I've implemented a file optimization script which takes about 30 seconds or more. There's a loop in which I added echos to track what's being processed.
However, most of the time, no output is being sent, until the end of the process.
How can I control this in order to send the echos in an iteration just as they're being finished?
EDIT:
This is the implied code, with the output buffer functions:
set_time_limit(60);
ini_set("memory_limit", "256M");
$documents = $this->documents_model->get($date1, $date2);
ob_start();
echo '-- Start ' . "<br>\n";
ob_end_flush();
flush();
foreach ($documents as $document) {
ob_start();
echo '-- Processing document ' . $document->id . "<br>\n";
$file = $document->get_file_name();
if (! $file || ! file_exists(DOCUMENT_ROOT . 'documents/' . $file)) {
echo '---- Document ' . $document->id . " has no PDF file yet or it was deleted<br>\n";
$path = $this->documents_model->generatePDF($document);
echo '------ file generated: ' . $path;
}
ob_end_flush();
flush();
}
echo '-- End ' . "<br>\n";
The reason why you will get the output after the processing finishes, is php's output buffer and eventually your webserver's buffer.
You will have to flush these buffers with the invokation of ob_flush() and flush() after every few echo statements.
For more technical information see PHP buffer ob_flush() vs. flush()
Related
I have this code to run command and print realtime output, the problem is that the code after system() command doesn´t get executed. How can I fix this problem?
function disable_ob() {
// Turn off output buffering
ini_set('output_buffering', 'off');
// Turn off PHP output compression
ini_set('zlib.output_compression', false);
// Implicitly flush the buffer(s)
ini_set('implicit_flush', true);
ob_implicit_flush(true);
// Clear, and turn off output buffering
while (ob_get_level() > 0) {
// Get the curent level
$level = ob_get_level();
// End the buffering
ob_end_clean();
// If the current level has not changed, abort
if (ob_get_level() == $level) break;
}
// Disable apache output buffering/compression
if (function_exists('apache_setenv')) {
apache_setenv('no-gzip', '1');
apache_setenv('dont-vary', '1');
}
}
function build() {
$location = "build/" . date("Y-m-d-h-i-sa");
echo "Build started! Your build location is " . $location;
$url = "http://{$_SERVER['HTTP_HOST']}";
$escaped_url = htmlspecialchars( $url, ENT_QUOTES, 'UTF-8' );
echo "\nIf you don´t get redirected after build go to $escaped_url/" . $location . "/smefs-Indigo-Remastered-master/Release/smef.pw.dll\n";
echo "...\n";
mkdir($location, 0777, true);
dircpy("","buildfiles/smefs-Indigo-Remastered-master",$location . "/smefs-Indigo-Remastered-master",true);
copy("buildfiles/buildscript.bat",$location . "/buildscript.bat");
copy("buildfiles/JunkCode.exe",$location . "/JunkCode.exe");
#shell_exec('call ' . $location . "/buildscript.bat");
system('call ' . $location . "/buildscript.bat");
# THIS CODE NOT EXECUTED
echo "\n\n# BUILD COMPLETED #";
echo "If you aren´t redirected please go to this link to download build: " . $location . "/smefs-Indigo-Remastered-master/Release/smef.pw.dll";
header('Location: '. $location . "/smefs-Indigo-Remastered-master/Release/smef.pw.dll");
}
Fixed! The problem was, that I had timeout set to 30 seconds. After adding ini_set('max_execution_time', 3600); everything works!
I use function system() to run system process.
$buff = system('python excel.py ' . $handle->file_dst_pathname, $retval);
It displays messages in line without separations.
In Python I use this line to print data:
print "# %d - article \"%s\" was inserted!" % (i, article)
Precede the output of system(...); with echo "<pre>"; and follow with echo "</pre>";
Or shorter:
$buff = "<pre>" . system('python excel.py ' . $handle->file_dst_pathname, $retval) . "</pre>";
You could also change all the "\n" chars of your output into "<br/>".
My php is running out of memory with a server error "Out of memory:Kill process..about 25% of the way through the process" Although it searches through about 10,000 lines, the number of lines that match the criteria, and therefore need to be stored and written to the file at the end of the process, are less than 200. So I am not sure why it is running out of memory.
Am I receiving this error because I am not clearing variables after each loop, or do I need to increase the memory on the server?
The process in brief is:
- LOOPA - loop through list of 400 zip codes
- using one api call for each zip - get list of all places within each zip (typically about 40-50)
-- SUBLOOP1 - for each place found, use an api call to get all events for that place
---- SUBLOOP1A loop through events to count the number for each place
zips = file($configFile, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$dnis = file($dniFile, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$s3->registerStreamWrapper();
$file = fopen("s3://{$bucket}/{$key}", 'w') or die("Unable to open file!");
fwrite($file, $type . " id" . "\t" . $type . " name" . "\t" . "zip" . "\t" . "event count" . "\n" );
foreach($zips as $n => $zip){
//first line is the lable to describe zips, so skip it
if ($n < 1) continue;
$params = $url;
$params .= "&q=" . $zip;
$more_node_pages = true;
while ($more_node_pages){
$res = fetchEvents($params);
//Now find the number of events for each place
foreach($res->data as $node){
//first check if on Do Not Include list
$countevents = true;
foreach($dnis as $dni) {
if ($dni == $node->id) {
echo "Not going to get events for ". $node->name . " id# " . $dni . "\n\n";
$countevents = false;
break;
}
}
//if it found a match, skip this and go to the next
if (!$countevents) continue;
$params = $url . $node->id . "/events/?fields=start_time.order(reverse_chronological)&limit=" . $limit . "&access_token=". $access_token;
//Count the number of valid upcoming events for that node
$event_count = 0;
$more_pages = true;
$more_events = true;
while ($more_pages) {
$evResponse = fetchEvents($params);
if (!empty($evResponse->error)) {
checkError($evResponse->error->message, $evResponse->error->code, $file);
}
//if it finds any events for that place, go throught each event for that place one by one to count until you reach today
foreach($evResponse->data as $event){
if(strtotime($event->start_time) > strtotime('now')){
$event_count++;
}
//else we have reached today's events for this node, so get out of this loop, and don't retrieve any more events for this node
else {
$more_events = false;
break;
}
}
if (!empty($evResponse->paging->next) and $more_events) $params = $evResponse->paging->next;
else $more_pages = false;
} //end while loop looking for more pages with more events for that node (page)
if ($event_count > "0") {
fwrite($file, $node->id . "\t" . $node->name . "\t" . $zip . "\t" . $event_count . "\n");
echo $event_count . "\n";
}
} // loop back to the next place until done
//test to see if there is an additional page
if (!empty($res->paging->next)) $params = $res->paging->next; else $more_node_pages = false;
} //close while loop for $more_node_pages containing additional nodes for that zip
} // loop back to the next zip until done
fclose($file);
I would highly recommend adding output to the beginning of each nested loop. I think you most likely have an infinite loop, which is causing the script to run out of memory.
If that isn't the case, then you can try increasing the memory limit for your PHP script by adding this line of PHP to the top of your script:
ini_set("memory_limit", "5G");
If it takes more than 5GB of RAM for your script to process the 400 zip codes, I would recommend breaking your script up so that you can run zip codes 0-10 and then 11-20, then 21-30, etc.
Hope this helps, cheers.
You need to find out where the memory is being lost and then you can either take care of it or work around it. memory_get_usage() is your friend - print it at the top (or bottom) of each loop with some identifier so you can see when & where you are using up memory.
This script is supposed to write log files using file locks etc to make sure that scripts running at the same time don't have any read/write complications. I got it off someone on php.net. When I tried to run it twice at the same time, I noticed that it completely ignored the lock file. However, when I ran them consecutively, the lock file worked just fine.
That doesn't make any sense whatsoever. The script just checks if a file exists, and acts based on that. Whether another script is running or not, shouldn't influence it at all. I double checked to make sure the lock file was created in both cases; it was.
So I started to do some testing.
First instance started at 11:21:00 outputs:
Started at: 2012-04-12 11:21:00
Checking if weblog/20120412test.txt.1.wlock exists
Got lock: weblog/20120412test.txt.1.wlock
log file not exists, make new
log file was either appended to or create anew
Wrote: 2012-04-12 11:21:00 xx.xx.xx.xxx "testmsg"
1
Second instance started at 11:21:03 outputs:
Started at: 2012-04-12 11:21:00
Checking if weblog/20120412test.txt.1.wlock exists
Got lock: weblog/20120412test.txt.1.wlock
log file not exists, make new
log file was either appended to or create anew
Wrote: 2012-04-12 11:21:00 xx.xx.xx.xxx "testmsg"
1
So there are two things wrong here. The timestamp, and the fact that the script sais the lock file doesn't exist even though it most certainly does.
It's almost as if the second instance of the script simply outputs what the first one did.
<?php
function Weblog_debug($input)
{
echo $input."<br/>";
}
function Weblog($directory, $logfile, $message)
{
// Created 15 september 2010: Mirco Babin
$curtime = time();
$startedat = date('Y-m-d',$curtime) . "\t" . date('H:i:s', $curtime) . "\t";
Weblog_debug("Started at: $startedat");
$logfile = date('Ymd',$curtime) . $logfile;
//Set directory correctly
if (!isset($directory) || $directory === false)
$directory = './';
if (substr($directory,-1) !== '/')
$directory = $directory . '/';
$count = 1;
while(1)
{
//*dir*/*file*.*count*
$logfilename = $directory . $logfile . '.' . $count;
//*dir*/*file*.*count*.lock
$lockfile = $logfilename . '.wlock';
$lockhandle = false;
Weblog_debug("Checking if $lockfile exists");
if (!file_exists($lockfile))
{
$lockhandle = #fopen($lockfile, 'xb'); //lock handle true if lock file opened
Weblog_debug("Got lock: $lockfile");
}
if ($lockhandle !== false) break; //break loop if we got lock
$count++;
if ($count > 100) return false;
}
//log file exists, append
if (file_exists($logfilename))
{
Weblog_debug("log file exists, append");
$created = false;
$loghandle = #fopen($logfilename, 'ab');
}
//log file not exists, make new
else
{
Weblog_debug("log file not exists, make new");
$loghandle = #fopen($logfilename, 'xb');
if ($loghandle !== false) //Did we make it?
{
$created = true;
$str = '#version: 1.0' . "\r\n" .
'#Fields: date time c-ip x-msg' . "\r\n";
fwrite($loghandle,$str);
}
}
//was log file either appended to or create anew?
if ($loghandle !== false)
{
Weblog_debug("log file was either appended to or create anew");
$str = date('Y-m-d',$curtime) . "\t" .
date('H:i:s', $curtime) . "\t" .
(isset($_SERVER['REMOTE_ADDR']) ? $_SERVER['REMOTE_ADDR'] : '-') . "\t" .
'"' . str_replace('"', '""', $message) . '"' . "\r\n";
fwrite($loghandle,$str);
Weblog_debug("Wrote: $str");
fclose($loghandle);
//Only chmod if new file
if ($created) chmod($logfilename,0644); // Read and write for owner, read for everybody else
$result = true;
}
else
{
Weblog_debug("log file was not appended to or create anew");
$result = false;
}
/**
Sleep & disable unlinking of lock file, both for testing purposes.
*/
//Sleep for 10sec to allow other instance(s) of script to run while this one still in progress.
sleep(10);
//fclose($lockhandle);
//#unlink($lockfile);
return $result;
}
echo Weblog("weblog", "test.txt", "testmsg");
?>
UPDATE:
Here's a simple script that just shows the timestamp. I tried it on a different host so I don't think it's a problem with my server;
<?php
function Weblog_debug($input)
{
echo $input."<br/>";
}
$curtime = time();
$startedat = date('Y-m-d',$curtime) . "\t" . date('H:i:s', $curtime) . "\t";
Weblog_debug("Started at: $startedat");
$timediff = time() - $curtime;
while($timediff < 5)
{
$timediff = time() - $curtime;
}
Weblog_debug("OK");
?>
Again, if I start the second instance of the script while the first is in the while loop, the second script will state it started at the same time as the first.
I can't fricking believe this myself, but it turns out this is just a "feature" in Opera. The script works as intended in Firefox. I kinda wish I tested that before I went all berserk on this but there ya go.
When I use file_get_contents and pass it as a parameter to another function, without assigning it to a variable, does that memory get released before the script execution finishes?
For Example:
preg_match($pattern, file_get_contents('http://domain.tld/path/to/file.ext'), $matches);
Will the memory used by file_get_contents be released before the script finishes?
The temporary string created to hold the file contents will be destroyed. Without delving into the sources to confirm, here's a couple of ways you can test that a temporary value created as a function parameter gets destroyed:
Method 1: a class which reports its destruction
This demonstrates lifetime by using a class which reports on its own demise:
class lifetime
{
public function __construct()
{
echo "construct\n";
}
public function __destruct()
{
echo "destruct\n";
}
}
function getTestObject()
{
return new lifetime();
}
function foo($obj)
{
echo "inside foo\n";
}
echo "Calling foo\n";
foo(getTestObject());
echo "foo complete\n";
This outputs
Calling foo
construct
inside foo
destruct
foo complete
Which indicates that the implied temporary variable is destroyed right after the foo function call.
Method 2: measure memory usage
Here's another method which offers further confirmation using memory_get_usage to measure how much we've consumed.
function foo($str)
{
$length=strlen($str);
echo "in foo: data is $length, memory usage=".memory_get_usage()."\n";
}
echo "start: ".memory_get_usage()."\n";
foo(file_get_contents('/tmp/three_megabyte_file'));
echo "end: ".memory_get_usage()."\n";
This outputs
start: 50672
in foo: data is 2999384, memory usage=3050884
end: 51544
In your example the memory will be released when $matches goes out of scope.
If you weren't storing the result of the match the memory would be released immediately
In following code memory usage = 6493720
start: 1050504
end: 6492344
echo "start: ".memory_get_usage()."\n";
$data = file_get_contents("/six_megabyte_file");
echo "end: ".memory_get_usage()."\n";
but memory usage in following code = 1049680
start = 1050504
end = 1050976
echo "start: ".memory_get_usage()."\n";
file_get_contents("/six_megabyte_file");
echo "end: ".memory_get_usage()."\n";
Note: in first code file stores in a variable.
If you think this will help in avoiding insufficient memory errors you are wrong. Your code (bytes_format):
<?php
$url = 'http://speedtest.netcologne.de/test_10mb.bin';
echo 'Before: ' . bytes_format(memory_get_usage()) . PHP_EOL;
preg_match('~~', file_get_contents($url), $matches);
echo 'After: ' . bytes_format(memory_get_usage()) . PHP_EOL;
echo 'Peak: ' . bytes_format(memory_get_peak_usage(true)) . PHP_EOL;
?>
uses 10.5 MB:
Before: 215.41 KB
After: 218.41 KB
Peak: 10.5 MB
and this code:
<?php
$url = 'http://speedtest.netcologne.de/test_10mb.bin';
echo 'Before: ' . bytes_format(memory_get_usage()) . PHP_EOL;
$contents = file_get_contents($url);
preg_match('~~', $contents, $matches);
unset($contents);
echo 'After: ' . bytes_format(memory_get_usage()) . PHP_EOL;
echo 'Peak: ' . bytes_format(memory_get_peak_usage(true)) . PHP_EOL;
?>
uses 10.5 MB as well:
Before: 215.13 KB
After: 217.64 KB
Peak: 10.5 MB
If you like to guard your script you need to use the $length parameter or read the file in chunks.