i currently work on a small project, where some data is gathered from the web and the system creates some relations between these. Of course it was not perfect from the beginning, so i needed to make a script which updates all the connections and relations with the updated scripts i made.
Basically the script works, but as there shall be a nice looking backend afterwards, its not really what i want.
The script needs around 10 minutes and because i didnt just want to set up the max_execution_time from php i thought of another method. Instead of loading 1000 sql entries at once i stripped it down to 200 at one time and just repeat it with the next 200 when the first round finished. Therefore i used php http_request. I just show you a stripped down version of the script:
require_once 'HTTP/Request.php';
$max = $db->query("SELECT COUNT(id) as max FROM db_table");
$lower = $_POST['lower'] ? $_POST['lower'] : 0;
$plus = 250;
$entries = $db->query("SELECT * FROM db_table LIMIT {$lower},{$plus}");
foreach($entries as $entry){
DO SOME STUFF TO UPDATE THE RELATIONS BETWEEN THE DATA
}
$lower = $lower + $plus;
if($lower <= $max) {
$request = new HTTP_Request("path to the script");
$request->setMethod(HTTP_REQUEST_METHOD_POST);
$request->addPostData("lower", $lower);
$result = $request->sendRequest();
}
This is it. As i said it works, because it's a new request so that it's not affected by the max_execution_time. But the browser is just loading and loading and loading and after a while it finishes. But of course i cannot show any refreshed data for something like a progress bar.
I saw many entries using php flush(), but that didnt work for me because of the (i guess) stupid way i used to solve my problem.
How would you do this if you need to install something on a webspace and you dont have the possbility to change the max execution time or install http_request?
As i said it should look like a progress bar later on. I guess i have to use ajax, and simply push the round the script is at every round and update the progress bar via javascript.
Can you help me?
You are still having trouble with max_execution_time because the page you requested from web browser is always active one, it doesn't finish until the HTTPRequests finishes. Try locating the page to another with the parameter lower.
header("Location: myscript.php?lower=$lower");
Related
I have a PHP script that read and export CSV to a database. At the beginning of each execution, the script get a customer name with $_POST. It runs around 7 minutes to send 120k row. Nevertheless, my host allow PHP scripts to run up to 165 seconds.
My idea was then to refresh the page before the 165s and start the export again, at the row it ended. I've succedeed to refresh the page, but I struggle to conserve the variable saving the row position at which the script ended in order to use it after the refresh.
I could use $_POST or $_SESSION, but my script may run several time at the same moment, exporting a different CSV each run. I'm afraid that changing these super global variable from scripts that may run at the same time make them collide, and change their value when I don't want to.
First : is the above affirmation true?
Then if it is, how can I store the number of row the script ended before refreshing the page. I though about creating a file, putting the informations inside and then read it. That may look like this :
customer_name : Jon
row_ended : 10584
customer_name : Jane
row_ended : 11564
But isn't there a more easier and efficient solution?
You can create a run ID and save it on the session.
Ex.
session_start();
$_SESSION['run']['id'] = 1; // or some unique ID
$_SESSION['run']['user'] = 'jon';
$_SESSION['run']['lastRow']= 0;
$startTime = time() + 160; // total secs
if($starTime > time() ){
// time of 160 passed redirect to same page.
$_SESSION['run']['lastRow']= 100000;
header("location: page.php");
exit;
}
But this will not solve the problem, can be be a redirect hell
You can try to increase the max execution time at runtime.
ini_set('max_execution_time',0); //will run forever
or the best solution run it as a shell command with max_execition_time = 0
users may navigate away the page if it takes too long.
I have a products database that synchronizes with product data ever morning.
The process is very clear:
Get all products from database by query
Loop through all products, and get and xml from the other server by product_id
Update data from xml
Log the changes to file.
If I query a low amount of items, but limiting it to 500 random products for example, everything goes fine. But when I query all products, my script SOMETIMES goes on the fritz and starts looping multiple times. Hours later I still see my log file growing and products being added.
I checked everything I could think of, for example:
Are variables not used twice without overwriting each other
Does the function call itself
Does it happen with a low amount of products too: no.
The script is called using a cronjob, are the settings ok. (Yes)
The reason that makes it especially weird is that it sometimes goes right, and sometimes it doesnt. Could this be some memory problem?
EDIT
wget -q -O /dev/null http://example.eu/xxxxx/cron.php?operation=sync its in webmin called on a specific hour and minute
Code is hundreds of lines long...
Thanks
You have:
max_execution_time disabled. Your script won't end until the process is complete for as long as it needed.
memory_limit disabled. There is no limit to how much data stored in memory.
500 records were completed without issues. This indicates that the scripts completes its process before the next cronjob iteration. For example, if your cron runs every hour, then the 500 records are processed in less than an hour.
If you have a cronjob that is going to process large amount of records, then consider adding lock mechanism to the process. Only allow the script to run once, and start again when the previous process is complete.
You can create script lock as part of a shell script before executing your php script. Or, if you don't have an access to your server you can use database lock within the php script, something like this.
class ProductCronJob
{
protected $lockValue;
public function run()
{
// Obtain a lock
if ($this->obtainLock()) {
// Run your script if you have valid lock
$this->syncProducts();
// Release the lock on complete
$this->releaseLock();
}
}
protected function syncProducts()
{
// your long running script
}
protected function obtainLock()
{
$time = new \DateTime;
$timestamp = $time->getTimestamp();
$this->lockValue = $timestamp . '_syncProducts';
$db = JFactory::getDbo();
$lock = [
'lock' => $this->lockValue,
'timemodified' => $timestamp
];
// lock = '0' indicate that the cronjob is not active.
// Update #__cronlock set lock = '', timemodified = '' where name = 'syncProducts' and lock = '0'
// $result = $db->updateObject('#__cronlock', $lock, 'id');
// $lock = SELECT * FROM #__cronlock where name = 'syncProducts';
if ($lock !== false && (string)$lock !== (string)$this->lockValue) {
// Currently there is an active process - can't start a new one
return false;
// You can return false as above or add extra logic as below
// Check the current lock age - how long its been running for
// $diff = $timestamp - $lock['timemodified'];
// if ($diff >= 25200) {
// // The current script is active for 7 hours.
// // You can change 25200 to any number of seconds you want.
// // Here you can send notification email to site administrator.
// // ...
// }
}
return true;
}
protected function releaseLock()
{
// Update #__cronlock set lock = '0' where name = 'syncProducts'
}
}
Your script is running for quite some time (~45m) and wget think it's "timing out" since you don't return any data. By default wget will have a 900s timeout value and a retry count of 20. So first you should probably change your wget command to prevent this:
wget --tries=0 --timeout=0 -q -O /dev/null http://example.eu/xxxxx/cron.php?operation=sync
Now removing the timeout could lead to other issue, so instead you could send (and flush to force webserver to send it) data from your script to make sure wget doesn't think the script "timed out", something every 1000 loops or something like that. Think of this as a progress bar...
Just keep in mind that you will hit an issue when the run time will get close to your period as 2 crons will run in parallel. You should optimize your process and/or have a lock mechanism maybe?
I see two possibilities:
- chron calls the script much more often
- script takes too long somehow.
you can try estimate the time a single iteration of the loop takes.
this can be done with time(). perhaps the result is suprising, perhaps not. you can probably get the number of results too. multiply the two, that way you will have an estimate of how long the process should take.
$productsToSync = $db->loadObjectList();
and
foreach ($productsToSync AS $product) {
it seems you load every result into an array. this wont work for huge databases because obviously a million rows wont fit in memory. you should just get one result at a time. with mysql there are methods that just fetch one thing at a time from the resource, i hope yours allows the same.
I also see you execute another query each iteration of the loop. this is something I try to avoid. perhaps you can move this to after the first query has ended and do all of those in one big query? otoh this may bite my first suggestion.
also if something goes wrong, try to be paranoid when debugging. measure as much as you can. time as much as you can when its a performance issue. put the timings in you log file. usually you will find the bottleneck.
I solved the problem myself. Thanks for all the replies!
My MySQL timed out, that was the problem. As soon as I added:
ini_set('mysql.connect_timeout', 14400);
ini_set('default_socket_timeout', 14400);
to my script the problem stopped. I really hope this helps someone. Ill upvote all the locking answers, because those were very helpful!
Firstly I want to give you the basic idea of what I am trying to do:
I'm trying to make a free web hosting service do some work for me. I've created one php page and MySQL db. The basic idea behind my PHP page is I have a while loop with condition of $shutdown, and some counter inside while loop to track whether code is running or not
<?php
/*
Connect to database etc. etc
*/
$shutdown = false;
// Main loop
while (!$shutdown)
{
// Check for user shutdown request
$strq = "SELECT * FROM TB_Shutdown;";
$result = mysql_query($strq);
$row = mysql_fetch_array($result);
if ($row[0] == "true")
{
$shutdown = true; // I know this statement is useless but nevermind
break;
}
//Increase counter
$strq = "SELECT * FROM TB_Counter;";
$result = mysql_query($strq);
$row = mysql_fetch_array($result);
if (intval($row[0]) == 60)
{
// Reset counter
$strq = "UPDATE TB_Counter SET value = 0";
$result = mysql_query($strq);
/*
I have some code to do some works at here its not important just curl stuff
*/
else
{
// Increase counter
$strq = "UPDATE TB_Counter SET value = " . (intval($row[0]) + 1);
$result = mysql_query($strq);
}
/*
I have some code to do some works at here its not important just curl stuff
*/
// Sleep
sleep(1);
}
?>
And I have a check.php which returns me the value from TB_Counter.
The problem is: I'm tracking the TB_Counter table every second. It stops after a while. If I close my webbrowser (which I called my main while php loop page from) it stops after like 2 minutes. If not after 5-7 mins I get the error "connection has been reset" on browser and loop stops.
What should I do to make my loop lasts forever?
You need to allow PHP to execute completely. There is an option in the PHP.INI file which says:
max_execution_time = 30;
This sets the maximum time in seconds a script is allowed to run
before it is terminated by the parser. This helps prevent poorly
written scripts from tying up the server. The default setting is 30.
When running PHP from the command line the default setting is 0.
The function set_time_limit:
Set the number of seconds a script is allowed to run. If this is
reached, the script returns a fatal error. The default limit is 30
seconds or, if it exists, the max_execution_time value defined in the
php.ini.
To check if PHP is running in safe mode, you can use this:
echo $phpinfo['PHP Core']['safe_mode'][0]
If it is going to be a huge process, you can consider running on Cron as a CronJob. A small explanation on it:
Cron is very simply a Linux module that allows you to run commands at predetermined times or intervals. In Windows, it’s called Scheduled Tasks. The name Cron is in fact derived from the same word from which we get the word chronology, which means order of time.
Using Cron, a developer can automate such tasks as mailing ezines that might be better sent during an off-hour, automatically updating stats, or the regeneration of static pages from dynamic sources. Systems administrators and Web hosts might want to generate quota reports on their clients, complete automatic credit card billing, or similar tasks. Cron has something for everyone!
Read more about Cron
You could use php function set_time_limit().
You should not handle this from a browser. Run a cron every minute doing the checks you need would be a better solution.
Next to that why would you update every second? Just write down a timestamp so you know when a request was made?
Making something to run forever is not doable. More important is to secure that your business process keeps running. So maybe it would be wise to put your business case here, you seem to need to count seconds and do something within a minute but it's not totally clear. So what do you need to do?
Ok here is my problem.
I have a file which outputs an XML based on an input X
I have another file which calls the above(1) file with 10000 (i mean many) times with different numbers for X
When an user clicks "Go" It should go through all those 10000 Xs and simultaneously show him a progress of how many are done. (hmm may be updated once every 10sec).
How do i do it? I need ideas. I know how to AJAX and stuff, but whats the structure my program should take?
EDIT
So according to the answer given below i did store my output in a session variable. It then outputs the answer. What is happening is:
When i execute a loong script. It gets executed say within 1min. But in the mean time if i open (in a new window) just the file which outputs my SESSION variable, then it doesnt output will the first script has run. Which is completely opposite to what i want. Whats the problem here? Is it my syste/server which doesnt handle multiple requests or what?
EDIT 2
I use the files approach:
To read what i want
> <?php include_once '../includeTop.php'; echo
> util::readFromLog("../../Files/progressData.tmp"); ?>
and in another script
$processed ++;
util::writeToLog($dir.'/progressData.tmp', "Files processed: $processed");
where the functions are:
public static function writeToLog($file,$data) {
$f = fopen($file,"w");
fwrite($f, $data);
fclose($f);
}
public static function readFromLog($file) {
return file_get_contents($file);
}
But still the same problem persist :(. I can manually see the file gettin updated like 1, 2, 3 etc. But when i run my script to do from php it just waits till my original script is output.
EDIT 3
Ok i finally found the solution. Instead of seeking the output from the php file i directly goto the log now and seek it.
Put the progress (i.e. how far are you into the 2nd file) into a memcached directly from the background job, then deliver that value if requested by the javascript application (triggered by a timer, as long as you did not reach a 100%). The only thing you need to figure out is how to pass some sort of "transaction ID" to both the background job and the javascript side, so they access the same key in memcached.
Edit: I was wrong about $_SESSION. It doesn't update asynchronously, i.e. the values you store in it are not accessible until the script has finished. Whoops.
So the progress needs to be stored in something that does update asynchronously: Memory (like pyroscope suggests, and which is still the best solution), a file, or the database.
In other words, instead of using $_SESSION to store the value, it should be stored by memcached, in a file or in the database.
I.e. using the database
$progress = 0;
mysql_query("INSERT INTO `progress` (`id`, `progress`) VALUES ($uid, $progress)");
# loop starts
# processing...
$progress += $some_increment;
mysql_query("UPDATE `progress` SET `progress`=$progress WHERE `id`=$uid");
# loop ends
Or using a file
$progress = 0;
file_put_contents("/path/to/progress_files/$uid", $progress);
# loop starts
# processing...
$progress += $some_increment;
file_put_contents("/path/to/progress_files/$uid", $progress);
# loop ends
And then read the file/select from the database, when requesting progress via ajax. But it's not a pretty solution compared to memcached.
Also, remember to remove the file/database row once it's all done.
You could put the progress in a $_SESSION variable (you'll need a unique name for it), and update it while the process runs. Meanwhile your ajax request simply gets that variable at a specific interval
function heavy_process($input, $uid) {
$_SESSION[$uid] = 0;
# loop begins
# processing...
$_SESSION[$uid] += $some_increment;
# loop ends
}
Then have a url that simply spits out the $_SESSION[$uid] value when it's requested via ajax. Then use the returned value to update the progress bar. Use something like sha1(microtime()) to create the $uid
Edit: pyroscope's solution is technically better, but if you don't have a server with memcached or the ability to run background processes, you can use $_SESSION instead
A site I am working with is starting to get a little sluggish, and I would like to refine it. I think the problem is with the PHP, but I can't be sure. How can I see how long functions are taking to perform?
If you want to test the execution time :
<?php
$startTime = microtime(true);
// Your content to test
$endTime = microtime(true);
$elapsed = $endTime - $startTime;
echo "Execution time : $elapsed seconds";
?>
Try the profiler feature in XDebug or Zend Debugger?
Two things you can do.
place Microtime calls everywhere although its not convenient if you want to test more than one function. So there is a simpler way to do it a better solution if you want to test many functions which i assume you would like to do.
just have a class (click on link to follow tutorial) where you can test how long all your functions take. Rather than place microtime everywhere. you just use this class. which is very convenient
http://codeaid.net/php/calculate-script-execution-time-%28php-class%29
the second thing you can do is to optimize your script is by taking a look at the memory usage.
By observing the memory usage of your scripts, you may be able optimize your code better.
PHP has a garbage collector and a pretty complex memory manager. The amount of memory being used by your script. can go up and down during the execution of a script. To get the current memory usage, we can use the memory_get_usage() function, and to get the highest amount of memory used at any point, we can use the memory_get_peak_usage() function.
view plaincopy to clipboardprint?
echo "Initial: ".memory_get_usage()." bytes \n";
/* prints
Initial: 361400 bytes
*/
// let's use up some memory
for ($i = 0; $i < 100000; $i++) {
$array []= md5($i);
}
// let's remove half of the array
for ($i = 0; $i < 100000; $i++) {
unset($array[$i]);
}
echo "Final: ".memory_get_usage()." bytes \n";
/* prints
Final: 885912 bytes
*/
echo "Peak: ".memory_get_peak_usage()." bytes \n";
/* prints
Peak: 13687072 bytes
*/
http://net.tutsplus.com/tutorials/php/9-useful-php-functions-and-features-you-need-to-know/
PK
You can also make it manually, by recording microtime() value in various places, like this:
<?
$TIMER['start']=microtime(TRUE);
// some code
$query="SELECT ...";
$TIMER['before q']=microtime(TRUE);
$res=mysql_query($query);
$TIMER['after q']=microtime(TRUE);
while ($row = mysql_fetch_array($res)) {
// some code
}
$TIMER['array filled']=microtime(TRUE);
// some code
$TIMER['pagination']=microtime(TRUE);
/and so on
?>
and then visualize it
<?
if ('127.0.0.1' === $_SERVER['REMOTE_ADDR']) {
echo "<table border=1><tr><td>name</td><td>so far</td><td>delta</td><td>per cent</td></tr>";
reset($TIMER);
$start=$prev=current($TIMER);
$total=end($TIMER)-$start;
foreach($TIMER as $name => $value) {
$sofar=round($value-$start,3);
$delta=round($value-$prev,3);
$percent=round($delta/$total*100);
echo "<tr><td>$name</td><td>$sofar</td><td>$delta</td><td>$percent</td></tr>";
$prev=$value;
}
echo "</table>";
}
?>
an IP address check implies that we are doing this profiling on the working site
Though I doubt it's PHP itself. Most likely it's database. So, pay most attention to query execution timing.
however, a "site" term is very broad. It includes also JS, CSS, images and stuff. So, I'd suggest to start form FirebFug's Net page to see what part of whole page takes more time.
Of course, refining can be done only after analysis of profiling results, and cannot be advised here without it.
Your best bet is Xdebug. Im happy as it comes bundled in my PHPed IDE. I can get profiler data at the click of a button.
So maybe you could consider that.
I had similar issues and so I created 2 new tables on the database and two new functions. One was audit_sql and the other was audit_code. Because I used an SQL abstraction class it was easy to time every single SQL call (I used php microtime as some others have suggested). So, I called microtime before and after the SQL call and stored the results on the database.
Similarly with pages. I called microtime at the start and end of each page and if necessary at the start and end of functons, divs - whatever I thought might be a culprit.
The general results were:
SQL calls to MySQL were almost instantaneous and were nto a problem at all. The only thing I would say is that even I was surprised at the number being executed! The site is generated from the database - even the menus, permissions etc. To produce the home page the SQL calls were measured in the 100s.
PHP was not the culprit. This was even more instantaneous that MySQL.
The culprit was.... (big build up!) calls to You Tube and Picassa and other sites like that. I host videos and photo albums on the site (well, I don't actually store them - they are stored on YT etc.) and on the home page are thumbnails that are extracted from You Tube and the like via the You Tube PHP API/Zend Framework. Because this is all http based to the other sites, each one was taking 1, 2 or 3 seconds. This was causing those divs containing these to take between 6 and 12 seconds and the home page up to 17 seconds.
The solution - store all thumbnails on my server. The first time one has to be served from the remote site (YT, Picassa etc.) so do that and then store it on your own site. Future times, you check if you have it and if so serve it always from your server. Cuts the page load time down to 2-3 seconds tops. Granted the first person to view the first home page load after someone has loaded more videos/images will take some time, but not thereafter. People will put a long one-off page load time down to their connection/the internet in general. Too many slow loads of your site and they will stop visiting!
I hope that helps somewhat.