I have a PHP script that read and export CSV to a database. At the beginning of each execution, the script get a customer name with $_POST. It runs around 7 minutes to send 120k row. Nevertheless, my host allow PHP scripts to run up to 165 seconds.
My idea was then to refresh the page before the 165s and start the export again, at the row it ended. I've succedeed to refresh the page, but I struggle to conserve the variable saving the row position at which the script ended in order to use it after the refresh.
I could use $_POST or $_SESSION, but my script may run several time at the same moment, exporting a different CSV each run. I'm afraid that changing these super global variable from scripts that may run at the same time make them collide, and change their value when I don't want to.
First : is the above affirmation true?
Then if it is, how can I store the number of row the script ended before refreshing the page. I though about creating a file, putting the informations inside and then read it. That may look like this :
customer_name : Jon
row_ended : 10584
customer_name : Jane
row_ended : 11564
But isn't there a more easier and efficient solution?
You can create a run ID and save it on the session.
Ex.
session_start();
$_SESSION['run']['id'] = 1; // or some unique ID
$_SESSION['run']['user'] = 'jon';
$_SESSION['run']['lastRow']= 0;
$startTime = time() + 160; // total secs
if($starTime > time() ){
// time of 160 passed redirect to same page.
$_SESSION['run']['lastRow']= 100000;
header("location: page.php");
exit;
}
But this will not solve the problem, can be be a redirect hell
You can try to increase the max execution time at runtime.
ini_set('max_execution_time',0); //will run forever
or the best solution run it as a shell command with max_execition_time = 0
users may navigate away the page if it takes too long.
Related
I have a products database that synchronizes with product data ever morning.
The process is very clear:
Get all products from database by query
Loop through all products, and get and xml from the other server by product_id
Update data from xml
Log the changes to file.
If I query a low amount of items, but limiting it to 500 random products for example, everything goes fine. But when I query all products, my script SOMETIMES goes on the fritz and starts looping multiple times. Hours later I still see my log file growing and products being added.
I checked everything I could think of, for example:
Are variables not used twice without overwriting each other
Does the function call itself
Does it happen with a low amount of products too: no.
The script is called using a cronjob, are the settings ok. (Yes)
The reason that makes it especially weird is that it sometimes goes right, and sometimes it doesnt. Could this be some memory problem?
EDIT
wget -q -O /dev/null http://example.eu/xxxxx/cron.php?operation=sync its in webmin called on a specific hour and minute
Code is hundreds of lines long...
Thanks
You have:
max_execution_time disabled. Your script won't end until the process is complete for as long as it needed.
memory_limit disabled. There is no limit to how much data stored in memory.
500 records were completed without issues. This indicates that the scripts completes its process before the next cronjob iteration. For example, if your cron runs every hour, then the 500 records are processed in less than an hour.
If you have a cronjob that is going to process large amount of records, then consider adding lock mechanism to the process. Only allow the script to run once, and start again when the previous process is complete.
You can create script lock as part of a shell script before executing your php script. Or, if you don't have an access to your server you can use database lock within the php script, something like this.
class ProductCronJob
{
protected $lockValue;
public function run()
{
// Obtain a lock
if ($this->obtainLock()) {
// Run your script if you have valid lock
$this->syncProducts();
// Release the lock on complete
$this->releaseLock();
}
}
protected function syncProducts()
{
// your long running script
}
protected function obtainLock()
{
$time = new \DateTime;
$timestamp = $time->getTimestamp();
$this->lockValue = $timestamp . '_syncProducts';
$db = JFactory::getDbo();
$lock = [
'lock' => $this->lockValue,
'timemodified' => $timestamp
];
// lock = '0' indicate that the cronjob is not active.
// Update #__cronlock set lock = '', timemodified = '' where name = 'syncProducts' and lock = '0'
// $result = $db->updateObject('#__cronlock', $lock, 'id');
// $lock = SELECT * FROM #__cronlock where name = 'syncProducts';
if ($lock !== false && (string)$lock !== (string)$this->lockValue) {
// Currently there is an active process - can't start a new one
return false;
// You can return false as above or add extra logic as below
// Check the current lock age - how long its been running for
// $diff = $timestamp - $lock['timemodified'];
// if ($diff >= 25200) {
// // The current script is active for 7 hours.
// // You can change 25200 to any number of seconds you want.
// // Here you can send notification email to site administrator.
// // ...
// }
}
return true;
}
protected function releaseLock()
{
// Update #__cronlock set lock = '0' where name = 'syncProducts'
}
}
Your script is running for quite some time (~45m) and wget think it's "timing out" since you don't return any data. By default wget will have a 900s timeout value and a retry count of 20. So first you should probably change your wget command to prevent this:
wget --tries=0 --timeout=0 -q -O /dev/null http://example.eu/xxxxx/cron.php?operation=sync
Now removing the timeout could lead to other issue, so instead you could send (and flush to force webserver to send it) data from your script to make sure wget doesn't think the script "timed out", something every 1000 loops or something like that. Think of this as a progress bar...
Just keep in mind that you will hit an issue when the run time will get close to your period as 2 crons will run in parallel. You should optimize your process and/or have a lock mechanism maybe?
I see two possibilities:
- chron calls the script much more often
- script takes too long somehow.
you can try estimate the time a single iteration of the loop takes.
this can be done with time(). perhaps the result is suprising, perhaps not. you can probably get the number of results too. multiply the two, that way you will have an estimate of how long the process should take.
$productsToSync = $db->loadObjectList();
and
foreach ($productsToSync AS $product) {
it seems you load every result into an array. this wont work for huge databases because obviously a million rows wont fit in memory. you should just get one result at a time. with mysql there are methods that just fetch one thing at a time from the resource, i hope yours allows the same.
I also see you execute another query each iteration of the loop. this is something I try to avoid. perhaps you can move this to after the first query has ended and do all of those in one big query? otoh this may bite my first suggestion.
also if something goes wrong, try to be paranoid when debugging. measure as much as you can. time as much as you can when its a performance issue. put the timings in you log file. usually you will find the bottleneck.
I solved the problem myself. Thanks for all the replies!
My MySQL timed out, that was the problem. As soon as I added:
ini_set('mysql.connect_timeout', 14400);
ini_set('default_socket_timeout', 14400);
to my script the problem stopped. I really hope this helps someone. Ill upvote all the locking answers, because those were very helpful!
So, I'm working on a time-sensitive website in PHP on my CentOS server. I have a random time selected in the future, within 24 hours of the present. At that point, I need a PHP file to execute, and a new date to be selected and the same file to be opened. How is this possible to accomplish? I looked briefly at cronjobs, but I couldn't find a way to make them open at a specific, random, time.
You can use at command, run your PHP file and in the end, register make another call to at for the next time. Something like this
<?php
// your PHP code in here, and then find out when is the next call time
$time = date('H:i', intval($time)); // or another good way to make sure time value is safe to use as a shell argument, like using escapeshellarg()
$run_me = "/usr/bin/env php " . __FILE__;
exec("echo '$run_me' | at '$time'");
One possible workaround is to run a script from a cron job, say, every 10 minutes. On the top of the script, check a specific file which is supposed to contain a timestamp. If the current time is greater than the value from the file, do the job, and write the new timestamp value into this file.
$time_to_run = intval(file_get_contents('my.timestamp'));
if(time() >= $time_to_run) {
do stuff
file_put_contents(time() + random value, 'my.timestamp');
}
If you need more granularity, a better option would be to run it as a daemon (see advices here) and just loop forever (probably with some sleep() inside) until the time comes.
i currently work on a small project, where some data is gathered from the web and the system creates some relations between these. Of course it was not perfect from the beginning, so i needed to make a script which updates all the connections and relations with the updated scripts i made.
Basically the script works, but as there shall be a nice looking backend afterwards, its not really what i want.
The script needs around 10 minutes and because i didnt just want to set up the max_execution_time from php i thought of another method. Instead of loading 1000 sql entries at once i stripped it down to 200 at one time and just repeat it with the next 200 when the first round finished. Therefore i used php http_request. I just show you a stripped down version of the script:
require_once 'HTTP/Request.php';
$max = $db->query("SELECT COUNT(id) as max FROM db_table");
$lower = $_POST['lower'] ? $_POST['lower'] : 0;
$plus = 250;
$entries = $db->query("SELECT * FROM db_table LIMIT {$lower},{$plus}");
foreach($entries as $entry){
DO SOME STUFF TO UPDATE THE RELATIONS BETWEEN THE DATA
}
$lower = $lower + $plus;
if($lower <= $max) {
$request = new HTTP_Request("path to the script");
$request->setMethod(HTTP_REQUEST_METHOD_POST);
$request->addPostData("lower", $lower);
$result = $request->sendRequest();
}
This is it. As i said it works, because it's a new request so that it's not affected by the max_execution_time. But the browser is just loading and loading and loading and after a while it finishes. But of course i cannot show any refreshed data for something like a progress bar.
I saw many entries using php flush(), but that didnt work for me because of the (i guess) stupid way i used to solve my problem.
How would you do this if you need to install something on a webspace and you dont have the possbility to change the max execution time or install http_request?
As i said it should look like a progress bar later on. I guess i have to use ajax, and simply push the round the script is at every round and update the progress bar via javascript.
Can you help me?
You are still having trouble with max_execution_time because the page you requested from web browser is always active one, it doesn't finish until the HTTPRequests finishes. Try locating the page to another with the parameter lower.
header("Location: myscript.php?lower=$lower");
here is my code
echo ("<br/>");
if ($APIkbpersec < 30) {
global $SpeedTest;
echo ("Slow speed");
$SpeedTest--;
}
if ($APIkbpersec > 30) {
global $SpeedTest;
echo ("High speed");
$SpeedTest++;
}
echo $SpeedTest;
the page this code is in gets reloaded every second with AJAX and the $APIkbpersec changes between 40 and 0.
I basically want to have a variable ($SpeedTest) increase or decrese depending on what $APIkbpersec is.
if $APIkbpersec is less than 30, I want $SpeedTest to decrease by 1 every refresh to a minimum of 0.
if $APIkbpersec is greaterthan 30, I want $SpeedTest to increase from by 1 every refresh to a maximum of 10.
the problem is I dont know what the porblem is....Im currently trying to write $SpeedTest to a txt file so I can read it in every refresh to do the maths on it every refresh without it being reset in PHP
any help would be appreciated
It's being reset because the HTTP request is stateless. Each AJAX call is an isolated event to a PHP script. To make the variable persist, it has to be stored in $_SESSION.
You have not shown the code you're using to write it to a text file, but unless you need it to persist beyond a user session, that's the wrong approach. You're better served using $_SESSION. If you do need long-term persistence, you should use a database instead.
session_start();
// Initialize the variable if it doesn't exist yet
if (!isset($_SESSION['SpeedTest'])) {
$_SESSION['SpeedTest'] = 0;
}
echo ("<br/>");
if ($APIkbpersec < 30) {
echo ("Slow speed");
$_SESSION['SpeedTest']--;
}
if ($APIkbpersec > 30) {
echo ("High speed");
$_SESSION['SpeedTest']++;
}
echo $_SESSION['SpeedTest'];
You should use $_SESSION for that purpose.
See HERE for an explanation, but basically you would need to do the following:
session_start();
$SpeedTest = isset($_SESSION['speedTest']) ? $_SESSION['speedTest'] : 0;
if ($APIkbpersec < 30)
{
echo ("Slow speed");
$SpeedTest--;
}
if ($APIkbpersec > 30)
{
echo ("High speed");
$SpeedTest++;
}
$_SESSION['speedTest'] = $SpeedTest;
echo $SpeedTest;
Either:
Return $SpeedTest in the response and pass it back and forth.
Use some kind of persistent storage such as a cookie or PHP sessions.
Both are pretty easy to implement. If you want with persistent storage, I'd suggest a cookie as both JS and PHP could share it. Session, although the obvious candidate, are a bit overkill in this case - IMO.
If this is all of your code, the problem is simple. Each time the script is run, the values of all the variables are initialized. For your case, this means that the value of $SpeedTest does not persist - it's reset to zero each time the script is called. You can use a session as #Michael suggests (probably my recommendation), read the value out from a text file or database and then write a new value out, or you could return the value of $SpeedTest to your AJAX script and pass it back into the php script as a parameter. Each of these have various advantages and disadvantages, but using the $_SESSION superglobal is easy to do and takes little modification to your code.
If you want to do it with files you can use a single file to store a single global value for your variable:
Read data from file (docs here):
$data= file_get_contents('file.txt');
Put data into file (docs here)
$bytesWritten = file_put_contents( $data );
Else you can use sessions or database as other suggested.
Without cookies or sessions you cannot have a real "per user" solution so if you need that stick with other answers or use an hybrid solution with sessions/files
If you use the request solution (that kind of ping-pong with POST or GET variables) always pay attention because those variables can be altered by users.
Other things to remember:
Files and database records last until you delete them (so maybe you have to manage undeleted files or records).
Session duration is configured within your server (so they can last too short if you need long term persistency).
Usually database are better than files (the do more tasks and provide your application more scalability) but in some cases files solution is faster (tested) specially if your database resides in another host and is not on the same host as your webserver.
Ok here is my problem.
I have a file which outputs an XML based on an input X
I have another file which calls the above(1) file with 10000 (i mean many) times with different numbers for X
When an user clicks "Go" It should go through all those 10000 Xs and simultaneously show him a progress of how many are done. (hmm may be updated once every 10sec).
How do i do it? I need ideas. I know how to AJAX and stuff, but whats the structure my program should take?
EDIT
So according to the answer given below i did store my output in a session variable. It then outputs the answer. What is happening is:
When i execute a loong script. It gets executed say within 1min. But in the mean time if i open (in a new window) just the file which outputs my SESSION variable, then it doesnt output will the first script has run. Which is completely opposite to what i want. Whats the problem here? Is it my syste/server which doesnt handle multiple requests or what?
EDIT 2
I use the files approach:
To read what i want
> <?php include_once '../includeTop.php'; echo
> util::readFromLog("../../Files/progressData.tmp"); ?>
and in another script
$processed ++;
util::writeToLog($dir.'/progressData.tmp', "Files processed: $processed");
where the functions are:
public static function writeToLog($file,$data) {
$f = fopen($file,"w");
fwrite($f, $data);
fclose($f);
}
public static function readFromLog($file) {
return file_get_contents($file);
}
But still the same problem persist :(. I can manually see the file gettin updated like 1, 2, 3 etc. But when i run my script to do from php it just waits till my original script is output.
EDIT 3
Ok i finally found the solution. Instead of seeking the output from the php file i directly goto the log now and seek it.
Put the progress (i.e. how far are you into the 2nd file) into a memcached directly from the background job, then deliver that value if requested by the javascript application (triggered by a timer, as long as you did not reach a 100%). The only thing you need to figure out is how to pass some sort of "transaction ID" to both the background job and the javascript side, so they access the same key in memcached.
Edit: I was wrong about $_SESSION. It doesn't update asynchronously, i.e. the values you store in it are not accessible until the script has finished. Whoops.
So the progress needs to be stored in something that does update asynchronously: Memory (like pyroscope suggests, and which is still the best solution), a file, or the database.
In other words, instead of using $_SESSION to store the value, it should be stored by memcached, in a file or in the database.
I.e. using the database
$progress = 0;
mysql_query("INSERT INTO `progress` (`id`, `progress`) VALUES ($uid, $progress)");
# loop starts
# processing...
$progress += $some_increment;
mysql_query("UPDATE `progress` SET `progress`=$progress WHERE `id`=$uid");
# loop ends
Or using a file
$progress = 0;
file_put_contents("/path/to/progress_files/$uid", $progress);
# loop starts
# processing...
$progress += $some_increment;
file_put_contents("/path/to/progress_files/$uid", $progress);
# loop ends
And then read the file/select from the database, when requesting progress via ajax. But it's not a pretty solution compared to memcached.
Also, remember to remove the file/database row once it's all done.
You could put the progress in a $_SESSION variable (you'll need a unique name for it), and update it while the process runs. Meanwhile your ajax request simply gets that variable at a specific interval
function heavy_process($input, $uid) {
$_SESSION[$uid] = 0;
# loop begins
# processing...
$_SESSION[$uid] += $some_increment;
# loop ends
}
Then have a url that simply spits out the $_SESSION[$uid] value when it's requested via ajax. Then use the returned value to update the progress bar. Use something like sha1(microtime()) to create the $uid
Edit: pyroscope's solution is technically better, but if you don't have a server with memcached or the ability to run background processes, you can use $_SESSION instead