I have a PHP script that processes my email subscriptions.
It does something like:
foreach email to be sent:
mailer->send-email
print "Email sent to whoever."
I'm now encountering rate-limiting by my web host. The mailing library has a built in throttler that will sleep to ensure I stay under the rate. However, this could result in the web page taken multiple hours to actually load.
Will the client side browser ever give up on the page loading? Any suggested better solutions to this?
Why is this being done on a webpage load? This should be an off-line back-end process which is scheduled to run. (Look into cron for scheduling tasks.)
Any long running process should be delegated to a back-end service to handle that process. Application interfaces (such as a web page) should respond back to the user as quickly as possible instead of forcing the user to wait (for upwards of an hour?) for a response.
The application can track progress, usually by means of some shared data source (a simple database, for example), of the back-end process and present that progress to the user. That's fine. But the process itself should happen outside of the application.
For example, at a high level...
Have a PHP script scheduled to run to process the emails.
When the script starts, save a record to a database indicating that it's started.
Each time the script reaches a milestone of some kind, update the database record to indicate this.
When the script finishes, update the database record to indicate this.
Have a web application which checks for that database record and shows the user the current status of the back-end process.
You may not care, but even if you coerce this script into staying alive, you shouldn't purposely run a long running script through the webserver. Webserver's use resource heavy threads or processes to run your script, and they have a finite amount of them available to server web requests. A long running script basically takes one of them out of the pool of processes that can be used to server web visitors.
Instead, use a cron job which executes the php binary directly. Specifically, do not use wget or lynx or any other web browser like program as part of the cron job, because those methods run the script through the webserver. The cron command should include something like
php /full/path/to/the/script.php
Related
I am writing a website with php and there is a part of code need huge amount of time to execute.
Since I don't use thread, when I run that code, the whole server is blocked by it. But it's OK.
Hovever, even though I closed that web page, it still executes and blocked my server. I cannot access any page of my website until the process completed.
Since the execution time is very long, so that I set a very long set_time_limit() for it but I don't set ignore_user_abort so that I supposes that it should not run after user abort. Or is it the problem of curl(the code does many curl job)?
Can someone tell me that why the php script cannot stop when the user close the connection? Or there are some way to assure the script can be stopped when user abort?
Thanks.
Closing the browser doesn't tell the server to stop doing something. It doesn't tell the server anything.
Long-running processes don't belong in web applications. Generally you would want some background task to handle the process. Either the web application would spawn this task (this seems like a workable approach) or would in some way queue the processing of this task where a background worker would see that queue (such as a database table polled every X minutes by a daemon process).
The goal is to not block the UI while the task is running. Even if the user were to leave the browser open, the browser itself may "give up" after a while or something else could sever the user from the UX while waiting for too long. Let the user invoke the process, but separate the invocation of the process from the execution of the process so the user can return to the application interface.
From the PHP Manual:
PHP will not detect that the user has aborted the connection until an
attempt is made to send information to the client.
Thus, even using ignore user abort, you must try to interact with the client again inside the script to ensure it aborts correctly. Note that on the page in question there are some additional notes about what constitutes 'sending information' (for example, an Echo doesn't qualify by itself, apparently).
Further Reading:
http://php.net/manual/en/function.ignore-user-abort.php
First of all sorry to post a question that seems to have been flogged to death on SO before. However, none of the questions I have reviewed helped me to solve my specific problem.
I have built a web application that runs an extensive data processing routine in PHP (i.e. MySQL queries, calculations, etc.).
Depending on the amount of data fed to the app this processing can take quite a long time so the script needs to run server-side and independently from the web front-end.
There is a problem, however. It seems I cannot control the script execution time limit as long as the script is invoked via cgi.
When I run the script via SSH and the command line it works fine for however long it takes to process the data.
But if I use the exec() command in a php script called via the webserver I always ends up with the error End of script output before headers after approximately 45 seconds.
Rather than having to fiddle with server settings (a nightmare in terms of portability) I would like to find a solution that kicks off the script independently from cgi.
Any suggestions?
Don't execute the long script directly from the website (AKA, directly from Apache) because, as you've mentioned, it will block until it finishes and potentially time out. Instead, use the website to schedule a job (an execution of the long script) to be run immediately.
Here is a basic outline of how you can potentially do this:
Create a new, small database to store job requests, including fields job_id, processing_status, run_start_time, and more relevant fields
Create some Ajax that hits your server and writes a "job request" to this jobs database, set to execute immediately.
Add a crontab script or bot that periodically watches for new jobs. If it finds a job that is yet to be processed but has passed the run_start_time, run it using exec() or some other command executor. This way the command won't timeout because it is not being run by Apache, but by the cron daemon.
When the command finishes, update the jobs database saying that processing is finished.
From your website, write a frontend that allows the user to see if the requested job is finished yet. Once it finishes, it displays some kind of "Done" indicator or something similar.
I have this scenario:
User submits a link to my PHP website and closes the browser. Now that the server has got the link it will analyse the submitted link (page) for the broken links and after it has completely analysed the posted link, it will send an email to the user. I have a complete understanding of the second part i.e. how to analyse the page for the broken links and send the mail to the user. Only problem that I have is how may I achieve this first part i.e. make the server keep running the actions on it's own even even if there is no request made by the client end?
I have learned that "Crontab" or a "fork" may work for me. What do you say about these? Is it possible to achieve what I want, using these? What are the alternatives?
crontab would be the way to go for something like this.
Essentially you have two applications:
A web site where users submit data to a database.
An offline script, scheduled to run via cron, which checks for records in the database and performs the analysis, sending notifications of the results when complete.
Both of these applications share the same database, but are otherwise oblivious to each other.
A website itself isn't suited well for this sort of offline work, it's mainly a request/response system. But a scheduled task works for this. Unless the user is expecting an immediate response, a small delay of waiting for the next scheduled run of the offline task is fine.
The server should run the script independently of the browser. Once the request is submitted, the php server runs the script and returns the result to the browser (if it has a result to return)
An alternative would be to add the request to a database and then use crontab run the php script at a given interval. The script would then check the database to see if there's anything that needs to be processed. You could limit the script to run one database entry every minute (or whatever works). This will help prevent performance problems if you have a lot of requests at once, but will be slower to send the email.
A typical approach would be to enter the link into a database when the user submits it. You would then use a cron job to execute a script periodically, which will process any pending links.
Exactly how to setup a cron job (or equivalent scheduled task) depends on your server. If you have a host which provides a web-based admin tool (such as CPanel), there will often be a way to do it in there.
PHP script will keep running after the client closes the broser (terminating the connection).
Only keep in mind PHP scripts maximum execution time is limited to "max_execution_time" directive value.
Of course here I suppose the link submission happens calling your script page... I don't understand if this is your use case...
For the sake of simplicity, a cronjob could do the wonders. User submits a link, the web handler simply saves the link into a DB (let me pretend here that the table is named "queued_links"). Then a cronjob scheduled to run each minute (for example), selects every link from queued_links, does the application logic (finds broken page links) and sends the email. It then also deletes the link from queued_links (or updates a flag to represent the fact that the link has already been processed.
In the sake of scale and speed, a cronjob wouldn't fit as well as a Message Queue (see rabbitmq, activemq, gearman, and beanstalkd (gearman and beanstalk are my favorite 2, simple and fit well with php)). In lieu of spawning a cronjob every minute, a queue processor listens for 'events' and asynchronously processes the 'events' (think 'onLinkSubmission($link)'), and processes the messages ASAP. The cronjob solution is just a simplified implementation of one of these MQ solutions, will result in better / more predictable results, but at the cost of adding new services to maintain, etc.
well, there are couple of ways, simplest of them would be:
When user submit a request, save this request some where, let's call it jobs table, and inform customer that his request has been received, they'll be updated site finish processing your request, or whatever suites you.
Now, create a (or multiple) scripts (depending upon requirement) and run this script from Cron, this script will pick requests from Job table, process it, do whatever required.
Alternatively, you can evaluate possibility of message_queue or may be using a Job server for this.
so, it all depends on your requirement.
i started to learn programming like a month ago. I already knew html and css, i thought i should learn PHP. I learned alot of it from from tutorials and books, now I am making mysql based websites for practice.
I always used to play browser based strategy games like travian when i was a kid. I was thinking about how those sites worked. I didnt have any problem till i realized that the game actually worked after you closed the browser. For example; you log in to your account and start a construction and log off. But even after you close the browser, game knows that in "x" amount of time it needs to update your data of that specific building.
can someone tell me how that works? is it something with php or MySQL or some other programming language? even if you can tell me what to search online, it would be enough.
Despite being someone who loves tackling steep learning curves, I would advise against trying jump into something that requires background processes until you have a bit more programming experience.
But either way, here's what you need to know:
Normal PHP Process
The way that PHP normally works is the following:
User types a url into the browser and hits enter (or just clicks on a link)
Request is sent to a bunch of servers and magically finds its way to the right web server (beyond scope of this answer)
Server program like Apache or IIS listening on port 80 grabs the request
Apache sees that there's a .php extension on the requested page
Apache looks up if any processors have been assigned to .php and finds php.exe
The requested page is fed into php.exe
php.exe starts up a new process for the specific user, runs everything on the script, returns its result
The result is then sent back to the user
When the user closes the browser and ends the "session", the process started by php exits
So the problem you encounter when you want something running in the background is that PHP in most cases is generally accessed through the web server, and hence usually requires a browser (and user making requests through the browser). And since closing the browser ends the process, so you need a way to run php scripts without a browser.
Luckily PHP can be accessed outside of just the webserver as a normal process on the server. But then the problem is that you have to access the server. You probably don't want your users to ssh into your server in order to manually run scripts (and I'm assuming you don't want to do it manually on behalf of your users every single time either). Hence you have the options either creating cronjobs that will automatically execute a command at a specific frequency as if you had typed it in yourself on your server's commandline. Another option is to manually start a script once that doesn't shutdown unless your server shuts down.
Triggering a Script based on Time:
Cron that is a task scheduler on *nix systems and Windows Task Scheduler on Windows. What you can do is set up a cronjob to run a specific php file at a specific frequency, and execute all the "background" tasks you need to run from within there.
One way of doing this would be to have a mysql table containing things that need to be executed along with when they need to be executed. The script then queries the table based on time to retrieve which tasks need to be executed, executes them, and then marks them executed (or just deletes them) in the mysql table.
This is a basic form of process queuing.
Building a Queue Server
This is a lot more advanced, but here's a tutorial for creating a script that will queue processes in the background without the need for any external databases: Building a Queue Server in PHP .
Let me know if this makes sense or if you have any questions :)
PHP is a server side language. Any time anybody accesses a PHP program on the server, it runs, irrespective of who is a client.
So, imagine a program that holds a counter. It stores this in a database. Every time updatecounter.php is called, the counter gets updated by one.
You browse to updatecounter.php, and it tells you that the counter is now at 34.
Next time you browse to updatecounter.php it tells you that the counter is at 53.
Its gone up by 18 more counts than you were expecting.
This is because updatecounter.php was being run without your intervention. It was being run by other people.
Now, if you looked at updatecounter.php, you might see code like this:
require_once("my_code.php);
$counterValue = increment_counter_value();
echo "New Counter Value = ".$counterValue;
Notice that the main core of the program is stored in a separate program than the program that you are calling.
Also, notice that instead of calling increment_counter_value, you could call anything. So every time somebody browsed to updatecounter.php, or whatever your game would be called, the internal game mechanics could be run. You could for instance, have an hourly stat management routine which would check each time it was called if it had been run in the last hour, and if it hadn't it would perform all the stats.
Now, what if nobody else is playing your game? If that happens, then the hourly stat management wouldn't get called, and your game world would die. So what you would need to do is create another program who's sole function is to run your stats. You would then schedule that program on the server to run at an hourly interval. You do this using something called a CRON job. You will probably find that your host already has this facility built in, if you are on Apache. I won't go into any more detail about task scheduling as without knowing your environment its impossible to give the correct answer. But basically, you would need to schedule a PHP program to run on the server to perform the hourly maintenance.
Here's a tutorial on CRON jobs:
http://net.tutsplus.com/tutorials/other/scheduling-tasks-with-cron-jobs/
I haven't used it myself but I've had no problems with other stuff on tutsplus so you should be ok.
This is not only php . Browser based game are combination of php/mysql/javascript/html . There are lot of technologies being used for this kind of work. When you are doing something on the browser, lets say adding a building ,an ajax request is being sent to the server so the server updates the database (can't wait until logout because then other users won't know your status to play (in case of multiparty) .
I have a simple messaging queue setup and running using the Zend_Queue object heirarchy. I'm using a Zend_Queue_Adapter_Db back-end. I'm interested in using this as a job queue, to schedule things for processing at a later time. They're jobs that don't need to happen immediately, but should happen sooner rather than later.
Is there a best-practices/standard way to setup your infrastructure to run jobs? I understand the code for receiving a message from the queue, but what's not so clear to me is how run the program that does that receiving. A cron that receives n messages on the command-line, run once a minute? A cron that fires off multiple web requests, each web request running the receiver script? Something else?
Tangential bonus question. If I'm running other queries with Zend_Db, will the message queue queries be considered part of that transaction?
You can do it like a thread pool. Create a command line php script to handle the receiving. It should be started by a shell script that automatically restarts the process if it dies. The shell script should not start the process if it is already running (use a $pid.running file or similar). Have cron run several of these every 1-10 minutes. That should handle the receiving nicely.
I wouldn't have the cron fire a web request unless your cron is on another server for some strange reason.
Another way to use this would be to have some backround process creating data, and a web user(s) consume it as they naturally browse the site. A report generator might work this way. Company-wide reports are available to all users but you don't want them all generating this db/time intensive report. So you create a queue and process one at a time possible removing duplicates. All users can view the report(s) when ready.
According to the docs it doens't look like the zend db is even using the same connection as your other zend_db queries. But of course the best way to find out is to make a simple test.
EDIT
The multiple lines in the cron are for concurrency. each line represents a worker for the pool. I was not clear, you don't want the pid as the identifier, you want to pass that as a parameter.
/home/byron/run_queue.sh Process1
/home/byron/run_queue.sh Process2
/home/byron/run_queue.sh Process3
The bash script would check for the $process.running file if it finds it exit.
otherwise:
Create the $process.running file.
start the php process. Block/wait until finished.
Delete the $process.running file.
This allows for the php script to die but not cause the pool to loose a worker.
If the queue is empty the php script exits immediately and is started again by the nex invocation of cron.