I'm learning php and I'd like to write a simple forum monitor, but I came to a problem. How do I write a script that downloads a file regularly? When the page is loaded, the php is executed just once, and if I put it into a loop, it would all have to be ran before the page is finished loading. But I want to, say, download a file every minute and make a notification on the page when the file changes. How do I do this?
Typically, you'll act in two steps :
First, you'll have a PHP script that will run every minute -- using the crontab
This script will do the heavy job : downloading and parsing the page
And storing some information in a shared location -- a database, typically
Then, your webpages will only have to check in that shared location (database) if the information is there.
This way, your webpages will always work :
Even if there are many users, only the cronjob will download the page
And even if the cronjob doesn't work for a while, the webpage will work ; worst possible thing is some information being out-dated.
Others have already suggested using a periodic cron script, which I'd say is probably the better option, though as Paul mentions, it depends upon your use case.
However, I just wanted to address your question directly, which is to say, how does a daemon in PHP work? The answer is that it works in the same way as a daemon in any other language - you start a process which doesn't end immediately, and put it into the background. That process then polls files or accepts socket connections or somesuch, and in so doing, accepts some work to do.
(This is obviously a somewhat simplified overview, and of course you'd typically need to have mechanisms in place for process management, signalling the service to shut down gracefully, and perhaps integration into the operating system's daemon management, etc. but the basics are pretty much the same.)
How do I write a script that downloads
a file regularly?
there are shedulers to do that, like 'cron' on linux (or unix)
When the page is loaded, the php is
executed just once,
just once, just like the index.php of your site....
If you want to update a page which is show in a browser than you should use some form of AJAX,
if you want something else than your question is not clear to /me......
Related
i started to learn programming like a month ago. I already knew html and css, i thought i should learn PHP. I learned alot of it from from tutorials and books, now I am making mysql based websites for practice.
I always used to play browser based strategy games like travian when i was a kid. I was thinking about how those sites worked. I didnt have any problem till i realized that the game actually worked after you closed the browser. For example; you log in to your account and start a construction and log off. But even after you close the browser, game knows that in "x" amount of time it needs to update your data of that specific building.
can someone tell me how that works? is it something with php or MySQL or some other programming language? even if you can tell me what to search online, it would be enough.
Despite being someone who loves tackling steep learning curves, I would advise against trying jump into something that requires background processes until you have a bit more programming experience.
But either way, here's what you need to know:
Normal PHP Process
The way that PHP normally works is the following:
User types a url into the browser and hits enter (or just clicks on a link)
Request is sent to a bunch of servers and magically finds its way to the right web server (beyond scope of this answer)
Server program like Apache or IIS listening on port 80 grabs the request
Apache sees that there's a .php extension on the requested page
Apache looks up if any processors have been assigned to .php and finds php.exe
The requested page is fed into php.exe
php.exe starts up a new process for the specific user, runs everything on the script, returns its result
The result is then sent back to the user
When the user closes the browser and ends the "session", the process started by php exits
So the problem you encounter when you want something running in the background is that PHP in most cases is generally accessed through the web server, and hence usually requires a browser (and user making requests through the browser). And since closing the browser ends the process, so you need a way to run php scripts without a browser.
Luckily PHP can be accessed outside of just the webserver as a normal process on the server. But then the problem is that you have to access the server. You probably don't want your users to ssh into your server in order to manually run scripts (and I'm assuming you don't want to do it manually on behalf of your users every single time either). Hence you have the options either creating cronjobs that will automatically execute a command at a specific frequency as if you had typed it in yourself on your server's commandline. Another option is to manually start a script once that doesn't shutdown unless your server shuts down.
Triggering a Script based on Time:
Cron that is a task scheduler on *nix systems and Windows Task Scheduler on Windows. What you can do is set up a cronjob to run a specific php file at a specific frequency, and execute all the "background" tasks you need to run from within there.
One way of doing this would be to have a mysql table containing things that need to be executed along with when they need to be executed. The script then queries the table based on time to retrieve which tasks need to be executed, executes them, and then marks them executed (or just deletes them) in the mysql table.
This is a basic form of process queuing.
Building a Queue Server
This is a lot more advanced, but here's a tutorial for creating a script that will queue processes in the background without the need for any external databases: Building a Queue Server in PHP .
Let me know if this makes sense or if you have any questions :)
PHP is a server side language. Any time anybody accesses a PHP program on the server, it runs, irrespective of who is a client.
So, imagine a program that holds a counter. It stores this in a database. Every time updatecounter.php is called, the counter gets updated by one.
You browse to updatecounter.php, and it tells you that the counter is now at 34.
Next time you browse to updatecounter.php it tells you that the counter is at 53.
Its gone up by 18 more counts than you were expecting.
This is because updatecounter.php was being run without your intervention. It was being run by other people.
Now, if you looked at updatecounter.php, you might see code like this:
require_once("my_code.php);
$counterValue = increment_counter_value();
echo "New Counter Value = ".$counterValue;
Notice that the main core of the program is stored in a separate program than the program that you are calling.
Also, notice that instead of calling increment_counter_value, you could call anything. So every time somebody browsed to updatecounter.php, or whatever your game would be called, the internal game mechanics could be run. You could for instance, have an hourly stat management routine which would check each time it was called if it had been run in the last hour, and if it hadn't it would perform all the stats.
Now, what if nobody else is playing your game? If that happens, then the hourly stat management wouldn't get called, and your game world would die. So what you would need to do is create another program who's sole function is to run your stats. You would then schedule that program on the server to run at an hourly interval. You do this using something called a CRON job. You will probably find that your host already has this facility built in, if you are on Apache. I won't go into any more detail about task scheduling as without knowing your environment its impossible to give the correct answer. But basically, you would need to schedule a PHP program to run on the server to perform the hourly maintenance.
Here's a tutorial on CRON jobs:
http://net.tutsplus.com/tutorials/other/scheduling-tasks-with-cron-jobs/
I haven't used it myself but I've had no problems with other stuff on tutsplus so you should be ok.
This is not only php . Browser based game are combination of php/mysql/javascript/html . There are lot of technologies being used for this kind of work. When you are doing something on the browser, lets say adding a building ,an ajax request is being sent to the server so the server updates the database (can't wait until logout because then other users won't know your status to play (in case of multiparty) .
I want to have my own variable that would be (most likely an array) storing what my php application is up to right now.
The application can trigger few processes that are in background (like downloading files) and I want to have a list what is being currently processed.
For example
if php calls exec() that will be downloading for 15mins
and then another download starts
and another download starts
then if I access my application I want to be able to see that 3 downloads are in process. If none of them finished yet.
Can do that? Only in memory, not storing anything on the disk?
I thought that the solution would be a some kind of server variable.
PHP doesn't have knowledge of previous processes. As soon has a php process is finished everything it knows about itself goes with it.
I can think of two options. Write knowledge about spawned processes to a file or database and use it to sync all your php request, (store the PID of each spawned process)
Or
Create an Daemon. The people behind PHP have worked hard to clean up PHP memory handling and such to make this more feasible. Take a look at their PEAR package - http://pear.php.net/package/System_Daemon
Off the top of my head, a quick architecture would compose of 3 peices
Part A) The web app that will take in request for downloads, and report back the progress of all request
Part B) You daemon, which accepts requests for downloads, spawns process, and will report back status of all spawned reqeust
Part C) The spawn request that will perform the download you need.
Anyone for shared memory?
Obviously you would have to have some sort of daemon, but you could use the inbuilt semaphore functions to easily have contact between each of the scripts. You need to be careful though because sometimes if you're not closing the memory block properly, you could risk ending up with no blocks left.
You can't store your own variables in $_SERVER. The best method would be to store your data in a database where and query/update it as required.
I am developing a website that requires a lot background processes for the site to run. For example, a queue, a video encoder and a few other types of background processes. Currently I have these running as a PHP cli script that contains:
while (true) {
// some code
sleep($someAmountOfSeconds);
}
Ok these work fine and everything but I was thinking of setting these up as a deamon which will give them an actual process id that I can monitor, also I can run them int he background and not have a terminal open all the time.
I would like to know if there is a better way of handling these? I was also thinking about cron jobs but some of these processes need to loop every few seconds.
Any suggestions?
Creating a daemon which you can make calls to and ask questions would seem the sensible option. Depends on wether your hoster permits such things, especially if you're requiring it to do work every few seconds, then definately an OS based service/daemon would seem far more sensible than anything else.
You could create a daemon in PHP, but in my experience this is a lot of hard work and the result is unreliable due to PHP's memory management and error handling.
I had the same problem, I wanted to write my logic in PHP but have it daemonised by a stable program that could restart the PHP script if it failed and so I wrote The Fat Controller.
It's written in C, runs as a daemon and can run PHP scripts, or indeed anything. If the PHP script ends for whatever reason, The Fat Controller will restart it. This means you don't have to take care of daemonising or error recovery - it's all handled for you.
The Fat Controller can also do lots of other things such as parallel processing which is ideal for queue processing, you can read about some potential use cases here:
http://fat-controller.sourceforge.net/use-cases.html
I've done this for 5 years using PHP to run background tasks and its no different to doing in any other language. Just use CRON and lock files. The lock file will prevent multiple instances of your script running.
Also its important to monitor your code and one check I always do to prevent stale lock files from preventing scripts to run is to have second CRON job to check if if the lock file is older than a few minutes and if an instance of the PHP script is running, if not it then removes the lock file.
Using this technique allows you to set your CRON to run the script every minute without issues.
Use the System::Daemon module from PEAR.
One solution (that I really need to try myself, as I may need it) is to use cron, but get the process to loop for five mins or so. Then, get cron to kick it off every five minutes. As one dies, the next one should be finishing (or close to finishing).
Bear in mind that the two may overlap a bit, and so you need to ensure that this doesn't cause a clash (e.g. writing to the same video file). Some simple inter-process communication may be useful, even if it is just writing to a PID file in the temp directory.
This approach is a bit low-tech but helps avoid PHP hanging onto memory over the longer term - sort of in-built task restarts!
As my server is not supporting cron job, I want a file in my server to trigger its action on a particular time every day..
Please let me know whether it possible to do run a script at a particular time from the server side itself without any external act.
I agree with Kel's answer.
You could try out one of the free cronjob services available, if your server doesn't support it.
Online Cronjobs
Set Cronjob
Just the first two found on Google, there's likely to be more if you search a little.
You cannot start script without ANY external act.
If your file server has SSH or HTTP server or something like that, you can configure cron job on another server to start your script via SSH / HTTP / something like that.
Also, you can create PHP script, which would do sleeping in a loop all the time, and wake up and do some job only if current time is near some specific value. You will have to correct maximum execution time for php script (see here for details), and you will have to start your script on server startup. BTW, this does not look like good solution.
As mentioned before, this is not possible literally "without external act".
A nice solution I found in the ThinkUp software (don't know where else this is used) to use a RSS feed reader. From the point of simplicity, this is probably the best option.
The idea is that you use your feed reader to automatically call a script on your site every XX hours (or whatever interval you want). When called, this script executes the maintenance tasks or whatever it is that you want to do.
To make sure that not everybody can run that script and cause your server to break down (I suppose this is a somewhat heavy task), you can use a unique long identifier string appended as URL parameter to make sure that the script only gets called by you.
Other than that, you can use one of the "poor man's" web cron job services that have been suggested in other answers.
if (rand(0,100)==0){
if (!file_exists($tf='tmp/job.crontime') || (time() - filemtime($tf))>(60*60*24)){
... # your tasks
touch($tf);
}
}
This simple & stupid script uses a file to store the time of last job-ecexution. If >60*60*24 has passed — it launches the job code. rand(0,100) should lower the overhead of checking for jobs on each request: 1/100 is the chance of running your jobs.
Put it in the end of your 'index.php'. Don't use in projects with modelate to high load :))
The Great Disadvantage: it won't run if you don't have any visitors.
UPD: Write a script that runs indefinitely and every 30s does touch('tmp/job.crontime') to report it's still alive. It should also check the current time & perform actions.
In index.php, if more than 30s has passed — re-launch the daemon with an HTTP-request. Ugly, but fully functional. You'll also deal with time limits, be careful!
Well, if this is on a public web server and you have enough visits, you could always use those to run code to check for a given value, say hour of day, number of times a file have been accessed (or store your number in a file). Just put your php code on top of a web page.
I'm currently running a Linux based VPS, with 768MB of Ram.
I have an application which collects details of domains and then connect to a service via cURL to retrieve details of the pagerank of these domains.
When I run a check on about 50 domains, it takes the remote page about 3 mins to load with all the results, before the script can parse the details and return it to my script. This causes a problem as nothing else seems to function until the script has finished executing, so users on the site will just get a timer / 'ball of death' while waiting for pages to load.
**(The remote page retrieves the domain details and updates the page by AJAX, but the curl request doesnt (rightfully) return the page until loading is complete.
Can anyone tell me if I'm doing anything obviously wrong, or if there is a better way of doing it. (There can be anything between 10 and 10,000 domains queued, so I need a process that can run in the background without affecting the rest of the site)
Thanks
A more sensible approach would be to "batch process" the domain data via the use of a cron triggered PHP cli script.
As such, once you'd inserted the relevant domains into a database table with a "processed" flag set as false, the background script would then:
Scan the database for domains that aren't marked as processed.
Carry out the CURL lookup, etc.
Update the database record accordingly and mark it as processed.
...
To ensure no overlap with an existing executing batch processing script, you should only invoke the php script every five minutes from cron and (within the PHP script itself) check how long the script has been running at the start of the "scan" stage and exit if its been running for four minutes or longer. (You might want to adjust these figures, but hopefully you can see where I'm going with this.)
By using this approach, you'll be able to leave the background script running indefinitely (as it's invoked via cron, it'll automatically start after reboots, etc.) and simply add domains to the database/review the results of processing, etc. via a separate web front end.
This isn't the ideal solution, but if you need to trigger this process based on a user request, you can add the following at the end of your script.
set_time_limit(0);
flush();
This will allow the PHP script to continue running, but it will return output to the user. But seriously, you should use batch processing. It will give you much more control over what's going on.
Firstly I'm sorry but Im an idiot! :)
I've loaded the site in another browser (FF) and it loads fine.
It seems Chrome puts some sort of lock on a domain when it's waiting for a server response, and I was testing the script manually through a browser.
Thanks for all your help and sorry for wasting your time.
CJ
While I agree with others that you should consider processing these tasks outside of your webserver, in a more controlled manner, I'll offer an explanation for the "server standstill".
If you're using native php sessions, php uses an exclusive locking scheme so only a single php process can deal with a given session id at a time. Having a long running php script which uses sessions can certainly cause this.
You can search for combinations of terms like:
php session concurrency lock session_write_close()
I'm sure its been discussed many times here. I'm too lazy to search for you. Maybe someone else will come along and make an answer with bulleted lists and pretty hyperlinks in exchange for stackoverflow reputation :) But not me :)
good luck.
I'm not sure how your code is structured but you could try using sleep(). That's what I use when batch processing.