A few years ago i made am webbapp with ajax and php, i really loved the polling mechanism.
Now i`m working on a streamingserver which is controlled by php. The client makes a call to the php to start en stop streaming. These are calls the clients initialises, now i want to do some "server-maintenace". (which is not initialized by a client)
What i'm trying to do is creating a script thats check for clients, the script needs to loop and query`s the database for the clients table every 20 seconds,
is this possible with using PHP only? can you give me some tricks and tips?
help is appreciated
If you want it to be under a minute you'll need a long running PHP process. If you can live w/ every minute, a cron job will suffice. Another option would be to run the check as part of requests as they came into the webserver for other things as part of your application.
Related
I need to write a cron (php) script to get the html result from several websites.
Let's say my database has 50 websites records (ie. http://www.somewebsite.com/page.php). So the cron job will be set to run every x mins. When it's running, it will load the records from database, check the status of each websites then get the HTML result back from it then analysis it.
My concern is, if the website from n'th record is not responding, or it takes some time to load (ie. oversea website), then n+1's record won't be run, not until n'th record is finished, then this cron job will take a while to finish.
If I execute the script on a browser, then it can be easily handled by using ajax async, however it's a cron job, so I have no idea how to handle this situation.
Here is what you can do,
Run the sh script from crontab and in the script invoke .php program that takes care of async tasks.
I think its better to move to other language if there is need of 'Async'. It's a major upgrade to how you would create the overall system architecture. So this is a major upgrade to any PHP developer as myself spending a lot in PHP and looking into NodeJS for better solutions. And PHP does not support such need internally or from the core yet, although the term 'Aync' is introduced.
(Our server is Linux based)
I'm an experienced PHP developer but first time i'll develop a bot which always running and fetch some datas.
I'll explain my application with a simple (and sample) scenario. I have about 2000 web site url and my application will visit this url's and record contents of web page's . This application will work 7 days 24 hours. It will start working again when it's finish 2000 web sites.
But i need some suggestions for my server. As you see, my application will be run infinity until i shut down server. I can do this infinity loop with this :
while(true)
{
APPLICATION CODES HERE
}
But i think this will be an evil for server :) Is it possible to doing something like this, on server side?
Also i think using cronjobs but it's not work for my scenario. Because my script start working again asap it's finish working. I have to "start again when you finish your work" , not "start every 30 minutes" . Because i don't know, maybe fetching all 2000 websites, will take more than 30 minutes or less than 30 minutes.
I hope i explained it very well.
Also i'm worried about memory usage. As you know garbage collector cleans memory after every PHP script stop. But as i said, my app won't stop for days (maybe weeks) . So garbage collector won't be triggered. I'm manually unsetting (unset() function) all used variables at end of script. Is it enough?
I need some suggestions from server administrators :)
PS. I'm developing it as console application, not a web application. I can execute it from command line.
Batch processing.. store all the sites in a csv or something, mark them after completion, then work on all the ones non-marked, then work on all the marked.. etc. Only do say 1 or 5 at a time, initiate batch script every minute from cron..
Don't even try to work on all of them at once.. any errors and you won't know what happened..
Could even store the jobs in a database, store processing stats etc.. allows for fine-tuning and better reporting.
You will probably hit time-limits trying to run infinite php scripts, even from the command line.. also your server admin will hate you. Will probably run into Memory limits if you don't release resources properly.. far too easily done with php.
Read: http://www.ibm.com/developerworks/opensource/library/os-php-batch/
Your script could just run through the list once and quit. That way, what ever resources php is holding can be freed.
Then have a shell script that calls the php script in an infinite loop.
As php is not designed for long running task, I am not sure if the garbage collection is up to the task. Quiting after every run will force it to release everything.
I don't have a server, so i don't have crontab access. I am thinking of using a PHP script to check current time and how much time has passed. For example, I run the script, it stores the current date in MySQL and check if 30 days have passed. If so, do my stuff.
Is it possible to do all these without MySQL? And of course it is only my thinking, i haven't tried yet.
Keeping script running:
The issue is that you've either got to keep that script running for a long, long time (which PHP doesn't like) or you'll have to manually run that script every day or whatever.
One thing you could do is write a script to run on your local machine that accesses that PHP script (e.g. using the commandline tool 'wget') every minute or hour or whatever.
If you want to have a long-running script, you'll need to use: http://php.net/manual/en/function.set-time-limit.php. That'll let you execute a script for much longer.
As noted in another answer, there are also services like this: https://stackoverflow.com/questions/163476/free-alternative-to-webcron
Need for MySQL?
As for whether you need MySQL - definitely not, though it isn't a bad option if you have it available. You can use a file if required (http://php.net/manual/en/function.fopen.php) or even SQLite (http://php.net/manual/en/book.sqlite.php) which is a file-based SQL database.
As i understand, you can only run php scripts, which involved by user request.
I tried this once, but it is dangerous. If the scheduled process took too long time, user may be interrupting it, and hanging up the script, causing half-processed data. I'm not suggesting it for production environment.
Take a look at Drupals Poormanscron module: http://drupal.org/project/poormanscron. From the introduction text:
The module inserts a small amount of JavaScript on each page of your
site that when a certain amount of time has passed since the last cron
run, calls an AJAX request to run the cron tasks.
You can implement something like this yourself, possibly using their code as a starting point. However, this implementation depends on regular visits to the pages of your website. If nobody visits your website, the cronjobs do not get executed.
You'd need some kind of persistent storage, but a simple file should do the trick. Give it a go and see how you get on. :) Come back for help if you get stuck. Here are some pointers to get you started:
http://php.net/manual/en/function.file-get-contents.php
http://nz.php.net/manual/en/function.file-put-contents.php
You could use a webcron (basically a cronjob on another server that calls your script on a given time)
https://stackoverflow.com/questions/163476/free-alternative-to-webcron
I'm currently running a Linux based VPS, with 768MB of Ram.
I have an application which collects details of domains and then connect to a service via cURL to retrieve details of the pagerank of these domains.
When I run a check on about 50 domains, it takes the remote page about 3 mins to load with all the results, before the script can parse the details and return it to my script. This causes a problem as nothing else seems to function until the script has finished executing, so users on the site will just get a timer / 'ball of death' while waiting for pages to load.
**(The remote page retrieves the domain details and updates the page by AJAX, but the curl request doesnt (rightfully) return the page until loading is complete.
Can anyone tell me if I'm doing anything obviously wrong, or if there is a better way of doing it. (There can be anything between 10 and 10,000 domains queued, so I need a process that can run in the background without affecting the rest of the site)
Thanks
A more sensible approach would be to "batch process" the domain data via the use of a cron triggered PHP cli script.
As such, once you'd inserted the relevant domains into a database table with a "processed" flag set as false, the background script would then:
Scan the database for domains that aren't marked as processed.
Carry out the CURL lookup, etc.
Update the database record accordingly and mark it as processed.
...
To ensure no overlap with an existing executing batch processing script, you should only invoke the php script every five minutes from cron and (within the PHP script itself) check how long the script has been running at the start of the "scan" stage and exit if its been running for four minutes or longer. (You might want to adjust these figures, but hopefully you can see where I'm going with this.)
By using this approach, you'll be able to leave the background script running indefinitely (as it's invoked via cron, it'll automatically start after reboots, etc.) and simply add domains to the database/review the results of processing, etc. via a separate web front end.
This isn't the ideal solution, but if you need to trigger this process based on a user request, you can add the following at the end of your script.
set_time_limit(0);
flush();
This will allow the PHP script to continue running, but it will return output to the user. But seriously, you should use batch processing. It will give you much more control over what's going on.
Firstly I'm sorry but Im an idiot! :)
I've loaded the site in another browser (FF) and it loads fine.
It seems Chrome puts some sort of lock on a domain when it's waiting for a server response, and I was testing the script manually through a browser.
Thanks for all your help and sorry for wasting your time.
CJ
While I agree with others that you should consider processing these tasks outside of your webserver, in a more controlled manner, I'll offer an explanation for the "server standstill".
If you're using native php sessions, php uses an exclusive locking scheme so only a single php process can deal with a given session id at a time. Having a long running php script which uses sessions can certainly cause this.
You can search for combinations of terms like:
php session concurrency lock session_write_close()
I'm sure its been discussed many times here. I'm too lazy to search for you. Maybe someone else will come along and make an answer with bulleted lists and pretty hyperlinks in exchange for stackoverflow reputation :) But not me :)
good luck.
I'm not sure how your code is structured but you could try using sleep(). That's what I use when batch processing.
i wonder how can i schedule and automate tasks in PHP? can i? or is web server features like cron jobs needed.
i am wondering if there is a way i can say delete files after say 3 days when the file are likely outdated or not needed
PHP natively doesn't support automating tasks, you have to build a solution yourself or search google for available solutions. If you have a frequently visited site/page, you could add a timestamp to the database linking to the file, when visiting your site in a chosen time (e.g. 8 in the morning) the script (e.g. deleteOlderDocuments.php) runs and deletes the files that are older.
Just an idea. Hope it helps.
PHP operates under the request-response model, so it won't be the responsibility of PHP to initiate and perform the scheduled job. Use cron, or make your PHP site to register the cron jobs.
(Note: the script that the job executes can be written in PHP of course)
In most shared hosting environments, a PHP interpreter is started for each page request. This means that for each PHP script in said environment, all that script will know about is the fact that it's handling a request, and the information that request gave it. Technically you could check the current time in PHP and see if a task needs to be performed, but that is relying on a user requesting that script near a given time.
It is better to use cron for such tasks. especially if the tasks you need performed can be slow -- then, every once in a while, around a certain time, a user would have a particularly slow response, because them accessing a script caused the server to do a whole bunch of scheduled stuff.