PHP curl to self to avoid FastCGI timeout - php

So here is my dilemma. I need to pull several hundred API calls worth of data, parse them one at a time, and log matching data. My problem is this takes awhile and I'm on shared hosting and my FastCGI busy timeout cant be altered (Web Host won't do it because of shared hosting I believe). So I'm completely stumped on how to get around this. I can't do CLI because it's a user facing tool where they input a list of data and thats what I match against. So once the input is recieved I need the PHP to run by itself until completed (probably awhile like couple hours).
I tried everything and nothing works. At this point to try and trick the system I have the file being self-referential instead of a loop but that doesnt seem to work. I think thats my only way (unless someone has a better idea) and I'm trying to figure out how to make every call back on itself "restart" in the eyes of FastCGI. HELP!!

If you have access to exec then you can always create either another PHP script to actually do the execution or some other program or script to do it, and then call that script with exec so you can have it run on the machine instead of through FastCGI. You'd then want to use some sort of progress tracking in your script to keep track of how far it's gotten, or when it's done, and then have a page to check on the progress of a request :)
Note: This really isn't a great idea for a production solution, but it will work better than figuring out a recursive curl call :)

Related

PHP infinite loop process; Is it good solution?

I'm creating a plugin for a CMS and need one or more preriodical tasks in background. As it is a plugin for an open source CMS, cron job is not a perfect solution because users may not have access to cron on their server.
I'm going to start a infinite loop via an AJAX request then abort XHR request. So HTTP connection will be closed but script continue running.
Is it a good solution generally? What about server resources? Is there any shutdown or limitation policies in servers (such as Apache) for long time running threads?
Long running php scripts are not too good idea. If your script uses session variables your user won't be able to load any pages until the other session based script is closed.
If you really need long running scripts make sure its not using any session and keep them under the maximum execution time. Do not let it run without your control. It can cause various problems. I remember when I made such a things like that and my server just crashed several times.
Know what you want to do and make sure it's well tested on different servers.
Also search for similiar modules and check what methods they use for such a problems like that. Learn from the pros. :)

Is it possible to trigger AJAX when the file has changed?

I am trying to use AJAX to read when a file or database has changed (had an extra post added to it by another user), and display the newest post (kind of like SO)
And it worked, but the thing is the host I'm using only allows a certain amount of "resource usage per hour" and once the limit is hit, the site is locked out for an hour. This is a free host I'm mainly using for testing and learning.
So before, I had the AJAX set to a setInterval of checking every 2-4 seconds, from a file that was just echoing the last post made in the system. Which I'm guessing is what shut down the site for an hour in a matter of minutes.
So I'm wondering if there's anyway to make it only retrieve the newest post ONLY when the result has changed from what it last found. It sounds like that can't be done because it'd still have to check every time, activating the PHP, regardless of what sends back.
Any ideas how I can do this or something similar?
You could use http://en.wikipedia.org/wiki/WebSocket (but I guess not on your host, since there is a apache extension you have to install) or you use http://en.wikipedia.org/wiki/Push_technology#Long_polling.
With long polling you send one request to your PHP and the PHP script loops until a new post was found and then send the response.
But you really should consider changing the hoster, because realtime web application need moe ressources. Why not testing and learning locally on your machine?

PHP script that works forever :)

I'm looking for some ideas to do the following. I need a PHP script to perform certain action for quite a long time. This is an extension for a CMS and this can't be anything else but PHP. It also can't be a command line script because it should be used by common people that will have only the standard means of the CMS. One of the options is having a cron job (most simple hostings have it) that will trigger the script often so that instead of working for a long time it could perform the action step by step preserving its state from one launch to the next one. This is not perfect but I can't see of any other solutions. If the script will be redirecting to itself server will interrupt it. What other options can suit?
Thanks everyone in advance!
What you're talking about is a daemon or long running program that waits for calls by client programs, performs and action, provides a response then keeps on waiting for more calls.
You might be familiar w/ these in the form of Apache & MySQL ;) Anyway PHP is generally OK in this regard, it does have the ability to function over raw sockets as well as fork sub-processes to handle multiple requests simultaneously.
Having said that PHP daemons are a tool where YMMV. Some folks will say they work great, other folks like me will say they have issues w/ interprocess communication and leaking memory even amidst plethora unset() calls.
Anyway you likely won't be able to deploy a daemon of any type on a shared hosting environment. You'll need to get a better server package or stick with a Cron based solution.
Here's a link about writing a PHP daemon.
Also, one more note. Daemons do crash from time to time and therefore you may still need to store state about whats going on, just in case someone trips over the power cord to your shared server :)
I would also suggest that you think about making it a daemon but if not then you can simply use
set_time_limit(0);
ignore_user_abort(true);
at the top to tell it not to time out and not to get interrupted by anything. Then call it from the cron to start it every day or whatever. I have this on many long processing daily tasks and it works great for me. However, it won't be able to easily talk to the outside world (other scripts can't query it or anything -- if that is what you want look into php services) so once you get it running make sure it will stop and have it print its progress to a logfile.

Long Polling causing server problems?

I've finally made a simple chat page that I had wanted to make for a while now, but I'm running into problems with my servers.
I'm not sure if long polling is the correct term, but from what I understand, I think it is. I have an ajax call to a php page that checks a mysql database for messages with times newer than the time sent in the ajax request. If there isn't a newer message, it keeps looping and checking until there is. Else, it just returns the new messages and the client script sends another ajax request as soon as it gets the messages.
Everything is working fine, except for the part where the server on 000webhost stops responding after a few chat messages, and the server on x10 hosting gives me a message about hitting a resource limit.
Maybe this is a dumb way to do a chat system, but it's all I know how to do. If there is a better way please let me know.
edit: Holy hell, it's just occurred to me that I didn't put any sleep time in the while loop on the server.
You can find a lot of reading on this, but I disbelieve that free web hosting is going to allow to do what you are thinking of doing. PHP was also not really designed to create chat systems.
I would recommend using WebSockets, and use for example, Node.JS with Socket.IO, or Tornado with Python; There is a lot of solutions out there, but most of them would require you to run your own server since it requires to run a whole program that interacts with many connections at once instead of simple scripts that just start and finish with a single connection.
What about using the same strategy whether there are newer messages on the server or not. The server would always return a list of newer messages - this list could be empty when there are no newer messages. The empty list could be also be encoded as a special data token.
The client then proceeds in both cases the same way: it processes the received data and requests new messages after a time period.
Make sure you sleep(1) your code on each loop, the code gonna enter the loop several times per second, stressing your database/server.
But still, nodejs or websockets are better tecnologies to deal with real time chats.

How do I avoid this PHP Script causing a server standstill?

I'm currently running a Linux based VPS, with 768MB of Ram.
I have an application which collects details of domains and then connect to a service via cURL to retrieve details of the pagerank of these domains.
When I run a check on about 50 domains, it takes the remote page about 3 mins to load with all the results, before the script can parse the details and return it to my script. This causes a problem as nothing else seems to function until the script has finished executing, so users on the site will just get a timer / 'ball of death' while waiting for pages to load.
**(The remote page retrieves the domain details and updates the page by AJAX, but the curl request doesnt (rightfully) return the page until loading is complete.
Can anyone tell me if I'm doing anything obviously wrong, or if there is a better way of doing it. (There can be anything between 10 and 10,000 domains queued, so I need a process that can run in the background without affecting the rest of the site)
Thanks
A more sensible approach would be to "batch process" the domain data via the use of a cron triggered PHP cli script.
As such, once you'd inserted the relevant domains into a database table with a "processed" flag set as false, the background script would then:
Scan the database for domains that aren't marked as processed.
Carry out the CURL lookup, etc.
Update the database record accordingly and mark it as processed.
...
To ensure no overlap with an existing executing batch processing script, you should only invoke the php script every five minutes from cron and (within the PHP script itself) check how long the script has been running at the start of the "scan" stage and exit if its been running for four minutes or longer. (You might want to adjust these figures, but hopefully you can see where I'm going with this.)
By using this approach, you'll be able to leave the background script running indefinitely (as it's invoked via cron, it'll automatically start after reboots, etc.) and simply add domains to the database/review the results of processing, etc. via a separate web front end.
This isn't the ideal solution, but if you need to trigger this process based on a user request, you can add the following at the end of your script.
set_time_limit(0);
flush();
This will allow the PHP script to continue running, but it will return output to the user. But seriously, you should use batch processing. It will give you much more control over what's going on.
Firstly I'm sorry but Im an idiot! :)
I've loaded the site in another browser (FF) and it loads fine.
It seems Chrome puts some sort of lock on a domain when it's waiting for a server response, and I was testing the script manually through a browser.
Thanks for all your help and sorry for wasting your time.
CJ
While I agree with others that you should consider processing these tasks outside of your webserver, in a more controlled manner, I'll offer an explanation for the "server standstill".
If you're using native php sessions, php uses an exclusive locking scheme so only a single php process can deal with a given session id at a time. Having a long running php script which uses sessions can certainly cause this.
You can search for combinations of terms like:
php session concurrency lock session_write_close()
I'm sure its been discussed many times here. I'm too lazy to search for you. Maybe someone else will come along and make an answer with bulleted lists and pretty hyperlinks in exchange for stackoverflow reputation :) But not me :)
good luck.
I'm not sure how your code is structured but you could try using sleep(). That's what I use when batch processing.

Categories