Continously running a PHP script waiting for videos to trancode - php

I'm making a transcoding server which uses FFMPEG to convert videos to flv. After user uploads a video it's queued for processing in amazon Simple Queue Service. System is linux ubuntu.
Instead of running CRON each 1min I wonder if it would be possible to continously run several PHP scripts (dowload queued files, process downloaded etc). Each of them would have its own queue which would be read every 10s or so looking for new tasks.
My question is:
How to detect if the script is already running? I'd run CRON each 1min and if one of the programs would not be running I'd load it again. How stuff like that is done on linux? PID files?
thanks for help,
ian

Instead of doing this with only pure-PHP, I would probably go with a solution based on gearman (quoting wikipedia) :
Gearman is an open source application
framework [...]. Gearman is
designed to distribute appropriate
computer tasks to multiple computers,
so large tasks can be done more
quickly.
It works well with PHP, thanks to the gearman extension, and will deal with most of the queuing stuff for you.
Note that it'll also facilitate things when you have more videos to transcode, making scaling to several servers easier.

Yes, you can use PID files
Or temporary table, or memcache, e.t.c.
But I do like this:
By cron runs script that execute convert video, and it check if process is terminated or not
This cron script get movie what needs to convert from database or file

PHP's PEAR repository has a System_Daemon class for creating daemons out of PHP. I've used it for a couple systems with good results.

I've created a similar script specifically for this issue
Check: https://github.com/SirNarsh/EasyCron
The idea is to save PID of the script to a file, then check if the process is running by checking /proc/PID existence

Related

Creating a FTP downloading process

How to create a process on the server like ftp_get() but not waiting its results to continue the PHP script?
My issue is I'm working on a synchronization script and some files are really huge to be downloaded using PHP since it conflicts with max execution time.
Is there any way to initiate the process to download the file and leave it to proccess another?
You need threading in PHP.
See http://php.net/manual/en/class.thread.php, if you don't have experience with threading then you should look up some tutorials and examples and then some. After thinking you understand them, research them some more.
And maybe a bit more...
Creating a multi-threaded application that is stable is a hard task.
Otherwise you could always increase the max execution time, or setup the cron job to download the FTP files in advance such as 30 minutes before with other linux utilities.

Background PHP Processes

I am developing a website that requires a lot background processes for the site to run. For example, a queue, a video encoder and a few other types of background processes. Currently I have these running as a PHP cli script that contains:
while (true) {
// some code
sleep($someAmountOfSeconds);
}
Ok these work fine and everything but I was thinking of setting these up as a deamon which will give them an actual process id that I can monitor, also I can run them int he background and not have a terminal open all the time.
I would like to know if there is a better way of handling these? I was also thinking about cron jobs but some of these processes need to loop every few seconds.
Any suggestions?
Creating a daemon which you can make calls to and ask questions would seem the sensible option. Depends on wether your hoster permits such things, especially if you're requiring it to do work every few seconds, then definately an OS based service/daemon would seem far more sensible than anything else.
You could create a daemon in PHP, but in my experience this is a lot of hard work and the result is unreliable due to PHP's memory management and error handling.
I had the same problem, I wanted to write my logic in PHP but have it daemonised by a stable program that could restart the PHP script if it failed and so I wrote The Fat Controller.
It's written in C, runs as a daemon and can run PHP scripts, or indeed anything. If the PHP script ends for whatever reason, The Fat Controller will restart it. This means you don't have to take care of daemonising or error recovery - it's all handled for you.
The Fat Controller can also do lots of other things such as parallel processing which is ideal for queue processing, you can read about some potential use cases here:
http://fat-controller.sourceforge.net/use-cases.html
I've done this for 5 years using PHP to run background tasks and its no different to doing in any other language. Just use CRON and lock files. The lock file will prevent multiple instances of your script running.
Also its important to monitor your code and one check I always do to prevent stale lock files from preventing scripts to run is to have second CRON job to check if if the lock file is older than a few minutes and if an instance of the PHP script is running, if not it then removes the lock file.
Using this technique allows you to set your CRON to run the script every minute without issues.
Use the System::Daemon module from PEAR.
One solution (that I really need to try myself, as I may need it) is to use cron, but get the process to loop for five mins or so. Then, get cron to kick it off every five minutes. As one dies, the next one should be finishing (or close to finishing).
Bear in mind that the two may overlap a bit, and so you need to ensure that this doesn't cause a clash (e.g. writing to the same video file). Some simple inter-process communication may be useful, even if it is just writing to a PID file in the temp directory.
This approach is a bit low-tech but helps avoid PHP hanging onto memory over the longer term - sort of in-built task restarts!

Schedule and execute a PHP script automatically

I have written a PHP script which generates an SQL file containing all tables in my database.
What I want to do is execute this script daily or every n days. I have read about cron jobs but I am using Windows. How can I automate the script execution on the server?
You'll need to add a scheduled task to call the URL.
First of all, read up here:
MS KB - this is for Windows XP.
Second, you'll need some way to call the URL - i'd recommend using something like wget - this way you can call the URL and save the output to a file, so you can see what the debug output is. You can get hold of wget on this page.
Final step is, as Gabriel says, write a batch file to tie all this up, then away you go.
e: wget is pretty simple to use, but if you have any issues, leave a comment and I'll help out.
ee: thinking about it, you don't even really need a batch file, and could just call wget directly..
add a scheduled task to request the url. either using a batch file or a script file (WSH).
http://blog.netnerds.net/2007/01/vbscript-download-and-save-a-binary-file/
this script will allow you to download binary data from a web source. Modify it to work for you particular case. This vbs file can either be run directly or executed from within a script. Alternately you do not have to save the file using the script, you can just output the contents (WScript.Echo objXMLHTTP.ResponseBody) and utilize the CMD out to file argument:
cscript download.vbs > logfile.log
save that bad boy in a .bat file somewhere useful and call it in the scheduler: http://lifehacker.com/153089/hack-attack-using-windows-scheduled-tasks
Cron is not always available on many hosting accounts.
But try this:
http://www.phpjobscheduler.co.uk/
its free, has a useful interface so you can see all the scheduled tasks and will run on any host that provides php and mysql.
You can use ATrigger scheduling service. A PHP library is also available to create scheduled tasks without overhead. Reporting, Analytics, Error Handling and more benefits.
Disclaimer: I was among the ATrigger team. It's a freeware and I have not any commercial purpose.
Windows doesn't have cron, but it does come with the 'at' command. It's not as flexible as cron, but it will allow you to schedule arbitrary tasks for execution from the command line.
Yes, You can schedule and execute your php script on windows to run automatically. In linux like os u will have cron but on windows u can schedule task using task scheduler.
If your code is in remote hosted server then create a cron-job for the same.
Else if in local then use a scheduled task in windows.Its easy to implement.I am having servers with so many scheduled tasks running.

How to do this (PHP) in python or ruby?

My app takes a loooong list of urls, and split it in X (where X = $threads) so then I can start a thread.php and calculate the urls for it. Then it does GET and POST request to retrieve data
I am using this:
for($x=1;$x<=$threads;$x++){
$pid[] = exec("/path/bin/php thread.php <options> > /dev/null & echo \$!");
}
For "threading" (I know its not really threading, is it forking or what?), I save the pids into a file for later checking if N thread is running and to stop them.
Now I want to move out from php, I was thinking about using python because I'd like to learn more about it.
How can I achieve this kind of "threading" with python? (or ruby)
Or is there a better way to launch multiple background threads in python or ruby that runs in parallel (at the same time)?
The threads doesn't need to communicate between each other or with a main thread, they are independent, they do http request and interact with a mysql db, they may need to access/modify the same table entries (I haven't tought about this or how I will solve it yet).
The app works with "projects", each project has a "max threads" variable and I use a web interface to control it (so I could still use php for the interface [starting/stopping threads] in the new app).
I wanted to use
from threading import Thread
in python, but I've been told those threads wont run in parallel but once at a time.
The app is intended to run on linux web servers.
Any suggestion will be appreciated.
For Python 2.6+, consider the multiprocessing module:
multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows
For Python 2.5, the same functionality is available via pyprocessing.
In addition to the example at the links above, here are some additional links to get you started:
multiprocessing Basics
Communication between processes with multiprocessing
You don't want threading. You want a work queue like Gearman that you can send jobs to asynchronously.
It's worth noting that this is a cross-platform, cross-language solution. There are bindings for many languages (including Python and PHP) provided officially, and many more unofficially with a bit of work with Google.
The original intent is effectively load balancing, but it works just as well with only one machine. Basically, you can create one or more Workers that listen for Jobs. You can control the number of Workers and the types of Jobs they can listen for.
If you insert five Jobs into the queue at the same time, and there happen to be five Workers waiting, each Worker will be handed one of the Jobs. If there are more Jobs than Workers, the Jobs get handled sequentially. Your Client (the thing that submits Jobs) can either wait for all of the Jobs it's created to complete, or it can simply place them in the queue and continue on.

Multithreaded Programming in PHP to avoid runtime limitations

I know about PHP not being multithreaded but i talked with a friend about this: If i have a large algorithmic problem i want to solve with PHP isn't the solution to simply using the "curl_multi_xxx" interface and start n HTTP requests on the same server. This is what i would call PHP style multithreading.
Are there any problems with this in the typical webserver environment? The master request which is waiting for "curl_multi_exec" shouldn't count any time against its maximum runtime or memory length.
I have never seen this anywhere promoted as a solution to prevent a script killed by too restrictive admin settings for PHP.
If i add this as a feature into a popular PHP system will there be server admins hiring a russian mafia hitman to get revenge for this hack?
If i add this as a feature into a
popular PHP system will there be
server admins hiring a russian mafia
hitman to get revenge for this hack?
No but it's still a terrible idea for no other reason than PHP is supposed to render web pages. Not run big algorithms. I see people trying to do this in ASP.Net all the time. There are two proper solutions.
Have your PHP script spawn a process
that runs independently of the web
server and updates a common data
store (probably a database) with
information about the progress of
the task that your PHP scripts can
access.
Have a constantly running daemon
that checks for jobs in a common
data store that the PHP scripts can
issue jobs to and view the progress
on currently running jobs.
By using curl, you are adding a network timeout dependency into the mix. Ideally you would run everything from the command line to avoid timeout issues.
PHP does support forking (pcntl_fork). You can fork some processes and then monitor them with something like pcntl_waitpid. You end up with one "parent" process to monitor the children it spanned.
Keep in mind that while one process can startup, load everything, then fork, you can't share things like database connections. So each forked process should establish it's own. I've used forking for up 50 processes.
If forking isn't available for your install of PHP, you can spawn a process as Spencer mentioned. Just make sure you spawn the process in such a way that it doesn't stop processing of your main script. You also want to get the process ID so you can monitor the spawned processes.
exec("nohup /path/to/php.script > /dev/null 2>&1 & echo $!", $output);
$pid = $output[0];
You can also use the above exec() setup to spawn a process started from a web page and get control back immediately.
Out of curiosity - what is your "large algorithmic problem" attempting to accomplish?
You might be better to write it as an Amazon EC2 service, then sell access to the service rather than the package itself.
Edit: you now mention "mass emails". There are already services that do this, they're generally known as "spammers". Please don't.
Lothar,
As far as I know, php don't work with services, like his concorrent, so you don't have a way for php to know how much time have passed unless you're constantly interrupting the process to check the time passed .. So, imo, no, you can't do that in php :)

Categories