I would like to real-hand experienced answer on this.
Which one is faster? Writing a Shell Script or PHP script? This script be will setup in cron.
Here is the brief idea of what I am trying to accomplish.
We get a lot of PGP encrypted files from clients. We download them to our local server, decrypt them and move them to different location for further processing.
There could be around 20-25 files a day to do and the number goes up gradually.
We have written both PHP script and Shell script to do this, for testing purposes.
But we are not sure which is going to faster and advantageous.
Has anyone tried? Any inputs?
Thanks much!
As indicated in the comments, you ought to just benchmark.
The overhead associated with the script will certainly be insignificant compared to the time spent in the decryption phase. (Cryptography is a notoriously computationally expensive process, especially dual-key crypto.)
Also, 20-25 requests, even 1000 requests, is nothing on a modern machine, unless we are talking about decrypting gigantic files (in which case, again, the crypto step will swamp any optimizations in the wrapper script). Asking this question and benchmarking are probably more time consuming that any overhead you will encounter.
(As an aside, I really hope that you are doing the decryption on a back-end machine not directly facing the public. Guard your key!)
Both use an interpreter to execute your tasks. Depending on which OS you are using, their engines could have been both written in C++.
I would use PHP. Because it has more modules you can addon.
Say do your PGP encryption then you want to update a mySQL DB, Send an email, post to facebook, send a tweet out that your task is complete.
Edit - PHP doesn't require awebserv. I'm referring to the command line execution of php and shell script.
PHP Command line help - http://php.net/manual/en/features.commandline.php
Related
I'm looking for some ideas to do the following. I need a PHP script to perform certain action for quite a long time. This is an extension for a CMS and this can't be anything else but PHP. It also can't be a command line script because it should be used by common people that will have only the standard means of the CMS. One of the options is having a cron job (most simple hostings have it) that will trigger the script often so that instead of working for a long time it could perform the action step by step preserving its state from one launch to the next one. This is not perfect but I can't see of any other solutions. If the script will be redirecting to itself server will interrupt it. What other options can suit?
Thanks everyone in advance!
What you're talking about is a daemon or long running program that waits for calls by client programs, performs and action, provides a response then keeps on waiting for more calls.
You might be familiar w/ these in the form of Apache & MySQL ;) Anyway PHP is generally OK in this regard, it does have the ability to function over raw sockets as well as fork sub-processes to handle multiple requests simultaneously.
Having said that PHP daemons are a tool where YMMV. Some folks will say they work great, other folks like me will say they have issues w/ interprocess communication and leaking memory even amidst plethora unset() calls.
Anyway you likely won't be able to deploy a daemon of any type on a shared hosting environment. You'll need to get a better server package or stick with a Cron based solution.
Here's a link about writing a PHP daemon.
Also, one more note. Daemons do crash from time to time and therefore you may still need to store state about whats going on, just in case someone trips over the power cord to your shared server :)
I would also suggest that you think about making it a daemon but if not then you can simply use
set_time_limit(0);
ignore_user_abort(true);
at the top to tell it not to time out and not to get interrupted by anything. Then call it from the cron to start it every day or whatever. I have this on many long processing daily tasks and it works great for me. However, it won't be able to easily talk to the outside world (other scripts can't query it or anything -- if that is what you want look into php services) so once you get it running make sure it will stop and have it print its progress to a logfile.
I am developing a website that requires a lot background processes for the site to run. For example, a queue, a video encoder and a few other types of background processes. Currently I have these running as a PHP cli script that contains:
while (true) {
// some code
sleep($someAmountOfSeconds);
}
Ok these work fine and everything but I was thinking of setting these up as a deamon which will give them an actual process id that I can monitor, also I can run them int he background and not have a terminal open all the time.
I would like to know if there is a better way of handling these? I was also thinking about cron jobs but some of these processes need to loop every few seconds.
Any suggestions?
Creating a daemon which you can make calls to and ask questions would seem the sensible option. Depends on wether your hoster permits such things, especially if you're requiring it to do work every few seconds, then definately an OS based service/daemon would seem far more sensible than anything else.
You could create a daemon in PHP, but in my experience this is a lot of hard work and the result is unreliable due to PHP's memory management and error handling.
I had the same problem, I wanted to write my logic in PHP but have it daemonised by a stable program that could restart the PHP script if it failed and so I wrote The Fat Controller.
It's written in C, runs as a daemon and can run PHP scripts, or indeed anything. If the PHP script ends for whatever reason, The Fat Controller will restart it. This means you don't have to take care of daemonising or error recovery - it's all handled for you.
The Fat Controller can also do lots of other things such as parallel processing which is ideal for queue processing, you can read about some potential use cases here:
http://fat-controller.sourceforge.net/use-cases.html
I've done this for 5 years using PHP to run background tasks and its no different to doing in any other language. Just use CRON and lock files. The lock file will prevent multiple instances of your script running.
Also its important to monitor your code and one check I always do to prevent stale lock files from preventing scripts to run is to have second CRON job to check if if the lock file is older than a few minutes and if an instance of the PHP script is running, if not it then removes the lock file.
Using this technique allows you to set your CRON to run the script every minute without issues.
Use the System::Daemon module from PEAR.
One solution (that I really need to try myself, as I may need it) is to use cron, but get the process to loop for five mins or so. Then, get cron to kick it off every five minutes. As one dies, the next one should be finishing (or close to finishing).
Bear in mind that the two may overlap a bit, and so you need to ensure that this doesn't cause a clash (e.g. writing to the same video file). Some simple inter-process communication may be useful, even if it is just writing to a PID file in the temp directory.
This approach is a bit low-tech but helps avoid PHP hanging onto memory over the longer term - sort of in-built task restarts!
I am writing a Wordpress plug in php in and next step is some kind of add on to this plug in.
The add on would scrape data from web, sending forms etc. I have this part almost ready from the time before I had any thoughts about Wordpress plugin - it's coded in ruby using mechanize. I haven't found anything similar to mechanize in php anyway.
But I do not know what is the best way to call my ruby script from Wordpress. Some tasks will be managed by cron. What about the ones based on user request?
php script only triggers the ruby script. It won't wait/require anything from ruby's output
Wordpress plugin is fully portable and functional without ruby script. Ruby adds on something more. If somebody requires it.
everything will be running on my linux server where I have a root access
A WordPress plugin that depends on Ruby isn't going to be portable. That's OK if you're the only one who will be using it, though.
If the Ruby script needs to return a result that will be used immediately by the PHP script that's calling it, then something like exec() is the only way. Make sure you escape any arguments you pass to the Ruby script; otherwise you'll be vulnerable to injection attacks.
If the Ruby script doesn't need to return a result immediately (e.g. some background processing, such as thumbnail generation) then I think the best way would be for the PHP script to insert a row into a MySQL database or something similar. The Ruby script can work in the background or run from cron, check the database periodically for new jobs, and do whatever processing it needs to do. This approach avoids the performance overhead and security issues of exec(), and it's arguably also more scalable. (A similar approach would have the Ruby script listen on a socket, and your PHP scripts would connect to the socket. But this requires more work to get it right.)
If i were you i would handle all the ruby stuff from the cron. Make a queue in the DB to hand user requests then make the script (in ruby?) invoked by cron grab all the unprocessed jobs from the queue and start running them, then remove the job from the queue (or set some kind of flag for it being done). This way you dont have to call exec which in most cases is going to be off limits unless the user is running on VPS/Dedicated server where they have root access.
You could also make this a seperate job and have it poll the DB for unprocessed jobs more regularly than the primary job... if necessary.
Still, this begs the question... why use ruby in a php blog/cms app??????
Use exec() to run the ruby interpreter, giving it the path to your ruby script.
http://php.net/manual/en/function.exec.php
I am trying to create a multi threaded PHP application right now. I have read lots of paper that explains how to create multi threading. All of those examples are built on diving the processes on different worker PHP files. Actualy that is also what I am trying to do but there is a problem :)
There are too many jobs even to divide
in 30 seconds (which is the execution time limit)
We are using multi server environment on local network to complete the processes as the processes do not linked to each other or shares the same memory. We just need to fire them up and let them work at an exact time. Each of the processes works for 0.5 secs but it has a possibility to work for 30 secs.
Most of the examples fires up the PHP's and waits for the results. But unfortunately in my situation I dont need to expect a result from the thread. I just need it to execute the command and write the result to its own database.
How can I achieve to fire up the phps and wait for them to work for 10000 processes ?
ADDITIONAL INFO:
I know that PHP neither have multi threading feature nor is built for. But we have to create a way to use it for instance we can send request to http://server1/dothis.php?jobid=5 but standart methods makes us wait for the result. If we can manage to send request to this server without waiting for result it would solve our problem I think or we will need completely different approach such as a process divider with c++ or qt.
As has been pointed out, php doesn't support multi threading. However, and as tomaszsobczak mentioned, there is a library which will let you create "threads" and leave them running, and reconnect to them through other scripts to check their status and so on, called "Gearman".
From the project homepage: "Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates."
Rasmus' blog has a great write up about it here:
playing with gearman and for your case, it might just be the solution, although I've not read any in depth test cases... Would be interested to know though, so if you end up using this, please report back!
As the comments say, multi-threading is not possible in PHP. But based on your comment:
If we can manage to send request to this server without waiting for result it would solve our problem I think
You can start a PHP script to run in the background using exec(), redirecting the script's output somewhere else (e.g. /dev/null). I think that's the best you will get. From the manual:
Note: If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.
There are several notes and pointers in the User Contributed Comments, for example this snippet that allows background execution on both Windows and Linux platforms.
Of course, that PHP script will not share the state, or any data, of the PHP script you are running. You will have to initialize the background script as if you are making a completely new request.
Since PHP does not have the ability to support multithreading, I don't quite know how to advise you. Each time a new script is loaded, a new instance of PHP is loaded, so if your server can handle X many PHP instances, then you can do what you want.
Here is something however:
Code for background execution
<?php
function execInBackground($cmd) {
if (substr(php_uname(), 0, 7) == "Windows"){
pclose(popen("start /B ". $cmd, "r"));
}
else {
exec($cmd . " > /dev/null &");
}
}
?>
Found at http://php.net/manual/en/function.exec.php#86329
Do you really want to have multi-threading in php?
OR do you just want to execute a php script every second? For the latter case, a cronjob-like "execute this file every second" approach via linux console tools should be enough.
If your task is to make a lot of HTTP requests you can use curl multi. There is a good library for doing it: http://code.google.com/p/rolling-curl/
As everyone's already mentioned, PHP doesn't natively support multi-threading and the workarounds are, well, workarounds...
That said, have you heard of the Facebook PHP Compiler? Basically it compiles your PHP to highly optimized C++ and uses g++ to compile it. This opens up a world of opportunities, including but not limited to multi-threading!
The project is open-source and its on github
if you just want to post a HTTP request, just do it using PHP CURL lib. It will solve your issue.
I know about PHP not being multithreaded but i talked with a friend about this: If i have a large algorithmic problem i want to solve with PHP isn't the solution to simply using the "curl_multi_xxx" interface and start n HTTP requests on the same server. This is what i would call PHP style multithreading.
Are there any problems with this in the typical webserver environment? The master request which is waiting for "curl_multi_exec" shouldn't count any time against its maximum runtime or memory length.
I have never seen this anywhere promoted as a solution to prevent a script killed by too restrictive admin settings for PHP.
If i add this as a feature into a popular PHP system will there be server admins hiring a russian mafia hitman to get revenge for this hack?
If i add this as a feature into a
popular PHP system will there be
server admins hiring a russian mafia
hitman to get revenge for this hack?
No but it's still a terrible idea for no other reason than PHP is supposed to render web pages. Not run big algorithms. I see people trying to do this in ASP.Net all the time. There are two proper solutions.
Have your PHP script spawn a process
that runs independently of the web
server and updates a common data
store (probably a database) with
information about the progress of
the task that your PHP scripts can
access.
Have a constantly running daemon
that checks for jobs in a common
data store that the PHP scripts can
issue jobs to and view the progress
on currently running jobs.
By using curl, you are adding a network timeout dependency into the mix. Ideally you would run everything from the command line to avoid timeout issues.
PHP does support forking (pcntl_fork). You can fork some processes and then monitor them with something like pcntl_waitpid. You end up with one "parent" process to monitor the children it spanned.
Keep in mind that while one process can startup, load everything, then fork, you can't share things like database connections. So each forked process should establish it's own. I've used forking for up 50 processes.
If forking isn't available for your install of PHP, you can spawn a process as Spencer mentioned. Just make sure you spawn the process in such a way that it doesn't stop processing of your main script. You also want to get the process ID so you can monitor the spawned processes.
exec("nohup /path/to/php.script > /dev/null 2>&1 & echo $!", $output);
$pid = $output[0];
You can also use the above exec() setup to spawn a process started from a web page and get control back immediately.
Out of curiosity - what is your "large algorithmic problem" attempting to accomplish?
You might be better to write it as an Amazon EC2 service, then sell access to the service rather than the package itself.
Edit: you now mention "mass emails". There are already services that do this, they're generally known as "spammers". Please don't.
Lothar,
As far as I know, php don't work with services, like his concorrent, so you don't have a way for php to know how much time have passed unless you're constantly interrupting the process to check the time passed .. So, imo, no, you can't do that in php :)