I have read the other questions on SO with a similar title, but that's not what this question is about. I know HOW to execute a PHP script from another PHP script. The problem is, when I do so, it uses far too much CPU. I would like to know how to reduce this.
I have a simple front-controller-like script called index.php. It processes GET requests from a client and depending on the "action" parameter passed, it sends the request to the appropriate file to handle it. For example, this is a client request:
xhttp.open("GET", serverURL + "?action=doSomething" + "&userID=" + user.ID + "&time=" + lastServerTime, true);
index.php has an array that maps the "action" parameter to the appropriate file:
exec('php ' . $url_map[$action] . ' "' . $parameter1 . '"' . ' "' . $parameter2 . '" 2>&1', $output, $return_value);
For testing purposes, I have created a PHP script that does nothing except measure CPU utilisation and dump it to a log file:
<?php
function varDumpToFile($parameter1) {
$file = 'log.txt';
$dump = $parameter1;
$output = print_r($dump, true);
file_put_contents($file, $output, FILE_APPEND | LOCK_EX);
}
varDumpToFile(`ps -eo pcpu,pid,user,args --no-headers| sort -t. -nk1,2 -k4,4 -r |head -n 5`);
?>
This produces a log file that looks like this:
9.0 3123052 user /opt/cpanel/ea-php56/root/usr/bin/php cputest.php 10 147424 1537625595
Clearly, a PHP script shouldn't take 9% of CPU to execute. For comparison, I've run the same script directly accessing it via a GET request:
0.1 3186198 user lsphp:ic_html/dev/php/cputest.php
0.1% is more like it. But why does calling this PHP script from another PHP script use so much CPU? Is it because I have to execute a "new instance" of PHP when I exec PHP, which has a lot of overhead? If so, is there a way to exec a PHP script using an "already running" instance of PHP? Or is there another way of doing this?
I always say "when in doubt, look at PHP source code". In here, for instance. While doing exec, you have to fork the process, create a new stream, read from the input buffer, etc.
And also, while PHP is a compiled language, for the newly forked process, you must run the opcode compiler to generate opcodes (instructions similar to Java bytecode) and then execute those. You can read all about it here. In the end you run the compiler twice, for each fork separately.
Is it worth 9% of your CPU? I have no idea. Maybe. Maybe not. Who knows.
"Better solution"? Upgrade to latest version of PHP. PHP 5.6 is not supported anymore and security updates will cease in 3 months. Even better solution - keep a normal object-oriented and maintainable code without using exec. IMO, it's okay to play around with exec like you are. But if it's your production code, I pray for the souls of those, who would maintain your code after you.
Whatever which way you run your application be mod_php or fpm, they rely on having worker processes ready to manage your request. Process management is built in: they will do their best to keep as many workers idle as you specify and reuse them to avoid exactly this problem, having to fork processes at the least desirable moment.
Not only there's overhead on executing new processes, but the execution environment will be completely different too. If you look into your php configuration there will be several php.ini files, one for each specific environment. This means that one environment could have different modules enabled or different configuration outright. It's not uncommon to have cli scripts max_execution_time or memory_limit set to unlimited. This can affect resource usage on your server, but it's also a pain to maintain.
Also, since your scripts will be running in a brand new process in a different execution environment, this won't have access to some variables (like $_SERVER or $_POST) or capabilities like sending headers.
And there's this thing called shared memory. As #Alex mentions, scripts have to be compiled. If you have opcode cache enabled (which you should) the bytecode gets cached when compiled and this compilation process can be skipped if the resulting bytecode it's there already. For this to work you need to have a persistent running process that can keep this memory around. If you are creating a new process it can't access this shared area and has to do the compilation all by itself.
Related
READ FURTHER BELOW at CLI, FOR THE CLI QUESTION, WHICH JUST ADDED TO THE CONVERSATION! THX!
I have written a script which processes an xml file of around 160'000 entries with 48.1MB and a text file of 150'000 entries with 31.1MB, including some directory searches for external files, heavy interlinking and recursive checks and the result formatted and all saved into html files.
Surely, I did review the program couple times and ended up with the most efficient code I could think of. This is a local program and the generator doesn't need to run regularly. One could argue that I should use an other language than PHP, but PHP with simplexml, etc. just works the best for me and for this purpose. Also a set_time_limit('70000') doesn't bother me.
Although, here my question, is it possible to make the apache2 on my linux system, use my 4 CPU cores running my PHP script?
Even if I split the process and make several request's simultaneously, the CPU usage can't go above 1 CPU at a time.
I googled this topic but couldn't find a solution, so I may have to just run it over night, even though, I would appreciate some help to boost that thing!!!
ADDED INFO - And here a picture of my processes:
CLI:
I need to call my index.php in the linux terminal to execute. But I also wanna send four post variables ($_POST['example']) to the script. On top of that, I am looking for having my echos presented in some output file. Could anyone help quickly with the terminal command and the php command to track those 4 post variables inside:
if (PHP_SAPI === 'cli')
{
// ...
}
? ...sorry but this is my first php-cli interaction. Thx!
No, a single PHP script will never use multiple threads and thus always run on single core.
Depending on how much the things you do depend on each other you couldn't easily split them on multiple threads anyway.
EDIT: Author's response
This is not a solution but a nice workaround. I clone my virtual machine with the linux/apache2 install to kick in the same process but different parts of the file/process on different vm's, which lets the host system apply one core for each virtual system, that way I could break down the process time by around the factor 4. Thanks for your posts!
===============
If it's local, and you want to run it every now and then, you should probably just invoke it from a cron job. That way, you can spawn a process for each task you are doing. If you really do want to use PHP for it, you can even invoke PHP to do it from the cron line.
None the less, it sounds like you're doing an inherently single-threaded process anyway, and if you want it faster, should probably use something that isn't PHP for this.
Maybe you can use Spork! It's a php lib allowing you to fork the php process into multiple ones.
<?php
use Spork\Deferred\DeferredFactory;
use Spork\ProcessManager;
$manager = new ProcessManager(new DeferredFactory());
$manager->fork(function() {
// do something in another process!
})->then(function($output, $status) {
// do something in the parent process when it's done!
});
https://github.com/kriswallsmith/spork
SOLUTION, THX TO ThiefMaster and Zebediah49 recommending cli and my friend who supported me with the links: http://ch.php.net/manual/en/reserved.variables.argv.php / http://ch.php.net/manual/en/function.getopt.php
and here how I call the php through cli:
//whenRunFromCLI
//callCLI
//php index.php './data/xyfullFile1.xml' './data/xxfullFile2.utf' 0 60000
//php index.php './data/xyfullFile1.xml' './data/xxfullFile2.utf' 60000 120000
//php index.php './data/xyfullFile1.xml' './data/xxfullFile2.utf' 120000 all
if (PHP_SAPI === 'cli'){
$_POST['xml'] = $argv[1];
$_POST['example'] = $argv[2];
#$_POST['rangeFrom'] = $argv[3];
#$_POST['rangeTo'] = $argv[4];
}
and the Result of calling the php file in three terminals:
I know, I must give some more RAM to my virtual machine, lucky that I still have 8GB spare ;-)
Cheers and peace!
I have 1 cronjob that runs every 60 minutes but for some reason, recently, it is running slow.
Env: centos5 + apache2 + mysql5.5 + php 5.3.3 / raid 10/10k HDD / 16gig ram / 4 xeon processor
Here's what the cronjob do:
parse the last 60 minutes data
a) 1 process parse user agent and save the data to the database
b) 1 process parse impressions/clicks on the website and save them to the database
from the data in step 1
a) build a small report and send emails to the administrator/bussiness
b) save the report into a daily table (available in the admin section)
I see now 8 processes (the same file) when I run the command ps auxf | grep process_stats_hourly.php (found this command in stackoverflow)
Technically I should only have 1 not 8.
Is there any tool in Cent OS or something I can do to make sure my cronjob will run every hour and not overlapping the next one?
Thanks
Your hardware seems to be good enough to process this.
1) Check if you already have hanging processes. Using the ps auxf (see tcurvelo answer), check if you have one or more processes that takes too much resources. Maybe you don't have enough resources to run your cronjob.
2) Check your network connections:
If your databases and your cronjob are on a different server you should check whats the response time between these two machines. Maybe you have network issues that makes the cronjob wait for the network to send the package back.
You can use: Netcat, Iperf, mtr or ttcp
3) Server configuration
Is your server is configured correctly? Your OS, MySQL are setup correctly? I would recommend to read these articles:
http://www3.wiredgorilla.com/content/view/220/53/
http://www.vr.org/knowledgebase/1002/Optimize-and-disable-default-CentOS-services.html
http://dev.mysql.com/doc/refman/5.1/en/starting-server.html
http://www.linux-mag.com/id/7473/
4) Check your database:
Make sure your database has the correct indexes and make sure your queries are optimized. Read this article about the explain command
If a query with few hundreds thousands of record takes times to execute that will affect the rest of your cronjob, if you have a query inside a loop, even worse.
Read these articles:
http://dev.mysql.com/doc/refman/5.0/en/optimization.html
http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/
http://blog.fedecarg.com/2008/06/12/10-great-articles-for-optimizing-mysql-queries/
5) Trace and optimized PHP code?
Make sure your PHP code runs as fast as possible.
Read these articles:
http://phplens.com/lens/php-book/optimizing-debugging-php.php
http://code.google.com/speed/articles/optimizing-php.html
http://ilia.ws/archives/12-PHP-Optimization-Tricks.html
A good technique to validate your cronjob is to trace your cronjob script:
Based on your cronjob process, put some debug trace including how much memory, how much time it took to execute the last process. eg:
<?php
echo "\n-------------- DEBUG --------------\n";
echo "memory (start): " . memory_get_usage(TRUE) . "\n";
$startTime = microtime(TRUE);
// some process
$end = microtime(TRUE);
echo "\n-------------- DEBUG --------------\n";
echo "memory after some process: " . memory_get_usage(TRUE) . "\n";
echo "executed time: " . ($end-$start) . "\n";
By doing that you can easily find which process takes how much memory and how long it takes to execute it.
6) External servers/web service calls
Is your cronjob calls external servers or web service? if so, make sure these are loaded as fast as possible. If you request data from a third-party server and this server takes few seconds to return an answer that will affect the speed of your cronjob specially if these calls are in loops.
Try that and let me know what you find.
The ps's output also shows when the process have started (see column STARTED).
$ ps auxf
USER PID %CPU %MEM VSZ RSS TTY STAT STARTED TIME COMMAND
root 2 0.0 0.0 0 0 ? S 18:55 0:00 [ktrheadd]
^^^^^^^
(...)
Or you can customize the output:
$ ps axfo start,command
STARTED COMMAND
18:55 [ktrheadd]
(...)
Thus, you can be sure if they are overlapping.
You should use a lockfile mechanism within your process_stats_hourly.php script. Doesn't have to be anything overly complex, you could have php write the PID which started the process to a file like /var/mydir/process_stats_hourly.txt. So if it takes longer than an hour to process the stats and cron kicks off another instance of the process_stats_hourly.php script, it can check to see if the lockfile already exists, if it does it will not run.
However you are left with the problem of how to "re-queue" the hourly script if it did find the lock file and couldn't start.
You might use strace -p 1234 where 1234 is a relevant process id, on one of the processes which is running too long. Perhaps you'll understand why is it so slow, or even blocked.
Is there any tool in Cent OS or something I can do to make sure my cronjob will run every hour and not overlapping the next one?
Yes. CentOS' standard util-linux package provides a command-line convenience for filesystem locking. As Digital Precision suggested, a lockfile is an easy way to synchronize processes.
Try invoking your cronjob as follows:
flock -n /var/tmp/stats.lock process_stats_hourly.php || logger -p cron.err 'Unable to lock stats.lock'
You'll need to edit paths and adjust for $PATH as appropriate. That invocation will attempt to lock stats.lock, spawning your stats script if successful, otherwise giving up and logging the failure.
Alternatively your script could call PHP's flock() itself to achieve the same effect, but the flock(1) utility is already there for you.
How often is that logfile rotated?
A log-parsing job suddenly taking longer than usual sounds like the log isn't being rotated and is now too big for the parser to handle efficiently.
Try resetting the logfile and see if the job runs faster. If that solves the problem, I recommend logrotate as a means of preventing the problem in the future.
You could add a step to the cronjob to check the output of your above command:
ps auxf | grep process_stats_hourly.php
Keep looping until the command returns nothing, indicating that the process isn't running, then allow the remaining code to execute.
I created a script that runs in the background using the ignore_user_abort() function. However, I was foolish enough not to insert any sort of code to make the script stop and now it is sending e-mails every 30 seconds...
Is there any way to stop the script? I am in a shared hosting, so I don't have access to the command prompt, and I don't know the PID.
Is there any way to stop the script? I am in a shared hosting, so I don't have access to the command prompt, and I don't know the PID.
Then no.
But are you sure you don't have any shell access? Even via PHP? If you do, you could try....
<?php
print `ps -ef | grep php`;
...and if you can identify the process from that then....
<?php
$pid=12345; // for example.
print `kill -9 $pid`;
And even if you don't have access to run shell commands, you may be able to find the pid in /proc (on a linux system) and terminate it using the POSIX extension....
<?php
$ps=glob('/proc/[0-9]*');
foreach ($ps as $p) {
if (is_dir($p) && is_writeable($p)) {
print "proc= " . basename($p);
$cmd=file_get_contents($p . '/cmdline');
print " / " . file_get_contents($p . '/cmdline');
if (preg_match('/(php).*(myscript.php)/',$cmd)) {
posix_kill(basename($p), SIGKILL);
print " xxxxx....";
break;
}
print "\n";
}
}
I came to this thread Yesterday! I by mistake had a infinite loop in a page which was not supposed to be visited and that increased my I/O to 100 and CPU usage to 100 I/O was because of some php errors and it was getting logged and log file size was increasing beyond anyone can think.
None of the above trick worked on my shared hosting.
MY SOLUTION
In cPanel, go to PHP Version (except that of current)
Select any PHP Version for time being.
And then Apply Changes.
REASON WHY IT WORKED
The script which had infinite loop with some php errors was a process so I just needed to kill it, changing php version reinforce restart of services like php and Apache, and as restart was involved earlier processes were killed, and I was relaxed as I/O and CPU usage stabilized. Also, I fixed that bug before hand changing the php version :)
how did you deploy the script? surely you can just remove it (if that's an acceptable option). otherwise modify it and insert some logic to only allow it to send a mail once every n mins/hours/days based on the server time?
re. stopping the script from executing (or rather the system trying to execute it) how did you schedule it for execution? is it some type of gui to a crontab or something? can you not just undo what you did there (seeing as you have no access to the command line/terminal)?
rob ganly
Simply .
Call the support, get it cancelled.
Next time, don't execute something you can't control.
I am coding a PHP-scripted web page that is intended to accept the filename of a JFFS2 image which was previously uploaded to the server. The script is to then re-flash a partition on the server with the image, and output the results. I had been using this:
$tmp = shell_exec("update_flash -v " . $filename . " 4 2>&1");
echo '<h3>' . $tmp . '</h3>';
echo verifyResults($tmp);
(The verifyResults function will return some HTML that indicates to the user whether the update command completed successfully. I.e., in the case that the update completes successfully, display a button to restart the device, etc.)
The problem with this is that the update command takes several minutes to complete, and the PHP script blocks until the shell command is complete before it returns any of the output. This typically means that the update command will continue running, while the user will see an HTTP 504 error (at worst) or wait for the page to load for several minutes.
I was thinking about doing something like this instead:
shell_exec("rm /tmp/output.txt");
shell_exec("update_flash -v " . $filename . " 4 2>&1 >> /tmp/output.txt &");
echo '<div id="output"></div>';
echo '<div id="results"></div>';
This would theoretically put the command in the background and append all output to /tmp/output.txt.
And then, in a Javascript function, I would periodically request getOutput.php, which would simply print the contents of /tmp/output.txt and stick it into the "output" div. Once the command is completely done, another Javascript function would process the output and display a result in the "results" div.
But the problem I see here is that getOutput.php will eventually become inaccessible during the process of updating the device's flash memory, because it's on the partition to which is targeted for an update. So that could leave me in the same position as before, albeit without the 504 or a seemingly eternally-loading page.
I could move getOutput.php to another partition in the device, but then I think I would still have to do some funky stuff with the webserver configuration to be able to access it there (a symlink to it from the webroot would, like any other file, eventually be overwritten during the re-flash).
Is there any other way of displaying the output of the command as it runs, or should I just make do with the solution I have?
Edit 1: I'm currently testing some solutions. I'll update my question with results later.
Edit 2: It seems that the filesystem does not get overwritten as I had originally thought. Instead, the system seems to mount the existing filesystem in read-only mode, so I can still access getOutput.php even after the filesystem is re-flashed.
The second solution I described in my question does seem to work in addition with using popen (as mentioned in an answer below) instead of shell_exec. The page loads, and via Ajax I can display the contents of output.txt.
However, it seems that output.txt does not reflect the output from the re-flash command in real time-- it seems to display nothing until the update command returns from execution. I will need to do further testing to see what's going on here.
Edit 3: Never mind, it looks like the file is current as I access it. I was just hitting a delay while the kernel did some JFFS2-related tasks triggered by my use of the partition on which the source JFFS2 image is stored. I don't know why, but this apparently causes all PHP scripts to block until it's done.
To work around that, I'm going to put the update command invocation in a separate script and request it via Ajax-- that way, the user will at least receive some prepackaged feedback while technically still waiting on the system.
Look at the popen: http://it.php.net/manual/en/function.popen.php
Interesting scenario.
My first thought was to do something regarding proc_* and $_SESSION, but I'm not sure if that will work or not. Give it a try, but if not...
If you're worried about the file being flashed during the process, you could always instantiate a mysql database in the secondary process and write to that. The database can exist on another partition, and you can address it by local ip and the system will take care of the routing.
Edit
When I mentioned proc_* with sessions, I meant something similar to this where $descriptorspec would become:
$_SESSION = array(
1 => array("pipe", "w"),
);
However I kind of doubt that will work. The process will end up writing to the $_SESSION in memory which no longer exists once the first script is killed.
Edit 2
ACTUALLY, on that note, you could install memcache and have your secondary process write directly to memory, which can then be re-read by your web-interfaced process.
If you wipe the DocRoot there is no resource/script that can respond to requests from the user during this time. Therefore you have to send updates to the user in the same request that does the wipe. This requires you to start the shell process and immediately return to PHP. This can be accomplished with pcntl_fork() and pcntl_exec(). Your PHP script should now continuously send the output of the shell script to the client. If the shell script appends to a file in /tmp, you could fpassthru() that file and clear it until the shell script ends.
Regarding your However:
My guess is you are trying to use the file as a stream. I haven't done any production tests, but I believe that the file will only be written back to disk on fclose().
If you are writing to the file continually in script #2, those writes are actually going directly into memory until the file is closed.
Again - I cannot verify this, but if you want to test it, try re-opening and closing the file for every write. This will confirm or deny my theory and you can modify your approach accordingly.
We are running PHP on a Windows server (a source of many problems indeed, but migrating is not an option currently). There are a few points where a user-initiated action will need to kick off a few things that take a while and about which the user doesn't need to know if they succeed or fail, such as sending off an email or making sure some third-party accounts are updated. If I could just fork with pcntl_fork(), this would be very simple, but the PCNTL functions are not available in Windows.
It seems the closest I can get is to do something of this nature:
exec( 'php-cgi.exe somescript.php' );
However, this would be far more complicated. The actions I need to kick off rely on a lot of context that already will exist in the running process; to use the above example, I'd need to figure out the essential data and supply it to the new script in some way. If I could fork, it'd just be a matter of letting the parent process return early, leaving the child to work on a few more things.
I've found a few people talking about their own work in getting various PCNTL functions compiled on Windows, but none seemed to have anything available (broken links, etc).
Despite this question having practically the same name as mine, it seems the problem was more execution timeout than needing to fork. So, is my best option to just refactor a bit to deal with calling php-cgi, or are there other options?
Edit: It seems exec() won't work for this, at least not without me figuring some other aspect of it, as it waits until the call returns. I figured I could use START, sort of like exec( 'start php-cgi.exe somescript.php' );, but it still waits until the other script finishes.
how about installing psexec and use the -d (don't wait) option
exec('psexec -d php-cgi.exe somescript.php');
Get PSExec and run the command:
exec("psexec -d php-cgi.exe myfile.php");
PSTools are a good patch in, but I'll leave this here:
If your server runs windows 10 and it has the latest updates, you can install a Linux subsystem, which has its own Kernel that supports native forking.
This is supported by Microsoft officially.
Here's a good guide on how to do it.
Once you've installed the subsystem itself, you need to install php on the subsystem.
Your windows "c:\" drive can be found under "/mnt/c", so you can run your php from the subsystem, which supports forking (and by extension the subsystem's php can use pcntl_fork).
Example: php /mnt/c/xampp/htdocs/test.php
If you want to run the subsystem's php directly from a windows command line you can simply use the "wsl" command.
Assuming you're running this from under "C:\xampp\htdocs\"
Example: wsl php main.php
The "wsl" command will resolve the path for you, so you don't need to do any dark magic, if you call the command under c:\xampp\htdocs, the subsystem will resolve it as "/mnt/c/xampp/htdocs/".
If you're running your server as an apache server, you don't really need to do anything extra, just stop the windows apache server and start the linux one and you're done.
Obviously you'll need to install all the missing php modules that you need on the subsystem.
You can create a daemon/background process to run the code (e.g. sending emails) and the request would just have to add items to the queue, let the deamon do the heavy lifting.
For example, a file send_emails.bat:
cls
C:\PHP533\php.exe D:\web\server.php
exit
open windows task scheduler, and have the above send_emails.bat run every 30 minutes. Make sure only one instance runs at a time or you might run each task in multiples, or send each email twice. I say 30 minutes in case something breaks temporarily (memory issues, database unavailable, etc), it will re-start every 30 minutes rather than having a never ending process that just stops. The following is a skeleton daemon... not complete or tested I am just typing out an example:
<?php
set_time_limit(60*30); // don't run
$keepgoing = true;
$timeout = time()+ 60*29; // 29 minutes
while(time() < $timeout)
{
// grab emails from database
$result = $db->query('select subject, body, to_email FROM email_queue');
if($result->num_rows == 0)
{
sleep(10); // so we are not taxing the database
}
else
{
while($row = $result->fetch_assoc())
{
// send email
}
}
}
exit;
?>
Finally you just need the request to add the item to the queue in a database, and let the daemon handle the heavy lifting.
$db->query('insert into email_queue(to,subject,body) values ('customer#email.com','important email','<b>html body!</b>');