trying to fix a crontab file duplicate by PID table

trying to fix a crontab file duplicate by PID table - php

I'm trying to develop a crontab task that every 5 seconds check my email. Normally I could request it every 1 minute instead of 5 seconds, but reading some other posts with no solution, I found one with the same problem than me. The script, after a period of time, was stopping. This is not a real problem cause I can configure a crontab task and make sleep(5) Also I have the same 1and1 server as the other question, which I'm including here.
PHP script stops running arbitrarily with no errors
The real problem I had when I tried to solve this via crontab, every minute a new PID was created, so in an hour I could get almost 50 process at the same time doing the same.
Here I include the .php file called by crontab every minute:
date_default_timezone_set('Europe/Madrid');
require_once ( $_SERVER['DOCUMENT_ROOT'] . '/folder1/path.php' );
require_once ( CLASSES . 'Builder.php');
$UIModules = Builder::getUIModules();
$UIModules->getfile();
So I found a solution by checking the PID table. The idea is if in the PID table are running 2 process, then that means the last proccess is still working, so just finish doing anything. If in the PID table there's just 1 process running, that means the latest process that was working has expired so we can use this new one. The way is something like I show on the next code:
$var_aux = exec("ps -A | grep php");
if (!isarray($var_aux)){
date_default_timezone_set('Europe/Madrid');
require_once ( $_SERVER['DOCUMENT_ROOT'] . '/folder1/path.php' );
require_once ( CLASSES . 'Builder.php');
$UIModules = Builder::getUIModules();
$UIModules->getfile();
}
I'm not sure about the condition isarray($var_aux) cause $var_aux always returns me the last PID process, so it returns a string of 28 characters, but in this case we want to return more than a process so the condition could even change to if (strlen($var) < 34). Note: I've given more margin to the len, cause sometime process take longer than 9999, so it's 1 lenght more.
The main problem I found on this is the exec sentence just print me the last process, in other words, it always returns me a string with a lenght of 28 (The PID for that script).
I don't know if what I've purposed is a crazy idea, but is it possible to get all the PID table with php?

You can use a much simpler solution than emulating crontab in php: use contab
make multiple entries to check every 5 seconds an then call your php program.
A good description of how to set up crontab to perform subminute action can be found here:
https://usu.li/how-to-run-a-cron-job-every-x-seconds
This solution only requires the maximum of 12 processes running every minute.

Related

PHP exec() function only runs extremely short Python scripts

I'm having some trouble using the PHP exec() function. Whenever the script I'm attempting to run is short, exec() works just fine. But if the script takes any more than a second or so, it fails. Note, I've attempted run the long script manually on the command line, and it works just fine. It seems as though the PHP interpreter is killing my external script if it takes any longer than a second or so to run. Any thoughts or suggestions? Here is my code:
<?php
$fileName = "foobar.docx";
$argVar = $fileName;
exec("python3 /var/www/html/jan8/alexandrina.py /var/www/html/jan8/$argVar");
echo "$output";
?>
And here is my script:
#!/usr/bin/env python3
import docx
import sys
docxFile = "".join(sys.argv[1:])
# The Three Lines Below Create a New Variable "htmlFile"
# "htmlFile" is the same as "docxFile", except ".docx" is cut off
# and replaced with ".html"
myNumber = len(docxFile) - 5
htmlFile = docxFile[0:myNumber]
htmlFile = htmlFile + '.html'
def generateHTML(filename):
doc = docx.Document(filename)
fullText = []
for para in doc.paragraphs:
fullText.append('<p>')
fullText.append(para.text)
fullText.append('</p>')
fullText.append('\n')
return '\n'.join(fullText)
file = open(htmlFile, "w")
file.write(generateHTML(docxFile))
file.close()
print("end python script")
Additional notes: I've increased the max execution time limits in php.ini, but I don't think that should matter, as the "long script" should only take a few seconds to run. Also, when I refer to the "short script", and the "long script", I'm actually referring to the same script. The difference between the "long script" and the "short script" is just the time to execute as it may vary depending on the size of the file I'm asking the script to process. Anyway... any suggestions would really be appreciated!

Ordinarily, php exec function should block until the command you run has completed. I.e., the PHP script will halt, waiting for the command to finish until continuing with the rest of your script. I was half thinking that your server was experiencing a max_execution_time timeout, but you've clearly stated that even just a couple of seconds is too long and even these fairly short scripts are having trouble.
A couple of solutions occur to me. The simplest one is to alter the python command so that a) any output is routed to a file or output stream and b) the process is run in the background. According to the docs on exec:
If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.
I also would like you to make use of the two additional optional parameters for the exec function.
$fileName = "foobar.docx";
$argVar = $fileName;
$cmd = "python3 /var/www/html/jan8/alexandrina.py /var/www/html/jan8/$argVar";
// modify your command to toss output, background the process, and output the process id
$cmd_modified = $cmd . " >/dev/null & echo \$!";
$cmd_output = NULL; // this will be an array of output
$cmd_return_value = NULL; // this will be the return value of the script
exec($cmd_modified, $cmd_output, $cmd_return_value);
echo "exec has completed</br>";
echo "output:<br>" . print_r($cmd_output, TRUE) . "<br>";
echo "return value: " . print_r($cmd_return_value, TRUE);
This may help or may not. If it does not, we still might be able to solve the problem using posix commands.
EDIT: according to crispytx, the long scripts are resulting in a $cmd_return_val of 1 which means an error is happening. Try changing this one line:
$cmd_modified = $cmd . " >/dev/null & echo \$!";
to this
$cmd_modified = $cmd . " & echo \$!";
And let us know what the output of $cmd_output is -- it should at the very least have the process id of the newly spawned process.

Thanks for all the help S. Imp. I had a little trouble debugging using your suggestions because I happened to be using AJAX to call the script. However, I wrote simpler script using your suggestions to try and debug the problem and this is what I found:
Array ( [0] => Traceback (most recent call last): [1] => File "/var/www/html/jan8/alexandrina.py", line 28, in [2] => file.write(generateHTML(docxFile)) [3] => UnicodeEncodeError: 'ascii' codec can't encode character '\u2026' in position 25: ordinal not in range(128) )
So it looks like the problem has to do with ascii encoding! Even though the larger file was just a docx file with the same text as the shorter docx file repeated over and over again for 300 pages. It seems that if a docx file exceeds 1 pages, ascii characters are inserted that aren't present in single page docx files. I have no idea if this post will ever end up helping anyone, but who knows!
[SOLVED]

PHP Parallel processing for a Metasearch Engine

I have developed a metasearch engine and one of the optimisations I would like to make is to process the search APIs in parallel. Imagine that results are retrieved from Search Engine A in 0.24 seconds, SE B in 0.45 Seconds and from SE C in 0.5 seconds. With other overheads the metasearch engine can return aggregated results in about 1.5 seconds, which is viable. Now what I would like to do is to send those requests in parallel rather than in series, as at present, and get that time down to under a second. I have investigated exec, forking, threading and all, for various reasons, have failed. Now I have only spent a day or two on this so I may have missed something. Ideally i would like to implement this on a WAMP stack on my development machine (localhost) and see about implementing on a Linux webserver thereafter. Any help appreciated.
Let's take a simple example: say we have two files we want to run simultaneously. File 1:
<?php
// file1.php
echo 'File 1 - Test 1'.PHP_EOL;
$sleep = mt_rand(1, 5);
echo 'Start Time: '.date("g:i:sa").PHP_EOL;
echo 'Sleep Time: '.$sleep.' seconds.'.PHP_EOL;
sleep($sleep);
echo 'Finish Time: '.date("g:i:sa").PHP_EOL;
?>
Now, imagine file two is the same... the idea is that if run in parallel the command line output for the times should be the same, for example:
File 1 - Test 1
Start Time: 9:30:43am
Sleep Time: 4 seconds.
Finish Time: 9:30:47am
But whether I use exec, popen or whatever, I just cannot get this to work in PHP!

I would use socket_select(). Doing so, only the connection time would be cummulative as you can read from the sockets in parralel. This will give you a big performance boost.

There is one viable approach. Make a cli php file that gets in arguments what it have to do and returns whatever result is produced serialized.
In your main app you may popen as many of these workers as you need and then in a simple loop collect the outputs:
[edit] I used your worker example, just had to chmod +x and add a #!/usr/bin/php line on top:
#!/usr/bin/php
<?php
echo 'File 1 - Test 1'.PHP_EOL;
$sleep = mt_rand(1, 5);
echo 'Start Time: '.date("g:i:sa").PHP_EOL;
echo 'Sleep Time: '.$sleep.' seconds.'.PHP_EOL;
sleep($sleep);
echo 'Finish Time: '.date("g:i:sa").PHP_EOL;
?>
also modified the run script a little bit - ex.php:
#!/usr/bin/php
<?php
$pha=array();
$res=array();
$pha[1]=popen("./file1.php","r");
$res[1]='';
$pha[2]=popen("./file2.php","r");
$res[2]='';
while (list($id,$ph)=each($pha)) {
while (!feof($ph))
$res[$id].=fread($ph,8192);
pclose($ph);
}
echo $res[1].$res[2];
here is the result, when tested in cli (its the same when ex.php is called from web, but paths to file1.php and file2.php should be fixed):
$ time ./ex.php
File 1 - Test 1
Start Time: 11:00:33am
Sleep Time: 3 seconds.
Finish Time: 11:00:36am
File 2 - Test 1
Start Time: 11:00:33am
Sleep Time: 4 seconds.
Finish Time: 11:00:37am
real 0m4.062s
user 0m0.040s
sys 0m0.036s
As seen in the result one script takes 3 seconds to execute and the other takes 4. Both run for 4 seconds together in parallel.
[end edit]
In this way the slow operation will run in parallel, you will only collect the result in serial.
Finally it will take (the slowest worker time)+(time for collecting) to execute. Since the time for collecting the results and time to unserialize, etc., may be ignored you get all data for the time of the slowest request.
As a side note you may try to use the igbinary serialiser that is much faster than the built-in one.
As noted in comments:
worker.php is executed outside of the web request and you have to pass all its state via arguments. Passing arguments may also be a problem to handle all escaping, security and etc., so not-effective but simple way is to use base64.
A major drawback in this approach is that it is not easy to debug.
It can be further improved by using stream_select instead of fread and also collecting data in parallel.

php timeout - set_time_limit(0); - don't work

I'm having a problem with my PHP file that takes more than 30 seconds to execute.
After searching, I added set_time_limit(0); at the start of the code,cbut the file still times out with a 500 error after 30 seconds.
log: PHP Fatal error: Maximum execution time of 30 seconds exceeded in /xxx/xx/xxx.php
safe-mode : off

Check the php.ini
ini_set('max_execution_time', 300); //300 seconds = 5 minutes
ini_set('max_execution_time', 0); //0=NOLIMIT

This is an old thread, but I thought I would post this link, as it helped me quite a bit on this issue. Essentially what it's saying is the server configuration can override the php config. From the article:
For example mod_fastcgi has an option called "-idle-timeout" which controls the idle time of the script. So if the script does not output anything to the fastcgi handler for that many seconds then fastcgi would terminate it. The setup is somewhat like this:
Apache <-> mod_fastcgi <-> php processes
The article has other examples and further explanation. Hope this helps somebody else.

I usually use set_time_limit(30) within the main loop (so each loop iteration is limited to 30 seconds rather than the whole script).
I do this in multiple database update scripts, which routinely take several minutes to complete but less than a second for each iteration - keeping the 30 second limit means the script won't get stuck in an infinite loop if I am stupid enough to create one.
I must admit that my choice of 30 seconds for the limit is somewhat arbitrary - my scripts could actually get away with 2 seconds instead, but I feel more comfortable with 30 seconds given the actual application - of course you could use whatever value you feel is suitable.
Hope this helps!

ini_set('max_execution_time', 300);
use this

Checkout this, This is from PHP MANUAL, This may help you.
If you're using PHP_CLI SAPI and getting error "Maximum execution time of N seconds exceeded" where N is an integer value, try to call set_time_limit(0) every M seconds or every iteration. For example:
<?php
require_once('db.php');
$stmt = $db->query($sql);
while ($row = $stmt->fetchRow()) {
set_time_limit(0);
// your code here
}
?>

I think you must say limit time to execution to php , try this.
ini_set('max_execution_time', 0);

Pause-Continue reading large text file with php

I have the below PHP code. I want to be able to continue reading the text file from the point it stopped, and the text file is over 90mb.
Is it possible to continue reading from the point the script stopped running?
$in = fopen('email.txt','r');
while($kw = trim(fgets($in))) {
//my code
}

No, that's not easily possible without saving the current state from time to time.
However, instead of doing that you should better try to fix whatever causes your script to stop. set_time_limit(0); and ignore_user_abort(true); will most likely prevent your script from being stopped while it's running.
If you do want to be able to continue from some position, use ftell($in) to get the position and store it in a file/database from time to time. When starting the script you check if you have a stored position and then simply fseek($in, $offset); after opening the file.
If the script is executed from a browser and it takes enough time to make aborts likely, you could also consider splitting it in chunks and cleanly terminating the script with a redirect containing an argument where to continue. So your script would process e.g. 1000 lines and then be restarted with an offset of 1000 to process the next 1000 lines and so on.

How to keep file to 1000 lines with Linux or PHP?

I have a file that I'm using to log IP addresses for a client. They want to keep the last 500 lines of the file. It is on a Linux system with PHP4 (oh no!).
I was going to add to the file one line at a time with new IP addresses. We don't have access to cron so I would probably need to make this function do the line-limit cleanup as well.
I was thinking either using like exec('tail [some params]') or maybe reading the file in with PHP, exploding it on newlines into an array, getting the last 1000 elements, and writing it back. Seems kind of memory intensive though.
What's a better way to do this?
Update:
Per #meagar's comment below, if I wanted to use the zip functionality, how would I do that within my PHP script? (no access to cron)
if(rand(0,10) == 10){
shell_exec("find . logfile.txt [where size > 1mb] -exec zip {} \;")
}
Will zip enumerate the files automatically if there is an existing file or do I need to do that manually?

The fastest way is probably, as you suggested, to use tail:
passthru("tail -n 500 $filename");
(passthru does the same as exec only it outputs the entire program output to stdout. You can capture the output using an output buffer)
[edit]
I agree with a previous comment that a log rotate would be infinitely better... but you did state that you don't have access to cron so I'm assuming you can't do logrotate either.

logrotate
This would be the "proper" answer, and it's not difficult to set this up either.

You may get the number of lines using count(explode("\n", file_get_contents("log.txt"))) and if it is equal to 1000, get the substring starting from the first \n to the end, add the new IP address and write the whole file again.
It's almost the same as writing the new IP by opening the file in a+ mode.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

trying to fix a crontab file duplicate by PID table - php

Related

PHP exec() function only runs extremely short Python scripts

PHP Parallel processing for a Metasearch Engine

php timeout - set_time_limit(0); - don't work

Pause-Continue reading large text file with php

How to keep file to 1000 lines with Linux or PHP?

Categories

Resources