Make python scripts run threaded when calling from php - php

I have a PHP webpage, that makes multiple queries to the database and displays the results on charts.
The logic is, there is index.php, where the query can be made. After submitting the data, 6 different PHP pages are called. The PHP pages log the query, run the appropriate Python script and make charts with Javascript. Each of those 6 PHP pages are displayed in index.php in divs. All of the python scripts have the same input and queried against the same database. Difference comes from the data pulled from the database and also the subsequent Javascript to make the charts.
Example of calling one of the PHP pages:
$("#chartFOO").load("http://example/test/get_foo.php? bar=".concat(bar)+"&start=".concat(start)+"&end=".concat(end), function(responseTxt, statusTxt, xhr){
if(statusTxt == "error")
alert("Error: " + xhr.status + ": " + xhr.statusText);
});
Example of calling the Python script:
if ($msisdn) {
$command = escapeshellcmd("/home/example/scripts/graph_foo.py $bar $start $end");
$output = shell_exec($command);
}
And the output is then used in the PHP file, to make charts. All of the PHP files are displayed in divs with different styling on index.php.
The problem is, it doesn't run them on multiple threads and locks up the system, which makes the response time for the query quite slow. Is that right, that only one shell command can be ran at a time?!
I have tried putting all the Python scripts as functions and 6 of the PHP files as strings in one file. Trying to call it all with one command, but so far I have problems with formating the PHP files, I can't use '{}' to format, because the PHP files already contain those. Had the idea to use threading module, to run the functions. And use one connection to the database, to save time from connecting 6 times, because it takes time each time.
Is there any reasonable solution to have the scripts run threaded, without having to rework the whole webpage? How can PHP, Javascript and Python be mixed?
A lot to read and a lot to ask, but thanks for advance for your time.
EDIT:
I created a new file, which basically has all the 6 files in it. But calling the Python scripts is a bit different now. And from index.php only calling this one file now, like I did before with 6 files.
Example of new way:
$part->handles = [
popen("/home/example/scripts/graph_foo.py {$bar} {$start} {$end}", 'r'),
popen("/home/example/scripts/graph_foo2.py {$bar} {$start} {$end}", 'r')
];
And the way I solved the memory issue:
$output0 = '';
while (!feof($part->handles[0])) {
$output0 .= fread($part->handles[0], 32768);
}
$output1 = '';
while (!feof($part->handles[1])) {
$output1 .= fread($part->handles[1], 32768);
}
Don't know if the best way, but works. Don't know PHP well. But it did get 0.5 minutes off the request time, which helps.

Related

how to interract with already-running php cli script from php cgi script

I made a php cli script which is running on loop. I run it from server's terminal(windows). This server also function as a php webserver (xampp).
The php cli script is dealing with hardware i/o stuff (responding and giving logics to a microcontroller board through serial port). Which is always running.
And what i'm trying to accomplish is to make a web-based app (php cgi) to control that cli script. like sending some command to to make it do something.
What i've tried
I have tried using a kind of temporary json file. Which contents is generated by the cgi script.
Then the cli script read that file in every loops. And if there is a change in the json (a timestamp), the script use data inside json to do something accordingly. And then store that timestamp to compare it with the json for next loop.
But this cause a huge load on the server and the cli script become much slower. Which affects the responsiveness of the microcontroller.
the php cli loops is something like this.
<?php
$lastTimestamp=0;
while(true){
//read json file
$json=json_decode(file_get_contents("temp.json"));
if($lastTimestamp < $json->stamp){
//do something with $json->data
...
//update $lastTimestamp
$lastTimestamp = $json->stamp;
}
//rest of microcontroller logic here
...
}
and the temp.json file is something like
{
"stamp": 1557653475,
"data": {
"nd_a": 1,
"nd_b": 1,
"nd_c": 0
}
}
so the question is how to interract with the already running php cli script from the cgi script without using above methods? expecting a better way that is not affecting the server load and performance.
Edit: i also tried using database in place of json file, but the performance is still not good.
This way is inefficient. The simplest way is to make a cron job that executes the cli script.
If you're using a linux system, here is how you setup a cron (on Ubunutu)
https://help.ubuntu.com/community/CronHowto
Then in your cli script, you remove the loop and the job will read the temp.json (which is gonna be done every time the job runs) and compare both timestamps.
Now for keeping the $lastTimestamp I would suggest to create a text file right next to your cli script and store the value of $lastTimestamp in that file.
Then only if there is a difference between the timestamp retrieved from the json and the timestamp in the text file, you overwrite the text file with the new timestamp.
You can use file_put_contents or fopen/fwrite to write into the file.
So your script would be like the following
<?php
$lastTimestamp=0;
$lastTimeStampFilePath = "lastTimeStampFile.txt";
//read text file
$lastTimestamp=file_get_contents($lastTimeStampFilePath);
//read json file
$json=json_decode(file_get_contents("temp.json"));
if($lastTimestamp < $json->stamp){
//do something with $json->data
...
//update $lastTimestamp
file_put_contents($lastTimeStampFilePath, $json->stamp);
}
//rest of microcontroller logic here
Maybe configure the job to run every 2 mins or 5 mins, depending on your requirements.

Start and terminate a background python script with PHP and a timer

What I've to do it's a bit complicated.
I've a python script and I want to run it from PHP, but in background. I had seen somewhere that for run a python script in background I have to use the PHP command exec(script.py) to run without wait for a return: and thus far no problem.
First question:
I have to stop this loop script with another PHP command, how to do this?
Second question:
I have to implement a server-side timer, who stops the script at the end of the time.
I found this code:
<?php
$timer = 60*5; // seconds
$timestamp_file = 'end_timestamp.txt';
if(!file_exists($timestamp_file))
{
file_put_contents($timestamp_file, time()+$timer);
}
$end_timestamp = file_get_contents($timestamp_file);
$current_timestamp = time();
$difference = $end_timestamp - $current_timestamp;
if($difference <= 0)
{
echo 'time is up, BOOOOOOM';
// execute your function here
// reset timer by writing new timestamp into file
file_put_contents($timestamp_file, time()+$timer);
}
else
{
echo $difference.'s left...';
}
?>
From this answer.
Then, there is a way to implement it in a MySQL database? (The integration with the script stop is not a problem)
That's actually pretty simple. You can use a memory object caching system. I would recommend memcached. Memory objects from memcached can be accessed literally from anywhere in your system. The only requirement is that a connection to the memcached backend server is supported. (PHP does, Python does, etc.)
Answer to your first question:
Create a variable called stopme with the value 0 in the memcached database.
Connect from your python script to the memcached database and read the variable stopme permanently. Let's say the python script is running when the variable stopme has the value 0.
In order to stop your script from PHP, make a connection from your PHP script to the memcached server and set stopme to 1.
The python script receives the updated value instantly and exits.
Answer to your second question:
It could be done like explained in my answer before through reading shared variables, but additionally I would like to mention that you also could use a cronjob to kill a running script.

Parse a website, getting all the links and save into mysql database

I'm working on PHP and MySQL along with PHP Simple HTML DOM Parser. I have to parse a website's pages and fetch some content. For that I put the homepage of website as an initial url and fetched all the anchor tags available on that page.
I have to filter those urls as every link is not useful for me. So, I used regular expression. Required links must be saved into my mysql database.
My questions are:
If I extract all the links(around 1,20,000 links) and try to save into mysql DB, I'm getting the following error:
Fatal error: Maximum execution time of 60 seconds exceeded in C:\xampp\htdocs\search-engine\index.php on line 12
I can't store data into database.
I couldn't filter links.
include('mysql_connection.php');
include('simplehtmldom_1_5/simple_html_dom.php');
$website_name="xyz.html/";
$html=file_get_html("xyz.html/");
foreach($html->find('div') as $div)
{
foreach($html->find('a') as $a_burrp)
{
echo $a1 = $a_burrp->href . '<br>';
if(preg_match('/.+?event.+/',$a1, $a_match))
{
mysql_query("INSERT INTO scrap_urls(url, website_name, date_added) VALUES(`$a1`, `$website_name`, now())";
}
}
}
You are receiving Fatal error: Maximum execution time of 60 seconds because of a config limitation in PHP. You can enlarge this number by adding a line like this at the top of your code:
set_time_limit(320);
More info: http://www.php.net/manual/en/function.set-time-limit.php
You can also just enlarge the number in your php.ini file in xampp
Actually, PHP is not the best solution. PHP script is intended to perform quick operations and return response. In your case the script can possibly run for a quite long time. Although you are able to increase max_execution_time, I encourage you to use another technology that is much more flexible than standard PHP, such as Python or JavaScript (Node.js)
I also/ usually work with php scripts that need "some time" to finish.
I always run those scripts either as a cronjob or directly from shell or command line using:
php script.php parameters
Though I don't have to mind the execution.
There is a purpose that php_execution_time is usually set to <=60secs.
Regards.

php - simple cache feature - retrieve data every hour

I want to set up a simple cache feature with php. I want the script to get data from somewhere, but not to do it on every page view, but only every hour.
I know i can have a cron job that runs a php script every hour.
But I was wondering if this can be achieved without cron, just inside the php script that created the page based on the data fetched (or cached). I'm really looking the simplest solution possible. It doesn't have to be accurate
I would use APC as well, but in either case you still need some logic. Basic file cache in PHP:
if (file_exists($cache_file) and time() - filemtime($cache_file) < 3600)
{
$content = unserialize(file_get_contents($cache_file));
}
else
{
$content = your_get_content_function_here();
file_put_contents($cache_file, serialize($content));
}
You only need to serialize/unserialize if $content is not a string (e.g. an array or object).
Why just don't use APC ?
you can do
apc_store('yourkey','yourvalue',3600);
And then you can retrive the content with:
apc_fetch();

check cron job has run script properly - proper way to log errors in batch processing

I have set up a cronjob to run a script daily. This script pulls out a list of Ids from a database, loops through each to get more data from the database and geneates an XML file based on the data retrieved.
This seems to have run fine for the first few days, however, the list of Ids is getting bigger and today I have noticed that not all of the XML files have been generated. It seems to be random IDs that have not run. I have manually run the script to generate the XML for some of the missing IDs individually and they ran without any issues.
I am not sure how to locate the problem as the cron job is definately running, but not always generating all of the XML files. Any ideas on how I can pin point this problem and quickly find out which files have not been run.
I thought perhaps add timestart and timeend fields to the database and enter these values at the start and end of each XML generator being run, this way I could see what had run and what hadn't, but wondered if there was a better way.
set_time_limit(0);
//connect to database
$db = new msSqlConnect('dbconnect');
$select = "SELECT id FROM ProductFeeds WHERE enabled = 'True' ";
$run = mssql_query($select);
while($row = mssql_fetch_array($run)){
$arg = $row['id'];
//echo $arg . '<br />';
exec("php index.php \"$arg\"", $output);
//print_r($output);
}
My suggestion would be to add some logging to the script. A simple
error_log("Passing ID:".$arg."\n",3,"log.txt");
Can give you some info on whether the ID is being passed. If you find that that is the case, you can introduce logging to index.php to further evaluate the problem.
Btw, can you explain why you are using exec() to run a php script? Why not excute a function in the loop. This could well be the source of the problem.
Because with exec I think the process will run in the background and the loop will continue, so you could really choke you server that way, maybe that's worth trying out as well. (I think this also depends on the way of outputting:
Note: If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.
Maybe some other users can comment on this.
Turned out the apache was timing out. Therefore nothing to do with using a function or the exec() function.

Categories