Big XML file import to database PHP - php

I am facing a problem that somehow I don't see the solution to it. I have a XML file that needs to be importet to custom DB structure, when user uploads / imports the file the ajax post is waiting untill the file import is finished, but this could take 5 hours or more I don't know. What is the best way to handle this UI issue.
I was thinkg about thread uplaod, to split the file in multiple parts and upload each with it's own thread (pthreads, having problems with instalation on centos 7 / PHP 7)
Or if there is any other way that I could import the file in the background and whenerever the user refreshes the page there would be a status log output so that user would know when the import is finished and if successful.

You would want to run them using a background job ( a detached process ) this way the end user gets a confirmation message right away, and then send an email when the long running task is complete. Then they don't have to wait for it to finish. As I mentioned in the comments I have a class I wrote on my git hub for this
https://github.com/ArtisticPhoenix/MISC/blob/master/BgProcess.php
But it passes the args as a path because it's setup for Code Igniter, so you would have to change that or split the arguments up within your code.
Anyway the basics is similar to running a cron job, This varies in the implantation depending on the OS of the server. But on Linux the command is like this
php -f "path/to/phpfile.php" "{args}" > /dev/null &
The > /dev/null & part sends the output to null ( throws it away ) and the & runs it as a non-blocking process meaning the script starting the command can continue on. So using an example as this
.. other code before starting background job ..
exec( 'php -f "path/to/phpfile/xmlProcessor.php" "testXML/2" > /dev/null &');
.. code to tell user job is started .. this runs right after the call without waiting for that process to finish.
Then in xmlProcessor.php you would have this
<?php
$args = explode('/', $argv[1]);
$file = $ags[0];
$user_id = $args[1];
... code to process xml
... email user confirmation of completion
http://php.net/manual/en/reserved.variables.argv.php
As I said typically would call it this way,
exec( 'php -f "path/to/phpfile/xmlProcessor.php" "testXML" "2" > /dev/null &');
And access them using
$argv[1] // = testXML
$argv[2] // = 2
But because I use this with CI, it does it's routing for me to a special controller and handles all that. The nice thing about my class is that it should find the PHP executable under most cases, and it has windows compatibility built in ( which was a pain in the ...)
Using that class you would just call it like this
$command = new BgProcess( "path/to/phpfile/xmlProcessor.php", "testXML", 2);
echo $command;
Would output 'php -f "path/to/phpfile/xmlProcessor.php" "testXML/2" > /dev/null &' after starting the process ( the return is just for debugging )
Basically your running a separate background job with PHP via the command line.

Related

Run script on other Server

I have 2 websites, hosted on 2 different servers. They are kind of interlinked. Sometimes I just do stuff on Website-1 and run a script on Website-2. Like I edited something on Website-1 and now I want to run a script on Website-2 to update accordingly on it's server.
Till now I am using following code on website 1.
$file = file_get_contents('Website-2/update.php');
But the problem with this is that my Website-1 server script stops running and wait for the file to return some data. And I don't wanna do anything with that data. I just wanted to run the script.
Is there a way where I can do this in a better way or tell PHP to move to next line of code.
If you want to call the second site without making your user wait for a response,
I would recommend using a message queue.
Site 1 request would put a message to the queue.
Cron job to check queue and run update on site 2 when message exists.
Common queues apps to look at:
[https://aws.amazon.com/sqs/?nc2=h_m1][1]
[https://beanstalkd.github.io/][2]
[https://www.iron.io/mq][3]
[1]: https://aws.amazon.com/sqs/?nc2=h_m1
[2]: https://beanstalkd.github.io/
[3]: https://www.iron.io/mq
What you're trying to achieve is called a web hook and should be implemented with proper authentication, so that not anybody can execute your scripts at any time and overload your server.
On server 2 you need to execute your script asynchronously via workers, threads, message queues or similar.
You can also run the asynchronous command on your server 1. There are many ways to achieve this. Here are some links with more on this.
(Async curl request in PHP)
(https://segment.com/blog/how-to-make-async-requests-in-php/)
Call your remote server as normal. But, In the PHP script you normally call, Take all the functionality and put it in a third script. Then from the old script call the new one with (on Linux)
exec('php -f "{path to new script}.php" $args > /dev/null &');
The & at the end makes this a background or non-blocking call. Because you call it from the remote sever you don't have to change anything on the calling server. The php -f runs a php file. The > /dev/null sends the output from that file to the garbage.
On windows you can use COM and WScript.Shell to do the same thing
$WshShell = new \COM('WScript.Shell');
$oExec = $WshShell->Run('cmd /C php {path to new script}.php', 0, false);
You may want to use escapeshellarg on the filename and any arguments supplied.
So it will look like this
Server1 calls Server2
Script that was called (on Server2) runs exec and kicks off a background job (Server2) then exits
Server1 continues as normal
Server2 continues the background process
So using your example instead of calling:
file_get_contents('Website-2/update.php');
You will call
file_get_contents('Website-2/update_kickstart.php');
In update_kickstart.php put this code
<?php
exec('php -f "{path}update.php" > /dev/null &');
Which will run update.php as a separate background (non-blocking) call. Because it's non-blocking update_kickstart.php will finish and return to searver1 which can go about it's business and update.php will run on server2 independantly
Simple...
The last note is that file_get_contents is a poor choice. I would use SSH and probably PHPSecLib2.0 to connect to server2 and run the exec command directly with a user that has access only to that file(Chroot it or something similar). As it is anyone can call that file and run it. With it behind a SSH login it's protected, with it Chrooted that "special" user can only run that one file.

PHP Recurring Operation

For an iOS Push Notification server, I am implementing a web service that checks a feed on the net for a particular price.
Therefore I need my PHP to keep checking a price (every 20 seconds or so) and check values.
I was wondering (forgive my ignorance I just started with PHP today) is the way people do this a cronjob? Or is there some special way to fire a php script that runs until it's killed and repeats a task?
Thanks!
John
If PHP was your preferred route, a simple script such as the following can be set to run indefinitely in the background (name this grabber.php):
#!/usr/bin/php
<?php
do {
// Grab the data from your URL
$data = file_get_contents("http://www.example.com/data.source");
// Write the data out somewhere so your push notifications script can read it
file_put_contents("/path/to/shared/data.store", $data);
// Wait and do it all over again
sleep(20);
} while (true);
And to start it (assuming you're on a unixy OS):
$ chmod u+x grabber.php
$ ./grabber.php > /path/to/a/file/logging/script/output.log 2>&1 &
That & at the end sends the process to run in the background.
PHP is probably overkill for this however, perhaps a simple bash script would be better:
#!/bin/bash
# This downloads data and writes to a file ('data-file')
doWork () {
data=$(curl -L http://www.example.com/data.source)
echo $data > data-file
sleep 20
doWork
}
# Start working
doWork
$ chmod u+x grabber.sh
$ ./grabber.sh > /path/to/logger.log 2>&1 &
That is possible by setting up a cron jobs on your server.
Login to your web hosting e.g cpanel create a new cron job and add the path to the php file that you want to run. e.g php /home/[your username]/public_html/rss/import_feeds.php. There is field where you can input the number of minutes would you like the php script to run.
Run a PHP file in a cron job using CPanel

run a php script in background

I have a php scrip, in which i have written the following code
$client = new \Predis\Client();
$client->select(4);
$client->lpush('emailid',$x['to']);
$command = "/usr/bin/php5 -f /var/www/Symfony/src/Ens/NewBundle/Controller/cron.php";
exec( "$command > /dev/null &", $arrOutput );
return $this->render('EnsNewBundle:Email:header.html.twig');
in this I have written an another php script named as cron.php. I want to run that script in background. and I want to check that is this running in background or not. how can i check that
Maybe you could have a look to the Symfony2 Process component.
It's quite useful for running command from PHP.
You can take the output of cron in a file by > filename and check if it really runs.
Or check in process list if there is a new php process stating when you run this one.
You should also look at Codememe bundle here
Do check open source queuing systems too, they are helpful many times.
Like Beanstalkd or RabbitMQ
You can push data to these queues, they can be say "filenames" and other worker takes data from the "tubes" of queues and apply say "php filename" and then picks up next data from queue.

Initiating background process for running php file as background from another php page

I want initiate one php page as background process from another php page.
Use popen():
$command = 'php somefile.php';
pclose(popen($command,'r'));
This launches somefile.php as a background process.
This is a technique I used to get around restrictions applied by my webhost (who limited cronjobs to 15 minutes of execution time, so my backup scripts would always timeout).
exec( 'php somefile.php | /dev/null &' );
The breakdown of this line is:
exec() - PHP reference Runs the specified command, as if from the Linux Command Line.
php somefile.php: Invokes PHP to open, and run, somefile.php. This is the same behaviour as what would happen if that file was accessed through a web browser.
| ("pipe") - Sends the output of the proceeding command to a specified target. In this instance, it would "pipe" the content which would normally be read by the web browser accessing the file.
/dev/null - A blackhole. No, not kidding. It is a place where you send output if you just want it to disappear.
& - Appending this character to the end of a Linux command means "Do not wait - Send this to the background and continue."
So, in summary, the provided code will execute a PHP script, return no output, and not wait for it to finish before continuing onto the next line.
(And, as always, if any of these assumptions on my part are in error, I would love to be corrected by more knowledgeable members of the community.)
You have to make sure, that the background process is not terminated when the processing of the page finished. If you are on a Linux system, you could try to use the nohup command:
$command = 'nohup php somefile.php';
pclose(popen($command,'r'));
If it still gets terminated, you could try the "daemon" command.

Run a ffmpeg process in the background

I am wanting to use ffmpeg to convert video to .flv in php. Currently I have this working, but it hangs the browser until the file is uploaded and is finished. I have been looking at the php docs on how to run an exec() process in the background, while updating the process using the returned PID. Here is what I found:
//Run linux command in background and return the PID created by the OS
function run_in_background($Command, $Priority = 0)
{
if($Priority)
$PID = shell_exec("nohup nice -n $Priority $Command > /dev/null & echo $!");
else
$PID = shell_exec("nohup $Command > /dev/null & echo $!");
return($PID);
}
There is also a trick which I use to track if the background task is running using the returned PID :
//Verifies if a process is running in linux
function is_process_running($PID)
{
exec("ps $PID", $ProcessState);
return(count($ProcessState) >= 2);
}
Am I suppose to create a separate .php file which then runs from the php cli to execute one of these functions? I just need a little nudge in getting this working and then I can take it from there.
Thanks!
Am I suppose to create a separate .php
file which then runs from the php cli
to execute one of these functions?
This is probably the way I would do it :
the PHP webpage adds a record in database to indicate "this file has to be processed"
and displays a message to the user ; something like "your file will be processed soon"
In CLI, have a batch process the new inserted files
first, mark a record as "processing"
do the ffmpeg thing
mark the file as "processed"
And, on the webpage, you can show to the user in which state his file is :
if it has not been processed yet
if it's being processed
or if it's been processed -- you can then give him the link to the new video file.
Here's a couple of other thoughts :
The day your application becomes bigger, you can have :
one "web server"
many "processing servers" ; in your application, it's the ffmpeg thing that will require lots of CPU, not serving web pages ; so, being able to scale that part is nice (that's another to "lock" files, indicating them as "processing" in DB : that way, you will not have several processing servers trying to process the same file)
You only use PHP from the web server to generate web pages, which is je job of a web server
Heavy / long processing is not the job of a web server !
The day you'll want to switch to something else than PHP for the "processing" part, it'll be easier.
Your "processing script" would have to be launch every couple of minutes ; you can use cron for that, if you are on a Linux-like machine.
Edit : a bit more informations, after seeing the comment
As the processing part is done from CLI, and not from Apache, you don't need anykind of "background" manipulations : you can just use shell_exec, which will return the whole ouput of the command to your PHP script when it's finished doing it's job.
For the user watching the web page saying "processing", it will seem like background processing ; and, in a way, it'll be, as the processing will be done by another processus (maybe even on another machine).
But, for you, it'll be much simpler :
one webpage (nothing "background")
one CLI script, with no background stuff either.
Your processing script could look like something like this, I suppose :
// Fetch informations from DB about one file to process
// and mark it as "processing"
// Those would be fetched / determined from the data you just fetched from DB
$in_file = 'in-file.avi';
$out_file = 'out-file.avi';
// Launch the ffmpeg processing command (will probably require more options ^^ )
// The PHP script will wait until it's finished :
// No background work
// No need for any kind of polling
$output = shell_exec('ffmpeg ' . escapeshellarg($in_file) . ' ' . escapeshellarg($out_file));
// File has been processed
// Store the "output name" to DB
// Mark the record in DB as "processed"
Really easier than what you first thought, isn't it ? ;-)
Just don't worry about the background stuff anymore : only thing important is that the processing script is launched regularly, from crontab.
Hope this helps :-)
You don't need to write a separate php script to do this (Though you may want to later if you implement some sort of queuing system).
You're almost there. The only problem is, the shell_exec() call blocks to wait for the return of the shell. You can avoid this if you redirect all output from the command in the shell to wither a file or /dev/null and background the task (with the & operator).
So your code would become:
//Run linux command in background and return the PID created by the OS
function run_in_background($Command, $Priority = 0)
{
if($Priority) {
shell_exec("nohup nice -n $Priority $Command 2> /dev/null > /dev/null &");
} else {
shell_exec("nohup $Command 2> /dev/null > /dev/null &");
}
}
I don't think there is any way to retrieve the PID, unfortunately.

Categories