PHP + large CSV file + Shell

PHP + large CSV file + Shell - php

My hosting is shared, their rule is at most 30 the set_time_limit, I already tried in several ways changing in cpanel or .htaccess I have many lines in different files to save.
Currently I am cutting the contents of the files in several files so as not to exceed the time:
$lines = file(''.get_template_directory_uri() . '/lines1.csv', FILE_IGNORE_NEW_LINES);
foreach ($lines as $line_num => $line){
//here is some code for save you content line
}
But, someone told me to use the code:
exec("php csv_import.php > /dev/null &");
This would run only a single file .csv in the background instead of multiple files , without having problems with exceeding time limit
It is the first time I see about shell and php, and I have doubts on how to work
I have to create a file csv_import.phpwith the normal php code? But how do I run this in the shell of my server?

If your host allow you to change the value you can define an different time limit on the php file.
<?php
$minutes = 30 ; // just for easy manage
$runfor =$minutes * 60;
set_time_limit ( $runfor );
?>

Related

How to write PHP script for cronjob that uses cat /dev/null > to reset log files

I could use some assistance in making a PHP script that I could add to a cronjob that would include multiple, (10 to 15), commands such as:
line1: cat /dev/null > /var/www/vhosts/website.com
/logs/access_log.webstat
line2: cat /dev/null > /var/www/vhosts/website.com/logs/big_access_log
line3: cat /dev/null > /var/log/plesk-roundcube/largefile.log
and so on. The commands work great from a command line, but doing this daily is time consuming and the files grow way too large even though they are being rotated. Any assistance would be greatly appreciated, thank you.

Could you possibly use the shell_exec command to complete these actions:
Example:
<?php
$output = shell_exec('cat /dev/null > /var/www/vhosts/website.com /logs/access_log.webstat');
echo "<pre>$output</pre>";
?>
Then just create a cron job to run them at interval times.

You can easily achieve the same result using native PHP code:
// The list of files to truncate
$listFiles = array(
'/var/www/vhosts/website.com/logs/access_log.webstat',
'/var/www/vhosts/website.com/logs/big_access_log',
'/var/log/plesk-roundcube/largefile.log',
);
// Process all the files in the list
foreach ($listFiles as $filename) {
// Open the file for writing ('w')
// Truncate it to zero length if it exists, create it if it doesn't exist
$fh = fopen($filename, 'w');
// Close the file; this commits the new file size to the disk
fclose($fh);
}

Thank you all for your assistance, the ending outcome is a awesome!
41 log files will no longer grow to gargantuan sizes. The implementation was as follows:
PHP script written as such:
<?php
$output = shell_exec('cat /dev/null > /var/www/vhosts/website.com/logs/access_log.webstat');
$output = shell_exec('cat /dev/null > /var/www/vhosts/website.com/logs/big_access_log');
$output = shell_exec('cat /dev/null > /var/log/plesk-roundcube/largefile.log');
?>
Then uploaded and set as a cron from the Plesk 12.5 panel. Tested and
functioning beautifully!

It's quite strange, by default this files should be rotated by psa-logrotate.
Maybe something happens with logrotate package or crontask.
Here is a default settings for rotating of domain's logs:

Can I set max execution time in php.ini to be 30,000 in my case?

Scenario is that I wanna save 4046 images to a folder . (Have coded in php ) I guess it would take maximum of 5 hours . Initially max execution time in php.ini was set to 30 seconds . After 650 images got saved , The browser froze . And none of the images got saved .But the process was running . And had no error too . Can anybody give me an idea the max execution time I should set in this case !
P.S. If my approach is wrong , Do guide me .
Thanks

I'm not sure if your problem isn't caused just by wrong tool - PHP isn't meant for such long tasks.
If that images are on some server better user FTP client.
If you have list of files saved in text file use cURL to download them.

I'd highly suggest modifying your script to do the job incrementally. So basically break the job up into smaller parts and provide a break in between. The basic logic flow would be like this.
<?php
$start = $_GET['start']; //where to start the job at
$end = $start + 250; //queue this and the next 250 positions
for($i=$start;$i<=$end;$i++){
//do the operations need for position $i
}
header("/urlToScript?start=".($end+1)); //refresh page moving to the next 250 jobs
?>
This will do small parts of the total job and avoid any issues with the interpreter. Add any INI modifications to increase memory usage and time as needed and you'll be fine.

You can extend the time using this line at that script which saving images.
ini_set('max_execution_time', 30000);
Second approach is to use htaccess.
php_value max_execution_time 30000

How to parse Large CSV file without timing out?

I'm trying to parse a 50 megabyte .csv file. The file itself is fine, but I'm trying to get past the massive timeout issues involved. Every is set upload wise, I can easily upload and re-open the file but after the browser timeout, I receive a 500 Internal error.
My guess is I can save the file onto the server, open it and keep a session value of what line I dealt with. After a certain line I reset the connect via refresh and open the file at the line I left off with. Is this a do-able idea? The previous developer made a very inefficient MySQL class and it controls the entire site, so I don't want to write my own class if I don't have to, and I don't want to mess with his class.
TL;DR version: Is it efficient to save the last line I'm currently on of a CSV file that has 38K lines of products then, and after X number of rows, reset the connection and start from where I left off? Or is there another way to parse a Large CSV file without timeouts?
NOTE: It's the PHP script execution time. Currently at 38K lines, it takes about 46 minutes and 5 seconds to run via command line. It works correctly 100% of the time when I remove it from the browser, suggesting that it is a browser timeout. Chrome's timeout is not editable as far as Google has told me, and Firefox's timeout works rarely.

You could do something like this:
<?php
namespace database;
class importcsv
{
private $crud;
public function __construct($dbh, $table)
{
$this->crud = new \database\crud($dbh, $table);
return $this;
}
public function import($columnNames, $csv, $seperator)
{
$lines = explode("\n", $csv);
foreach($lines as $line)
{
\set_time_limit(30);
$line = explode($seperator, $line);
$data = new \stdClass();
foreach($line as $i => $item)
{
if(isset($columnNames[$i])&&!empty($columnNames[$i]))
$data->$columnNames[$i] = $item;
}
#$x++;
$this->crud->create($data);
}
return $x;
}
public function importFile($columnNames, $csvPath, $seperator)
{
if(file_exists($csvPath))
{
$content = file_get_contents($csvPath);
return $this->import($columnNames, $content, $seperator);
}
else
{
// Error
}
}
}
TL;DR: \set_time_limit(30); everytime you loop throu a line might fix your timeout issues.

I suggest to run php from command line and set it as a cron job. This way you don't have to modify your code. There will be no timeout issue and you can easily parse large CSV files.
also check this link

Your post is a little unclear due to the typos and grammar, could you please edit?
If you are saying that the Upload itself is okay, but the delay is in processing of the file, then the easiest thing to do is to parse the file in parallel using multiple threads. You can use the java built-in Executor class, or Quartz or Jetlang to do this.
Find the size of the file or number of lines.
Select a Thread load (Say 1000 lines per thread)
Start an Executor
Read the file in a loop.
For ach 1000 lines, create a Runnable and load it to the Executor
Start the Executor
Wait till all threads are finished
Each runnable does this:
Fetch a connection
Insert the 1000 lines
Log the results
Close the connection

How to show random content every 15 minutes - php

Ok so i have a .txt file with a bunch of urls. I got a script that gets 1 of the lines randomly. I then included this into another page.
However I want the url to change every 15 minutes. So I'm guessing I'm gonna need to use a cron, however I'm not sure how I should put it all into place.
I found if you include a file, it's still going to give a random output so I'm guessing if I run the cron and the include file it's going to get messy.
So what I'm thinking is I have a script that randomly selects a url from my initial text file then it saves it to another .txt file and I include that file on the final page.
I just found this which is sort of in the right direction:
Include php code within echo from a random text
I'm not the best with writing php (can understand it perfectly) so all help is appreciated!

So what I'm thinking is I have a
script that randomly selects a url
from my initial text file then it
saves it to another .txt file and I
include that file on the final page.
That's pretty much what I would do.
To re-generate that file, though, you don't necessarily need a cron.
You could use the following idea :
If the file has been modified less that 15 minutes ago (which you can find out using filemtime() and comparing it with time())
then, use what in the file
else
re-generate the file, randomly choosing one URL from the big file
and use the newly generated file
This way, no need for a cron : the first user that arrives more than 15 minutes after the previous modification of the file will re-generate it, with a new URL.

Alright so I sorta solved my own question:
<?php
// load the file that contain thecode
$adfile = "urls.txt";
$ads = array();
// one line per code
$fh = fopen($adfile, "r");
while(!feof($fh)) {
$line = fgets($fh, 10240);
$line = trim($line);
if($line != "") {
$ads[] = $line;
}
}
// randomly pick an code
$num = count($ads);
$idx = rand(0, $num-1);
$f = fopen("output.txt", "w");
fwrite($f, $ads[$idx]);
fclose($f);
?>
However is there anyway I can delete the chosen line once it has been picked?

How to schedule PHP script on Job Finish (from CLI)

I am processing a big .gz file using PHP (transfering data from gz to mysql)
it takes about 10 minutes per .gz file.
I have a lot of .gz file to be processed.
After PHP is finished with one file I have to manually change the PHP script to select another .gz file and then run the script again manually.
I want it to be automatically run the next job to process the next file.
the gz file is named as 1, 2 ,3, 4, 5 ...
I can simply make a loop to be something like this ( process file 1 - 5):
for ($i = 1 ; $i >= 5; $i++)
{
$file = gzfile($i.'.gz')
...gz content processing...
}
However, since the gz file is really big, I cannot do that, because if I use this loop, PHP will run multiple big gz files as single script job. (takes a lot of memory)
What I want to do is after PHP is finished with one job I want a new job to process the next file.
maybe its going to be something like this:
$file = gzfile($_GET['filename'].'.gz')
...gz content processing...
Thank You

If you clean up after processing and free all memory using unset(), you could simply wrap the whole script in a foreach (glob(...) as $filename) loop. Like this:
<?php
foreach (glob(...) as $filename) {
// your script code here
unset($thisVar, $thatVar, ...);
}
?>

What you should do is
Schedule a cronjob to run your php script every x minutes
When script is run, check if there is a lock file in place, if not create one and start processing the next unprocessed gz file, if yes abort
Wait for the queue to get cleared

You should call the PHP script with argument, from a shell script. Here's the doc how to use command-line parameters in PHP http://php.net/manual/en/features.commandline.php
Or, I can't try it now, but you may give a chance to unset($file) after processing the gzip.
for ($i = 1 ; $i >= 5; $i++)
{
$file = gzfile($i.'.gz')
...gz content processing...
unset($file);
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP + large CSV file + Shell - php

If your host allow you to change the value you can define an different time limit on the php file. <?php $minutes = 30 ; // just for easy manage $runfor =$minutes * 60; set_time_limit ( $runfor ); ?>

Related

How to write PHP script for cronjob that uses cat /dev/null > to reset log files

Can I set max execution time in php.ini to be 30,000 in my case?

How to parse Large CSV file without timing out?

How to show random content every 15 minutes - php

How to schedule PHP script on Job Finish (from CLI)

Categories

Resources