I have a large JSON file (>100MB) which I want to parse and then convert its objects into wordpress posts.
I have successfully created a function to complete this task but it is not able to loop through all object, else it dies by giving an error of PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 92 bytes)
Function is able to handle files under 1MB but for files greater than 1MB it fails.
I have asked server administrators to increase memory limit and now no memory errors are visible but still it is not able to fetch all JSON.
I have gone through many post and questions but couldn't find/understood anything useful. Also it is my first attempt to write such function, so any help/guidance will be appreciated.
EDIT
Added code for function used
function create_post_from_json($json_key) {
$json_options = get_option('json_file_data');
$obj = wp_remote_retrieve_body(wp_remote_get($json_options[$json_key]['url'],array( 'timeout' => -1 )));
$obj = json_decode($obj);
$id_stored = array();
$new_posts_id_array = array();
$new_json_posts_id_array = get_option('json_' . $json_options[$json_key]['name'] . "_post_ids");
$id_stored = get_option('json_' . $json_options[$json_key]['name']);
if(!$id_stored){$id_stored = [];}
foreach ($obj->products as $one_post) {
$post_char_id = $one_post->ID;
$new_posts_id_array[] = $post_char_id;
$cat_array = array();
if (!in_array($post_char_id, $id_stored)) {
$id_stored[] = $post_char_id;
update_option('json_' . $json_options[$json_key]['name'], $id_stored);
$post = array(
'post_title' => $one_post->name,
'post_status' => 'publish',
'post_author' => 1,
'post_content' => $one_post->description,
'post_type' => 'destinations',
);
$new_post_id = wp_insert_post($post); //post id
return true;
}
Chances are your script might be running out of time.
You could increase the max execution time using set_time_limit(100); // 100 seconds
Or to make it indefinite, use set_time_limit(0);
Note: Set the time on top of your script.
Related
I'm import data from a CRM Server by JSON to Wordpress.
I know that the load may take several minutes, so the script runs outside Wordpress. And I execute "php load_data.php"
But when the script reaches the part where we upload the images, it throws an error:
php: time limit exceeded `Success' # fatal/cache.c/GetImagePixelCache/2042.
and it stops.
This is my code to upload image to media:
<?php
function upload_image_to_media($postid, $image_url, $set_featured=0) {
$tmp = download_url( $image_url );
// fix filename for query strings
preg_match( '/[^\?]+\.(jpg|jpe|jpeg|gif|png)/i', $image_url, $matches );
$before_name = $postid == 0 ? 'upload' : $postid;
$file_array = array(
'name' => $before_name . '_' . basename( $matches[0] ),
'tmp_name' => $tmp
);
// Check for download errors
if ( is_wp_error( $tmp ) ) {
#unlink( $file_array['tmp_name'] );
return false;
}
$media_id = media_handle_sideload( $file_array, $postid );
// Check for handle sideload errors.
if ( is_wp_error( $media_id ) ) {
#unlink( $file_array['tmp_name'] );
return false;
}
if( $postid != 0 && $set_featured == 1 )
set_post_thumbnail( $postid, $media_id );
return $media_id;
}
?>
They are like 50 posts and each one has 10 large images.
Regards
The default execution time is 30 seconds so looks you are exceeding that. We have a similar script that downloads up to a couple thousand photos per run. Adding set_time_limit(60) to reset timer each loop fixed timeout issues. In your case you can probably just add at the beginning of the function. Just be very careful you don't get any infinite loops as they will run forever (or until the next reboot).
To make sure it works you can add the below as the first line inside your upload function
set_time_limit(0)
this will allow it to run until it's finished, but watch it as this will let it run forever which WILL hurt your servers available memory. But to see if the script works put that in there, then adjust to proper time if need be.
If you get another or the same error it will at least verify its not a time issue (error messages are not always factual).
The other possibility is that you are on a shared server and are exceeding their time allotment for you server. (continuous processor use for more then 30 seconds, as an example).
I am trying to upload a 9GB .tar.gz file which I created using the CPanel Backup wizard. This file should be stored as is on Amazon Glacier but Amazon Glacier has a upload limit of 4GB.
Is there a way to do this using PHP, aws-SDK v2 and uploadMultipartPart?
This is the code I got so far:
<?php
require 'aws-autoloader.php';
use Aws\Glacier\GlacierClient;
use Aws\Glacier\Model\MultipartUpload\UploadPartGenerator;
//#####################################################################
//SET AMAZON GLACIER VARIBALES
//#####################################################################
$key = 'XXXXXXXXXXXXXXXXX';
$secret = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX';
$region = 'us-west-2';
$accountId = 'XXXXXXXXXXXX';
$vaultName = 'XXXXXXXXXXXX';
$partSize = 4 * 1024 * 1024;
$fileLocation = 'path/to/.tar.gz file/';
//#####################################################################
//DECLARE THE AMAZON CLIENT
//#####################################################################
$client = GlacierClient::factory(array(
'key' => $key,
'secret' => $secret,
'region' => $region,
));
//#####################################################################
//GET ALL FILES INTO AN ARRAY
//#####################################################################
$files = scandir($fileLocation);
$filename = $files[2];
//#####################################################################
// USE HELPERS IN THE SDK TO GET INFORMATION ABOUT EACH OF THE PARTS
//#####################################################################
$archiveData = fopen($fileLocation.$filename, 'r');
$parts = UploadPartGenerator::factory($archiveData, $partSize);
//#####################################################################
// INITIATE THE UPLOAD AND GET THE UPLOAD ID
//#####################################################################
$result = $client->initiateMultipartUpload(array(
'vaultName' =>$vaultName,
'partSize' => $partSize,
));
$uploadId = $result->get('uploadId');
//#####################################################################
// UPLOAD EACH PART INDIVIDUALLY USING DATA FROM THE PART GENERATOR
//#####################################################################
$archiveData = fopen($fileLocation.$filename, 'r');
foreach ($parts as $part) {
set_time_limit (120);
fseek($archiveData, $part->getOffset());
$client->uploadMultipartPart(array(
'vaultName' => $vaultName,
'uploadId' => $uploadId,
'body' => fread($archiveData, $part->getSize()),
'range' => $part->getFormattedRange(),
'checksum' => $part->getChecksum(),
'ContentSHA256' => $part->getContentHash(),
));
}
//#####################################################################
// COMPLETE THE UPLOAD BY USING DATA AGGREGATED BY THE PART GENERATOR
//#####################################################################
$result = $client->completeMultipartUpload(array(
'vaultName' => $vaultName,
'uploadId' => $uploadId,
'archiveSize' => $parts->getArchiveSize(),
'checksum' => $parts->getRootChecksum(),
));
$archiveId = $result->get('archiveId');
fclose($archiveData);
?>
Note partSize needs to be n * 1024 * 1024, where n is a power of 2. You're using 104857600 = 100 * 1024 * 1024. Your n is an even number, not a power of two. http://docs.aws.amazon.com/amazonglacier/latest/dev/api-multipart-initiate-upload.html
I don't have a complete answer, but you could specify what error you are getting.
Also from the docs: "The minimum allowable part size is 1 MB, and the maximum is 4 GB (4096 MB)." In other words, n >=1, n <= 4096, and n is a power of 2. So what's a good number to use? I think the idea is use a smaller n if you have problems, subject to these constraints:
You pay per part: $0.050 per 1,000 requests in US-East.
There's a maximum number of parts: 10,000. For your 9 GB upload, that works out to a part size of 966367 ~ 0.9 MB if you use the max number of parts. So 0.9 MB is the min part size for 9 GB. You are right to want to use a larger part size than 1 MB to be comfortably within the limits.
There's a reason not to use overly large part sizes. It has something to do with memory, CPU and saturating your internet connection. All I can really say is that the software I use defaults to 16 MB. Here is a discussion of the tradeoffs on its issues tracker: https://github.com/vsespb/mt-aws-glacier/issues/55
I am creating a excel sheet using:
Codeigniter 2.2.1
PHP 5.4.25
Apache 2.4.7
XAMPP 1.8.2
PHPExcel 1.8.0
I am fetching 35000 rows (mostly empty) with 79 columns from database and writing to excel file (Excel5).
It works just fine for 30000 rows with peak memory usage of 1.44GB,file size 24MB. But when I go for 35000 it times out saying,
Fatal error: Out of memory (allocated 1780219904) (tried to allocate 17301483 bytes) in E:\XAMPP\htdocs\ProjectName\application\libraries\PHPExcel\Writer\Excel5\BIFFwriter.php on line 144
When I am creating PHPExcel object it uses big amount of memory and time outs finally. I have set "memory_limit= -1" and "max_execution_time= 1000" in php.ini file and tried different cache storage method in PHPExcel.
My algorithm in controller looks like this
public function write_controller() {
error_reporting(E_ALL);
ini_set("display_errors", 1);
ini_set('memory_limit', '-1'); //-1 for unlimited memory
$dir = "assets/output/";
//FIRST CHECK IF PREVIOUS FILE EXISTS OR NOT
$this->clear_directory($dir);
// Loading PHPExcel library
$this->load->library('PHPExcel');
$this->load->library('PHPExcel/IOFactory');
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array('memoryCacheSize' => '5000MB', 'cacheTime' => '1000');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
//First Create the xls file and then insert rest of the data
$objPHPExcel = new PHPExcel();
$objPHPExcel->getProperties()->setTitle("export")->setDescription("none");
//activate sheet number 1
$objPHPExcel->setActiveSheetIndex(0);
//Setting font styles
$objPHPExcel->getActiveSheet()->getDefaultStyle()->getFont()->setName('Arial')->setSize(8)->setBold(false);
//Setting number format as TEXT
$objPHPExcel->getActiveSheet()->getDefaultStyle()->getNumberFormat()->setFormatCode(PHPExcel_Style_NumberFormat::FORMAT_TEXT);
//Freezing first top row
$objPHPExcel->getActiveSheet()->freezePane('A2');
$objWorksheet = $objPHPExcel->getActiveSheet();
$row1 = 1;
$objWorksheet->setCellValueByColumnAndRow(0, $row1, "Site name");
$objWorksheet->setCellValueByColumnAndRow(1, $row1, "Vendor_1");
$objWorksheet->setCellValueByColumnAndRow(2, $row1, "Status");
$objWorksheet->setCellValueByColumnAndRow(3, $row1, "Easting");
$objWorksheet->setCellValueByColumnAndRow(4, $row1, "Northing");
$objWorksheet->setCellValueByColumnAndRow(5, $row1, "Sector_1");
.
.
.
//Rest of the 74 columns
$style = array(
'alignment' => array(
'horizontal' => PHPExcel_Style_Alignment::HORIZONTAL_CENTER,
'vertical' => PHPExcel_Style_Alignment::VERTICAL_CENTER)
);
$objWorksheet->getDefaultStyle()->applyFromArray($style);
$objWriter = IOFactory::createWriter($objPHPExcel, 'Excel5');
$saved_location ='assets/output/Piano11.xls';
$objWriter->save($saved_location);
//Now reading the saved xls file
$objReader = new PHPExcel_Reader_Excel5();
$newPHPExcel = $objReader->load($saved_location);
$newWorksheet = $newPHPExcel->getActiveSheet();
//Now insert rest of the data from Piano table which will come from database
$table_name = 'piano_test';
$query = $this->db->get('tbl_piano');
if (!$query) {
return false;
}
// Fetching the data from table
$fields = $query->list_fields();
$row = 2;
foreach ($query->result() as $data) {
set_time_limit(0);
$col = 0;
foreach ($fields as $field) {
$newWorksheet->setCellValueByColumnAndRow($col, $row, $data->$field); //<- This skips leading 0s
$col++;
}
$row++;
}
$newobjWriter = IOFactory::createWriter($newPHPExcel, 'Excel5');
$newobjWriter->save('assets/output/Piano11.xls');
echo 'Memory peak usage: <b>'.$this->convert(memory_get_peak_usage(true)).'</b><br/>';
gc_collect_cycles();//garbage collector
echo 'inserted.';
}
Any solution how can I minimize memory usage & execution time? Or Any other alternative solution? Or Should I change my algorithm?
You're using phptemp for caching, but with settings
$cacheSettings = array('memoryCacheSize' => '5000MB', 'cacheTime' => '1000');
This means that PHPExcel will use 5000MB (5GB) of your PHP memory before its starts to make use of phptemp for caching..... I'd be surprised if you had php.ini max memory settings allowing PHP to use that much memory
You should use a much lower value for memoryCacheSize, perhaps 512MB, which means that PHPExcel will only use 512MB of PHP Memory for caching cell data before it switches to using php://temp
So I'm trying to cache an array in a file and use it somewhere else.
import.php
// Above code is to get each line in CSV and put in it in an array
// (1 line is 1 multidimensional array) - $csv
$export = var_export($csv, true);
$content = "<?php \$data=" . $export . ";?>";
$target_path1 = "/var/www/html/Samples/test";
file_put_contents($target_path1 . "recordset.php", $content);
somewhere.php
ini_set('memory_limit','-1');
include_once("/var/www/html/Samples/test/recordset.php");
print_r($data);
Now, I've included recordset.php in somewhere.php to use the array stored in it. It works fine when the uploaded CSV file has 5000 lines, now if i try to upload csv with 50000 lines for example, i'm getting a fatal error:
Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 79691776 bytes)
How can I fix it or is there a possible way to achieve what i want in a more convenient way? Speaking about the performance... Should i consider the CPU of the server? I've override the memory limit and set it to -1 in somewhere.php
There are 2 ways to fix this:
You need to increase memory(RAM) on the server as memory_limit can only use memory which is available on server. And it seems that you have very low RAM available for PHP.
To Check the total RAM on Linux server:
<?php
$fh = fopen('/proc/meminfo','r');
$mem = 0;
while ($line = fgets($fh)) {
$pieces = array();
if (preg_match('/^MemTotal:\s+(\d+)\skB$/', $line, $pieces)) {
$mem = $pieces[1];
break;
}
}
fclose($fh);
echo "$mem kB RAM found"; ?>
Source: get server ram with php
You should parse your CSV file in chunks & every time release occupied memory using unset function.
For one off my projects I need to import a very huge text file ( ~ 950MB ). I'm using Symfony2 & Doctrine 2 for my project.
My problem is that I get errors like:
Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 24 bytes)
The error even occurs if I increase the memory limit to 1GB.
I tried to analyze the problem by using XDebug and KCacheGrind ( as part of PHPEdit ), but I don't really understand the values :(
I'am looking for a tool or a method (Quick & Simple due to the fact that I don't have much time) to find out why memory is allocated and not freed again.
Edit
To clear some things up here is my code:
$handle = fopen($geonameBasePath . 'allCountries.txt','r');
$i = 0;
$batchSize = 100;
if($handle) {
while (($buffer = fgets($handle,16384)) !== false) {
if( $buffer[0] == '#') //skip comments
continue;
//split parts
$parts = explode("\t",$buffer);
if( $parts[6] != 'P')
continue;
if( $i%$batchSize == 0 ) {
echo 'Flush & Clear' . PHP_EOL;
$em->flush();
$em->clear();
}
$entity = $em->getRepository('MyApplicationBundle:City')->findOneByGeonameId( $parts[0] );
if( $entity !== null) {
$i++;
continue;
}
//create city object
$city = new City();
$city->setGeonameId( $parts[0] );
$city->setName( $parts[1] );
$city->setInternationalName( $parts[2] );
$city->setLatitude($parts[4] );
$city->setLongitude( $parts[5] );
$city->setCountry( $em->getRepository('MyApplicationBundle:Country')->findOneByIsoCode( $parts[8] ) );
$em->persist($city);
unset($city);
unset($entity);
unset($parts);
unset($buffer);
echo $i . PHP_EOL;
$i++;
}
}
fclose($handle);
Things I have tried, but nothing helped:
Adding second parameter to fgets
Increasing memory_limit
Unsetting vars
Increasing memory limit is not going to be enough. When importing files like that, you buffer the reading.
$f = fopen('yourfile');
while ($data = fread($f, '4096') != 0) {
// Do your stuff using the read $data
}
fclose($f);
Update :
When working with an ORM, you have to understand that nothing is actually inserted in the database until the flush call. Meaning all those objects are stored by the ORM tagged as "to be inserted". Only when the flush call is made, the ORM will check the collection and start inserting.
Solution 1 : Flush often. And clear.
Solution 2 : Don't use the ORM. Go for plain SQL command. They will take up far less memory than the object + ORM solution.
33554432 are 32MB
change memory limit in php.ini for example 75MB
memory_limit = 75M
and restart server
Instead of simply reading the file, you should read the file line by line. Every time you do read the one line you should process your data. Do NOT try to fit EVERYTHING in memory. You will fail. The reason for that is that while you can put the TEXT file in ram, you will not be able to also have the data as php objects/variables/whathaveyou at the same time, since php by itself needs much larger amounts of memory for each of them.
What I instead suggest is
a) read a new line,
b) parse the data in the line
c) create the new object to store in the database
d) goto step a, by unset(ting) the old object first or reusing it's memory