xml_parse huge file PHP - php

I have a issue with PHP function xml_parse. It's not working with huge files - I have xml file with 10MB size.
Problem is, that I have old XML-RPC library from Zend and there are another functions (element handlers and case folding).
$parser_resource = xml_parser_create('utf-8');
xml_parser_set_option($parser_resource, XML_OPTION_CASE_FOLDING, true);
xml_set_element_handler($parser_resource, 'XML_RPC_se', 'XML_RPC_ee');
xml_set_character_data_handler($parser_resource, 'XML_RPC_cd');
if (!xml_parse($parser_resource, $data, 1)) {
// ends here with 10MB file
}
On another place, I just use siple_load_xml_file with option LIBXML_PARSEHUGE, but in this case I don't know what can I do.
Best way will be, if function xml_parse will have some parameter for huge files too.
Thank you for your advices
Error is:
XML error: No memory at line ...

The chunk length of file to parse could be to huge.
if you use fread
while ($data = fread($fp, 1024*1024)) {...}
use smaller length (at my case it has to be smaller than 10 MB) e.g. 1MB and put the xml_parse function in the while loop.

Related

file_put_contents truncates content to max int on 32-bit php

I have nextcloud running on my Raspberry Pi 4, which uses 32-bit architecture.
When trying to upload a file larger then 2147483647 bytes, the file is uploaded completely and is accessible through ssh. However when I try to access it in any way through the webclient it fails. The error seen in the webclient's logging is the following:
file_put_contents(): content truncated from 4118394086 to 2147483647 bytes at /var/www/html/nextcloud/lib/private/Files/Storage/Local.php#556
When I try to access the file this error message is logged:
Sabre\DAV\Exception\RequestedRangeNotSatisfiable: The start offset (0) exceeded the size of the entity (-176573210)
The file in question here is a .mp4 file, however i have been able to replicate the issue with other filetypes.
I have read that the 2GB upload limit for 32-bit architectures has been fixed, however I don't know why it might fail in my case.
Problem
Well you cant get around this by tweaking any config, since its a hard limit set by PHP (PHP_INT_MAX on 32-Bit architecure is 2G (2^(32-1)-1))
There is hope
You can patch manually or even better override the responsible nextcloud code:
patch manually (since you are not using composer this is what you probably wanna do)
// this one is pretty memory expensive, but works with resouce and string
// Test: 4GB file, 2GB chunks (at 32bits)
// 12GB memory usage! - hell no
public function file_put_contents($path, $data) {
$bytesWritten = 0;
foreach (explode(PHP_EOL, chunk_split($data, PHP_INT_MAX, PHP_EOL)) as $chunk) {
$bytesWritten += file_put_contents($this->getSourcePath($path), $chunk, FILE_APPEND|LOCK_EX);
}
return $bytesWritten;
}
or
// better use this, in case $data is a resource - I dont know, you have to test it!
// Test: 4GB file, 1MB chunks
// 2MB memory usage - much better :)
public function file_put_contents($path, $data) {
$bytesWritten = 0;
while ($chunk = fread($data, 2**20)) {
$bytesWritten += file_put_contents($this->getSourcePath($path), $chunk, FILE_APPEND|LOCK_EX);
}
return $bytesWritten;
}
In case you want to override (composer)
class PatchedLocal extends OC\Files\Storage\Local {
public function file_put_contents($path, $data) {
// same as above ...
}
}
And here everything you need to know to force the autoloader to use your PatchedLocal. - As mentioned, you want to use composers PSR-4 implementation for this - via composer.json.

Php break or split a big ICS file

I have an ICS file which is 2gb in size and i want to parse ics data from that file but php is not able to read such big file and i am getting fatal error of "Out of Memory" even i have set "ini_set('memory_limit', '-1')".
So i want to somehow break or split big ICS file to small file or is there any way to read the data from such big ICS file.
I have some small files and all are working fine and i can extract data from other files but 2gb big ICS file is more important for me to extract / parse.
Thanks in advance
Usual method to handle Out of memory exception is by allocating more ram to php in php.ini file but because you have a file of 2Gb thats not a valid option unless you have a lot of memory on your system. Basically you are trying to read the file wrong. Rather than reading the whole file and saving it to a variable which would cost ram equivalent to the size of file, you ran parse them by line or byte depending on the format of file you are working on. Here is a basic example that you can work on
<?php
$handle = fopen("fileName", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
// process line here
}
fclose($handle);
} else {
// handle file read error
}
?>
Hope this helps you.

Issue to determine a currently downloading file size?

I have an interesting problem. I need to do a progress bar from an asycronusly php file downloading. I thought the best way to do it is before the download starts the script is making a txt file which is including the file name and the original file size as well.
Now we have an ajax function which calling a php script what is intended to check the local file size. I have 2 main problems.
files are bigger then 2GB so filesize() function is out of business
i tried to find a different way to determine the local file size like this:
.
function getSize($filename) {
$a = fopen($filename, 'r');
fseek($a, 0, SEEK_END);
$filesize = ftell($a);
fclose($a);
return $filesize;
}
Unfortunately the second way giving me a tons of error assuming that i cannot open a file which is currently downloading.
Is there any way i can check a size of a file which is currently downloading and the file size will be bigger then 2 GB?
Any help is greatly appreciated.
I found the solution by using an exec() function:
exec("ls -s -k /path/to/your/file/".$file_name,$out);
Just change your OS and PHP to support 64 bit computing. and you can still use filesize().
From filesize() manual:
Return Values
Returns the size of the file in bytes, or FALSE (and generates an
error of level E_WARNING) in case of an error.
Note: Because PHP's integer type is signed and many platforms use
32bit integers, some filesystem functions may return unexpected
results for files which are larger than 2GB.

How to decompress gzip stream chunk by chunk using PHP?

I can't read from an active http gzip stream chunk by chunk.
In short, it can't decompress the stream chunk by chunk, it requires the first chunk when it decompress the second one, it requires the first and second one when decompress the third one, or it will return strange characters(gzip string I guess).
I guess there are no existing ways for this as I have googled it for 2 days, anyway, I'll be appreciative if you have any suggestions.
Following is the function which I am using for decompressing:
function gzdecode1($data){
$g = tempnam('./','gz');
file_put_contents($g,$data);
ob_start();
readgzfile($g);
$d = ob_get_clean();
unlink($g);
return $d;
}
Here are ten example chunks
http://2.youpiaoma.com/chunk_s.rar
Use gzopen() and gzread()
$h = gzopen($filename, 'r');
while ($chunk = gzread($h, $chunksize)) {
// do magic
}
If it's a remote you might need to enable that remote file opens, I've never done it in that kind of environment though.

gzdeflate() and large amount of data

I've been building a class to create ZIP files in PHP. An alternative to ZipArchive assuming it is not allowed in the server. Something to use with those free servers.
It is already sort of working, build the ZIP structures with PHP, and using gzdeflate() to generate the compressed data.
The problem is, gzdeflate() requires me to load the whole file in the memory, and I want the class to work limitated to 32MB of memory. Currently it is storing files bigger than 16MB with no compression at all.
I imagine I should make it compress data in blocks, 16MB by 16MB, but I don't know how to concatenate the result of two gzdeflate().
I've been testing it and it seems like it requires some math in the last 16-bits, sort of buff->last16bits = (buff->last16bits & newblock->first16bits) | 0xfffe, it works, but not for all samples...
Question: How to concatenate two DEFLATEd streams without decompressing it?
PHP stream filters are used to perform such tasks. stream_filter_append can be used while reading from or writing to streams. For example
$fp = fopen($path, 'r');
stream_filter_append($fp, 'zlib.deflate', STREAM_FILTER_READ);
Now fread will return you deflated data.
This may or may not help. It looks like gzwrite will allow you to write files without having them completely loaded in memory. This example from the PHP Manual page shows how you can compress a file using gzwrite and fopen.
http://us.php.net/manual/en/function.gzwrite.php
function gzcompressfile($source,$level=false){
// $dest=$source.'.gz';
$dest='php://stdout'; // This will stream the compressed data directly to the screen.
$mode='wb'.$level;
$error=false;
if($fp_out=gzopen($dest,$mode)){
if($fp_in=fopen($source,'rb')){
while(!feof($fp_in))
gzwrite($fp_out,fread($fp_in,1024*512));
fclose($fp_in);
}
else $error=true;
gzclose($fp_out);
}
else $error=true;
if($error) return false;
else return $dest;
}

Categories