file_put_contents truncates content to max int on 32-bit php - php

I have nextcloud running on my Raspberry Pi 4, which uses 32-bit architecture.
When trying to upload a file larger then 2147483647 bytes, the file is uploaded completely and is accessible through ssh. However when I try to access it in any way through the webclient it fails. The error seen in the webclient's logging is the following:
file_put_contents(): content truncated from 4118394086 to 2147483647 bytes at /var/www/html/nextcloud/lib/private/Files/Storage/Local.php#556
When I try to access the file this error message is logged:
Sabre\DAV\Exception\RequestedRangeNotSatisfiable: The start offset (0) exceeded the size of the entity (-176573210)
The file in question here is a .mp4 file, however i have been able to replicate the issue with other filetypes.
I have read that the 2GB upload limit for 32-bit architectures has been fixed, however I don't know why it might fail in my case.

Problem
Well you cant get around this by tweaking any config, since its a hard limit set by PHP (PHP_INT_MAX on 32-Bit architecure is 2G (2^(32-1)-1))
There is hope
You can patch manually or even better override the responsible nextcloud code:
patch manually (since you are not using composer this is what you probably wanna do)
// this one is pretty memory expensive, but works with resouce and string
// Test: 4GB file, 2GB chunks (at 32bits)
// 12GB memory usage! - hell no
public function file_put_contents($path, $data) {
$bytesWritten = 0;
foreach (explode(PHP_EOL, chunk_split($data, PHP_INT_MAX, PHP_EOL)) as $chunk) {
$bytesWritten += file_put_contents($this->getSourcePath($path), $chunk, FILE_APPEND|LOCK_EX);
}
return $bytesWritten;
}
or
// better use this, in case $data is a resource - I dont know, you have to test it!
// Test: 4GB file, 1MB chunks
// 2MB memory usage - much better :)
public function file_put_contents($path, $data) {
$bytesWritten = 0;
while ($chunk = fread($data, 2**20)) {
$bytesWritten += file_put_contents($this->getSourcePath($path), $chunk, FILE_APPEND|LOCK_EX);
}
return $bytesWritten;
}
In case you want to override (composer)
class PatchedLocal extends OC\Files\Storage\Local {
public function file_put_contents($path, $data) {
// same as above ...
}
}
And here everything you need to know to force the autoloader to use your PatchedLocal. - As mentioned, you want to use composers PSR-4 implementation for this - via composer.json.

Related

Run out of memory writing files to zip with flysystem

I'm programming a tool that gathers images uploaded by a user into a zip-archive. For this I came across ZipArchiveAdapter from Flysystem that seems to do a good job.
I'm encountering an issue with the memory limit when the amount of files in the zip archive goes into the thousands.
When the amount of images for a user starts to go beyond a 1000 it usually fails due to the available memory being exhausted. To get to the point where it seems to handle most users with less than 1000 images I've increased memory limit to 4GB, but increasing it beyond this is not really an option.
Simplified code at this point:
<?php
use League\Flysystem\Filesystem;
use League\Flysystem\ZipArchive\ZipArchiveAdapter;
use League\Flysystem\Memory\MemoryAdapter;
class User {
// ... Other user code
public function createZipFile()
{
$tmpFile = tempnam('/tmp', "zippedimages_");
$download = new FileSystem(new ZipArchiveAdapter($tmpFile));
if ($this->getImageCount()) {
foreach ($this->getImages() as $image) {
$path_in_zip = "My Images/{$image->category->title}/{$image->id}_{$image->image->filename}";
$download->write($path_in_zip, $image->image->getData());
}
}
$download->getAdapter()->getArchive()->close();
return $tmpFile;
// Upload zip to s3-storage
}
}
So my questions:
a) Is there a way to have Flysystem write to the zip-file "on the go" to disk? Currently it stores the entire zip in memory before writing to disk when the object is destroyed.
b) Should I utilize another library that would be better for this?
c) Should I take another approach here? For example having the user download multiple smaller zips instead of one large zip. (Ideally I want them to download just one file regardless)

xml_parse huge file PHP

I have a issue with PHP function xml_parse. It's not working with huge files - I have xml file with 10MB size.
Problem is, that I have old XML-RPC library from Zend and there are another functions (element handlers and case folding).
$parser_resource = xml_parser_create('utf-8');
xml_parser_set_option($parser_resource, XML_OPTION_CASE_FOLDING, true);
xml_set_element_handler($parser_resource, 'XML_RPC_se', 'XML_RPC_ee');
xml_set_character_data_handler($parser_resource, 'XML_RPC_cd');
if (!xml_parse($parser_resource, $data, 1)) {
// ends here with 10MB file
}
On another place, I just use siple_load_xml_file with option LIBXML_PARSEHUGE, but in this case I don't know what can I do.
Best way will be, if function xml_parse will have some parameter for huge files too.
Thank you for your advices
Error is:
XML error: No memory at line ...
The chunk length of file to parse could be to huge.
if you use fread
while ($data = fread($fp, 1024*1024)) {...}
use smaller length (at my case it has to be smaller than 10 MB) e.g. 1MB and put the xml_parse function in the while loop.

Is ini_set("memory_limit", "512M"); too much?

I am using 2 php libraries to serve files- unzip, and dUnzip2
zip zip.lib
http://www.zend.com/codex.php?id=470&single=1
They work fine on files under 10MB , but with files over 10MB, I have to set the mem limit to 256. With files over 25MB, I set it to 512. It seems kind of high... Is it?
I'm on a dedicated server- 4 CPU's and 16GB RAM - but we also have a lot of traffic and downloading, so I'm kind of wondering here.
Perhaps you're using php to load the whole files into memory before serving them to the user? I've used a function found at http://www.php.net/manual/en/function.readfile.php (comments section) that serves the file in parts, keeping memory low. Copying from that post (because my version is changed):
<?php
function readfile_chunked ($filename,$type='array') {
$chunk_array=array();
$chunksize = 1*(1024*1024); // how many bytes per chunk
$buffer = '';
$handle = fopen($filename, 'rb');
if ($handle === false) {
return false;
}
while (!feof($handle)) {
switch($type)
{
case'array':
// Returns Lines Array like file()
$lines[] = fgets($handle, $chunksize);
break;
case'string':
// Returns Lines String like file_get_contents()
$lines = fread($handle, $chunksize);
break;
}
}
fclose($handle);
return $lines;
}
?>
what my script does is licensing files before user downloads
Generally you should avoid any web script that consumes that much memory or CPU time. Ideally you'd convert the task to a run via a detached process. Setting a high limit isn't bad per se, but it makes it harder to detect poorly written scripts and easier for high demand of the complex pages to hurt performance across even the simple pages.
For example, with gearman you can easily set up some license worker scripts that run on the CLI and communicate with them via the gearman API.
This way your web server can remain free to perform low CPU and memory tasks, and you can easily guarantee that you'll never run more than X licensing tasks at once (where X is based on how many worker scripts you let run).
Your front end script could just be an AJAX widget that polls the server, checking to see if the task has completed.

Issue to determine a currently downloading file size?

I have an interesting problem. I need to do a progress bar from an asycronusly php file downloading. I thought the best way to do it is before the download starts the script is making a txt file which is including the file name and the original file size as well.
Now we have an ajax function which calling a php script what is intended to check the local file size. I have 2 main problems.
files are bigger then 2GB so filesize() function is out of business
i tried to find a different way to determine the local file size like this:
.
function getSize($filename) {
$a = fopen($filename, 'r');
fseek($a, 0, SEEK_END);
$filesize = ftell($a);
fclose($a);
return $filesize;
}
Unfortunately the second way giving me a tons of error assuming that i cannot open a file which is currently downloading.
Is there any way i can check a size of a file which is currently downloading and the file size will be bigger then 2 GB?
Any help is greatly appreciated.
I found the solution by using an exec() function:
exec("ls -s -k /path/to/your/file/".$file_name,$out);
Just change your OS and PHP to support 64 bit computing. and you can still use filesize().
From filesize() manual:
Return Values
Returns the size of the file in bytes, or FALSE (and generates an
error of level E_WARNING) in case of an error.
Note: Because PHP's integer type is signed and many platforms use
32bit integers, some filesystem functions may return unexpected
results for files which are larger than 2GB.

Allowed memory size exhausted error exporting from mongodb

I try to export some documents from mongodb to .csv. For some large lists, the files would be something like 40M, I get errors about memory limit:
Fatal error: Allowed memory size of 134217728 bytes exhausted
(tried to allocate 44992513 bytes) in
/usr/share/php/Zend/Controller/Response/Abstract.php on line 586
I wonder why this error happens. What consumes such an amount of memory? How do I avoid such error without changing memory_limit which is set 128M now.
I use something like this:
public static function exportList($listId, $state = self::SUBSCRIBED)
{
$list = new Model_List();
$fieldsInfo = $list->getDescriptionsOfFields($listId);
$headers = array();
$params['list_id'] = $listId;
$mongodbCursor = self::getCursor($params, $fieldsInfo, $headers);
$mongodbCursor->timeout(0);
$fp = fopen('php://output', 'w');
foreach ($mongodbCursor as $subscriber) {
foreach ($fieldsInfo as $fieldInfo) {
$field = ($fieldInfo['constant']) ? $fieldInfo['field_tag'] : $fieldInfo['field_id'];
if (!isset($subscriber->$field)) {
$row[$field] = '';
} elseif (Model_CustomField::isMultivaluedType($fieldInfo['type'])) {
$row[$field] = array();
foreach ($subscriber->$field as $value) {
$row[$field][] = $value;
}
$row[$field] = implode(self::MULTIVALUED_DELEMITOR, $row[$field]);
} else {
$row[$field] = $subscriber->$field;
}
}
fputcsv($fp, $row);
}
}
Then in my controller I try to call it something like this:
public function exportAction()
{
set_time_limit(300);
$this->_helper->layout->disableLayout();
$this->_helper->viewRenderer->setNoRender();
$fileName = $list->list_name . '.csv';
$this->getResponse()->setHeader('Content-Type', 'text/csv; charset=utf-8')
->setHeader('Content-Disposition', 'attachment; filename="'. $fileName . '"');
Model_Subscriber1::exportList($listId);
echo 'Peak memory usage: ', memory_get_peak_usage()/1024, ' Memory usage: ', memory_get_usage()/1024;
}
So I'm at the end of the file where I export data. It's rather strange that for the list I export with something like 1M documents, it exports successfully and displays:
> Peak memory usage: 50034.921875 Kb Memory usage: 45902.546875 Kb
But when I try to export 1.3M documents, then after several minutes I only get in export file:
Fatal error: Allowed memory size of 134217728 bytes exhausted
(tried to allocate 44992513 bytes) in
/usr/share/php/Zend/Controller/Response/Abstract.php on line 586.
The size of documents I export are approximately the same.
I increased memory_limit to 256M and tried to export 1.3M list, this is what it showed:
Peak memory usage: 60330.4609375Kb Memory usage: 56894.421875 Kb.
It seems very confusing to me. Isn't this data so inaccurate? Otherwise, why it causes memory exhausted error with memory_limit set to 128M?
While the size of the documents may be about the same, the size allocated by PHP to process them isn't directly proportional to the document size or number of documents. This is because different types require different memory allocation in PHP. You may be able to free some memory as you go, but I don't see any place where you can in your code.
The best answer is to probably just increase the memory limit.
One thing you could do is offload the processing to an external script and call that from PHP. Many languages do this sort of processing in a more memory efficient way than PHP.
I've also noticed that the memory_get_peak_usage() isn't always accurate. I would try an experiment to increase the mem_limit to say 256 and run it on the larger data set (the 1.3 million). You are likely to find that it reports below the 128 limit as well.
I could reproduce this issue in a similar case of exporting a CSV file, where my system should have had enough memory, as shown by memory_get_usage(), but ended up with the same fatal error:
Fatal error: Allowed memory size.
I circumvented this issue by outputting the CSV contents into a physical temporary file, that I eventually zipped, before reading it out.
I wrote the file in a loop, so that each iteration wrote only a limited chunk of data, so that I never exceded the memory limit.
After zipping, the compression ratio was such, that I could handle raw files of over 10 times the size I initially hit the wall at. All up, it was a success.
Hint: when creating your archive, don't unlink the archive component(s) before invoking $zip->close(), as this call seems to be the one doing the business. Otherwise you'll end up with an empty archive!
Code sample:
<?php
$zip = new ZipArchive;
if ($zip->open($full_zip_path, ZipArchive::CREATE) === TRUE) {
$zip->addFile($full_csv_path, $csv_name);
$zip->close();
$Response->setHeader("Content-type", "application/zip; charset=utf-8");
$Response->setHeader("Content-disposition", "attachment; filename=" . $zip_name);
$Response->setBody(file_get_contents($full_zip_path));
}
else {
var_dump(error_get_last());
echo utf8_decode("Couldn't create zip archive '$full_zip_path'."), "\r\n";
}
unset($zip);
?>
Attention: when adding items to the zip archive, don't prepend a leading slash to the item's name if using Windows based OS.
Discussion over the original issue:
The Zend file at the line quoted is the
public function outputBody()
{
$body = implode('', $this->_body);
echo $body;
}
from the outputBody() method of the Zend_Controller_Response_Abstract class.
It looks like, however you do it, through echo, or print, or readfile, the output is always captured, and stuck into the response body, even if your turn the response return feature off before the dispatch.
I even tried to use the clearBody() class method, within the echo loop, with in mind that each $response->sendResponse() followed by $response->clearBody() would release memory, but it failed.
The way Zend handles the sending of the response is such that I always got the memory allocation of the full size of the raw CSV file.
Yet to be determined how it would be possible to tell Zend not to "capture" the output buffer.

Categories