Curl hangs when downloading large file (500MB+) - php

I'm using cURL to download large XML files (between 500MB and 1GB) from a remote server. Although the script works fine for smaller test files, every time I try to download a file larger than a few hundred megabytes, the script seems to hang - it doesn't quit, there's no error message, it just hangs there.
I'm executing the script from the command line (CLI), so PHP itself should not time out. I have also tried cURL's verbose mode, but this shows nothing beyond the initial connection. Every time I download the file, it stops at exactly the same size (463.3MB). The file's XML at this point is incomplete.
Any ideas much appreciated.
$ch = curl_init();
$fh = fopen($filename, 'w');
curl_setopt($ch,CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE );
curl_setopt($ch, CURLOPT_FILE, $fh);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 0);
if(curl_exec($ch) === false)
{
echo 'Curl error: ' . curl_error($ch) . "\n";
}
else
{
echo 'Operation completed without any errors';
}
$response = array(
'header' => curl_getinfo($ch)
);
curl_close($ch);
fclose($fh);
if($response['header']['http_code'] == 200) {
echo "File downloaded and saved as " . $filename . "\n";
}
Again, this script works fine with smaller files, but with the large file I try to download it does not even get as far as printing out an error message.
Could this be something else (Ubuntu 10.04 on Linode) terminating the script? As far as I understand, my webserver shouldn't matter here since I am running it through CLI.
Thanks,
Matt

The files appear to be complete when they stop downloading right? Adding -m 10800 to the command will timeout and end the transfer after the specified number of seconds. This will work if you set the timeout to be longer than the transfer takes, but is still annoying.

Check this post, maybe you can try to download file as parts. Or if you have access to remote server you can try to archive file and then download it. You can check your php.ini configuration too. See for file size, memory limits and other.

You might be out of disk space on the partition you are saving,
Or running over quota for the user running the script.

Related

How to upload big files via Storage-ftp in Laravel 5.1

I'm trying to upload big file to ftp server with Laravel Storage function and it's keep give me the erorr
Out of memory (allocated 473432064) (tried to allocate 467402752 bytes)
I've tried to change the memory limit on php.ini and it still won't work, when I upload file to the server normally its work and the size dosent metter.
I've tried anything but nothing work.
Again - I'm trying to upload via FTP.
One more question : there is a way to uplaod the file direct to the ftp server from the client? I see that the storage upload first to my server and then transfer to the second server...
It definately sounds like a PHP Limitation, raising the memory limit probably isn't the best way to do it though, that leads to nothing but hassle, trust me.
Best method i can think of from the top of my head is to use Envoy (the server script method not the deployment service) to put together an SSH task, that way your job is being executed outside of PHP so you're not subject to the same memory limitations. Your Envoy script (envoy.blade.php in your project root) would probably look something like this;
#servers(['your_server_name' => 'your.server.ip'])
#task('upload', ['on' => ['your_server_name']])
// perform your FTP setup, login etc.
put your_big_file.extension
#endtask
I've only got one of these set up for a deployment job which is called from Jenkins so i'm not sure if you can launch it from within Laravel, but i launch from the command line like this;
vendor/bin/envoy run myJobName
Like i said the only thing i can't quite remember is if you can run Envoy from within Laravel itself, and the docs seem a little hazy on it, Definately an option worth checking out though :)
https://laravel.com/docs/5.1/envoy
Finally I solve the problem by using curl instead of Storage.
$ch = curl_init();
$localfile = $file->getRealPath();
$fp = fopen($localfile, 'r');
curl_setopt($ch, CURLOPT_URL, 'ftp://domain/' . $fileName);
curl_setopt($ch, CURLOPT_USERPWD, "user:pass");
curl_setopt($ch, CURLOPT_UPLOAD, 1);
curl_setopt($ch, CURLOPT_INFILE, $fp);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($localfile));
curl_exec($ch);
$error_no = curl_errno($ch);
curl_close($ch);
Work perfect! Better then Storage option of laravel.

Update local php file from remote php file

I am working on a CMS that will be installed for many clients, but as I keep on improving it I make modifications to a few files. I want these updates to be automatically applied to all the projects using the same files.
I thought of executing a check file every time the CMS is opened. This file would compare the version of the local file with the remote file, for this I can keep a log or something for the versions, no big deal, but that's not the problem, here is some sample code I thought of:
$url = 'http://www.example.com/myfile.php';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HEADER, false);
$data = curl_exec($curl);
curl_close($curl);
The problem is getting the content of myfile.php since its a PHP file the server will execute it and return the output, but I want the actual content of the file. I understand that this is not possible as it would be a security problem, anybody would be able to get the php code of other sites, but is there any way to get the contents of a remote php file maybe by giving special permissions to a remote connection?
Thanks.
You should create a download script on your remote server which will return the original php code by using readfile().
<?php
$file = $_SERVER['DOCUMENT_ROOT'] . $_GET['file'];
// #TODO: Add security check if file is of type php and below document root. Use realpath() to check this.
header("Content-Type: text/plain");
header("Content-Disposition: attachment; filename=\"$file\"");
readfile($file);
?>
Get file contents by fethcing http://example.com/download.php?file=fileName

PHP curl resume download

I'm currently trying to download satellite images from esa's Copernicus / Sentinel project with curl. Unfortunately the download keeps stopping at around 90% and the php script returns an Internal Server Error (500).
Therefore I would like to resume the download at a specific byte number. It seems that the esa server just ignores the http-range-header (CURLOPT_RANGE) and CURLOPT_RESUME_FROM doesn't change anything either.
If I use Google Chrome to download the file manually, the download also interrupts but continues after some time.
So, if Google Chrome can resume the download, curl should be able to do that, too. I would appreciate any help on how to do that.
Some details:
The file I'm trying to download is here (420MB), to access it you need to register at scihub.esa.int/dhus/.
Content-Type is application/octet-stream
My code:
$save_file = fopen($save_filepath, "w+");
$open_file = curl_init(str_replace(" ","%20", $url));
curl_setopt($open_file, CURLOPT_USERPWD, $username.":".$password);
curl_setopt($open_file, CURLOPT_TIMEOUT, 300);
curl_setopt($open_file, CURLOPT_FILE, $save_file);
curl_setopt($open_file, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($open_file, CURLOPT_PROGRESSFUNCTION, "trackprogress");
curl_setopt($open_file, CURLOPT_NOPROGRESS, false);
curl_exec($open_file);
curl_close($open_file);
fclose($save_file);
It works perfectly for smaller files (I've tested it with some images and pdf-files) and I can also download most of the satellite image (the first 380MB are downloaded). I tried to increase the timeout value, too, but the script terminates long before the 5 minutes are reached.
I tried curl_setopt($open_file, CURLOPT_RESUME_FROM, 1048576); and curl_setopt($open_file, CURLOPT_RANGE, "1048576-"); but the file always starts with the same bytes.
EDIT:
I can't answer my question, but for this specific case I found a workaround. So, if anybody reads this and also wants to download these satellite images with cURL by chance, here is what I did:
When downloading not just the image file, but the zip-file with some additional data, the download still keeps stopping, however with curl_setopt($open_file, CURLOPT_RESUME_FROM, $bytes_already_loaded); it is possible to skip the bytes which had previously been loaded and resume the download (which isn't possible for the image file). Thus, use this link instead of the image file.

Corrupt image when extract from zip

I trying download a zip file using curl from one virtual host to another, in a same server. Zip file contains *.php and *.jpg files.
The problem is: sometimes JPG files get corrupt, like this:
Here is my code :
$out = fopen(ABSPATH.'/templates/default.zip','w+');
$ch = curl_init();
curl_setopt($ch, CURLOPT_FILE, $out);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_URL, 'http://share.example.com/templates/default.zip');
curl_exec($ch);
curl_close($ch);
$zip = new ZipArchive;
if ($zip->open(ABSPATH.'/templates/default.zip') === TRUE)
{
if($zip->extractTo(ABSPATH.'/templates'))
{
echo 'OK';
}
$zip->close();
}
//$zip->close();
I don't understand what happen to my jpg. I also tried using pclzip.lib.php, but no luck. How to solve this problem ?
Thanks in advance
Have you tried downloading the file via curl and unzipping it normally (i.e. without php)? To figure out whether the download causes the problem or the unzip.
You might also try to replace one of both parts using shell_exec (wget instead of curl, unzip instead of ZipArchive). I mean just for debugging, not for production maybe.
Finally i found what is the problem.
I'm using Nginx web server, when i change nginx config files:
sendfile on;
became
sendfile off;
My image not corrupt anymore. So its not php or curl problem. Interesting article: http://technosophos.com/node/172

Downloading remote files with PHP/cURL: a bit more robustly

I have a script that pulls URLs from the database and downloads them (pdf or jpg) to a local file.
Code is:
$cp = curl_init($remote_url);
$fp = fopen($dest_temp, "w");
#curl_setopt($cp, CURLOPT_FILE, $fp);
#curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_exec($cp);
curl_close($cp);
fclose($fp);
If the remote file is there, it works fine. If the remote file is not there, it just bombs and the browser hangs forever.
What's the best approach to handling this, should I somehow ping for the file first? or can I set options above that will handle this. I tried setting timeouts but it had no effect.
this is my first experience using cURL
I used to use wget much as you're using curl and got frustrated with the lack of ability to know what is going on because its essentially calling out to an external program.
I use perl WWW:Mechanize and the link below is a PHP version which might be a bit more robust for you to be able to deal with such instances.
http://www.compasswebpublisher.com/php/www-mechanize-for-php
Hope this helps.

Categories