PHP cURL, read remote file and write contents to local file

PHP cURL, read remote file and write contents to local file - php

I want to connect to a remote file and writing the output from the remote file to a local file, this is my function:
function get_remote_file_to_cache()
{
$the_site="http://facebook.com";
$curl = curl_init();
$fp = fopen("cache/temp_file.txt", "w");
curl_setopt ($curl, CURLOPT_URL, $the_site);
curl_setopt($curl, CURLOPT_FILE, $fp);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_exec ($curl);
$httpCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if($httpCode == 404) {
touch('cache/404_err.txt');
}else
{
touch('cache/'.rand(0, 99999).'--all_good.txt');
}
curl_close ($curl);
}
It creates the two files in the "cache" directory, but the problem is it does not write the data into the "temp_file.txt", why is that?

Actually, using fwrite is partially true.
In order to avoid memory overflow problems with large files (Exceeded maximum memory limit of PHP), you'll need to setup a callback function to write to the file.
NOTE: I would recommend creating a class specifically to handle file downloads and file handles etc. rather than EVER using a global variable, but for the purposes of this example, the following shows how to get things up and running.
so, do the following:
# setup a global file pointer
$GlobalFileHandle = null;
function saveRemoteFile($url, $filename) {
global $GlobalFileHandle;
set_time_limit(0);
# Open the file for writing...
$GlobalFileHandle = fopen($filename, 'w+');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FILE, $GlobalFileHandle);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_USERAGENT, "MY+USER+AGENT"); //Make this valid if possible
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); # optional
curl_setopt($ch, CURLOPT_TIMEOUT, -1); # optional: -1 = unlimited, 3600 = 1 hour
curl_setopt($ch, CURLOPT_VERBOSE, false); # Set to true to see all the innards
# Only if you need to bypass SSL certificate validation
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
# Assign a callback function to the CURL Write-Function
curl_setopt($ch, CURLOPT_WRITEFUNCTION, 'curlWriteFile');
# Exceute the download - note we DO NOT put the result into a variable!
curl_exec($ch);
# Close CURL
curl_close($ch);
# Close the file pointer
fclose($GlobalFileHandle);
}
function curlWriteFile($cp, $data) {
global $GlobalFileHandle;
$len = fwrite($GlobalFileHandle, $data);
return $len;
}
You can also create a progress callback to show how much / how fast you're downloading, however that's another example as it can be complicated when outputting to the CLI.
Essentially, this will take each block of data downloaded, and dump it to the file immediately, rather than downloading the ENTIRE file into memory first.
Much safer way of doing it!
Of course, you must make sure the URL is correct (convert spaces to %20 etc.) and that the local file is writeable.
Cheers,
James.

Let's try sending GET request to http://facebook.com:
$ curl -v http://facebook.com
* Rebuilt URL to: http://facebook.com/
* Hostname was NOT found in DNS cache
* Trying 69.171.230.5...
* Connected to facebook.com (69.171.230.5) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.35.0
> Host: facebook.com
> Accept: */*
>
< HTTP/1.1 302 Found
< Location: https://facebook.com/
< Vary: Accept-Encoding
< Content-Type: text/html
< Date: Thu, 03 Sep 2015 16:26:34 GMT
< Connection: keep-alive
< Content-Length: 0
<
* Connection #0 to host facebook.com left intact
What happened? It appears that Facebook redirected us from http://facebook.com to secure https://facebook.com/. Note what is response body length:
Content-Length: 0
It means that zero bytes will be written to xxxx--all_good.txt. This is why the file stays empty.
Your solution is absolutelly correct:
$fp = fopen('file.txt', 'w');
curl_setopt($handle, CURLOPT_FILE, $fp);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
All you need to do is change URL to https://facebook.com/.
Regarding other answers:
#JonGauthier: No, there is no need to use fwrite() after curl_exec()
#doublehelix: No, you don't need CURLOPT_WRITEFUNCTION for such a simple operation which is copying contents to file.
#ScottSaunders: touch() creates empty file if it doesn't exists. I think it was intention of OP.
Seriously, three answers and every single one is invalid?

You need to explicitly write to the file using fwrite, passing it the file handle you created earlier:
if ( $httpCode == 404 ) {
...
} else {
$contents = curl_exec($curl);
fwrite($fp, $contents);
}
curl_close($curl);
fclose($fp);

In your question you have
curl_setopt($curl, CURLOPT_FILE, $fp);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
but from PHP's curl_setopt documentation notes...
It appears that setting CURLOPT_FILE before setting CURLOPT_RETURNTRANSFER doesn't work, presumably because CURLOPT_FILE depends on CURLOPT_RETURNTRANSFER being set.
So do this:
<?php
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FILE, $fp);
?>
not this:
<?php
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
?>
...stating "CURLOPT_FILE depends on CURLOPT_RETURNTRANSFER being set".
Reference: https://www.php.net/manual/en/function.curl-setopt.php#99082

To avoid memory leak problems:
I was confronted with this problem as well. It's really stupid to say but the solution is to set CURLOPT_RETURNTRANSFER before CURLOPT_FILE!
it seems CURLOPT_FILE depends on CURLOPT_RETURNTRANSFER.
$curl = curl_init();
$fp = fopen("cache/temp_file.txt", "w+");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_FILE, $fp);
curl_setopt($curl, CURLOPT_URL, $url);
curl_exec ($curl);
curl_close($curl);
fclose($fp);

The touch() function doesn't do anything to the contents of the file. It just updates the modification time. Look at the file_put_contents() function.

Related

how to get image from url in php?

I have one remote url which outputs the image
The url is in format like this
http://domain.com/my_file/view/<file_id>/FULL/
In the url "my_file" is a controller name, "view" is a function name and the other two are the parameters
If I hit this url in browser it shows me image
I want to take that image in my projects folder
I have tried with file_get_contents but it gives me warning with 404
How can I achieve that?

$img=file_get_contents('http://example.com/image/test.jpg');
file_put_contents('/your/project/folder/imgname.jpg',$img);
This works only if allow_url_fopen is set to 1 in your php.ini file.
If you can change this value, enable it and you're done.
Another option is CURL. Check if this module is enabled in your PHP configuration.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://example.com/image/test.jpg');
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_HEADER , 0);
curl_setopt($ch, CURLOPT_VERBOSE, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$result = #curl_exec($ch);
$curl_err = curl_error($ch);
curl_close($ch);
if (empty($curl_err)) {
file_put_contents('/your/project/folder/imgname.jpg',$result);
}
If CURL is not enabled, your chance is to write a simple HTTP client like this:
$buf='';
$fp = fsockopen('example.com',80);
fputs($fp, "GET /image/test.jpg HTTP/1.1\n" );
fputs($fp, "Host: example.com\n" );
fputs($fp, "Connection: close\n\n" );
while (!feof($fp)) {
$buf .= fgets($fp,128);
}
fclose($fp);
file_put_contents('/your/project/folder/imgname.jpg',$buf);

Use Curl for this:
function curlFile($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
function readurl()
{
$url="http://domain.com/my_file/view/<file_id>/FULL/";
curlFile($url);
}
echo readurl();

If you are trying
$image = file_get_contents('http://domain.com/my_file/view/<file_id>/FULL/');
if($image) file_put_contents('some/folder/<file_id>');
and it does not work, it probably means either:
That there is access control that prevents it on the remote server. In that case, you must set the appropriate cookies. I suggest using curl to do that.
The image path is wrong; try to view source when navigating to http://domain.com/my_file/view/<file_id>/FULL/ using your browser.
The trailing / in your URL should not be there, e.g. http://domain.com/my_file/view/<file_id>/FULL?

How to tell if curl download is complete using CURLOPT_FILE in PHP

I have a function that downloads a specific zip file from a remote server to my local Windows server. The file ranges in size from 1-10 mb and I want my script to wait until it's complete. How can I tell with certainty that it's done? The actual downloading works fine.
The function is:
function get_download($url, $path)
{
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL,$url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl, CURLOPT_COOKIEFILE, 'cookie2.txt');
curl_setopt($curl, CURLOPT_FILE, $path);
$results = curl_exec($curl);
return $results;
}
I thought I would get the size of the remote file first then compare to the local file size. However, this doesn't seem to work. Not sure if its a function of the file being a .zip file but it always returns -1 or -11. I tried this function:
function remotefileSize($url)
{
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_NOBODY, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($curl, CURLOPT_HEADER, 0);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($curl, CURLOPT_MAXREDIRS, 3);
curl_exec($curl);
$filesize = curl_getinfo($curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD);
curl_close($curl);
return $filesize;
}

Did you see anything on your browser that shows Unknown time remaining or Unknown bytes remaining while downloading a file? It is because the browser couldn't retrieve the size and failed to calculate remaining time or size. This is what is happening with you. Your filesize retrieval code is correct.
Another reason could be: Sometime the server doesn't support the HEAD request(when you use CURLOPT_NOBODY).

CURL get file only if had been modified

how can I understand if a file was been modified before to open the stream with CURL
(then I can open it with file-get-contents)
thanks

Check for CURLINFO_FILETIME:
$ch = curl_init('http://www.mysite.com/index.php');
curl_setopt($ch, CURLOPT_FILETIME, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
$exec = curl_exec($ch);
$fileTime = curl_getinfo($ch, CURLINFO_FILETIME);
if ($fileTime > -1) {
echo date("Y-m-d H:i", $fileTime);
}

Try sending a HEAD request first to get the last-modified header for the target url for comparison of your cached version. Also you could try to use the If-Modified-Since header with the time your cached version is created with the GET request so the other side can respond you with 302 Not Modified too.
Sending a HEAD request with curl looks something like this:
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_NOBODY, true);
curl_setopt($curl, CURLOPT_HEADER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTP_VERSION , CURL_HTTP_VERSION_1_1);
$content = curl_exec($curl);
curl_close($curl)
The $content now will contain the returned HTTP header, as one long string, you can look for last-modified: in it like this:
if (preg_match('/last-modified:\s?(?<date>.+)\n/i', $content, $m)) {
// the last-modified header is found
if (filemtime('your-cached-version') >= strtotime($m['date'])) {
// your cached version is newer or same age than the remote content, no re-fetch required
}
}
You should handle the expires header too the same way (extract the value from the header string, check if if the value is in the future or not)

Download file Curl with url var

I would like to download a file with Curl.
The problem is that the download link is not direct, for example:
http://localhost/download.php?id=13456
When I try to download the file with curl, it download the file download.php!
Here is my curl code:
###
function DownloadTorrent($a) {
$save_to = $this->torrentfolder; // Set torrent folder for download
$filename = str_replace('.torrent', '.stf', basename($a));
$fp = fopen ($this->torrentfolder.strtolower($filename), 'w+');//This is the file where we save the information
$ch = curl_init($a);//Here is the file we are downloading
curl_setopt($ch, CURLOPT_ENCODING, "gzip"); // Important
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
curl_setopt($ch, CURLOPT_URL, $fp);
curl_setopt($ch, CURLOPT_HEADER,0); // None header
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1); // Binary trasfer 1
curl_exec($ch);
curl_close($ch);
fclose($fp);
}
Is there a way to download the file without knowing the path?

You may try CURLOPT_FOLLOWLOCATION
TRUE to follow any "Location: " header that the server sends as part
of the HTTP header (note this is recursive, PHP will follow as many
"Location: " headers that it is sent, unless CURLOPT_MAXREDIRS is
set).
So it will result into:
function DownloadTorrent($a) {
$save_to = $this->torrentfolder; // Set torrent folder for download
$filename = str_replace('.torrent', '.stf', basename($a));
$fp = fopen ($this->torrentfolder.strtolower($filename), 'w+');//This is the file where we save the information
$ch = curl_init($a);//Here is the file we are downloading
curl_setopt($ch, CURLOPT_ENCODING, "gzip"); // Important
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER,0); // None header
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1); // Binary transfer 1
curl_exec($ch);
curl_close($ch);
fclose($fp);
}

Set the FOLLOWLOCATION option to true, e.g.:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
Options are documented here: http://www.php.net/manual/en/function.curl-setopt.php

Oooh !
CURLOPT_FOLLOWLOCATION work perfect...
The problem is that I use CURLOPT_URL for fopen(), I simply change CURLOPT_URL whit CURLOPT_FILE
and it works very well!
thank you for your help =)

Check size of external files, php

I get files by their urls by this code
file_get_contents($_POST['url'];
Then I do something with them.
But I don't want to operate with big files, how do I limit size of received file?
It should throw an error if file is bigger than 500kb.

See my answer to this question. You need to have the cURL extension, with which you can make a HEAD HTTP request to the remote server. The response will let you know how big the file is, and you can then decide accordingly.
You are interested specifically in this line:
$size = curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD);

Agree with #Jon
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_URL, $url); //specify the url
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$head = curl_exec($ch);
$size = curl_getinfo($ch,CURLINFO_CONTENT_LENGTH_DOWNLOAD);
if(<limit the $size>){
file_get_contents($url);
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP cURL, read remote file and write contents to local file - php

You need to explicitly write to the file using fwrite, passing it the file handle you created earlier: if ( $httpCode == 404 ) { ... } else { $contents = curl_exec($curl); fwrite($fp, $contents); } curl_close($curl); fclose($fp);

The touch() function doesn't do anything to the contents of the file. It just updates the modification time. Look at the file_put_contents() function.

Related

how to get image from url in php?

How to tell if curl download is complete using CURLOPT_FILE in PHP

CURL get file only if had been modified

Download file Curl with url var

Check size of external files, php

Categories

Resources