I have the following php code which I found here:
function download_xml()
{
$url = 'http://tv.sygko.net/tv.xml';
$ch = curl_init($url);
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
echo("curl_exec was succesful"); //This never gets called
curl_close($ch);
return $data;
}
$my_file = 'tvdata.xml';
$handle = fopen($my_file, 'w');
$data = download_xml();
fwrite($handle, $data);
What I'm trying to do is to download the xml at the specified url and save it to the disk. However, it stops once about 80% finished and never reaches the echo call after the curl_exec call. I'm not sure why, but I believe this is because it runs out of memory. Therefore I would like to ask if it is possible to make curl write the data to the file every time it has downloaded say 4kb. If this is not possible, do anybody know a way to get the xml file stored at the url downloaded and stored on my disk using php?
Thank you very much,
BEN.
EDIT:
This is the code now, it doesnt work. It writes the data to the file but still only about 80% of the document. Maybe it isn't because it exceeds memory but some other reason? I really can't believe it is this hard to copy a file from a URL to the disc...
<?
$url = 'http://tv.sygko.net/tv.xml';
$my_file = fopen('tvdata.xml', 'w');
$ch = curl_init($url);
$timeout = 300;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FILE, $my_file);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_BUFFERSIZE, 4096);
curl_exec($ch) OR die("Error in curl_exec()");
echo("got to after curl exec");
fclose($my_file);
curl_close($ch);
?>
Here comes a fully working example:
public function saveFile($url, $dest) {
if (!file_exists($dest))
touch($dest);
$file = fopen($dest, 'w');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROGRESSFUNCTION, 'progressCallback');
curl_setopt($ch, CURLOPT_BUFFERSIZE, (1024*1024*512));
curl_setopt($ch, CURLOPT_NOPROGRESS, FALSE);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_FILE, $file);
curl_exec($ch);
curl_close($ch);
fclose($file);
}
?>
The secret lies withing setting CURLOPT_NOPROGRESS to FALSE, and then, CURLOPT_BUFFERSIZE will make the callback report for every CURLOPT_BUFFERSIZE bytes reached. The smaller value, the more frequently it will report. This also depends on your download speed, etc, so don't count on it to report every X seconds, since it will report for every X bytes received/transferred.
Your timeout is set to 5 seconds which might be too short depending on the file size of the document. Try increasing it to 10-15 just to make sure it has enough time to complete the transfer.
There's an option called CURELOPT_FILE that allows you to specify a file handler that curl should write to. I'm pretty sure it will do "right" thing and "write" as it reads, avoiding your memory problem
$file = fopen('test.txt', 'w'); //<--------- file handler
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://example.com');
curl_setopt($ch, CURLOPT_FAILONERROR,1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_FILE, $file); //<------- this is your magic line
curl_exec($ch);
curl_close($ch);
fclose($file);
curl_setopt the CURLOPT_FILE - The file that the transfer should be written to. The default is STDOUT (the browser window)
http://us2.php.net/manual/en/function.curl-setopt.php
Related
I am trying to download a ZIP file using cURL, from a given URL.
I received an URL from a supplier where I should download a ZIP file. But everytime I try to download the ZIP file I get the page that says that I am not logged in.
The url where I should get the file from looks like this:
https://www.tyre24.com/nl/nl/user/login/userid/USERID/password/PASSWORD/page/L2V4cG9ydC9kb3dubG9hZC90L01nPT0vYy9NVFE9Lw==
Here you see that the USERID, and PASSWORD are variables that are filled in with the correct data. The strange thing is that if I enter the URL in my browser it seems to work, the zip file is getting downloaded.
But everytime I call that URL with cURL, I seem to get a incorrect login page. Could someone tell me what I am doing wrong?
It seems like that there is a redirect behind the given URL, that is why I have putted in the cURL call: curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
Here is my code:
set_time_limit(0);
//File to save the contents to
$fp = fopen ('result.zip', 'w+');
$url = "https://www.tyre24.com/nl/nl/user/login/userid/118151/password/5431tyre24/page/L2V4cG9ydC9kb3dubG9hZC90L01nPT0vYy9NVFE9Lw==";
//Here is the file we are downloading, replace spaces with %20
$ch = curl_init(str_replace(" ","%20",$url));
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
//give curl the file pointer so that it can write to it
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);//get curl response
//done
curl_close($ch);
Am I doing something wrong?
To download a zip file from the external source via CURL use one of the following approaches:
First approach:
function downloadZipFile($url, $filepath){
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
$raw_file_data = curl_exec($ch);
if(curl_errno($ch)){
echo 'error:' . curl_error($ch);
}
curl_close($ch);
file_put_contents($filepath, $raw_file_data);
return (filesize($filepath) > 0)? true : false;
}
downloadZipFile("http://www.colorado.edu/conflict/peace/download/peace_essay.ZIP", "result.zip");
A few comments:
to get data back from the remote source you have to set
CURLOPT_RETURNTRANSFER option
instead of consequent calls of fopen ... fwite functions you can
use file_put_contents which is more handy
And here is screenshot with result.zip which was downloaded a few minutes earlier using the above approach:
Second approach:
function downloadZipFile($url, $filepath){
$fp = fopen($filepath, 'w+');
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
//curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false );
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_exec($ch);
curl_close($ch);
fclose($fp);
return (filesize($filepath) > 0)? true : false;
}
Include following lines of code after curl_init() .i think this will work.
CURLOPT_RETURNTRANSFER :::
TRUE to return the transfer as a string of the return value of
curl_exec() instead of outputting it out directly.
CURLOPT_USERAGENT::The contents of the "User-Agent: " header to be used in a HTTP request.
Read more about curl_setopt here.
$ch = curl_init(str_replace(" ","%20",$url));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
I am struck with this. I want to download a csv file from url using curl. I have referred all the answer in stackoverflow and tried all. But not getting what i am expected. i have the following code.
define("COOKIE_FILE", "cookie.txt");
$path = "settlement_file/test.csv";
set_time_limit(0);
$fp = fopen ($path, 'w+');//This is the file where we save the information
$ch = curl_init(str_replace(" ","%20",$url));//Here is the file we are downloading, replace spaces with %20
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
curl_setopt($ch, CURLOPT_FILE, $fp); // write curl response to file
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
curl_setopt ($ch, CURLOPT_COOKIEFILE, COOKIE_FILE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSLVERSION,3);
curl_exec($ch); // get curl response
curl_close($ch);
fclose($fp);
You have to remove the curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); statement. It makes curl_exec return the data instead of writing it to a file. Since it comes after the curl_setopt($ch, CURLOPT_FILE, $fp); it overrides that, so just remove the former line.
After two days of struggling my last hopes are on you.
I'm trying to download a large (+/- 160 mb) XML-file from the Zanox servers.
The download link to this file is dynamic and does not directly point to the file itself.
I'm trying to download this file to my own server to parse it, but it's not working out for me.
I've been using curl with the CURLOPT_HEADER set to 0.
Can you guys help me out maybe?
Regards.
One of the codes I used:
$fp = fopen("productfeed1.xml", 'w+');
$c = curl_init($url);
curl_setopt($c, CURLOPT_FILE, $fp);
curl_setopt($c, CURLOPT_HEADER, 0);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1);
$contents = curl_exec($c);
$info = curl_getinfo($c);
fwrite($fp, $contents);
curl_close($c);
fclose($fp);
Give this a try. (It works for me)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt($ch, CURLOPT_MAXREDIRS, 10 );
curl_setopt($ch, CURLOPT_TIMEOUT, 36000);
curl_exec($ch);
I want to download a image from the url but it is not getting downloaded.Please check out the code and tell me where i am going wrong.
<?php
$ch = curl_init ("http://l1.yimg.com/t/frontpage/cannes_anjelina_60.jpg");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
$rawdata=curl_exec ($ch);
curl_close ($ch);
$fp = fopen("img.jpg",'w');
fwrite($fp, $rawdata);
fclose($fp);
?>
It's working fine, set proper file permissions to where the image should be saved. In this case it's the same folder where your script is, might want to move it somewhere else like:
// where "images" folder can be written with files
// set permissions to 0755
$fp = fopen("images/img.jpg",'w');
function download_image ($url, $the_filename) {
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init ($url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Debian) Firefox/0.10.1");
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_setopt($ch, CURLINFO_EFFECTIVE_URL, true);
curl_setopt($ch, CURLOPT_REFERER, "http://tomakefast.com");
curl_setopt($ch, CURLOPT_POST, false);
$rawdata=curl_exec($ch);
curl_close($ch);
file_put_contents("../somewhere/". $the_filename, $rawdata);
}
If you see any problems, please let me know. This occasionally gives me a 0 byte image file but that could happen for any number of reasons. This could be improved to return false on 0 bytes, maybe even do a basic test to see if indeed an image was downloaded.
you can use this simple code instead if you feel this is better.
$img_file = file_get_contents("http://l1.yimg.com/t/frontpage/cannes_anjelina_60.jpg");
file_put_contents("myImage.jpg", $img_file);
I've been killing myself all day for this one bug. I can't tell you enough how much I'd appreciate any possible help on this.
Basically, I have a very simple script. It logs into a website, looks at a file's header to see if it is an image type and then it downloads it. It then repeats this three times.
The problem here is that I cannot set CURLOPT_NOBODY without curl_exec crashing the entire script with -no- errors. (I can't even call or get an curl_error!)It would seem that it is impossible for me to go from CURLOPT_NOBODY, true to CURLOPT_NOBODY, false. The loop below runs one time and then dies().
What could possibly be causing this bug?
Here is the script:
// Log into the Website
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/login');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.12) Gecko/2009070611 Firefox/3.0.12");
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_exec($ch);
// Begin the Loop for Finding Images
for($i = 0; $i < 3; $i++) {
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/file.php?id=' . $i);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
$output = curl_exec($ch) or die('WHY DOES THIS DIE!!!');
$curl_info = curl_getinfo($ch);
echo '<br/>' . $output;
// (Normally checks for content type here) Download the File
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/file.php?id=' . $i);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
$filename = 'downloads/test-' . $i . '.jpg';
$fp = fopen($filename, 'w');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_exec($ch);
fclose($fp);
}
I'm running apache 2.2 and PHP version 5.2.13.
Thanks for any help- I can't tell you how much I'd appreciate it. I'm completely stuck here. :(
Looks to me like the curl library is getting confused, esp when you are reusing the resource.
you should do
$ch = curl_init();
// do stuff with curl
curl_close($ch);
$ch = curl_init();
// another curl call
curl_close($ch);
$ch = curl_init();
// yet another curl call
curl_close($ch);
I was given the same errors executing your script, but adding in the curl_close's and the curl_init to reinitialize, seem to fix the problem. I don't know if this is acceptable, if not. I'd use the fopen() to do your http downloading, its much more intuitive than using curl, unless you need something that isn't supported in fopen.