Download Zanox product feed to server - php

After two days of struggling my last hopes are on you.
I'm trying to download a large (+/- 160 mb) XML-file from the Zanox servers.
The download link to this file is dynamic and does not directly point to the file itself.
I'm trying to download this file to my own server to parse it, but it's not working out for me.
I've been using curl with the CURLOPT_HEADER set to 0.
Can you guys help me out maybe?
Regards.
One of the codes I used:
$fp = fopen("productfeed1.xml", 'w+');
$c = curl_init($url);
curl_setopt($c, CURLOPT_FILE, $fp);
curl_setopt($c, CURLOPT_HEADER, 0);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1);
$contents = curl_exec($c);
$info = curl_getinfo($c);
fwrite($fp, $contents);
curl_close($c);
fclose($fp);

Give this a try. (It works for me)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt($ch, CURLOPT_MAXREDIRS, 10 );
curl_setopt($ch, CURLOPT_TIMEOUT, 36000);
curl_exec($ch);

Related

Upload the file to the FTP server over HTTPS using curl in php

I am trying to upload the data [can be text or zip] to the ftp site.
As we have proxy in place in the environment so, I decided to upload the data using curl. Before I go head to set proxy server setting, I was testing the script on the environment without proxy.
I followed :-
FTP upload file to distant server with CURL and PHP uploads a blank file
However, I couldn't able to upload a file.
link where, I would like to upload is this format:-
https://fp.emc.com/.....
Do any one know, how to upload a file to ftp server over https using curl function of PHP?
<?php
$sendTo = 'https://ftp.emc.com/....?domain=XX&user=XXX&password=XXX';
$localfile ="23.txt";
$fp = fopen($localfile, 'r');
// Create CURL Connection
$ch = curl_init();
//curl_setopt($ch, CURLOPT_PROTOCOLS, CURLPROTO_HTTPS);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "usr:passwd");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_URL, $sendTo);
//curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_UPLOAD, 1);
curl_setopt($ch, CURLOPT_INFILE, $fp);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($localfile));
echo $ch;
$m=curl_exec ($ch);
echo "m $m<br>";
$error_no = curl_errno($ch);
echo "error_no $error_no<br>";
curl_close ($ch);
if ($error_no == 0) {
$error = 'File uploaded succesfully.';
} else {
$error = 'File upload error.';
}
echo $error;
?>
Try this:
$fp = fopen($filepath, 'r');
$ftp_url = "ftp://user:password#ftpserver:21/" . $filename;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $ftp_url);
curl_setopt($ch, CURLOPT_UPLOAD, 1);
curl_setopt($ch, CURLOPT_INFILE, $fp);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($filepath));
curl_setopt($ch, CURLOPT_PROXY, $proxy_server_ip);
curl_setopt($ch, CURLOPT_PROXYPORT, $proxy_server_port);
curl_setopt($ch, CURLOPT_PROXYTYPE, 'HTTP');
curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxy_login);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_exec($ch);
curl_close($ch);
It works for me.

cURL not saving the image correctly

$ch = curl_init();
$fp = fopen("$localName",'w');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_URL, $src);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_TIMEOUT, 200);
curl_setopt($ch, CURLOPT_AUTOREFERER, false);
curl_setopt($ch, CURLOPT_REFERER, "http://google.com");
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLOPT_HEADER, 0);
$rawdata=curl_exec ($ch);
curl_close ($ch);
fwrite($fp, $rawdata);
fclose($fp);
... writes the file but invalid (0 bytes). Please tell me what I'm doing wrong.
I ran your code and there were some errors. I have rectified those here:
$src = '<URL to the image>';
$ch = curl_init($src);
//curl_setopt($ch, CURLOPT_FILE, $fp); This option is not required
//curl_setopt($ch, CURLOPT_URL, $host); Since you are setting the source in init skip this
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_TIMEOUT, 200);
curl_setopt($ch, CURLOPT_AUTOREFERER, false);
curl_setopt($ch, CURLOPT_REFERER, "http://google.com");
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follows redirect responses.
$raw=curl_exec($ch);
if ($raw === false) {
trigger_error(curl_error($ch));
}
curl_close ($ch);
$localName = basename($src); // The file name of the source can be used locally
if(file_exists($localName)){
unlink($localName);
}
$fp = fopen($localName,'wb');
fwrite($fp, $raw);
fclose($fp);
Turned out the problem was some images were loaded but with 302 redirect status message which confused the curl.
Try add a header() for image type.
For example, if it is a PNG image, add in your code:
// ...
header('Content-type: image/png');
file_put_contents($raw);
Had a play, all you seem to need is the following:
<?
$src = 'http://bit.ly/TT5N5M';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $src);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$output = curl_exec($ch);
curl_close($ch);
$fp = fopen("img.jpg",'w');
fwrite($fp, $output);
fclose($fp);

cUrl Login then cUrl Download

I am writing a script to download files from a password protected members area. I have it working right now by using a curl call to login and then download. But the issue I am trying to fix is that I could like to have a script login and save the cookie then another script use the cookie to download the file needed. Now I am not sure if this is possible.
Here is my working code:
$cookie_file_path = "downloads/cookie.txt";
$fp = fopen($cookie_file_path, "w");
fclose($fp);
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_URL, $loginUrl);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $loginPostInfo);
curl_exec($ch);
// harddcode some known data
$downloadSize = 244626770;
$chuckSize = 1024*2048;
$filePath = "downloads/file.avi";
$file = fopen($filePath, "w");
$downloaded = 0;
$startTime = microtime(true);
while ($downloaded < $downloadSize) {
// DOWNLOAD
curl_setopt($ch, CURLOPT_RANGE, $downloaded."-".($downloaded + $chuckSize - 1));
curl_setopt($ch, CURLOPT_URL, $downloadUrl);
$result = curl_exec($ch);
$nowTime = microtime(true);
fwrite($file, $result);
echo "\n\nprogress: ".$downloaded."/".$downloadSize." - %".(round($downloaded / $downloadSize, 4) * 100);
$downloaded += $chuckSize;
// calculate kbps
$totalTime = $nowTime - $startTime;
$kbps = $downloaded / $totalTime;
echo "\ndownloaded: ".$downloaded." bytes";
echo "\ntime: ".round($totalTime, 2);
echo "\nkbps: ".(round($kbps / 1024, 2));
}
fclose($file);
curl_close($ch);
So is it possible to close the curl after the login curl_exec and then open a curl call again to download the file using the cookie I saved during the login part?
Yes it's possible.
CURLOPT_COOKIEJAR is the write path for cookies, while CURLOPT_COOKIEFILE is the read path for cookies. If you provide CURLOPT_COOKIEFILE with the same path as you did with CURLOPT_COOKIEJAR, cURL will persist the cookies across requests:
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
On ImpressPages I've done it this way:
//initial request with login data
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/login.php');
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/32.0.1700.107 Chrome/32.0.1700.107 Safari/537.36');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, "username=XXXXX&password=XXXXX");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie-name'); //could be empty, but cause problems on some hosts
curl_setopt($ch, CURLOPT_COOKIEFILE, '/var/www/ip4.x/file/tmp'); //could be empty, but cause problems on some hosts
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
//another request preserving the session
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/profile');
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, "");
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
Yes, please loop at the CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE. I see you already use CURLOPT_COOKIEJAR, so you should probably only dive into *_COOKIEJAR.

Unable to set CURLOPT_NOBODY without crashing curl_exec

I've been killing myself all day for this one bug. I can't tell you enough how much I'd appreciate any possible help on this.
Basically, I have a very simple script. It logs into a website, looks at a file's header to see if it is an image type and then it downloads it. It then repeats this three times.
The problem here is that I cannot set CURLOPT_NOBODY without curl_exec crashing the entire script with -no- errors. (I can't even call or get an curl_error!)It would seem that it is impossible for me to go from CURLOPT_NOBODY, true to CURLOPT_NOBODY, false. The loop below runs one time and then dies().
What could possibly be causing this bug?
Here is the script:
// Log into the Website
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/login');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.12) Gecko/2009070611 Firefox/3.0.12");
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_exec($ch);
// Begin the Loop for Finding Images
for($i = 0; $i < 3; $i++) {
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/file.php?id=' . $i);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
$output = curl_exec($ch) or die('WHY DOES THIS DIE!!!');
$curl_info = curl_getinfo($ch);
echo '<br/>' . $output;
// (Normally checks for content type here) Download the File
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/file.php?id=' . $i);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
$filename = 'downloads/test-' . $i . '.jpg';
$fp = fopen($filename, 'w');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_exec($ch);
fclose($fp);
}
I'm running apache 2.2 and PHP version 5.2.13.
Thanks for any help- I can't tell you how much I'd appreciate it. I'm completely stuck here. :(
Looks to me like the curl library is getting confused, esp when you are reusing the resource.
you should do
$ch = curl_init();
// do stuff with curl
curl_close($ch);
$ch = curl_init();
// another curl call
curl_close($ch);
$ch = curl_init();
// yet another curl call
curl_close($ch);
I was given the same errors executing your script, but adding in the curl_close's and the curl_init to reinitialize, seem to fix the problem. I don't know if this is acceptable, if not. I'd use the fopen() to do your http downloading, its much more intuitive than using curl, unless you need something that isn't supported in fopen.

Make cURL write data as it receives it

I have the following php code which I found here:
function download_xml()
{
$url = 'http://tv.sygko.net/tv.xml';
$ch = curl_init($url);
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
echo("curl_exec was succesful"); //This never gets called
curl_close($ch);
return $data;
}
$my_file = 'tvdata.xml';
$handle = fopen($my_file, 'w');
$data = download_xml();
fwrite($handle, $data);
What I'm trying to do is to download the xml at the specified url and save it to the disk. However, it stops once about 80% finished and never reaches the echo call after the curl_exec call. I'm not sure why, but I believe this is because it runs out of memory. Therefore I would like to ask if it is possible to make curl write the data to the file every time it has downloaded say 4kb. If this is not possible, do anybody know a way to get the xml file stored at the url downloaded and stored on my disk using php?
Thank you very much,
BEN.
EDIT:
This is the code now, it doesnt work. It writes the data to the file but still only about 80% of the document. Maybe it isn't because it exceeds memory but some other reason? I really can't believe it is this hard to copy a file from a URL to the disc...
<?
$url = 'http://tv.sygko.net/tv.xml';
$my_file = fopen('tvdata.xml', 'w');
$ch = curl_init($url);
$timeout = 300;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FILE, $my_file);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_BUFFERSIZE, 4096);
curl_exec($ch) OR die("Error in curl_exec()");
echo("got to after curl exec");
fclose($my_file);
curl_close($ch);
?>
Here comes a fully working example:
public function saveFile($url, $dest) {
if (!file_exists($dest))
touch($dest);
$file = fopen($dest, 'w');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROGRESSFUNCTION, 'progressCallback');
curl_setopt($ch, CURLOPT_BUFFERSIZE, (1024*1024*512));
curl_setopt($ch, CURLOPT_NOPROGRESS, FALSE);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_FILE, $file);
curl_exec($ch);
curl_close($ch);
fclose($file);
}
?>
The secret lies withing setting CURLOPT_NOPROGRESS to FALSE, and then, CURLOPT_BUFFERSIZE will make the callback report for every CURLOPT_BUFFERSIZE bytes reached. The smaller value, the more frequently it will report. This also depends on your download speed, etc, so don't count on it to report every X seconds, since it will report for every X bytes received/transferred.
Your timeout is set to 5 seconds which might be too short depending on the file size of the document. Try increasing it to 10-15 just to make sure it has enough time to complete the transfer.
There's an option called CURELOPT_FILE that allows you to specify a file handler that curl should write to. I'm pretty sure it will do "right" thing and "write" as it reads, avoiding your memory problem
$file = fopen('test.txt', 'w'); //<--------- file handler
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://example.com');
curl_setopt($ch, CURLOPT_FAILONERROR,1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_FILE, $file); //<------- this is your magic line
curl_exec($ch);
curl_close($ch);
fclose($file);
curl_setopt the CURLOPT_FILE - The file that the transfer should be written to. The default is STDOUT (the browser window)
http://us2.php.net/manual/en/function.curl-setopt.php

Categories