I've been killing myself all day for this one bug. I can't tell you enough how much I'd appreciate any possible help on this.
Basically, I have a very simple script. It logs into a website, looks at a file's header to see if it is an image type and then it downloads it. It then repeats this three times.
The problem here is that I cannot set CURLOPT_NOBODY without curl_exec crashing the entire script with -no- errors. (I can't even call or get an curl_error!)It would seem that it is impossible for me to go from CURLOPT_NOBODY, true to CURLOPT_NOBODY, false. The loop below runs one time and then dies().
What could possibly be causing this bug?
Here is the script:
// Log into the Website
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/login');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.12) Gecko/2009070611 Firefox/3.0.12");
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_exec($ch);
// Begin the Loop for Finding Images
for($i = 0; $i < 3; $i++) {
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/file.php?id=' . $i);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
$output = curl_exec($ch) or die('WHY DOES THIS DIE!!!');
$curl_info = curl_getinfo($ch);
echo '<br/>' . $output;
// (Normally checks for content type here) Download the File
curl_setopt($ch, CURLOPT_URL, 'http://myexample.com/file.php?id=' . $i);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
$filename = 'downloads/test-' . $i . '.jpg';
$fp = fopen($filename, 'w');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_exec($ch);
fclose($fp);
}
I'm running apache 2.2 and PHP version 5.2.13.
Thanks for any help- I can't tell you how much I'd appreciate it. I'm completely stuck here. :(
Looks to me like the curl library is getting confused, esp when you are reusing the resource.
you should do
$ch = curl_init();
// do stuff with curl
curl_close($ch);
$ch = curl_init();
// another curl call
curl_close($ch);
$ch = curl_init();
// yet another curl call
curl_close($ch);
I was given the same errors executing your script, but adding in the curl_close's and the curl_init to reinitialize, seem to fix the problem. I don't know if this is acceptable, if not. I'd use the fopen() to do your http downloading, its much more intuitive than using curl, unless you need something that isn't supported in fopen.
Related
I know this is a common problem when using Curl but I have not found a solution after looking through StackOverflow and Google.
I've tried different User Agents and I'm getting different errors:
The requested URL returned error: 400 Bad Requestresource(19) of type (Unknown)
The requested URL returned error: 400 Bad Requeststring(42) of type (Unknown) (I noticed the 42 refers to the '=' in the $target_url)
depending on some of the modifications I make to my code below, however none has pointed me in the direction to solve this problem.
I appreciate any advice:
$target_url = "http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=170307";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)');
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
//curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
if ($html === false) $html = curl_error($ch);
echo stripslashes($html);
curl_close($ch);
var_dump($ch);
*** I should note that I'm actually reading the url (and a few others) from a file, so maybe there is something wrong with the format of the url?
I've done this before and had no problem with it, but now I'm stumped.
I read each line/url and place it into an array which I loop through later on.
*** If I hardcode the url then it works fine, but for some reason reading it from the file produces the error.
Don't use stripslashes() use preg_replace() to filter the URLs
<?php
$target_url="http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=170307";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,4);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
$html = preg_replace("#(<\s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+) ([\"'>]+)#",'$1'.$target_url.'$2$3', $html);
echo $html;
curl_close($ch);
var_dump($ch);
?>
I have written a PHP script that is using CURL to transfer large files from one Windows server to another. Both are running IIS, which has a 2GB limit for file uploads. My process works great for files less than 2GB. However, it fails for files > 2GB.
I have heard that CURL may have the ability to send files in chunks, which may solve my problem. However, I can't find any good examples of how to do this. My current working code is posted below. Does anyone know how to modify it to send my files in chunks (i.e. 100MB chunks)?
$postVars = array("file1" => $fileContents1,
"file2" => $fileContents2,
"outputPath" => $outputPath);
// Initializing curl
$ch = curl_init();
// Configuring curl options
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_USERPWD, $username . ":" . $password);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_VERBOSE, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FORBID_REUSE, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_BUFFERSIZE, '64000');
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSLVERSION, '3');
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: multipart/form-data'));
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_POSTFIELDS,$postVars);
echo "Sending files...\n";
curl_setopt($ch, CURLOPT_TIMEOUT, 9999);
if(curl_exec($ch) === false)
{
echo 'Curl error: ' . curl_error($ch);
}
else
{
echo 'Operation completed without any errors';
}
$info = curl_getinfo($ch);
print_r($info);
curl_close($ch);
plupload is a javascript/php library, and it's quite easy to use and allows chunking of your desired size.
It uses HTML5 though. I found only the way to uploading a large file is PlUpload Library and make the FileManager Application. Hope it will help you also.
Hi i'm trying to get the content from a json file, but i can't i have many troubles to do it, my code is:
<?PHP
$url = 'http://www.taringa.net/api/efc6d445985d5c38c5515dfba8b74e74/json/Users-GetUserData/apptastico';
$ch = curl_init();
$timeout = 0; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch); // take out the spaces of curl statement!!
curl_close($ch);
var_dump($file_contents);
?>
if i put the address in the browser i get the content without any issue, but if i try to get it on PHP or other code i have issues, what can i do? i try use file_get_contents(); too but i don't get anything, only issues and issues
It appears that they are checking user agents and blocking PHP. Set this before using file_get_contents:
ini_set('user_agent', 'Mozilla/5.0 (Windows NT 5.2; rv:2.0.1) Gecko/20100101 Firefox/4.0.1');
If they are checking user agents, they may be doing so to prevent people from doing this kind of thing.
Have you read a little bit about cURL on php.net?
here is how it works:
From php.net:
$url = 'http://www.taringa.net/api/efc6d445985d5c38c5515dfba8b74e74/json/Users-GetUserData/apptastico';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, FALSE); // set to true to see the header
curl_setopt($ch, CURLOPT_NOBODY, FALSE); // show the body
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$content = curl_exec($ch);
curl_close($ch);
echo $content
I want to download a image from the url but it is not getting downloaded.Please check out the code and tell me where i am going wrong.
<?php
$ch = curl_init ("http://l1.yimg.com/t/frontpage/cannes_anjelina_60.jpg");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
$rawdata=curl_exec ($ch);
curl_close ($ch);
$fp = fopen("img.jpg",'w');
fwrite($fp, $rawdata);
fclose($fp);
?>
It's working fine, set proper file permissions to where the image should be saved. In this case it's the same folder where your script is, might want to move it somewhere else like:
// where "images" folder can be written with files
// set permissions to 0755
$fp = fopen("images/img.jpg",'w');
function download_image ($url, $the_filename) {
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init ($url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Debian) Firefox/0.10.1");
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_setopt($ch, CURLINFO_EFFECTIVE_URL, true);
curl_setopt($ch, CURLOPT_REFERER, "http://tomakefast.com");
curl_setopt($ch, CURLOPT_POST, false);
$rawdata=curl_exec($ch);
curl_close($ch);
file_put_contents("../somewhere/". $the_filename, $rawdata);
}
If you see any problems, please let me know. This occasionally gives me a 0 byte image file but that could happen for any number of reasons. This could be improved to return false on 0 bytes, maybe even do a basic test to see if indeed an image was downloaded.
you can use this simple code instead if you feel this is better.
$img_file = file_get_contents("http://l1.yimg.com/t/frontpage/cannes_anjelina_60.jpg");
file_put_contents("myImage.jpg", $img_file);
I have the following php code which I found here:
function download_xml()
{
$url = 'http://tv.sygko.net/tv.xml';
$ch = curl_init($url);
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
echo("curl_exec was succesful"); //This never gets called
curl_close($ch);
return $data;
}
$my_file = 'tvdata.xml';
$handle = fopen($my_file, 'w');
$data = download_xml();
fwrite($handle, $data);
What I'm trying to do is to download the xml at the specified url and save it to the disk. However, it stops once about 80% finished and never reaches the echo call after the curl_exec call. I'm not sure why, but I believe this is because it runs out of memory. Therefore I would like to ask if it is possible to make curl write the data to the file every time it has downloaded say 4kb. If this is not possible, do anybody know a way to get the xml file stored at the url downloaded and stored on my disk using php?
Thank you very much,
BEN.
EDIT:
This is the code now, it doesnt work. It writes the data to the file but still only about 80% of the document. Maybe it isn't because it exceeds memory but some other reason? I really can't believe it is this hard to copy a file from a URL to the disc...
<?
$url = 'http://tv.sygko.net/tv.xml';
$my_file = fopen('tvdata.xml', 'w');
$ch = curl_init($url);
$timeout = 300;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FILE, $my_file);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_BUFFERSIZE, 4096);
curl_exec($ch) OR die("Error in curl_exec()");
echo("got to after curl exec");
fclose($my_file);
curl_close($ch);
?>
Here comes a fully working example:
public function saveFile($url, $dest) {
if (!file_exists($dest))
touch($dest);
$file = fopen($dest, 'w');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROGRESSFUNCTION, 'progressCallback');
curl_setopt($ch, CURLOPT_BUFFERSIZE, (1024*1024*512));
curl_setopt($ch, CURLOPT_NOPROGRESS, FALSE);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_FILE, $file);
curl_exec($ch);
curl_close($ch);
fclose($file);
}
?>
The secret lies withing setting CURLOPT_NOPROGRESS to FALSE, and then, CURLOPT_BUFFERSIZE will make the callback report for every CURLOPT_BUFFERSIZE bytes reached. The smaller value, the more frequently it will report. This also depends on your download speed, etc, so don't count on it to report every X seconds, since it will report for every X bytes received/transferred.
Your timeout is set to 5 seconds which might be too short depending on the file size of the document. Try increasing it to 10-15 just to make sure it has enough time to complete the transfer.
There's an option called CURELOPT_FILE that allows you to specify a file handler that curl should write to. I'm pretty sure it will do "right" thing and "write" as it reads, avoiding your memory problem
$file = fopen('test.txt', 'w'); //<--------- file handler
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://example.com');
curl_setopt($ch, CURLOPT_FAILONERROR,1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_FILE, $file); //<------- this is your magic line
curl_exec($ch);
curl_close($ch);
fclose($file);
curl_setopt the CURLOPT_FILE - The file that the transfer should be written to. The default is STDOUT (the browser window)
http://us2.php.net/manual/en/function.curl-setopt.php