error 301 by using function fileget_contents_curl on a url

error 301 by using function fileget_contents_curl on a url - php

I have an error curl code 301. I get an error 301 when I made the request to curl leboncoin.fr
I try to solve the problem by adding: curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, 1) in the code of my function curl.
Code work find on one day only. and next day I found again the same code erreor(301 error)
Here are the curl code below:
function file_get_contents_curl($url)
{
$ch = curl_init();
$timeout = 10;
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
**curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);**
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER , 1);
curl_setopt($ch, CURLOPT_FORBID_REUSE , 1);
curl_setopt($ch, CURLOPT_TIMEOUT , 10);
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
do you have any idea to solve this?
thanks.

If it works one day and there are no changes on your side or the site you are trying to connect, it should be the same result.
In any case, since you know that the address changed, change it in your code to reduce time and steps.
Also, you may be having problems due to the time limit set for the wait on connection, try to increase CURLOPT_CONNECTTIMEOUT a bit, like 13, just in case the servers are taking too long to respond or do the redirection.

Related

PHP Curl gets 403 error, but browser from same machine can request page?

I've got this script working with generally no problems. I say generally, because while it retrieves pages from CNN.com, allrecipes.com, reddit.com, etc - when I point it towards at least one URL (foxnews.com), I get a 403 error instead.
As you can see, I've set the user agent to the same as my machine's browser (that was necessitated by sending a request to Facebook's homepage, which returned a message that the browser wasn't supported).
So, basically wondering what step(s) I need to take to have as many sites as possible recognize the CURL request as coming from a real, actual browser, rather than 403'ing it.
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $this->url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8');
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

Fox News appears to be blocking access to their website from any request passing a USERAGENT. Simply removing the USERAGENT string works fine for me:
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $this->url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
Hope this helps! :)

PHP Curl - 400 Bad Request

I know this is a common problem when using Curl but I have not found a solution after looking through StackOverflow and Google.
I've tried different User Agents and I'm getting different errors:
The requested URL returned error: 400 Bad Requestresource(19) of type (Unknown)
The requested URL returned error: 400 Bad Requeststring(42) of type (Unknown) (I noticed the 42 refers to the '=' in the $target_url)
depending on some of the modifications I make to my code below, however none has pointed me in the direction to solve this problem.
I appreciate any advice:
$target_url = "http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=170307";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)');
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
//curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
if ($html === false) $html = curl_error($ch);
echo stripslashes($html);
curl_close($ch);
var_dump($ch);
*** I should note that I'm actually reading the url (and a few others) from a file, so maybe there is something wrong with the format of the url?
I've done this before and had no problem with it, but now I'm stumped.
I read each line/url and place it into an array which I loop through later on.
*** If I hardcode the url then it works fine, but for some reason reading it from the file produces the error.

Don't use stripslashes() use preg_replace() to filter the URLs
<?php
$target_url="http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=170307";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,4);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
$html = preg_replace("#(<\s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+) ([\"'>]+)#",'$1'.$target_url.'$2$3', $html);
echo $html;
curl_close($ch);
var_dump($ch);
?>

cURL follow redirect causing 500 server error

I'm trying to replicate the functionality of a PHP 'header' redirect using cURL so that it works within a cronjob triggered script.
The following code runs within a while loop (which I know works) but it returns a 500 internal server error when I try to run it, as though the request is timing out on the server (it should not take any more than 30 seconds).
Is there something wrong with the way I'm doing this:
<?php
$url = 'http://exampleurl/?ld_LeadProductType=PROF&ld_contactFirstName='.rawurlencode($quotes['name']).'&ld_businessName='.rawurlencode($quotes['company_name']).'&QuotePrice=%20/%20Quote%20Price:'.number_format($quotes['quote'],2).'.';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,0);
curl_setopt($ch, CURLOPT_TIMEOUT, 128);
$html = curl_exec($ch);
$redirectURL = curl_getinfo($ch,CURLINFO_EFFECTIVE_URL );
curl_close($ch);
?>

CURL FOLLOWLOCATION without displaying

When I add an parameter to the CURL init function:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
It does print the site content I'm sending a request to, the problem is that I want to manually parse the returned content ($response = curl_exec($ch);) but the problem is that, the site is displaying the page content and I want to keep having the site content on my $response variable so I could parse it, but at the same, stop it from displaying it.
Is that feasible?
The curl code:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $action);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, '');
curl_setopt($ch, CURLOPT_COOKIEFILE, '');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);
$response = curl_exec($ch);
curl_close($ch);

You should set the CURLOPT_RETURNTRANSFER flag to true. From PHP.NET manual
CURLOPT_RETURNTRANSFER TRUE
To return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

Stop Curl redirecting to new page

I am trying to make curl request to domain: http://xyz.com. here is my code.
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_URL, $strURL);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $arrData);
curl_exec($ch);
While making request it gets redirected to some page within and don't come back to my page.
How can i stop being redirected in middle of curl request.
M sorry guys...
after the suggestion i tried CURLOPT_FOLLOWLOCATION to 0 and it worked... it was my mistake that i didn't remove next line of header redirection and it went on passing and passing...
sorry my mistake.
once more... CURLOPT_FOLLOWLOCATION to 0 wont transfer...

I think because that page checks your user-agent or sets cookies, so you need to try mimic web browser as much as possible.
Like adding user-agent:
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7');
Or try set cookie:
$cookieJar = tempnam ("/tmp", "CURLCOOKIE");
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookieJar);
If you provide url maybe i could help more.

Try using the CURLOPT_MAXREDIRS option.
CURLOPT_MAXREDIRS : The maximum amount of HTTP redirections to follow. Use this option alongside CURLOPT_FOLLOWLOCATION.

Try to play with :
CURLOPT_RETURNTRANSFER,
CURLOPT_FOLLOWLOCATION,
CURLOPT_COOKIEJAR,
CURLOPT_COOKIEFILE
And think to log, it's easier to debug with it !
$handle = fopen('log.tmp', 'w');
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_STDERR, $handle);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

error 301 by using function fileget_contents_curl on a url - php

Related

PHP Curl gets 403 error, but browser from same machine can request page?

PHP Curl - 400 Bad Request

cURL follow redirect causing 500 server error

CURL FOLLOWLOCATION without displaying

Stop Curl redirecting to new page

Categories

Resources