Why cURL is timed out? - php

I want to fetch the result of a webpage, exactly like a simple user, through a browser. I setted the headers and the sended cookies of the request, to which I got with fiddler4. Last night it's worked, but now it's throw cURL error 28, request timed out.
Here is the code what I used:
function cURL($url){
$cURL=curl_init();
$header[0] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: hu-HU,hu;q=0.8,en-US;q=0.6,en;q=0.4,es;q=0.2,it;q=0.2,de;q=0.2,fr;q=0.2";
$header[] = "Accept-Encoding: gzip, deflate, sdch";
$header[] = "Pragma: ";
curl_setopt($cURL, CURLOPT_URL, $url);
curl_setopt($cURL, CURLOPT_USERAGENT, 'User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36');
curl_setopt($cURL, CURLOPT_HTTPHEADER, $header);
curl_setopt($cURL, CURLOPT_POST, true);
curl_setopt($cURL, CURLOPT_POSTFIELDS, 'action=verChau');
curl_setopt($cURL, CURLOPT_REFERER, 'http://www.google.com');
curl_setopt($cURL, CURLOPT_AUTOREFERER, true);
curl_setopt($cURL, CURLOPT_COOKIE, 'Cookie: __gfp_64b=mgK79a4qc_M9RH4eFToSbGkkxUWaWD2tKPQ51RreN8r.A7; PHPSESSID=780d83cb35c5b82098e33fde9c101d08; __atuvc=0%7C4%2C0%7C5%2C1%7C6%2C6%7C7%2C1%7C8; cTest=1; resDone20101213=1; _gat=1; _ga=GA1.2.307256553.1418233339; _goa3=eyJ1IjoiMTQxMjEyNDEwODM2NDE4MTU4NzAxNiIsImgiOiJCQzI0OEVFRC5kc2wucG9vbC50ZWxla29tLmh1IiwicyI6MTQxODMzODgwMDAwMH0=; _goa3TC=eyI1NjM2NCI6MTQyNTEzMjMwNDE2OCwiMzEzNDUzMiI6MTQyNTE0ODk1NDg2M30=; _goa3TS=e30=');
curl_setopt($cURL, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($cURL, CURLOPT_TIMEOUT, 10);
$html=curl_exec($cURL);
if ($html === false){
echo "cURL exception: ".curl_errno($cURL).": ".curl_error($cURL);
}
curl_close($cURL);
return $html;
}
Can anybody help me out?

Related

Curl work suitable on localhost but not working on server

Right now I use curl to get html5player.setVideoUrlLow and is working good but is only poor quality. So i need to get html5player.setVideoUrlHigh but this param don't appear with curl response if I run from server! On localhost work fine! What I missing from my code?
Tried already with different CURLOPT_USERAGENT ans same problem! Thank you!
<?php
function getstring($string,$start,$end)
{
$str = explode($start,$string);
$str = explode($end,$str[1]);
return $str[0];
}
$viewkey = $_GET['viewkey']; // https://mypage.com/view_video.php?viewkey=54501623
$url = "http://www.xvideos.com/embedframe/".$viewkey."";
// $url = "http://www.xvideos.com/video".$viewkey.""; // alternate but same result //
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) Ap");
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, "http://www.google.com/bot.html");
curl_setopt($curl, CURLOPT_ENCODING, "gzip,deflate");
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION,true);
$html = curl_exec($curl);
curl_close($curl);
$VideoUrlLow=getstring($html,"html5player.setVideoUrlLow('","');");
$VideoUrlHigh=getstring($html,"html5player.setVideoUrlHigh('","');");
if($VideoUrlHigh!="")
{
$mp4 = $VideoUrlHigh; // empty on server but work in localhost
} else
{
$mp4 = $VideoUrlLow;
}
header('Location: '.$mp4);
?>

What's the equivalent of CURL in SCRAPY

I want to scrape a website by SCRAPY with AJAX PAGINATION, i scraped this web site by PHP by using CURL, i monitored the network by Firebug, with firebug we have a option "Copy for CURL" for POST REQUEST.
My question is how can i do the same for SCRAPY.
my function in PHP:
function forCurl($url,$refer, $jsessionid){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0');
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: no-cache' --data 't%3Azoneid=forceAjax";
$header[] = "Connection: keep-alive";
$header[] = "Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3";
$header[] = "Pragma: no-cache";
$header[] = "X-Requested-With: XMLHttpRequest";
$header[] = "Keep-Alive: 700";
$cookie = "JSESSIONID=" . $jsessionid. '; langueFront=fr; tc_cj_v2=%5Ecl_%5Dny%5B%5D%5D_mmZZZZZZKNLLMQOMROKJRZZZ%5D777_rn_lh%5BfyfcheZZZ%7B%7E%28%24%29H/*+%7E-%241%20H%21-ZZZKNLLMQOSQMMRNZZZ%5D777%5Ecl_%5Dny%5B%5D%5D_mmZZZZZZKNLLNNJJKNRRMZZZ%5D777_rn_lh%5BfyfcheZZZ%7B%7E%28%24%29H/*+%7E-%241%20H%21-ZZZKNLLNNKNJOJSKZZZ%5D777%5Ecl_%5Dny%5B%5D%5D_mmZZZZZZKNLLNNMLSNSKLZZZ%5D777_rn_lh%5BfyfcheZZZ222H%7B0%7D%23%7B%29H%21-ZZZKNLLNNMMLMJNJZZZ%5D777%5Ecl_%5Dny%5B%5D%5D_mmZZZZZZKNLLNOOJSKRKMZZZ%5D777_rn_lh%5BfyfcheZZZ%7B%7E%28%24%29H/*+%7E-%241%20H%21-ZZZKNLLNOOLSOMPNZZZ%5D777%5Ecl_%5Dny%5B%5D%5D_mmZZZZZZKNLLNOPJMROQLZZZ%5D777_rn_lh%5BfyfcheZZZ%7B%7E%28%24%29H/*+%7E-%241%20H%21-ZZZKNLLNOPMQSKNOZZZ%5D; _ga=GA1.2.487921595.1421941922; aurol=GA1.2.865695137.1421941922; __utma=239562643.487921595.1421941922.1422452658.1422454606.14; __utmz=239562643.1422443324.10.2.utmcsr=Sphere_myWebSite|utmccn=myWebSitefr_logo|utmcmd=Interne; kameleoonVisitIdentifier=rj1hnzh5ux1n2gxr/4; myWebSiteCook=\"869|\"; revelationDriveWin=2; myWebSite.hamon=1; __utmv=239562643.|1=visite_myWebSitedrive=239562643.487921595.1421941922.1422452658.1422454606.14=1; tosend=%7B%22p%22%3A%7B%22tracker%22%3A%22myWebSitedrive%22%2C%20%22url%22%3A%22rayon%22%2C%20%22mtime%22%3A1422455760000%2C%20%22ref%22%3A%22http%3A%2F%2Fwww.myWebSitedrive.fr%2Fdrive%2Frecherche%2Fbio%22%2C%20%22dest%22%3A%22http%3A%2F%2Fwww.myWebSitedrive.fr%2Fdrive%2FNice-Cote-dAzur-869%2FSurgeles-R41355%2FViandes-Volailles-41478%2F%22%7D%2C%22d%22%3A%7B%22dv%22%3A%22NA%22%7D%2C%20%22t%22%3A%7B%22iplobserverstart%22%3A%221422455762613%22%2C%22jsinit%22%3A%221422455763871%22%2C%22domload%22%3A%221422455764728%22%2C%22clicklink%22%3A%221422455817128%22%2C%22unload%22%3A%221422455817521%22%7D%7D; kameleoonExperiment-14570=86018/1422452656881/false; __utmc=239562643; rdmvalidation=1; layerDrivePromos=2; __utmb=239562643.19.10.1422454606; _gat=1; _gat_myWebSiteRollup=1; __utmt=1; __utmt_secondTracker=1; __utmli=toPage_14b30fac8d4_0';
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_REFERER, $refer);
curl_setopt($ch, CURLOPT_COOKIE, $cookie);
$content = curl_exec($ch);
curl_close($ch);
return $content ;
i want to know how can i post the same parametres with SCRAPY, is that a good idea for scraping a website with ajax pagination?
i tried this:
yield Request(sousUrl, headers={'Referer':'%s' % url}, callback=self.parse_page)
In Python you can use PyCurl
PycURL is a Python interface to libcurl.

PHP CURL - Why is my HTTP_REFERER script not working?

I need help fixing my script so that visitors through the script will show up in analytics as coming from tumblr.com.
When I visit my php page, it shows http://prntscr.com/aczmvi
Heres my script---->
<?php
print curl_spoof("http://whatsmyreferer.com/");
function curl_spoof($url)
{
$curl = curl_init();
$header[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
curl_setopt($curl, CURLOPT_URL, $url);
//Spoof the agent
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36');
//Spoof the Referer
curl_setopt($curl, CURLOPT_REFERER, "http://tumblr.com");
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 10);
if (!$html = curl_exec($curl))
{
$html = file_get_contents($url);
}
curl_close($curl);
return $html;
}
?>
Ideas?

Php CURL to work with cookies and session

function get_data($url,$proxy=Null){
$agents = array(
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100508 SeaMonkey/2.0.4',
'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)',
'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; da-dk) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1'
);
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl,CURLOPT_USERAGENT,$agents[array_rand($agents)]);
curl_setopt($curl, CURLOPT_REFERER, "http://google.com/");
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE); ///** Follow Redirect
$html1 = curl_exec($curl);
curl_close($curl);
return $html1;
}
Above is my function and i am trying to get a page from proxy site
echo get_data('http://www.hostfast.info/browse.php?u=lZpnCp2dHRM0%2BnBp1Ljfmr8I%2BA%3D%3D&b=5');
But this is not working ....its giving me home page of that site and if i am trying new search its also not working... i am new to CURL ... but i think there is some thing to do with cookies ... how can i fix this
thx
To save cookie in cURL with PHP:
curl_setopt($curl, CURLOPT_COOKIEFILE, "yourcookiefile.txt");
curl_setopt($curl, CURLOPT_COOKIEJAR, "yourcookiefile.txt");
define('POSTURL', 'http://hostfast.info/includes/process.php?action=update');
define('POSTVARS', 'u=google.com/complete/search?output=toolbar&q=love'); // POST VARIABLES TO BE SENT
$ch = curl_init(POSTURL);
curl_setopt($ch, CURLOPT_POST ,1);
curl_setopt($ch, CURLOPT_POSTFIELDS ,POSTVARS);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION ,1);
curl_setopt($ch, CURLOPT_HEADER ,0); // DO NOT RETURN HTTP HEADERS
curl_setopt($ch, CURLOPT_RETURNTRANSFER ,1); // RETURN THE CONTENTS OF THE CALL
curl_setopt($ch, CURLOPT_COOKIEFILE, "yourcookiefile.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "yourcookiefile.txt");
$Rec_Data = curl_exec($ch);
curl_close($ch);
echo $Rec_Data;
This works .. ;)

Changing a request's referer doesn't work

I'm trying to fake the referer of a request using:
<?php
$url = "http://www.blabla.com";
function doMagic($url)
{
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20100101 Firefox/7.0.12011-10-16 20:23:00");
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, "http://www.fakeRef.com");
curl_setopt($curl, CURLOPT_ENCODING, "gzip,deflate");
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 10);
$html = curl_exec($curl);
echo 'Curl error: '. curl_error($curl);
curl_close($curl);
return $html;
}
$text = doMagic($url);
print("$text");
?>
I have a local apache server that I'm using to run this PHP script: localhost/script.php. The problem is that the actual referer (that Piwik reports) is localhost/script.php, not http://www.fakeRef.com.
What's the issue here?
The problem is that the actual referer (that Piwik reports) is localhost/script.php, not http://www.fakeRef.com.
What's the issue here?
You seem to be viewing the output of your curl operation in a browser. Then the explanation is simple. Piwik uses a tracking image to count your hit. The browser loads the tracking image; the image's referer will be your script, not the fake value you used to fetch the HTML code.
If you want to test whether setting the referer this way works, look into your server access logs. The script.php request there should contain the faked referer.

Categories