I'm trying load facebook content with CURL, but the system is showing the login page instead of my timeline (also, I'm already signed in). Is something related to cookies? What I'm doing wrong? There's no errors in this code...
$ch = curl_init('https://www.facebook.com');
curl_setopt($ch, CURLOPT_POST, true );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
$header = array();
$header[] = "Accept-Language: pt-br,pt;q=0.5";
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_HEADER, false );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt($ch, CURLOPT_VERBOSE,true);
$data = curl_exec( $ch );
echo $data;
I hope you help me! Thank you
php is server side and your server run your php code and after that return the output to browser. you need use in iframe for example for show facebook timeline.
Related
I want to get the html content from this link
https://store.nike.com/in/en_gb/pw/boys-shoes/7pvZoi3
and for this i have created the below curl request php script
$ua = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13';
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, 'https://store.nike.com/in/en_gb/pw/boys-shoes/7pvZoi3');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_USERAGENT, $ua);
curl_setopt($ch, CURLOPT_COOKIE, '<Pasted_cookie>');
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 20);
$result = curl_exec($ch);
$last = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
print_r($result);
But the above script redirects me to the page that is showing me a screen to select region.
Please help me as what i need to change to make the script work.
Thanks.
To set location, there is always a network call that set your location in cookies or somewhere else, totally web dependent.
What you can do is, find that call, first mock the location call to set location then hit the main page with same cookies.
I´m new to php and trying to program a few things.
In the first step I want to show the following page: "https://www.mytischtennis.de/public/home" on my website. I am using Curl to grab the page. But every time I want to output the page I am getting a blank page.
My code looks like this:
<?php
function grab_page($site){
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17');
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 40);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_URL, $site);
ob_start();
return curl_exec($ch);
ob_end_clean();
curl_close($ch);
}
echo grab_page("https://www.mytischtennis.de/public/home");
echo "hallo";
?>
With other pages this code works. Only for "https://www.mytischtennis.de/public/home" it wont work for me?
Can someone help me why i get only with this site a blank page?
Thank you :)
You need to set two more options in your curl request:
// Add some more headers here if you need to
$headers = [
"Accept-Encoding: gzip, deflate, br"
];
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
// Decode the response automatically when needed
curl_setopt($ch, CURLOPT_ENCODING, '');
After this you will get the page you want.
I'm creating a script that is scraping the site www.piratebay.se. The script was working OK two-three days ago but now I'm having problems with it.
This is my code:
$URL = 'http://thepiratebay.se';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1");
curl_setopt($ch, CURLOPT_COOKIE, "language=pt_BR; c[thepiratebay.se][/][language]=pt_BR");
$fonte = curl_exec ($ch);
curl_close ($ch);
echo $fonte;
The response of this code is not clean HTML, but looks like this instead:
��[s۸N>��k�9��-ىmI7��$�8�.v��͕���$h���y�G�Sg:ӷ>�5����ʱ�aor&���.v)���������) d�w��8w�l����c�u""1����F*G��ِ�2$�6�C�}��z(bw�� 4Ƒz6�S��t4�K��x�6u���~�T���ACJb��T^3�USPI:Mf��n�'��4��� ��XE�QQ&�c5�`'β�T Y]D�Q�nBfS�}a�%� ���R) �Zn��̙ ��8IB�a����L�
I already tried to use user agent on .htaccess, PHP and cURL but to no success.
Add this:
curl_setopt($ch, CURLOPT_ENCODING , "gzip");
Tested on my local environment, works fine with it.
URL:
You can see the url in Here (I put the url in the pastebin because the url is quite long).
Curl & Header :
$header=array();
$header[]="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[]="Accept-Encoding: gzip, deflate";
$header[]="Accept-Language: en-US,en;q=0.5";
$header[]="Connection: keep-alive";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_exec($ch);
Result:
Error 400--Bad Request
From RFC 2068 Hypertext Transfer Protocol -- HTTP/1.1:
10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications."
The Browser when go to the URL directly without curl:
Displayed Nicely.
There are problems with your URL, chances are it was computed wrong.
If you're generating that long URL from your script, make sure it's the right one.
The reason is that if you try deleting stuff, let's say you end up with https://wftc3.e-travel.com/plnext/garuda-indonesia/Override.action, you will see that accessing this page ends up in a 400 error.
I hope this helps.
/edit: this works, so it's probably $url.
<?php
$url = "https://wftc3.e-travel.com/plnext/garuda-indonesia/Override.action?SITE=CBEECBEE&LANGUAGE=ID&EMBEDDED_TRANSACTION=FlexPricerAvailability&ENCT=1&ENC=BE37D8CC9CE37D25AB7C16ED9DC32B9AB70052C76CC99D6460FB6990C37619D08A86D19D12DB2E39B1C2572C7C97A890E4D0CA079A35CB075FC284C469128F210A361D6121DEF1E64C3E153CCF855158302AA41136F317A8F143E4A2DDFEF68DB413BD337613EF92A4809626A4E3CB107A5666859C612C388539292CD16A1FF421C143F7D74A504845EBC98B1476E79EB32DFB32E46B43ADF0B514FB472B0258C41F696441043714660B3493F3395E8329B93A0C4E7EA3E1E466025EAE4AE2562754B6324C4C4CFEDC3CA4548A17AAFA500FC0A331F1FB1B770FE91E31B88D5C391DA66B00A5AD8F83D02BBB962F35B6BAAAAE34984EF07693352467B3AC7C1F62D8AA70B791C71CAC7AD4E4744563A096471F0CF79FF28425EFD39C2A58D0D52F279632268E3FBB1217DB8DF5A61181D466B62CA13CC7044EC9E90A550DC3CFD3A28EBC4FB9FC451D0C34BA94E7CF46B5FF1C9E1ADECA8EAE477B1112AA911E8826C07311B033F5D3AD39F26814A7399072805235049E856C9BC9E9C0819AC596471F0CF79FF28425EFD39C2A58D0D5A7B93A6BBADB799F3B3B95A975D76E523BD3C538C3B91D308FC57D8F84ACD46D57A25DAD2528B4486D0DC651B85D1DD27680F4762A813920C0D7DBF02676A659F6479A9F3A48B7202F10A44379467113DC817B3F3908F5F21D13389934D53CCDC787523D2953A5401152E090735051220AEA4FC852F0A20BCD957F8F2BDA35C0C9AC95AF6075C2923C1FA881D3D3C0484BA6740A4CCF8CBB8FB1E0C2FC9C1B0E35764D079EA758ED28405DDE81CB538B69A5C75758DA45C03F45EEF0D75B6E714ABED40ED5E467D99F4CBBFEAAAC28EB7BD54AA4E28454445AE822E53EDC3DAD7634FDBC01F64DB410A450D54EECBE4F9BB8BB8FA9E2B8CFE4CD88019D3BCAE97A041BDB6C72AD195FB68212FEC44CE587C5CA13B74686FD62FEE6AA43C4FA3DE765A4EAA2034043E2CB24F5BD0B48F771F51FDD8197D0C0B6DF85DD8D5EBED594F80B56C7963080333C519C67B88961921D48431AF465CA9A94060E8DC600EE3BDBAFEB22FBE9E105C6F386A0580A6F6E7FC39BC0D3F12A253DD73581FC7C6FA4EF58293CEE6869B817EE1F1D4801A3282C9F857924158FB6D2EFC7A02057155B69A8271F69B754AE7D978062AD01AC449A3D598CFB37921ECC4932CFE4A19A891C29A0B1C234E3950520529F97DDD2FA5793ADEF0FC1D327C3E38C77455FF12AF99DF582CB6BE66F9FA601DF00AD3EBE281CCC3B9BA63A47860E793A6D5002486A06345EB691B2521491F8694797EE0EFEC1C90C082B815D23EE0E46E4CA6A9EB06EB1483FC07C1D7B17818AA8B20F16223C113ACBB81B628BE6EDDA4E96D559E7A7BA9A1BCD31FB3FEEAE509704B54B426646A42CA6F7C75E85BBB32FA49E60102E76D13F7961343025E44CF14705EF7424EC3578B294BF87D34DB49040CBCEABC06B466033A4AB5BEF9660F69B68BFB71206446F8A8EDECD068C8EAE159840BE226495914996D001BC6872525FB8D5A43A545CCF106EA9E823CEEE64F6955AAF3340E15DE72ED4D1865D63C9D85C3B0CC627381311163D08103D86C0392C1FDCD7065892EF3519C6A802940125B7D6C167C10E3D4750BC762DE1F10A15C0C8FD23A77E1E1310AE9CE073CA809773C21794EC1F190E868C513B83CB35EDBDDC31297078D472BE9A37C2F70A1BE31C0A5042E89214851AC675AEB34690ADED5AFA187CE56AE5D3270B07B6986EF5FEF05AB2C4BF44F5281CC3779E98EAE5F090AA07928D4FFEC8A893799F5BB3BA57ED47422E7532DB4F570F7E2CF8F2C9FF76CE87DBEA84738C535794EFD373A1080D7E12CB3B7C37AA566D663E54CCCBDCFE9E7970B61AB40F02A528F2107E9DEBD6B0795D766FB16AA71E0BF1091F2F897BBE39B1E11B3B610B5DF0CF98ABDE6A9B1D5C5784144D68A4629FDD409B7D6349D888162741633A718ED89B555EB147B67A79B06055E02BFBCEC56CD768BEFD38391AE4B7F13CD3AC6D9FADE73C1C2E313B83FDE3FAB3D60BE111D43EFA7565D5614427F7F0CBD7913C0E9496FA2978868C1D983C14212C987D6B0E38BEF1701B4120E6147BB88E776E1C05574475A7E44F4D11963189DB5BA6EEDB6E514D543BA8CA23A216AF3C5E876E99BCBD46F3B066A5BCE4FDBAA0CA012DFEF2A256652B8DE8AF04A0C27E58379BB1768602DBE55717B38AF3EDB8570FD4A9CC80D7D27A41AA2AF727C833A46583C3955E5BD0CE289BAF1F9AFD9415619A00EE2E965A46AE7891A4F3A303F5E44183DD542F13";
$header=array();
$header[]="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[]="Accept-Encoding: gzip, deflate";
$header[]="Accept-Language: en-US,en;q=0.5";
$header[]="Connection: keep-alive";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_ENCODING , "gzip");
$x = curl_exec($ch);
die(($x));
I tried to use file_get_contents and cURL to get the content of an website, I also tried to open the same site using Lynx and could not get the content. I got a 406 Not Acceptable, it seems that the site checks if I'm using a browser. Is there a work around?
It probably expects the user agent to be a web browser. You can set this easily using cURL:
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
Where $useragent is the string you want to use for a user agent. Try it with some common ones for the major browsers and see if that helps. This page lists some common user agents.
//make a call the the webpage to get his handicap
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.golfspain.com/portalgolf/HCP/handicap_resul.aspx?sLic=CB00693474");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 60);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE );
curl_setopt($ch, CURLOPT_REFERER, "http://google.com" );
curl_setopt($ch, CURLOPT_HEADER, TRUE );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
$html = curl_exec($ch);
curl_close($ch);
$doc = new DOMDocument();
$doc->strictErrorChecking = FALSE;
$doc->loadHTML($html);
$xml = simplexml_import_dom($doc);
Maybe you have to set some more HTTP headers like a 'real' browser. With cURL:
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);