I am using Curl PHP to fetch data from remote site. My Script is:
<?php
$url = 'https://www.(url).com/';
$sleep = rand(10, 12);
sleep($sleep);
$agent= 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36';
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8','accept-encoding:gzip, deflate, sdch','accept:image/webp,image/*,*/*;q=0.8'));
curl_setopt($ch, CURLOPT_PROXY, "x.x.x.x:x");
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
$mainPage = new simple_html_dom;
echo $mainPage->load($result);
But it returns 403 forbidden error in response.
I tried with advanced User agents include, but still I am getting this error in response.
Thanks in advance for suggestions and comments.
Related
I am trying to use header value on next url visit in php curl function.
example visit https://example.com/page1 its 302 redirect and return new header with new value and this value i am trying to use in if again 302 redirect url
my php code is
$header = ["Authorization: Bearer $token"];
$header = ["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"];
$data = Curl("https://example.com/page1",$header);
function Curl($url,$head= [],$post=''){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_TIMEOUT,8);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,5);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
if($post):
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,$post);
endif;
if($head)
curl_setopt($ch, CURLOPT_HTTPHEADER, $head);
$page = curl_exec($ch);
curl_close($ch);
return $page;
}
$ch2 = SSLCURL("https://www.tcpvpn.com/create-tcpvpn-account-server");
curl_setopt($ch2, CURLOPT_REFERER, "https://www.tcpvpn.com/free-vpn-server-continent-europe");
curl_setopt($ch2, CURLOPT_POST, 1);
curl_setopt($ch2, CURLOPT_POSTFIELDS, "server=115");
curl_setopt($ch2, CURLOPT_RETURNTRANSFER, 1);
$ex = curl_exec($ch2);
echo nl2br(str_replace("<","!!",$ex));
curl_close($ch2);
This is the code that I access to website. I handle cookies, SSL access, redirecting actions and UserAgent (Latest Chrome) on SSLCURL function.
The thing is when I access to that website over my browser or even with Glype (a proxy script written in PHP), I can reach to website without problem, but everytime I try to access over my script, I just get a meta redirection. How can I fix it?
edit: here comes the SSLCURL
function SSLCURL($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__)."/jamjar.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__)."/jamjar.txt");
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
"Accept-Language:tr-TR,tr;q=0.8,en-US;q=0.6,en;q=0.4",
"Connection:keep-alive",
"Upgrade-Insecure-Requests:1"
));
return $ch;
}
I am trying to retrieve the HTML from a user profile on Instagram using cURL.
I am new to cURL so do not know the cause of this error.
Nothing happens when the cURL is executed , the page seems to refresh?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.instagram.com/zohebchaudhry1/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookiess.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookiess.txt');
curl_setopt($ch ,CURLOPT_TIMEOUT , 10);
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36" );
$html = curl_exec($ch);
curl_close($ch);
echo $html;
above is the PHP cURL code.
It appears that cURL is working, however you're unable to see the output because printing HTML may not be desired.
I suggest replacing echo $html; with echo htmlentities($html);
Read more: php.net/htmlentities
Anyone can explain to me what is wrong with the code and how do i get the height value? I am trying to get the height of celebrities. Any suggestions?
Thanks.
My code (Updated with CURL user agent setting as advised):
$url='https://www.google.com/webhp?ie=UTF-8#q=ailee+height';
//Set CURL user agent
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
//simple html dom
require_once('lib/simple_html_dom.php');
$html = str_get_html($data);
$height= $html->find('div[class="_eF"]',0)->innertext;
echo $height;
I get empty from the above code. In this case, I want to return:
5' 5" (1.65 m)
The problem is that curl doesn't process JavaScript and Google will show a different webpage when JavaScript is disabled, in this case, the div changes to a span with a different id
<span class="_m3b">1.65 m</span>
Also, the link you were using wasn't working for me.
Try this instead:
<?php
header('Content-Type: text/html; charset=utf-8');
$url='https://www.google.pt/search?q=ailee+height&num=10&gbv=1';
//Set CURL user agent
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
require_once('simple_html_dom.php');
$html = str_get_html($data);
$height= $html->find('span[class="_m3b"]',0)->innertext;
echo $height;
//1.65 m
I am given a project on fetching data from this url.
For this, Simple HTML DOM process has already failed, so I am working on:
function curl_download($Url){
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_REFERER, "www.idealo.de/preisvergleich/MainSearchProductCategory.html?q=0018208925063");
curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
print curl_download('www.idealo.de/preisvergleich/MainSearchProductCategory.html?q=0018208925063');
This code returns a blank page. Can anyone please help me?
The reason is the Useragent you used is too short to look like a real browser.
Try to use this one bellow:
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.130 Safari/537.38");