cannot download this link using curl php
https://www.economy.gov.ae/PublicationsArabic/2%20%D9%86%D8%B4%D8%B1%D8%A9%20%D8%A7%D9%84%D8%B9%D9%84%D8%A7%D9%85%D8%A7%D8%AA%20%D8%A7%D9%84%D8%AA%D8%AC%D8%A7%D8%B1%D9%8A%D8%A9%20%D8%A7%D9%84%D8%B9%D8%AF%D8%AF%20199-%20%D8%A7%D9%84%D9%86%D8%B4%D8%B1%20%D8%B9%D9%86%20%D8%A7%D9%84%D8%B9%D9%84%D8%A7%D9%85%D8%A7%D8%AA%20%D8%A7%D9%84%D9%85%D9%82%D8%A8%D9%88%D9%84%D8%A9.pdf
tried basic curl didn't work, wget don't work too
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_ENCODING, "utf-8");
echo $output=curl_exec($ch);
curl_close($ch);
empty pdf or 189 byte
just website use post request before that get request and you need to forge it before as it has a certain logic on app, now it work right
Try urldecode before using curl:
$ch = curl_init();
$url = urldecode($url);
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_ENCODING, "utf-8");
echo $output=curl_exec($ch);
curl_close($ch);
Related
So I found a solution to a problem I was having, (how to determine a domain protocol) How to find the domain is whether HTTP or HTTPS (with or without WWW) using PHP?
Below are two versions of my code;
The first doesn't work as expected it only echoes out my domains.
<?php
$url_list = file('urls.txt');
foreach($url_list as $url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_exec($ch);
$real_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
echo $real_url;
}
?>
The Second version of my code gives me an error of how I am supplied my foreach statement...
<?php
$fn = fopen("urls.txt","r");
while(! feof($fn)) {
$url_list = fgets($fn);
foreach($url_list as $url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_exec($ch);
$real_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
echo $real_url;
#echo $result;
fclose($fn);
}
}
?>
What can I be doing wrong??
Expected results is this, but when reading the domains from a file;
code: reference --> How to find the domain is whether HTTP or HTTPS (with or without WWW) using PHP?
<?php
$url_list = ['facebook.com','google.com'];
foreach($url_list as $url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_exec($ch);
$real_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
echo $real_url;//add here your db commands
}
?>
Output:
test#linux: php domain-fuzzer.php
https://www.facebook.com/http://www.google.com/#
I found the solution, just incase anyone would find such an issue here is the code:
<?php
$fn = file_get_contents("urls.txt");
$url_list = explode(PHP_EOL, $fn);
foreach($url_list as $url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_exec($ch);
$real_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
echo nl2br("$real_url \n");
}
?>
By default fgets returns a single current line from the file not an array of all lines of the file.
So, your code will be:
<?php
$fn = fopen("urls.txt","r");
while(! feof($fn)) {
$url = fgets($fn);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_exec($ch);
$real_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
echo $real_url;
#echo $result;
fclose($fn);
}
?>
I am trying to recover the content of a page with PHP cURL. It works well on other websites, but on this website it does not work and I don't why.
Here is my code :
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'https://ratings.fide.com/top.phtml');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$result = curl_exec($curl);
curl_close($curl);
echo $result;
I am trying to use curl to get response from a web page. But I get different responses while using curl and when browsing normally.
PHP file
$ch = curl_init();
echo "trying<br>";
//$url = "home.iitk.ac.in/~gopi/student_search/feedback.php";
$roll_no = "11101";
$name="";
$program="all";
$department="all";
$email="";
$gender="both";
$city="";
$course="";
$order="id";
$hostel="";
$bg='';
$tile = '0';
$offset = 0;
$url = "http://search.junta.iitk.ac.in/get2.php?&tile=0&roll_no=".$roll_no."&name=".$name."
&program=".$program."&dept=".$department."&login=".$email."&gender=".$gender."
&city=".$city."&course=".$course."&hostel=".$hostel."&bg=".$bg."&offset=".$offset;
echo $url;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
$ret_val = curl_error($ch);
echo $result;
echo $ret_val;
curl_close($ch);
And this results me in this page,
But when I directly go to the same url, it gives me 44 results.
and more results.
How do I get the same result using curl?
Edit
(doesn't work.)
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_REFERER, 'http://search.junta.iitk.ac.in');
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$result = curl_exec($ch);
Looks like the targeted server checks the user agent and if it's not a real browser it throws it a way(or generally behaves differently).
Try specifying the user agent - http://curl.haxx.se/docs/manpage.html
For example:
curl -A "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8J2 Safari/6533.18.5" http://www.apple.com
In PHP:
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
Works great, even just by typing it into the URL.
But now in my PHP script, when I build the URL it is not working.
Yellow pages API:
http://api2.yp.com/listings/v1/search?searchloc=91203&term=pizza&format=json&sort=distance&radius=5&listingcount=10&key=xxxxxxxxxx
Here is my code snippet
$apiURL = 'http://api2.yp.com/listings/v1/search?searchloc=91203&term=pizza&format=json&sort=distance&radius=5&listingcount=10&key=xxxx';
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$apiURL);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$data = curl_exec($ch);
curl_close($ch);
var_dump($data);
Thanks in advance
Add this cURL param.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
That is because you are getting 301 redirect from the URL. So you need to add it. [Personally Tested]
I am trying to use php cURL to fetch amazon web page but get
HTTP/1.1 503 Service Temporarily Unavailable instead. Is Amazon blocking cURL?
http://www.amazon.com/gp/offer-listing/B003B7Q5YY/
<?php
function get_html_content($url) {
// fake user agent
$userAgent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070219 Firefox/2.0.0.2';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
$string = curl_exec($ch);
curl_close($ch);
return $string;
}
echo get_html_content("http://www.amazon.com/gp/offer-listing/B003B7Q5YY");
?>
I use simple
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $offers_page);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');
$html = curl_exec($ch);
curl_close($ch);
but i have another problem. if you send a lot of queries to amazon - they start send 500 page to you.