file_get_contents from specific URL - php

I have an API Key that verifies the request URL
If I do
echo file_get_contents('http://myfilelocation.com/?apikey=1234');
RESULT : this api key is not authorized for this domain
However, if I put the requested URL within an iframe with the same URL:
RESULT : this api key is authorized
Obviously, the Server I'm getting the requested JSON return data is working properly because the iframe is outputting the proper information. However, how can I verify that PHP is making the request from the proper domain and URL settings?
By using file_get_contents I am always getting back that the API key is not authorized. However, I'm running the php script from the authorized domain.

Try this PHP code:
<?php
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Host: myfilelocation.com\r\n". // Don't forgot replace with your domain
"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n"
)
);
$context = stream_context_create($options);
$file = file_get_contents("http://myfilelocation.com/?apikey=1234", false, $context);
?>

file_get_contents doesn't send a any referrer information and the api may need it, this may help you:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://myfilelocation.com/?apikey=1234');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_REFERER, 'http://autorized-domain.here');
$html = curl_exec($ch);
echo $html;
?>

Related

file_get_contents does not work for getting json from API in PHP 7

I juz upgraded php from 5.6 to 7.2.
Before 7.2, both file_get_contents worked fine for getting json from API but after upgrade to 7.2,
it returned false.
file_get_contents($url) => false
The url is like this:
'https://username:password#project_domain/api/json/xxx/?param_a=' . $a . '&param_b='. $b
And I didn't even touch the default setting in php.ini which is probably related to file_get_contents:
allow_url_fopen = On
I did google for this but there is no straight answer for my problem.
What is the reason for this?
How to fix to it?
Thanks!
$url = "https://www.f5buddy.com/";
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8A293 Safari/6531.22.7\r\n"
));
$context = stream_context_create($options);
$file = file_get_contents($url, false, $context);
$file=htmlentities($file);
echo json_encode($file);
Finally got it with curl. It only worked when I skipped the ssl stuff. It is juz https to own project anyway so it shouldn't be a problem in security.
function getJsonFromAPI($url) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch);
curl_close($ch);
$data = json_decode($result);
return $data;
}
Btw, I found out file_get_contents only worked for https to external url but not for https connection to the project itself. Any fix to that is appreciated.

File_get_contents, curl not working

Something strange is going on, and I would like to know why.
On this url: http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json, which works well in the browser, but when I tried to retrieve the content with php:
echo file_get_contents('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json');
printed nothing, with var_dump(...) = string(0) "", so i went a little further and used:
function get_page($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, True);
curl_setopt($curl, CURLOPT_URL, $url);
$return = curl_exec($curl);
curl_close($curl);
return $return;
}
echo get_page('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json');
Also printed nothing, so i tried python (3.X):
import requests
print(requests.get('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json').text)
And WORKED. Why is this happening? What's going on?
It looks like they're blocking the user agent, or lack thereof, considering that php curl and file_get_contents doesn't seem to set the value in the request header.
You can fake this by setting it to something like Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
<?php
function get_page($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, True);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1');
$return = curl_exec($curl);
curl_close($curl);
return $return;
}
echo get_page('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json');
I experienced the same behaviour.
Fetching the URL using the CLI Curl worked for me.
I then wrote a script with a file_get_contents call to another script that dumped all request headers to a file using getallheaders:
<?php
file_put_contents('/tmp/request_headers.txt', var_export(getallheaders(),true));
Output of file:
array (
'Host' => 'localhost',
)
I then inspected the curl request headers,
$ curl -v URL
And tried adding one at a time to the file_get_contents request. It turned out a User agent header was needed.
<?php
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>
"User-Agent: examplebot\r\n"
)
);
$context = stream_context_create($opts);
$response = file_get_contents($url, false , $context);
This gave me a useful response.

Send request with user ip when scraping data in php

I am stuck in a problem I have a url which has a geo location restriction like it can only be viewed from europe or USA. My location is Asia. I want to extract all href's from the url.
However I am using curl but the problem is that it send server ip address and I want the request to be made with user ip address inorder to track a user which links he has visited. If you can guide me how to send request with user ip address and without using curl I'll be grateful.
Following is the source code. The url which I am accesing is:
http&colon;//partnerads.ysm.yahoo.com/ypa/?ct=2&c=000000809&u=http%3A%2F%2Ftrouve.autocult.fr%2F_test.php%3Fq%3Dtarif%2520skoda%2520superb%2520combi&r=&w=1&tv=&tt=&lo=&ty=&ts=1458721731523&ao=&h=1&CoNo=3292b85181511c0a&dT=1&er=0&si=p-Autocult_FRA_SERP_2%3A600x796
<?php
include_once 'simple_html_dom.php';
$html = file_get_html('iframe.html');
// find iframe from within html doc
foreach($html->find('iframe') as $iframe)
{
$src = $iframe->getAttribute('src'); // src extracted
$ch = curl_init(); // Initialise a cURL handle
// Set any other cURL options that are required
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, TRUE);
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36');
curl_setopt($ch, CURLOPT_URL,$src);
$results = curl_exec($ch); // Execute a cURL request
//echo curl_error($ch);
curl_close($ch); // Closing the curl
$bool = TRUE; $match = array(); $int = 0;
while(preg_match('/<a[^>]+href=([\'"])(.+?)\1[^>]*>/i', $results, $matches))
{
if($bool)
{
// print captured group that's actually the url your searching for
echo $matches[2].'<br>'.'<br>'.'<br>'.'<br>';
$bool = false;
}
}
}
You can use proxy.
$ip = '100.100.100.100:234' //example $ip
curl_setopt($ch, CURLOPT_PROXY,$ip);
without curl:
$aContext = array(
'http' => array(
'proxy' => 'tcp://'.$ip,
'request_fulluri' => true,
),
);
$cxContext = stream_context_create($aContext);
$sFile = file_get_contents("http://www.google.com", False, $cxContext);
If you lookin' for proxies, there's some adresses easy to scrape:
'http://proxylist.hidemyass.com/',
'http://ipaddress.com/proxy-list/',
'http://nntime.com/proxy-ip-'.$i.'.htm',
'http://www.proxylisty.com/ip-proxylist-'.$i
over 2000 ips

Can't load the XML file?

http://westwood-backup.com/podcast?categoryID2=403
This is the XML file that i want to load and echo via PHP. I tried file_get_contents and load. Both of are return empty string. If i change the URL as another XML file, functions works great. What can be special about the URL?
<?php
$content = file_get_contents("http://westwood-backup.com/podcast?categoryID2=403");
echo $content;
?>
Another try with load, same empty result.
<?php
$feed = new DOMDocument();
if (#$feed->load("http://westwood-backup.com/podcast?categoryID2=403")) {
$xpath = new DOMXpath($feed);
$linkPath = $xpath->query("/rss/channel/link");
echo $linkPath
}
?>
Use CURL and you can do it like this:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://westwood-backup.com/podcast?categoryID2=403');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, ' Mozilla/1.22 (compatible; MSIE 2.0d; Windows NT)');
$xml = curl_exec($ch);
curl_close($ch);
$xml = new SimpleXMLElement($xml);
echo "<pre>";
print_r($xml);
echo "</pre>";
Outputs:
I think the server implements a "User-Agent" check to make sure the XML data is only loaded within a browser (not via bots/file_get_contents etc...)
so, by using CURL and setting a dummy user-agent, you can get around the check and load the data.
You need to set a useragent header that the server is happy with. No need for cUrl if you dont want to use it, you can use stream_context_create with file_get_contents:
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n" // i.e. An iPad
)
);
$context = stream_context_create($options);
$content = file_get_contents("http://westwood-backup.com/podcast?categoryID2=403", false, $context);
echo $content;

unable to get the website content by using file_get_content in php

When i am trying to get the website content from the external url fanpop.com by using file_get_contents in php, i am getting empty data. I used the below code to get the contents
$add_url= "http://www.fanpop.com/";
$add_domain = file_get_contents($add_url);
echo $add_domain;
but here i am getting empty result for $add_domain. But the same code is working for other urls and i tried to send the request from browser not from the script then also it is not working.
Below is the same request, but in CURL:
error_reporting(-1);
ini_set('display_errors','On');
$url="http://www.fanpop.com/";
$ch = curl_init();
$header=array('GET /1575051 HTTP/1.1',
'Host: adfoc.us',
'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language:en-US,en;q=0.8',
'Cache-Control:max-age=0',
'Connection:keep-alive',
'Host:adfoc.us',
'User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36',
);
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,0);
curl_setopt( $ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
curl_setopt($ch,CURLOPT_HTTPHEADER,$header);
echo $result=curl_exec($ch);
curl_close($ch);
... but the above is also not working, can any one tell is there any any changes have to make in that?
The problem with this particular site is that it only serves compressed contents and throws a 404 error otherwise.
Easy fix:
$ch = curl_init('http://www.fanpop.com');
curl_setopt($ch,CURLOPT_ENCODING , "");
curl_exec($ch);
You can also make this work for file_get_contents() but with a substantial amount of effort, as described in this article.

Categories