Can't load the XML file? - php

http://westwood-backup.com/podcast?categoryID2=403
This is the XML file that i want to load and echo via PHP. I tried file_get_contents and load. Both of are return empty string. If i change the URL as another XML file, functions works great. What can be special about the URL?
<?php
$content = file_get_contents("http://westwood-backup.com/podcast?categoryID2=403");
echo $content;
?>
Another try with load, same empty result.
<?php
$feed = new DOMDocument();
if (#$feed->load("http://westwood-backup.com/podcast?categoryID2=403")) {
$xpath = new DOMXpath($feed);
$linkPath = $xpath->query("/rss/channel/link");
echo $linkPath
}
?>

Use CURL and you can do it like this:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://westwood-backup.com/podcast?categoryID2=403');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, ' Mozilla/1.22 (compatible; MSIE 2.0d; Windows NT)');
$xml = curl_exec($ch);
curl_close($ch);
$xml = new SimpleXMLElement($xml);
echo "<pre>";
print_r($xml);
echo "</pre>";
Outputs:
I think the server implements a "User-Agent" check to make sure the XML data is only loaded within a browser (not via bots/file_get_contents etc...)
so, by using CURL and setting a dummy user-agent, you can get around the check and load the data.

You need to set a useragent header that the server is happy with. No need for cUrl if you dont want to use it, you can use stream_context_create with file_get_contents:
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n" // i.e. An iPad
)
);
$context = stream_context_create($options);
$content = file_get_contents("http://westwood-backup.com/podcast?categoryID2=403", false, $context);
echo $content;

Related

file_get_contents does not work for getting json from API in PHP 7

I juz upgraded php from 5.6 to 7.2.
Before 7.2, both file_get_contents worked fine for getting json from API but after upgrade to 7.2,
it returned false.
file_get_contents($url) => false
The url is like this:
'https://username:password#project_domain/api/json/xxx/?param_a=' . $a . '&param_b='. $b
And I didn't even touch the default setting in php.ini which is probably related to file_get_contents:
allow_url_fopen = On
I did google for this but there is no straight answer for my problem.
What is the reason for this?
How to fix to it?
Thanks!
$url = "https://www.f5buddy.com/";
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8A293 Safari/6531.22.7\r\n"
));
$context = stream_context_create($options);
$file = file_get_contents($url, false, $context);
$file=htmlentities($file);
echo json_encode($file);
Finally got it with curl. It only worked when I skipped the ssl stuff. It is juz https to own project anyway so it shouldn't be a problem in security.
function getJsonFromAPI($url) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch);
curl_close($ch);
$data = json_decode($result);
return $data;
}
Btw, I found out file_get_contents only worked for https to external url but not for https connection to the project itself. Any fix to that is appreciated.

File_get_contents, curl not working

Something strange is going on, and I would like to know why.
On this url: http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json, which works well in the browser, but when I tried to retrieve the content with php:
echo file_get_contents('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json');
printed nothing, with var_dump(...) = string(0) "", so i went a little further and used:
function get_page($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, True);
curl_setopt($curl, CURLOPT_URL, $url);
$return = curl_exec($curl);
curl_close($curl);
return $return;
}
echo get_page('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json');
Also printed nothing, so i tried python (3.X):
import requests
print(requests.get('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json').text)
And WORKED. Why is this happening? What's going on?
It looks like they're blocking the user agent, or lack thereof, considering that php curl and file_get_contents doesn't seem to set the value in the request header.
You can fake this by setting it to something like Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
<?php
function get_page($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, True);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1');
$return = curl_exec($curl);
curl_close($curl);
return $return;
}
echo get_page('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json');
I experienced the same behaviour.
Fetching the URL using the CLI Curl worked for me.
I then wrote a script with a file_get_contents call to another script that dumped all request headers to a file using getallheaders:
<?php
file_put_contents('/tmp/request_headers.txt', var_export(getallheaders(),true));
Output of file:
array (
'Host' => 'localhost',
)
I then inspected the curl request headers,
$ curl -v URL
And tried adding one at a time to the file_get_contents request. It turned out a User agent header was needed.
<?php
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>
"User-Agent: examplebot\r\n"
)
);
$context = stream_context_create($opts);
$response = file_get_contents($url, false , $context);
This gave me a useful response.

RSS-Feed returns an empty string

I have a news portal that displays RSS Feeds Items. Approximately 50 sources are read and it works very well.
Only with a source I always get an empty string. The RSS Validator of W3C can read the RSS feed. Even my program Vienna receives data.
What can I do?
Here is my simple code:
$link = 'http://blog.bosch-si.com/feed/';
$response = file_get_contents($link);
if($response !== false) {
var_dump($response);
} else {
echo 'Error ';
}
The server serving that feed expects a User Agent to be set. You apparently don't have a User Agent set in your php.ini, nor do you set it in the call to file_get_contents.
You can either set the User Agent for this particular request through a stream context:
echo file_get_contents(
'http://blog.bosch-si.com/feed/',
FALSE,
stream_context_create(
array(
'http' => array(
'user_agent' => 'php'
)
)
)
);
Or globally for any http calls:
ini_set('user_agent', 'php');
echo file_get_contents($link);
Both will give you the desired result.
blog http://blog.bosch-si.com/feed/ required some header to fetch content from the website, better use curl for the same.
See below solution:
<?php
$link = 'http://blog.bosch-si.com/feed/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $link);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: blog.bosch-si.com', 'User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 Safari/537.36'));
$result = curl_exec($ch);
if( ! $result)
{
echo curl_error($ch);
}
curl_close($ch);
echo $result;

PHP 500 internal server error file_get_contents

Using PHP I'm trying to crawl a website page and then grab an image automatically.
I've tried the following:
<?php
$url = "http://www.domain.co.uk/news/local-news";
$str = file_get_contents($url);
?>
and
<?php
$opts = array('http'=>array('header' => "User-Agent:Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.75 Safari/537.1\r\n"));
$context = stream_context_create($opts);
$header = file_get_contents('http://www.domain.co.uk/news/local-news',false,$context);
?>
and also
<?php
include('simple_html_dom.php');
$html = file_get_html('http://www.domain.co.uk/news/local-news');
$result = $html->find('section article img', 0)->outertext;
?>
but these all return with Internal Server Error. I can view the site perfectly in the browser but when I try to grab the page in PHP it fails.
Is there anything I can try?
Try below code: It will save content in local file.
<?php
$ch = curl_init("http://www.domain.co.uk/news/local-news");
$fp = fopen("localfile.html", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
?>
Now you can ready localfile.html.
Sometimes you might get an error opening an http URL with file_get_contents.
even though you have set allow_url_fopen = On in php.ini
For me the the solution was to also set "user_agent" to something.

file_get_contents from specific URL

I have an API Key that verifies the request URL
If I do
echo file_get_contents('http://myfilelocation.com/?apikey=1234');
RESULT : this api key is not authorized for this domain
However, if I put the requested URL within an iframe with the same URL:
RESULT : this api key is authorized
Obviously, the Server I'm getting the requested JSON return data is working properly because the iframe is outputting the proper information. However, how can I verify that PHP is making the request from the proper domain and URL settings?
By using file_get_contents I am always getting back that the API key is not authorized. However, I'm running the php script from the authorized domain.
Try this PHP code:
<?php
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Host: myfilelocation.com\r\n". // Don't forgot replace with your domain
"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n"
)
);
$context = stream_context_create($options);
$file = file_get_contents("http://myfilelocation.com/?apikey=1234", false, $context);
?>
file_get_contents doesn't send a any referrer information and the api may need it, this may help you:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://myfilelocation.com/?apikey=1234');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_REFERER, 'http://autorized-domain.here');
$html = curl_exec($ch);
echo $html;
?>

Categories