DOMDocument - Load xml rss - failed to open stream - php

I want to get links from a rss url . This is my code :
$doc = new DOMDocument();
$doc->load("http://www.alef.ir/rssdx.gmyefy,ggeltshmci.62ay2x.y.xml");
$arrFeeds = array();
foreach ($doc->getElementsByTagName('item') as $node) {
$title = $node->getElementsByTagName('title')->item(0)->nodeValue;
$title=strip_tags($title);
$link=$node->getElementsByTagName('link')->item(0)->nodeValue;
}
I've used this code for several others URLs and all of them worked but on this one I get:
Warning:
DOMDocument::load(http://www.alef.ir/rssdx.gmyefy,ggeltshmci.62ay2x.y.xml): failed to open stream: HTTP request failed!
HTTP/1.1 403 Forbidden in /home/xxxxxxx/domains/xxxxxxx/public_html/data.php on line 14
Warning:
DOMDocument::load(): I/O warning: failed to load external entity "http://www.alef.ir/rssdx.gmyefy,ggeltshmci.62ay2x.y.xml"
in /home/xxxxxxx/domains/xxxxxxx/public_html/data.php on line 14
http://www.alef.ir/rssdx.gmyefy,ggeltshmci.62ay2x.y.xml
Line 14 is:
$doc->load("http://www.alef.ir/rssdx.gmyefy,ggeltshmci.62ay2x.y.xml");
Could you help me? Why does this request give me an error?
Thanks

Using the code above failed for me and it was not due to the comma as I commented. I found that, using curl, I was able to retrieve the xml file.
$c=curl_init('http://www.alef.ir/rssdx.gmyefy,ggeltshmci.62ay2x.y.xml');
curl_setopt( $c, CURLOPT_USERAGENT,'nginx-curl-blahblahblah' );
curl_setopt( $c, CURLOPT_RETURNTRANSFER, true );
$r=curl_exec( $c );
curl_close( $c );
$doc = new DOMDocument();
$doc->loadxml($r);
$arrFeeds = array();
foreach ($doc->getElementsByTagName('item') as $node) {
$title=$node->getElementsByTagName('title')->item(0)->nodeValue;
$title=strip_tags($title);
$link=$node->getElementsByTagName('link')->item(0)->nodeValue;
}

Add this code before calling your feed, this will change user agent.
$opts = array(
'http' => array(
'user_agent' => 'PHP libxml agent',
)
);
$context = stream_context_create($opts);
libxml_set_streams_context($context);

Related

PHP - simplexml_load_file() - I/O warning : failed to load external entity [duplicate]

I'm trying to create a small application that will simply read an RSS feed and then layout the info on the page.
All the instructions I find make this seem simplistic but for some reason it just isn't working. I have the following
include_once(ABSPATH.WPINC.'/rss.php');
$feed = file_get_contents('http://feeds.bbci.co.uk/sport/0/football/rss.xml?edition=int');
$items = simplexml_load_file($feed);
That's it, it then breaks on the third line with the following error
Error: [2] simplexml_load_file() [function.simplexml-load-file]: I/O warning : failed to load external entity "<?xml version="1.0" encoding="UTF-8"?> <?xm
The rest of the XML file is shown.
I have turned on allow_url_fopen and allow_url_include in my settings but still nothing.
I've tried multiple feeds that all end up with the same result?
I'm going mad here
simplexml_load_file() interprets an XML file (either a file on your disk or a URL) into an object. What you have in $feed is a string.
You have two options:
Use file_get_contents() to get the XML feed as a string, and use e simplexml_load_string():
$feed = file_get_contents('...');
$items = simplexml_load_string($feed);
Load the XML feed directly using simplexml_load_file():
$items = simplexml_load_file('...');
You can also load the content with cURL, if file_get_contents insn't enabled on your server.
Example:
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,"http://feeds.bbci.co.uk/sport/0/football/rss.xml?edition=int");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$output = curl_exec($ch);
curl_close($ch);
$items = simplexml_load_string($output);
this also works:
$url = "http://www.some-url";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xmlresponse = curl_exec($ch);
$xml=simplexml_load_string($xmlresponse);
then I just run a forloop to grab the stuff from the nodes.
like this:`
for($i = 0; $i < 20; $i++) {
$title = $xml->channel->item[$i]->title;
$link = $xml->channel->item[$i]->link;
$desc = $xml->channel->item[$i]->description;
$html .="<div><h3>$title</h3>$link<br />$desc</div><hr>";
}
echo $html;
***note that your node names will differ, obviously..and your HTML might be structured differently...also your loop might be set to higher or lower amount of results.
$url = 'http://legis.senado.leg.br/dadosabertos/materia/tramitando';
$xml = file_get_contents("xml->{$url}");
$xml = simplexml_load_file($url);

PHP and JSON for Reddit

want to grab from a specific Reddit user some data.
There's a dynamic JSON file on the Reddit servers which can be accessed remotely.
The JSON file path is: http://www.reddit.com/user/tiagoperes/about.json
(where you can replace “tiagoperes” in the URL with whatever user you were trying to look up) - thank you Tom Chapin
Problem: I get the error message
http error 500: reddiant.com/reddit.php
Error Log:
PHP Warning:
file_get_contents(https://www.reddit.com/user/tiagoperes/about.json):
failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden on
line 5
PHP Fatal error: Uncaught exception 'InvalidArgumentException'
with message 'Passed variable is not an array or object, using empty
array instead' on line 8
Code:
<?php
$url = "https://www.reddit.com/user/tiagoperes/about.json";
$json = file_get_contents($url);
$jsonIterator = new RecursiveIteratorIterator(
new RecursiveArrayIterator(json_decode($json, TRUE)),
RecursiveIteratorIterator::SELF_FIRST);
foreach ($jsonIterator as $key => $val) {
if(is_array($val)) {
echo "$key:\n";
} else {
echo "$key => $val\n";
}
}
(inspired in this: http://codepad.org/Gtk8DqJE)
Solution: Ask for debug.
What's the problem right here?
Can not find a way to make it work and should be quite straightforward.
Thank you!
Was getting some troubles in the first procedure, so decided to change the approach.
Got it to work like that:
<?php
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"User-Agent: reddiant api script\r\n"
));
$context = stream_context_create($opts);
$url = "http://www.reddit.com/user/tiagoperes/about.json";
$json = file_get_contents($url, false, $context);
$result = json_decode($json, true);
// Result:
var_dump($result);
//echo data
echo $result['data']['name'];

PHP - Getting Text From Inside Class

I'm attempting to gather text from a webpage using PHP, so that when the text on that website is updated, it's also automatically updated.
Take the site http://www.roblox.com/CW-Ultimate-Amethyst-Addiction-item?id=188004500 for example - inside the class robux-text, there's a figure saying R$ 20,003 - my aim is to get that text from Roblox, to my site.
I have attempted this using the code, but to no avail - I'm being presented with the following errors:
Warning: file_get_contents(): php_network_getaddresses: getaddrinfo
failed: Temporary failure in name resolution in
/home/public_html/index.php on line 9
Warning:
file_get_contents(http://www.roblox.com/CW-Ultimate-Amethyst-Addiction-item?id=188004500):
failed to open stream: php_network_getaddresses: getaddrinfo failed:
Temporary failure in name resolution in /home/public_html/index.php on
line 9
Warning: DOMDocument::loadHTML(): Empty string supplied as input in /home/public_html/index.php on line 11
<?php
$html = file_get_contents("http://www.roblox.com/CW-Ultimate-Amethyst-Addiction-item?id=188004500");
$DOM = new DOMDocument();
$DOM->loadHTML($html);
$finder = new DomXPath($DOM);
$classname = 'robux-text';
$nodes = $finder->query("//*[contains(#class, '$classname')]");
foreach ($nodes as $node) {
echo $node->nodeValue;
}
?>
It seems that allow_url_fopen is disabled your system (php.ini), that's why you're getting the error.
Try it with curl:
<?php
libxml_use_internal_errors(true);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.roblox.com/CW-Ultimate-Amethyst-Addiction-item?id=188004500");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$html = curl_exec($ch);
curl_close($ch);
$DOM = new DOMDocument();
$DOM->loadHTML($html);
$finder = new DomXPath($DOM);
$classname = 'robux-text';
$nodes = $finder->query("//*[contains(#class, '$classname')]");
foreach ($nodes as $node) {
echo $node->nodeValue;
}
?>
You can get the html content of an url easily via curl. You just have to set the returntransfer option to true.

SimpleXML - I/O warning : failed to load external entity

I'm trying to create a small application that will simply read an RSS feed and then layout the info on the page.
All the instructions I find make this seem simplistic but for some reason it just isn't working. I have the following
include_once(ABSPATH.WPINC.'/rss.php');
$feed = file_get_contents('http://feeds.bbci.co.uk/sport/0/football/rss.xml?edition=int');
$items = simplexml_load_file($feed);
That's it, it then breaks on the third line with the following error
Error: [2] simplexml_load_file() [function.simplexml-load-file]: I/O warning : failed to load external entity "<?xml version="1.0" encoding="UTF-8"?> <?xm
The rest of the XML file is shown.
I have turned on allow_url_fopen and allow_url_include in my settings but still nothing.
I've tried multiple feeds that all end up with the same result?
I'm going mad here
simplexml_load_file() interprets an XML file (either a file on your disk or a URL) into an object. What you have in $feed is a string.
You have two options:
Use file_get_contents() to get the XML feed as a string, and use e simplexml_load_string():
$feed = file_get_contents('...');
$items = simplexml_load_string($feed);
Load the XML feed directly using simplexml_load_file():
$items = simplexml_load_file('...');
You can also load the content with cURL, if file_get_contents insn't enabled on your server.
Example:
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,"http://feeds.bbci.co.uk/sport/0/football/rss.xml?edition=int");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$output = curl_exec($ch);
curl_close($ch);
$items = simplexml_load_string($output);
this also works:
$url = "http://www.some-url";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xmlresponse = curl_exec($ch);
$xml=simplexml_load_string($xmlresponse);
then I just run a forloop to grab the stuff from the nodes.
like this:`
for($i = 0; $i < 20; $i++) {
$title = $xml->channel->item[$i]->title;
$link = $xml->channel->item[$i]->link;
$desc = $xml->channel->item[$i]->description;
$html .="<div><h3>$title</h3>$link<br />$desc</div><hr>";
}
echo $html;
***note that your node names will differ, obviously..and your HTML might be structured differently...also your loop might be set to higher or lower amount of results.
$url = 'http://legis.senado.leg.br/dadosabertos/materia/tramitando';
$xml = file_get_contents("xml->{$url}");
$xml = simplexml_load_file($url);

Twitter feed not working

Hi I use the following PHP code to parse a twitter feed and display the latest two tweets in the footer of a website..
<ul id="twitter_update_list" style="word-wrap:break-word;">
<?php
$doc = new DOMDocument();
$doc->load('http://twitter.com/statuses/user_timeline/fixedgearfrenzy.rss');
$arrFeeds = array();
$count = 0;
foreach ($doc->getElementsByTagName('item') as $node)
{
if($count < 2)
echo('<li><span style="word-wrap:break-word;">'.substr($node->getElementsByTagName('description')->item(0)->nodeValue, 17).' </span>'.substr($node->getElementsByTagName('pubDate')->item(0)->nodeValue, 0, 16).'</li>');
$count = $count + 1;
}
?></ul>
For some reason it seems to not always work and most of the time the following error is displayed..
Warning: DOMDocument::load(http://twitter.com/statuses/user_timeline/fixedgearfrenzy.rss) [domdocument.load]: failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request in /home/fixedge1/public_html/catalog/view/theme/CartMania-Clean/template/common/footer.tpl on line 336Warning: DOMDocument::load() [domdocument.load]: I/O warning : failed to load external entity "http://twitter.com/statuses/user_timeline/fixedgearfrenzy.rss" in /home/fixedge1/public_html/catalog/view/theme/CartMania-Clean/template/common/footer.tpl on line 336
I can't work out why on earth it sometimes works and sometimes doesn't, any ideas??
The website is http://www.fixedgearfrenzy.co.uk and it's the twitter feed in the bottom right
IIRC, Twitter impose a per-hour limit on the number of times you can load a feed. If your results are intermittent, this is probably the reason why. Try caching the feed results locally to avoid re-loading the same data from Twatter over and over again.
You should use a valid user-agent when using DOMDocument to load a remote resource.
<?php
// Set a valid user-agent
$opts = array(
'http' => array(
'user_agent' => 'PHP libxml agent',
)
);
$context = stream_context_create($opts);
libxml_set_streams_context($context);
$doc = new DOMDocument();
$doc->load('http://twitter.com/statuses/user_timeline/fixedgearfrenzy.rss');
$arrFeeds = array();
// rest of your code...

Categories