SimpleXML will not parse this one URL - php

I am developing a plugin for WordPress to read from an XML Output which is being delivered from a cloud-based POS.
This is not about complicated programming, this is more of a debug.
The XML url is: Removed :)
And the basic, simple code is:
<?php
error_reporting(E_ALL);
ini_set('display_errors', true);
$url = '--URL to the XM--L';
$xml=simplexml_load_file($url);
print_r($xml);
?>
I have tried all the approaches. DOMDoc, CURL and SimpleXML. Everything spits out errors. The coder who made the output has been able to get it to work within their domain, but I need to debug more to find out where the error may be.
I have run into quite a series of errors depending on how I feed the XML into the script.
Fatal error: Uncaught Exception: String could not be parsed as XML in /var/www/html/test2.php:27 Stack trace: #0 /var/www/html/test2.php(27): SimpleXMLElement->__construct('http://jewelbas...', NULL, true) #1 {main} thrown in /var/www/html/test2.php on line 27
And sometimes I get these weird support errors, that I assume is coming from their host, but they cannot identify them. These support errors come no matter what server I use. They come when using simplexml_load_string()
Warning: simplexml_load_string(): act the webmaster. <br><br>Your support ID is: 9641638103684613562</body></html>

The server requires a user agent to be set. All the standard XML APIs in PHP are based on libxml. You can set a stream context for it:
libxml_set_streams_context(
stream_context_create(
[
'http' => [
'header'=>
"User-Agent: Foo\r\n"
]
]
)
);
$document = new DOMDocument();
$document->load($url);
echo $document->saveXML();

Just so it's here for future generations to see, here is the complete code now parsing the XML in a nice format.
<?php
error_reporting(E_ALL);
ini_set('display_errors', true);
$url = 'http://sample.com/yourphpxmlfile.php?variables=rule';
$feed = $url;
$options = array(
'http' => array(
'method' => "GET",
'header' => "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17\r\n" // Chrome v24
)
);
$context = stream_context_create($options);
$content = new SimpleXMLElement(file_get_contents($feed, false, $context));
echo "<pre>";
print_r($content);
echo "</pre>";
?>

Related

How to get data from an phrase in remote XML in php?

I wanna get the data from an XML file on remote site from a particular node. But im getting the following error
Warning: simplexml_load_file(): in php line
on the warning 2 : Its loading the File data. Result I wanna is to get the GRate.
Note: I have enabled SimpleXML module on my php installation.
<?php
$url = "http://api.srinivasajewellery.com/getrate/getrate";
$xml = simplexml_load_file($url) or die("not open");
?><pre><?php //print_r($xml); ?></pre><?php
foreach($xml->GRate as $GRate){
printf('$GRate');
}
?>
I have expected to get "3640.00" on my output but error is as follows
Warning: simplexml_load_file(): http://api.srinivasajewellery.com/getrate/getrate:1: parser error : Start tag expected, '<' not found in H:\root\home\srinivasauser-001\www\goldrate\wp-content\themes\twentynineteen\footer.php on line 24
Warning: simplexml_load_file(): {"GRate":"3640.00","SRate":"49.00","PRate":"0.00"} in H:\root\home\srinivasauser-001\www\goldrate\wp-content\themes\twentynineteen\footer.php on line 24
Warning: simplexml_load_file(): ^ in H:\root\home\srinivasauser-001\www\goldrate\wp-content\themes\twentynineteen\footer.php on line 24
not open.
When the URL "http://api.srinivasajewellery.com/getrate/getrate" is requested from PHP with the default settings, it will return the data as JSON. Which might be even easier to parse in this case:
<?php
$url = "http://api.srinivasajewellery.com/getrate/getrate";
$json = json_decode(file_get_contents($url));
echo '$GRate: ' . $json->GRate, "\n";
Output:
$GRate: 3670.00
This can be easily checked by fetching the URL and output it verbatim:
$buffer = file_get_contents($url);
echo $buffer, "\n";
{"GRate":"3670.00","SRate":"50.00","PRate":"0.00"}
As has been demonstrated by Vijay Dohare it is possible to tell that server that XML is preferred. To check if that works, is possible this way, too:
stream_context_get_default(['http' => ['header' => 'Accept: application/xml']]);
$buffer = file_get_contents($url);
echo $buffer, "\n";
The output is not that beautified then (I guess if the data is more, the JSON wouldn't be that easy to read as well as it also would grew larger):
<GetRateController.Rate xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/Savings.Controllers"><GRate>3670.00</GRate><PRate>0.00</PRate><SRate>50.00</SRate></GetRateController.Rate>
This might be similar to when opening the URL in the browser. This is because the browser also sends the Accept request header and it contains XML as well:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
Because browser normally accept XML as well (albeit they prefer HTML over it).
So in the end it depends what you prefer. Either the JSON which is less verbose compared to XML (see the very first example code above) or if you want to use XML with SimpleXML:
<?php
$url = "http://api.srinivasajewellery.com/getrate/getrate";
stream_context_get_default(['http' => ['header' => 'Accept: application/xml']]);
$xml = simplexml_load_file($url) or die("not open");
echo '$GRate: ' . $xml->GRate, "\n";
Output:
$GRate: 3670.00
Try following code,
<?php
$url = "http://api.srinivasajewellery.com/getrate/getrate";
$context = stream_context_create(array('http' => array('header' => 'Accept: application/xml')));
$xml = file_get_contents($url, false, $context);
$xml = simplexml_load_string($xml) or die("not open");
foreach($xml->GRate as $GRate){
echo '$GRate: '.$GRate;
}
?>

file_get_contents returns unreadable text for a specific url

When I try to read the rss feeds of the kat.cr using php file_get_contents function, I get some unreadable text but when I open it up with my browser the feed is fine.
I have tried many other hosts but no chance in getting the correct data.
I even have tried setting the user-agent to diffrent browsers but still no change.
this is a simple code that I've tried:
$options = array('http' => array('user_agent' => 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'));
$url = 'https://kat.cr/movies/?rss=1';
$data = file_get_contents($url, FILE_TEXT, stream_context_create($options));
echo $data;
I'm curious how their doing it and what I can do to overcome the problem.
A part of unreadable text:
‹ي]يrم6–‎?Oپي©™ت,à7{»‌âgw&يؤe;éN¹\S´HK\S¤–¤l+ے÷ِùِIِ”(إژzA5‌ةض؛غ%K4ـ{qtqy½ùوa^ »¬nٍھ|ûٹSِ eه¤Jَrِْصڈ1q^}sü§7uسlدزؤYً¾²yفVu‌•يغWGG·Iس&m>،“j~$ےzؤ(?zï‍ج’²جٹم?!ّ÷¦حغ";‏گ´Yس¢ï³{tر5ز ³َsgYٹْ.ں#
Actually everytime I open up the link there is some different unreadable text.
As I mentioned in the comment - the contents returned are gzip encoded so you need to un-gzip the data. Depending upon your version of php you may or may not have gzdecode installed, I don't but the function here does the trick.
if( !function_exists('gzdecode') ){
function gzdecode( $data ){
$g=tempnam('/tmp','ff');
#file_put_contents( $g, $data );
ob_start();
readgzfile($g);
$d=ob_get_clean();
unlink($g);
return $d;
}
}
$data=gzdecode( file_get_contents( $url ) );
echo $data;

Why file_get_contents returning garbled data?

I am trying to grab the HTML from the below page using some simple php.
URL: https://kat.cr/usearch/architecture%20category%3Abooks/
My code is:
$html = file_get_contents('https://kat.cr/usearch/architecture%20category%3Abooks/');
echo $html;
where file_get_contents works, but returns scrambled data:
I have tried using cUrl as well as various functions like: htmlentities(), mb_convert_encoding, utf8_encode and so on, but just get different variations of the scrambled text.
The source of the page says it is charset=utf-8, but I am not sure what the problem is.
Calling file_get_contents() on the base url kat.cr returns the same mess.
What am I missing here?
It is GZ compressed and when fetched by the browser the browser decompresses this, so you need to decompress. To output it as well you can use readgzfile():
readgzfile('https://kat.cr/usearch/architecture%20category%3Abooks/');
Your site response is being compressed, therefore you've to uncompress in order to convert it to the original form.
The quickest way is to use gzinflate() as below:
$html = gzinflate(substr(file_get_contents("https://kat.cr/usearch/architecture%20category%3Abooks/"), 10, -8));
Or for more advanced solution, please consider the following function (found at this blog):
function get_url($url)
{
//user agent is very necessary, otherwise some websites like google.com wont give zipped content
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-Language: en-US,en;q=0.8rn" .
"Accept-Encoding: gzip,deflate,sdchrn" .
"Accept-Charset:UTF-8,*;q=0.5rn" .
"User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20100101 Firefox/19.0 FirePHP/0.4rn"
)
);
$context = stream_context_create($opts);
$content = file_get_contents($url ,false,$context);
//If http response header mentions that content is gzipped, then uncompress it
foreach($http_response_header as $c => $h)
{
if(stristr($h, 'content-encoding') and stristr($h, 'gzip'))
{
//Now lets uncompress the compressed data
$content = gzinflate( substr($content,10,-8) );
}
}
return $content;
}
echo get_url('http://www.google.com/');

file_get_contents with header: Send current header or current cookie in magento or common

I want to call an URL and want to get the result with PHP by using file_get_contents (I know CURL, but first I want to try it with file_get_contents). In my case it's a request to the magento shop system, which requires a previously done login to the backend.
If I execute the URL manually in my browser, the right page is coming. If I send the URL with file_get_contents, I will also get logged in (because I added the Cookie to the request), but everytime I get only the dashboard home site, maybe something causes a redirect.
I tried to simulate the same http request, as my browser send it away. My question is: Is there a possiblity to send the same header data (Cookie, Session-ID etc.) directly as parameter to file_get_contents without manual serialization?
It's a common PHP question, the basic script would be:
$postdata = http_build_query(
array(
'var1' => 'some content',
'var2' => 'doh'
)
);
$opts = array('http' =>
array(
'method' => 'POST',
'header' => 'Content-type: application/x-www-form-urlencoded',
'content' => $postdata
)
);
$context = stream_context_create($opts);
$result = file_get_contents('http://example.com/submit.php', false, $context);
And in my case the code is:
$postdata = http_build_query(
array
(
'selected_products' => 'some content',
)
);
$opts = array('http' =>
array
(
'method' => 'POST',
'header' => "Content-type: application/x-www-form-urlencoded; charset=UTF-8\r\n".
"Cookie: __utma=".Mage::getModel('core/cookie')->get("__utma").";".
"__utmz=".Mage::getModel('core/cookie')->get("__utmz").
" __utmc=".Mage::getModel('core/cookie')->get("__utmc").';'.
"adminhtml=".Mage::getModel('core/cookie')->get("adminhtml")."\r\n".
"X-Requested-With: XMLHttpRequest\r\n".
"Connection: keep-alive\r\n".
"Accept: text/javascript, text/html, application/xml, text/xml, */*\r\n".
"User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0",
'content' => $postdata
)
);
$context = stream_context_create($opts);
var_dump(file_get_contents($runStopAndRemoveProducts, false, $context ));
The result should be the same error message I'll get in the browser by calling the URL manually ("please select some products" as plain text), but the response is a full dashboard home page as html website.
I'm looking for a script like this. I want to make sure all parameters are set automatically without manual build the cookie string and the other ones :)
file_get_contents('http://example.com/submit.php', false, $_SESSION["Current_Header"]);
EDIT: I've found the mistake, two special get-Parameter (isAjax=1 and form_key = Mage::getSingleton('core/session', array('name' => 'adminhtml'))->getFormKey()) are required. In my case the form_key causes the error. But the ugly Cookie string is already there - still looking for a more pretty solution.
To me this looks like you are trying to write a hack for something that you can do more elegantly, the proper, fully documented way. Please have a look at the Magento API.
If you want to delete products (or do anything else):
http://www.magentocommerce.com/api/soap/catalog/catalogProduct/catalog_product.delete.html
You will get a proper response back to know if things have been successful. If there are things the API cannot do then you can extend/hack it if you wish.
To get started you will need an API user/pass and get up to speed with SOAP. The examples in the Magento documentation should suffice. Good luck!

PHP file_get_contents and VAST xml

This is what am I trying to do: download a xml VAST from a URL and save locally in a XML file, in PHP. For that I am using file_get_contents and file_put_contents. this is the script I am using:
<?php
$tid=time();
$xml1 = file_get_contents('http://ad.afy11.net/ad?enc=4&asId=1000009566807&sf=0&ct=256');
file_put_contents("downloads/file1_$tid.xml", $xml1);
echo "<p>file 1 recorded</p>";
?>
The URL in question is a real URL that will deliver a xml VAST code. My problem is that when I save de file it will write an empty VAST tag:
<?xml version="1.0" encoding="UTF-8"?> <VAST version="2.0"> </VAST>
But if I run on Firefox it will actually deliver some code:
<VAST version="2.0"><Ad id="Adify"><Wrapper><AdSystem>Eyeblaster</AdSystem><VASTAdTagURI>http://bs.serving-sys.com/BurstingPipe/adServer.bs?cn=is&c=23&pl=VAST&pli=6583370&PluID=0&pos=7070&ord=4288438534]&cim=1</VASTAdTagURI><Impression>http://ad.afy11.net/ad?ipc=NMUsqYdyBUCjh4-i2HwWfK1oILM2AAAAN6-rBkSy8JNMZcuzAlj1XlSySpo6Hi7xEYULS+UgOVN5D3UuhFUVSWbFHoLE-+3su0-QnGgZgMJyiTm-R6O+yQ==</Impression><Creatives/></Wrapper></Ad></VAST>
Not a 100% of the time, they do cap the amount of requests, but WAY more often that when I try save the file using the PHP script.
Is that a way to make the PHP script mimic a browser???? I dont know if this is the right question but thats the only thing I can think of why I get an empty VAST tag when using the php script and get a full tag when using the browser.
any ideas???
thanks :)
Update: After doing some extra research, I found some info about stream_context_create function, but I haven't been able to duplicate the browser's results.
here's my new code:
<?php
$tid=time();
$opts = array('http' =>
array(
'method' => 'GET',
//'user_agent ' => "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2) Gecko/20100301 Ubuntu/9.10 (karmic) Firefox/3.6",
'header' => array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*\/*;q=0.8
'
),
)
);
$context = stream_context_create($opts);
$xml1 = file_get_contents('http://ad.afy11.net/ad?enc=4&asId=1000009566807&sf=0&ct=256');
file_put_contents("downloads/file1_$tid.xml", $xml1);
echo "<p>file 1 recorded</p>";
echo "<textarea rows='6' cols='80'> $xml1 </textarea> ";
echo "<br><iframe src='http://ad.afy11.net/ad?enc=4&asId=1000009566807&sf=0&ct=256' width='960' height='300'></iframe>";
?>
I also addded a iframe to compare when the browser are getting the right file and when the php function are not.
After some research I found a solution for my problem, and I would like to share here for future reference.
The idea as to pass some HTTP header with the file_get_contents. I accomplish that with this:
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>array("Accept-language: en", "Content-Type: multipart/form-data\r\n"),
'user_agent'=> $_SERVER['HTTP_USER_AGENT']
)
);
$context = stream_context_create($opts);
$xml4 = file_get_contents($url1, true, $context);
That's it, now I can get the same xml as if I was using the browser.

Categories