Parsing XML with PHP? - php

This has been driving me insane for about the last hour. I'm trying to parse a bit of XML out of Last.fm's API, I've used about 35 different permutations of the code below, all of which have failed. I'm really bad at XML parsing, lol. Can anyone help me parse the first toptags>tag>name 'name' from this XML API in PHP? :(
http://ws.audioscrobbler.com/2.0/?method=track.getinfo&api_key=b25b959554ed76058ac220b7b2e0a026&artist=Owl+city&track=fireflies
Which in that case ^ would be 'electronic'
Right now, all I have is this
<?
$xmlstr = file_get_contents("http://ws.audioscrobbler.com/2.0/?method=track.getinfo&api_key=b25b959554ed76058ac220b7b2e0a026&artist=Owl+city&track=fireflies");
$genre = new SimpleXMLElement($xmlstr);
echo $genre->lfm->track->toptags->tag->name;
?>
Which returns with, blank. No errors either, which is what's incredibly annoying!
Thank You very Much :) :) :)
Any help greatly, and by greatly I mean really, really greatly appreciated! :)

The <tag> tag is an array, so you should loop through them with a foreach or similar construct. In your case, just grabbing the first would look like this:
<?
$xmlstr = file_get_contents("http://ws.audioscrobbler.com/2.0/?method=track.getinfo&api_key=b25b959554ed76058ac220b7b2e0a026&artist=Owl+city&track=fireflies");
$genre = new SimpleXMLElement($xmlstr);
echo $genre->track->toptags->tag[0]->name;
Also note that the <lfm> tag is not needed.
UPDATE
I find it's much easier to grab exactly what I'm looking for in a SimpleXMLElement by using print_r(). It'll show you what's an array, what's a simple string, what's another SimpleXMLElement, etc.

Try using
$url = "http://ws.audioscrobbler.com/2.0/?method=track.getinfo&api_key=b25b959554ed76058ac220b7b2e0a026&artist=Owl+city&track=fireflies";
$xml = simplexml_load_file($url);
echo $xml->track->toptags->tag[0]->name;

Suggestion: insert a statement to echo $xmlstr, and make sure you are getting something back from the API.

You don't need to reference lfm. Actually, $genre already is lfm. Try this:
echo $genre->track->toptags->tag->name;

if you wan't to read xml data please follow those steps,
$xmlURL = "your xml url / file name goes here";
try {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $xmlURL);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-type: text/xml'
));
$content = curl_exec($ch);
$error = curl_error($ch);
curl_close($ch);
$obj = new SimpleXMLElement($content);
echo "<pre>";
var_dump($obj);
echo "</pre>";
}
catch(Exception $e){
var_dump($e);exit;
}
You will get array formate of whole xml file.
Thanks.

Related

Learning how to parse basic json in to php echo of table

I know this has been done to death.
but i am really strugling
I have put together a webservice that generates a json,
i can and understand this bit in
// create a new cURL resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://marcom.domain.com/corp/pub.php");
curl_setopt($ch, CURLOPT_HEADER, 0);
this will dump out on the page [{"Torders":"3222","name":"john"},{"Torders":"579","name":"Kevin"}]
their is 5 in total but for this i am keeping it simple
I don't understand how to get these as an array so that my end result is a list
of name and Torders
the rendered html would be something like this
<li>john 3222</li>
<li>Kevin 579 </li>
please don't send me to php manual page for json decode cause i am strugling to understand this.
thank you
<?php
$json = file_get_contents('http://marcom.domain.com/corp/pub.php');
$people = json_decode($json);
?>
<ul>
<?php foreach ($people as $person): ?>
<li><?=$person->name?> <?=$person->Torders?></li>
<?php endforeach; ?>
</ul>
If you want to use curl, you'll want to set curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); and get the return from curl_exec()
U need to use json_decode()
$j = '[{"Torders":"3222","name":"john"},{"Torders":"579","name":"Kevin"}]';
$data = json_decode($j,true);
//print_r($data);
echo '<ul>';
foreach($data as $key=>$val){
echo '<li>'.$val['name'].' '.$val['Torders'].'</li>';
}
echo '</ul>';

Scrape a statistic from YouTube using PHP

After struggling for 3 hours at trying to do this on my own, I have decided that it is either not possible or not possible for me to do on my own. My question is as follows:
How can I scrape the numbers in the attached image using PHP to echo them in a webpage?
Image URL: http://gyazo.com/6ee1784a87dcdfb8cdf37e753d82411c
Please help. I have tried almost everything, from using cURL, to using a regex, to trying an xPath. Nothing has worked the right way.
I only want the numbers by themselves in order for them to be isolated, assigned to a variable, and then echoed elsewhere on the page.
Update:
http://youtube.com/exonianetwork - The URL I am trying to scrape.
/html/body[#class='date-20121213 en_US ltr ytg-old-clearfix guide-feed-v2 site-left-aligned exp-new-site-width exp-watch7-comment-ui webkit webkit-537']/div[#id='body-container']/div[#id='page-container']/div[#id='page']/div[#id='content']/div[#id='branded-page-default-bg']/div[#id='branded-page-body-container']/div[#id='branded-page-body']/div[#class='channel-tab-content channel-layout-two-column selected blogger-template ']/div[#class='tab-content-body']/div[#class='secondary-pane']/div[#class='user-profile channel-module yt-uix-c3-module-container ']/div[#class='module-view profile-view-module']/ul[#class='section'][1]/li[#class='user-profile-item '][1]/span[#class='value']
The xPath I tried, which didn't work for some unknown reason. No exceptions or errors were thrown, and nothing was displayed.
Perhaps a simple XPath would be easier to manipulate and debug.
Here's a Short Self-Contained Correct Example (watch for the space at the end of the class name):
#!/usr/bin/env php
<?
$url = "http://youtube.com/exonianetwork";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
if (!$html)
{
print "Failed to fetch page. Error handling goes here";
}
curl_close($ch);
$dom = new DOMDocument();
#$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$profile_items = $xpath->query("//li[#class='user-profile-item ']/span[#class='value']");
if ($profile_items->length === 0) {
print "No values found\n";
} else {
foreach ($profile_items as $profile_item) {
printf("%s\n", $profile_item->textContent);
}
}
?>
Execute:
% ./scrape.php
57
3,593
10,659,716
113,900
United Kingdom
If you are willing to try a regex again, this pattern should work:
!Network Videos:</span>\r\n +<span class=\"value\">([\d,]+).+Views:</span>\r\n +<span class=\"value\">([\d,]+).+Subscribers:</span>\r\n +<span class=\"value\">([\d,]+)!s
It captures the numbers with their embedded commas, which would then need to be stripped out. I'm not familiar with PHP, so cannot give you more complete code

RSS parse with PHP gives nothing?

I am trying to parse a rss-feed with php but it gives me nothing. As you can see I am trying to echo both the $doc and the $itemRSS array and it gives me nothing and the "Success!" for when item 0 in the array is never reached.
I would be thrilled if someone could say "have you thought about [idiotic mistake]?", so please consider me a noob for this question. Thank you!
$doc = new DOMDocument();
$doc->load('http://pipes.yahoo.com/pipes/pipe.run?_id=566903fd393811762dc74aadc701badd&_render=rss');
$arrFeeds = array();
foreach ($doc->getElementsByTagName('item') as $node) {
$itemRSS = array (
'guid' => $node->getElementsByTagName('guid')->item(0)->nodeValue
);
array_push($arrFeeds, $itemRSS);
}
if ($itemRSS[0] != NULL) {
echo 'Success!';
}
echo $itemRSS;
echo $doc;
And by this I mean that the page is completely blank. No error, nothing.
Update:
Apprently my webhost has allow_url_fopen deactivated, so I have to find another way to do this. sigh
As mentioned in the other answers, you're doing a couple of things wrong, such as trying to echo arrays and objects. For some reason you're not getting any results in $arrFeeds either, although you should.
A simpler way to do this is change the render method of the feed to JSON: http://pipes.yahoo.com/pipes/pipe.run?_id=566903fd393811762dc74aadc701badd&_render=json
Then you can use json_decode() to get an array of all items:
$contents = json_decode(file_get_contents('http://pipes.yahoo.com/pipes/pipe.run?_id=566903fd393811762dc74aadc701badd&_render=json'));
foreach($contents['items'] as $item) {
// use $item['title'], $item['description'] etc...
}
Note you can only echo strings, ints, etc. not structured data such as arrays or objects.
To make it easier, you can open that URL in your browser and analyze the JSON contents # http://json.parser.online.fr/ - you'll see how your array is structured then.
JSON is a much easier format to work with IMO.
Edit:
Since file_get_contents() is disabled you can use cURL (which should be installed on most servers, especially with allow_url_fopen disabled):
function file_get_contents_curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //Set curl to return the data instead of printing it to the browser.
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
Add to the end of your script these lines and check it.
var_dump($itemRSS);
var_dump($arrFeeds);
var_dump($doc);
The problem is knowing you are not properly display the information.
there is no problem with script other than you trying to echo one object and one array, print_r $arrFeeds you can see everything

PHP Regex for IP to Location API

How would I use Regex to get the information on a IP to Location API
This is the API
http://ipinfodb.com/ip_query.php?ip=74.125.45.100
I would need to get the Country Name, Region/State, and City.
I tried this:
$ip = $_SERVER["REMOTE_ADDR"];
$contents = #file_get_contents('http://ipinfodb.com/ip_query.php?ip=' . $ip . '');
$pattern = "/<CountryName>(.*)<CountryName>/";
preg_match($pattern, $contents, $regex);
$regex = !empty($regex[1]) ? $regex[1] : "FAIL";
echo $regex;
When I do echo $regex I always get FAIL how can I fix this
As Aaron has suggested. Best not to reinvent the wheel so try parsing it with simplexml_load_string()
// Init the CURL
$curl = curl_init();
// Setup the curl settings
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 0);
// grab the XML file
$raw_xml = curl_exec($curl);
curl_close($curl);
// Setup the xml object
$xml = simplexml_load_string( $raw_xml );
You can now access any part of the $xml variable as an object, with that in regard here is an example of what you posted.
<Response>
<Ip>74.125.45.100</Ip>
<Status>OK</Status>
<CountryCode>US</CountryCode>
<CountryName>United States</CountryName>
<RegionCode>06</RegionCode>
<RegionName>California</RegionName>
<City>Mountain View</City>
<ZipPostalCode>94043</ZipPostalCode>
<Latitude>37.4192</Latitude>
<Longitude>-122.057</Longitude>
<Timezone>0</Timezone>
<Gmtoffset>0</Gmtoffset>
<Dstoffset>0</Dstoffset>
</Response>
Now after you have loaded this XML string into the simplexml_load_string() you can access the response's IP address like so.
$xml->IP;
simplexml_load_string() will transform well formed XML files into an object that you can manipulate. The only other thing I can say is go and try it out and play with it
EDIT:
Source
http://www.php.net/manual/en/function.simplexml-load-string.php
You really are better off using a XML parser to pull the information.
For example, this script will parse it into an array.
Regex really shouldn't be used to parse HTML or XML.
If you really need to use regular expressions, then you should correct the one you are using. "|<CountryName>([^<]*)</CountryName>|i" would work better.

Parsing XML data with Namespaces in PHP

I'm trying to work with this XML feed that uses namespaces and i'm not able to get past the colon in the tags. Here's how the XML feed looks like:
<r25:events pubdate="2010-05-19T13:58:08-04:00">
<r25:event xl:href="event.xml?event_id=328" id="BRJDMzI4" crc="00000022" status="est">
<r25:event_id>328</r25:event_id>
<r25:event_name>Testing 09/2005-08/2006</r25:event_name>
<r25:alien_uid/>
<r25:event_priority>0</r25:event_priority>
<r25:event_type_id xl:href="evtype.xml?type_id=105">105</r25:event_type_id>
<r25:event_type_name>CABINET</r25:event_type_name>
<r25:node_type>C</r25:node_type>
<r25:node_type_name>cabinet</r25:node_type_name>
<r25:state>1</r25:state>
<r25:state_name>Tentative</r25:state_name>
<r25:event_locator>2005-AAAAMQ</r25:event_locator>
<r25:event_title/>
<r25:favorite>F</r25:favorite>
<r25:organization_id/>
<r25:organization_name/>
<r25:parent_id/>
<r25:cabinet_id xl:href="event.xml?event_id=328">328</r25:cabinet_id>
<r25:cabinet_name>cabinet 09/2005-08/2006</r25:cabinet_name>
<r25:start_date>2005-09-01</r25:start_date>
<r25:end_date>2006-08-31</r25:end_date>
<r25:registration_url/>
<r25:last_mod_dt>2008-02-27T14:22:43-05:00</r25:last_mod_dt>
<r25:last_mod_user>abc00296004</r25:last_mod_user>
</r25:event>
</r25:events>
And here is what I'm using for code - I'll trying to throw these into a bunch of arrays where I can format the output however I want:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://somedomain.com/blah.xml");
curl_setopt ($ch, CURLOPT_HTTPHEADER, Array("Content-Type: text/xml"));
curl_setopt($ch, CURLOPT_USERPWD, "username:password");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
$xml = new SimpleXmlElement($output);
foreach ($xml->events->event as $entry){
$dc = $entry->children('http://www.collegenet.com/r25');
echo $entry->event_name . "<br />";
echo $entry->event_id . "<br /><br />";
}
Figured out the issue was with the XML feed rather than code:
XML feed was missing this line:
<r25:events xmlns:r25="http://www.collegenet.com/r25" xmlns:xl="http://www.w3.org/1999/xlink" pubdate="2010-05-19T13:58:08-04:00">
Thanks for the help though.
"All kinds of errors" isn't a helpful description; what errors are you actually getting?
You should give the object a namespace option like this:
$xml = new SimpleXmlElement($output, null, false, $ns = 'r25');
See the manual.
Alternatively, since r25 is the only namespace used and therefore is not especially helpful, I just run
$xml = preg_replace('/r25:/','',$xml);
And that strips out the namespace. Then you can navigate much easier with simplexml, just like in your example.

Categories