Reading RSS feed - php

I need to read an XML file, i watched some tutorials and tried different sollutions, but for some reason I can't figure out why it doenst work.
The XML file that I want to read: http://www.voetbalzone.nl/rss/rss.xml
This is the code that im using:
$xml= "http://www.voetbalzone.nl/rss/rss.xml"
for ($i = 0; $i < 10; $i++)
{
$title = $xml->rss->channel->item[$i]->title;
}
The error I get: Premature end of data in tag

It works for me like this:
<?php
$xml = simplexml_load_file("http://www.voetbalzone.nl/rss/rss.xml");
for ($i = 0; $i < 10; $i++)
{
$title = $xml->channel->item[$i]->title;
}
?>
Note that you are overwriting the variable $title each time, so that you will have the title of the 10. element in it after the loop finished [I assume that is not what you want?]
To get all 'item'-Elements inside 'channel' as an Array to iterate through you can use xpath like this:
<?php
$xml = simplexml_load_file("http://www.voetbalzone.nl/rss/rss.xml");
$item_array = $xml->xpath("//rss/channel/item");
foreach($item_array as $item) {
echo $item->title . "\n";
}
?>
I would suggest to read about php's SimpleXML here: http://php.net/manual/en/book.simplexml.php

Related

How to get image url by page in PHP

This is my code :
<form method="POST">
<input name="link">
<button type="submit">></button>
</form>
<title>GET IMAGE URL</title>
<?php
if (!isset($_POST['link'])) exit();
$link = $_POST['link'];
$parse = explode('.html', $link);
echo '<div id="pin" style="float:center"><textarea class="text" cols="110" rows="50">';
for ($i = 1; $i <=5; $i++)
{
if ($i > 1)
$link = "$parse[0]-$i.html";
$get = file_get_contents($link);
if (preg_match_all('/src="(.*?)"/', $get, $matches))
{
foreach ($matches[1] as $content)
echo $content."\r\n";
}
}
echo '</textarea>';
The page I'm trying to get the img src has 10 to 15 page,so I want my code to get all the img url until the end of the page. How can I do that without the loop?
If I use:
for ($i = 1; $i <=5; $i++)
this will get only 5 page img urls, but I want to make it get until the end. Then I don't need to edit the loop everytime I submit another URL with a different number of pages.
From this
this will get only 5 page img urls, but I want to make it get until the end. Then I don't need to edit the loop everytime I submit another URL with a different number of pages.
I could understand that your problem is with dynamic number of pages.Your urls have a next page link at the bottom
下一页
Identify it and get your images in while loop
<?php
// Link given in form
$link = "http://www.xiumm.org/photos/XiuRen-17305.html";
$parse = explode('.html', $link);
$i=1;
// Intialize a boolean
$nextPageFound = true;
while($nextPageFound) {
// Construct URL Every time when nextPageFound
if ($i == 1) {
$url = "$parse[0].html";
echo "First Page<br><br>";
} else {
$url = "$parse[0]-$i.html";
}
// Getting URL Contents
$get = file_get_contents($url);
if (preg_match_all('/src="(.*?)"/', $get, $matches))
{
// echoing contents
foreach ($matches[1] as $content)
echo $content."<br>";
}
// check nextPageBtn if available
if (strpos($get, '"nextPageBtn"') !== false) {
$nextPageFound = true;
// increment +1
$i++;
echo "<br>Page $i<br><br>";
} else {
$nextPageFound = false;
echo "THE END";
}
}
?>
You should use an HTML/XML parser, like DOMDocument, in combination with DOMXPath (xpath is query language to query (X)HTML data structures):
// create DOMDocument
$doc = new DOMDocument();
// load remote HTML file
$doc->loadHTMLFile( $link );
// create DOMXPath
$xpath = new DOMXPath( $doc );
// fetch all IMG elements that have a src attribute
$nodes = $xpath->query( '//img[#src]' );
// loop trough found IMG elements and echo their src attribute values
for( $i = 0; $i < $nodes->length; $i++ ) {
echo $nodes->item( $i )->getAttribute( 'src' ) . PHP_EOL;
}
Regarding the xpath query //div[contains(#class,'pic_box')]//#src, mentioned by #Enuma, in the comments:
The resulting DOMNodeList of that query will not contain DOMElement objects, but DOMAttr objects, because the query directly asks for attributes, not elements. Since DOMAttr represents an attribute and not an element, the method getAttribute() does not exist. To get the value of the attribute you have to use the property DOMAttr->value.
So, we have to slightly alter the relevant part of our example code from above to:
// loop trough found src attributes and echo their value
for( $i = 0; $i < $nodes->length; $i++ ) {
echo $nodes->item( $i )->value . PHP_EOL;
}
Putting it all together, our example code then becomes:
// create DOMDocument
$doc = new DOMDocument();
// load remote HTML file
$doc->loadHTMLFile( $link );
// create DOMXPath
$xpath = new DOMXPath( $doc );
// fetch all src attributes that are descendants of div.pic_box
$nodes = $xpath->query( '//div[contains(#class,'pic_box')]//#src' );
// loop trough found src attributes and echo their value
for( $i = 0; $i < $nodes->length; $i++ ) {
echo $nodes->item( $i )->value . PHP_EOL;
}
PS.: In order for DOMDocument to be able to load remote files, I believe some php config setting may be required to be set, which I don't know off the top of my head, right now. But since it already appeared to be working for #Enuma, it's not actually relevant now. Perhaps I'll look them up later.

Loop through xml nodes with simplexml

Im trying to create a table of Jobs on my site, pulling info from an xml feed I have access to... I've looked at various examples online and videos but I can't seem to understand how it works. My xml feed returns the following node structure:
<OutputVacancyAsXml>
<Vacancy>
<VacancyID></VacancyID>
<Job></Job>
<ClosingDate></ClosingDate>
</Vacancy>
</OutputVacancyAsXml>
I've had success with pulling through one item with this code:
<?php
$x = simplexml_load_file('https://www.octopus-hr.co.uk/recruit/OutputVacancyAsXml.aspx?CompanyID=400-73A3BCA1-D952-4BA6-AADB-D8BF3B495DF6');
echo $x->Vacancy[5]->Job;
?>
But converting it to foreach seems to be where I'm struggling. Heres the code I have tried so far with no luck;
<?php
$html = "";
$url = "https://www.octopus-hr.co.uk/recruit/OutputVacancyAsXml.aspx?CompanyID=400-73A3BCA1-D952-4BA6-AADB-D8BF3B495DF6";
$xml = simplexml_load_file($url);
for ($i = 0; $i < 10; $i++) {
$title = $xml->OutputVacancyAsXml->Vacancy[$i]->job;
$html .= "<p>$title</p>";
}
echo $html;
?>
Thanks all :)
Taken from the documentation
Note:
Properties ($movies->movie in previous example) are not arrays. They
are iterable and accessible objects.
With that kept in mind you can simple run over the nodes with foreach
$xml = simplexml_load_file($url);
foreach ($xml->OutputVacancyAsXml->Vacancy as $vacanacy)
{
echo (string)$vacanacy->Job; // Echo out the Job Title
}
Ok looks like I found a solution. Heres the code that worked for me plus it contains a little bit of code that pulls out duplicated (it was displaying each item 4 times!)...
<?php
$x = simplexml_load_file('https://www.octopus-hr.co.uk/recruit/OutputVacancyAsXml.aspx?CompanyID=400-73A3BCA1-D952-4BA6-AADB-D8BF3B495DF6');
$num = count($x->Vacancy);
//echo "num is $num";
$stopduplicates = array();
for ($i = 0; $i < $num; $i++) {
$job = $x->Vacancy[$i]->Job;
$closingdate = $x->Vacancy[$i]->ClosingDate;
// http://stackoverflow.com/questions/416548/forcing-a-simplexml-object-to-a-string-regardless-of-context
$vacancyid = (string) $x->Vacancy[$i]->VacancyID;
if (!in_array($vacancyid, $stopduplicates)) {
echo '
<tr class="job-row">
<td class="job-cell">'.$job.'</td>
<td class="date-cell">'.$closingdate.'</td>
<td class="apply-cell">
Apply Here
</td>
</tr>';
}
$stopduplicates[] = $vacancyid;
} //print_r($stopduplicates);
?>

PHP XML Parse data

I have never used XML before and am trying to loop through the XML and display all the display names in the 'A Team'. The code I am using is outputting 10 zeros and not the names.
The code I am using is attached below along with the feed.
Any assistance is much appreciated!
feed: https://apn.apcentiaservices.co.uk/ContactXML/agentfeed?organisation=se724de89ca150f
<?php
$url = 'https://apn.apcentiaservices.co.uk/ContactXML/agentfeed?organisation=se724de89ca150f';
$html = "";
$xml = simplexml_load_file($url);
for($i = 0; $i < 10; $i++){
$title = $xml->organisation['APN']->brand['AllStar Psychics']->pool['A Team']->agent[$i]->display-name;
echo $title;
}
echo $html;
?>
This might getthe basics you asked for. Not sure if it's what you want. I'm not that good at xpath.
$mydata = $xml->xpath('/organisation/brand/pool[#name="A Team"]//display-name');
foreach($mydata as $key=>$value){
echo('Name:' . $value .'<br>');
}

how to use loop for array

I have a array which it reads its cells from a xml file,i wrote it by "for" but now because i don't know how many node i have i wanna to write this loop in a way that it start and finish up to end of xml file.my code with for is:
$description=array();
for($i=0;$i<2;$i++)
{
$description[$i]=read_xml_node("dscription",$i);
}
and my xml file:
<eth0>
<description>WAN</description>
</eth0>
<eth1>
<description>LAN</description>
</eth1>
in this code i must know "2",but i wanna to know a way that doesn't need to know "2".
i am not sure what kind of parser you are using, but it is very easy with simplexml, so i put together some sample code using simplexml.
something like this should do the trick:
$xmlstr = <<<XML
<?xml version='1.0' standalone='yes'?>
<node>
<eth0>
<description>WAN</description>
</eth0>
<eth1>
<description>LAN</description>
</eth1>
</node>
XML;
$xml = new SimpleXMLElement($xmlstr);
foreach ($xml as $xmlnode) {
foreach ($xmlnode as $description) {
echo $description . " ";
}
}
output:
WAN LAN
$length = count($description);
for ($i = 0; $i < $length; $i++) {
print $description[$i];
}
The parser you use might allow you to use a while loop which will return false when it has reached the end of the XML document. For example:
while ($node = $xml->read_next_node($mydoc)) {
//Do whatever...
}
If none exists, you can try using count() as the second parameter of your for loop. It returns the length of an array you specify. For example:
for ($i = 0; $i < count($myarray); $i++) {
//Do whatever...
}

Fetch the attributes using PHP crawler

I am trying to fetch the name,address and location from crawling of a website . Its a single page and dont want any other thing other than this. I am using the below code.
<?php
include 'simple_html_dom.php';
$html = "http://www.phunwa.com/phone/0191/2604233";
$dom = new DomDocument();
$dom->loadHtml($html);
$xpath = new DomXpath($dom);
$div = $xpath->query('//*[#class="address-tags"]')->item(0);
for($i=0; $i < $div->length; $i++ )
{
print "nodename=".$div->item( $i )->nodeName;
print "\t";
print "nodevalue : ".$div->item( $i )->nodeValue;
print "\r\n";
echo $link->getElementsByTagName("<p>");
}
?>
The website html source code is
<div class="address-tags">
<p><strong>Name:</strong> RAJ GOPAL SINGH</p>
<p><strong>Address:</strong> R/O BARNAI NETARKOTHIAN, P.O.MUTHI TEH.& DISTT.JAMMU,X, 181206</p>
<p><strong>Location:</strong> JAMMU, Jammu & Kashmir, India</p>
<p><strong>Other Numbers:</strong> 01912604233 | +911912604233 | +91-191-2604233</p>
Can somone please help me get the three attributes as output. Nothing is echop on the page as of now.
Thanks alot .
you need $dom->load($html); instead of $dom->loadHtml($html);. After doing this you wil; find your html is not well formed, so $xpath stay empty.
Maybe try something like:
$html = file_get_contents('http://www.phunwa.com/phone/0191/2604233');
$name = preg_replace('/(.*)(<p><strong>Name:<\/strong> )([^<]+)(<\/p>)(.*)/mis','$3',$html);
$address = preg_replace('/(.*)(<p><strong>Address:<\/strong> )([^<]+)(<\/p>)(.*)/mis','$3',$html);
$location = preg_replace('/(.*)(<p><strong>Location:<\/strong> )([^<]+)(<\/p>)(.*)/mis','$3',$html);
$othernumbers = preg_replace('/(.*)(<p><strong>Other Numbers:<\/strong> )(.*)/mis','$3',$html);
list($othernumbers,$trash)= preg_split('/<\/p>/mis',$othernumbers,0);
echo 'name: '.$name.'<br>address: '.$address.'<br>location: '.$location.'<br>other numbers: '.$othernumbers;
exit;
You should use the following for your XPath query:
//*[#class='address-tags']/p
so you're retrieving the actual paragraph nodes that are children of the 'address-tags' parent. Then you can use a loop on them:
$nodes = $xpath->query('//*[#class="address-tags"]/p');
for ($i = 0; $i < $nodes->length; $i++) {
echo $nodes->item($i)->nodeValue;
}
// or just
foreach($nodes as $node) {
echo $node->nodeValue;
}
Right now your code is properly fetching the first div that's found, but then you continue treating that div as if it was a DOMNodeList returned from an xpath query, which is incorrect. ->item() returns a DOMNode object, which does NOT have an ->item() method.

Categories