Symfony 2 test xml with Symfony\Component\DomCrawler\Crawler - php

I've got an url that return an xml but I have some problem to extract "link" element.
<rss xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<item>
<id>123</id>
<title>my title</title>
<link>
http://example.org
</link>
</item>
</channel>
</rss>
I need to test it with
Symfony\Component\DomCrawler\Crawler
These are my tests:
$crawler = $this->client->get('/my-feed');
$items = $crawler->filterXPath('//channel/item');
$this->assertGreaterThanOrEqual(1, $items->count()); // ok pass
// ...
$titles = $items->filterXPath('//title')->extract(array('_text'));
$this->assertContains("my title", $titles); // ok pass
// ...
$links = $items->filterXPath('//link')->extract(array('_text'));
$this->assertContains("example.org", $links); // KO!!! don't pass
var_dump($links); // empty string
"link" is a reserved word?

Your XML is broken:
you don't have a closing channel node </channel>
you don't have a closing rss node </rss>
Here is corrected XML :
<rss xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<item>
<id>123</id>
<title>my title</title>
<link>http://example.org</link>
</item>
</channel>
</rss>
Then, ->extract() returns An array of extracted values. So you shouldn't directly try to see its contain but get the first element and do your test:
$this->assertContains("my title", $titles[0]);
// ...
$this->assertContains("example.org", $links[0]);

Related

Can't Loop Through XML Elements Using SimpleXML

I'm trying to loop through an XML document and get out the details of each item.
I manage to get into a single item like this and print its values:
foreach($xml->children() as $games){
echo $games->item->title;
}
But it doesn't loop through all the values.
My instinct to loop through says it should look something like this:
foreach($xml->children()->item as $games){
echo $games->title;
}
But this doesn't return anything.
Example of XML:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:g="http://base.google.com/ns/1.0">
<channel>
<title>feed EUR</title>
<link>https://www.google.com/</link>
<description>Lorem ipsum</description>
<item>
<g:id>99f3a672-c54a-11e8-83c9-186590d66063</g:id>
<title>7 Days to Die Steam Key GLOBAL</title>
</item>
<item>
<g:id>9aacf9ec-c54a-11e8-9925-186590d66063</g:id>
<title>A Hat in Time Steam Key GLOBAL</title>
</item>
<item>
Children of root <rss> node are <channel>, so you can iterate over them:
foreach ($xml->channel as $channel) {
echo ($channel->title);
}

How can I retrieve specific XML tag names?

The feed above returns an XML document. I can successfully retrieve tag names like title,description and link using these codes
$xml = file_get_contents($feed_url);
$xml = trim($xml);
$xmlObject = new SimpleXmlElement($xml);
foreach ($xmlObject->channel->item as $item) {
$title = strip_tags($item->title);
$description = strip_tags($item->description);
}
How can I get <a10:updated> ?
<rss xmlns:a10="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<title>title/title>
<link>link</link>
<description>news</description>
<item>
<guid isPermaLink="true">link</guid>
<link>link</link>
<title>Tiele</title>
<description>Descr</description>
<enclosure url="image" type="image/jpeg"/>
<a10:updated>2017-05-07T09:14:00+03:00</a10:updated>
</item>
</channel>
</rss>
Here we are using DOMDocument for extracting data from a tag.
Try this code snippet here
<?php
ini_set('display_errors', 1);
$xml='<rss xmlns:a10="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<title>title</title>
<link>link</link>
<description>news</description>
<item>
<guid isPermaLink="true">link</guid>
<link>link</link>
<title>Tiele</title>
<description>Descr</description>
<enclosure url="image" type="image/jpeg"/>
<a10:updated>2017-05-07T09:14:00+03:00</a10:updated>
</item>
</channel>
</rss>';
$xmlObject = new DOMDocument();
$xmlObject->loadXML($xml);
$result=$xmlObject->getElementsByTagNameNS("http://www.w3.org/2005/Atom", "*");
print_r($result->item(0)->textContent);
Output:
2017-05-07T09:14:00+03:00
You're looking at a different XML namespace there. You can use curly brackets to access it:
$a10 = $item->{'a10:updated'}

PHP SimpleXML read comment with xPath

I'm loading a XML Feed, which works.
But it seems that I am missing something in my xpath query.
Actually I want to read the comment of the title node but it doesn't seem to work.
$xml = simplexml_load_file( $feed_url );
$comment = $xml->xpath('//channel/item[1]/title//comment()');
The feed has the following structure
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title>My main feed</title>
<description>feed description</description>
<link>http://www.example.com</link>
<language>en_US</language>
<lastBuildDate>Wed, 26 Nov 2014 11:44:13 UTC</lastBuildDate>
<item>
<title><!-- Special Image --></title>
<link></link>
<guid>http://www.example.com/page/345</guid>
<media:category>horizontal</media:category>
<media:thumbnails>
<media:thumbnail url="www.example.com/test.jpg" type="image/jpeg" height="324" width="545" />
</media:thumbnails>
</item>
<item>
<title>Here's a normal title</title>
<description><![CDATA[Description Text]]></description>
<link></link>
<guid>http://www.example.com/page/123</guid>
<media:category>horizontal</media:category>
</item>
</channel>
</rss>
Does anyone have a clue how I could read the comment?
Alternatively, you could use DOMDocument to access those comments. Example:
$dom = new DOMDocument();
$dom->load($feed_url);
$xpath = new DOMXpath($dom);
$comment = $xpath->evaluate('string(//channel/item[1]/title/comment())');
echo $comment;

Adding version to a Generated XML tag

Im trying to make an RSS Feed XML and i spotted a site where they give an example of how the XML should look like :
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title>The Channel Title Goes Here</title>
<description>The explanation of how the items are related goes here</description>
<link>http://www.directoryoflinksgohere</link>
<item>
<title>The Title Goes Here</title>
<description>The description goes here</description>
<link>http://www.linkgoeshere.com</link>
</item>
<item>
<title>Another Title Goes Here</title>
<description>Another description goes here</description>
<link>http://www.anotherlinkgoeshere.com</link>
</item>
</channel>
</rss>
However when i check my current XML i notice i miss the version in the xml and rss tag.
<xml>
<rss>
<channel>
<title>#####</title>
<description>
#####
</description>
<path>#####</path>
How can i add the version to the start tag of XML and RSS?
PHP
$newspages = $this->newspages;
$xml = new SimpleXMLElement('<xml/>');
$rss = $xml->addChild('rss');
$channel = $rss->addChild('channel');
$channel->addChild('title', txt('rss.channelname'));
$channel->addChild('description', txt('rss.channeldescription'));
$channel->addChild('path', 'http://'.$_SERVER['HTTP_HOST']);
foreach ($newspages as $newspage) {
if ($newspage['id'] !== 'news-archive') {
$item = $channel->addChild('item');
$item->addChild('title', $newspage['title']);
$item->addChild('description', $newspage['description']);
$item->addChild('path', url('###/pageid', array('language'=>$this->language, 'id'=>$newspage['id'])));
}
}
Header('Content-type: text/xml');
print($xml->asXML());
Use addAttribute method.
<?php
$xml = new SimpleXMLElement('<xml/>');
$xml->addAttribute('version', '1.0');
$rss = $xml->addChild('rss');
$rss->addAttribute('version', '2.0');

Replacing innertext of XML node using PHP DOMDocument

I want to replace innertext of a XML node my XML file named test.xml is
<?xml version="1.0" encoding="utf-8"?>
<ads>
<loop>no</loop>
<item>
<description>Description 1</description>
</item>
<item>
<description>Text in item2</description>
</item>
<item>
<description>Let play with this XML</description>
</item>
</ads>
I want to change the value of loop and description tag both,
and it should be saved in test.xml like:
<?xml version="1.0" encoding="utf-8"?>
<ads>
<loop>yes</loop>
<item>
<description>Description Changing Here</description>
</item>
<item>
<description>Changing text in item2</description>
</item>
<item>
<description>We will play later</description>
</item>
</ads>
I tried code in PHP:
<?
$file = "test.xml";
$fp = fopen($file, "rb") or die("cannot open file");
$str = fread($fp, filesize($file));
$dom=new DOMDocument();
$dom->formatOutput = true;
$dom->preserveWhiteSpace = false;
$dom->loadXML($str) or die("Error");
//$dom->load("items.xml");
$root=$dom->documentElement; // This can differ (I am not sure, it can be only documentElement or documentElement->firstChild or only firstChild)
$loop=$root->getElementsByTagName('loop')->item(0);//->textContent;
//echo $loop;
if(trim($loop->textContent)=='no')
{
echo 'ok';
$root->getElementsByTagName('loop')->item(0)->nodeValue ='yes';
}
echo "<xmp>NEW:\n". $dom->saveXML() ."</xmp>";
?>
I tried only for loop tag.I don't know how to replace nodevalue in description tag.
When I run this page it shows output like:
ok
NEW:
<?xml version="1.0" encoding="utf-8"?>
<ads>
<loop>yes</loop>
<item>
<description>Description 1</description>
</item>
<item>
<description>Changing text in item2</description>
</item>
<item>
<description>Let play with this XML</description>
</item>
</ads>
It gives the value yes in browser but don't save it in test.xml any reason?
So, since you have created DOMDocument you can use DOMXpath. Or keep using getElementsByTagName()
You could do this (but only in that context):
$descriptions = $root->getElementsByTagName('description');
foreach($descriptions as $nodeDesciption)
{
$nodeDesciption->nodeValue ='Your custom value';
}

Categories