PHP Search between two points - php

I'm trying to deal with some XML in PHP.
I have code, such as this:
<?php
$stream = fopen("xml","r");
?>
Where "xml" contains something such as this:
<name>name1</name>
<key>key1</key>
<name>name2</name>
<key>key2</key>
etc.
I'd like to create an array out of the contents of the <key> tags, something like where
keys[0] = "key1"
and
keys[1] = "key2"
Any help is appreciated, thank you very much :)
Solution:
$xmlstr = fread($stream,filesize("xml-file"));
$sxe = new SimpleXMLElement($xmlstr);
echo $sxe->getName() . "\n";
foreach ($sxe->children() as $child) {
echo $child->children();
}

You should use DOM functions for this case. Let's suppose a well-formed XML document (xmltest.xml):
<?xml version="1.0" encoding="utf-8"?>
<root>
<name>name1</name>
<key>key1</key>
<name>name2</name>
<key>key2</key>
</root>
This code loads the xml file into DOM document and gets all nodes with tag key;
<?php
$dom = new DOMDocument('1.0','utf-8');
$dom->load('xmltest.xml');
$keys = $dom->getElementsByTagName('key');
for ($i = 0; $i < $keys->length; $i++) {
echo $keys->item($i)->nodeValue . "</br>";
}
?>

Related

Trouble creating a valid RSS feed in PHP

I'm trying to get an RSS feed, change some text, and then serve it again as an RSS feed. However, the code I've written doesn't validate properly. I get these errors:
line 3, column 0: Missing rss attribute: version
line 14, column 6: Undefined item element: content (10 occurrences)
Here is my code:
<?php
header("Content-type: text/xml");
echo "<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet type='text/xsl'?>
<?xml-stylesheet type='text/xsl' media='screen'
href='/~d/styles/rss2full.xsl'?>
<rss xmlns:content='http://purl.org/rss/1.0/modules/content/'>
<channel>
<title>Blaakdeer</title>
<description>Blog RSS</description>
<language>en-us</language>
";
$html = "";
$url = "http://feeds.feedburner.com/vga4a/mPSm";
$xml = simplexml_load_file($url);
for ($i = 0; $i < 10; $i++){
$title = $xml->channel->item[$i]->title;
$description = $xml->channel->item[$i]->description;
$content = $xml->channel->item[$i]->children("content", true);
$content = preg_replace("/The post.*/","", $content);
echo "<item>
<title>$title</title>
<description>$description</description>
<content>$content</content>
</item>";
}
echo "</channel></rss>";
Just as you don't treat XML as a string when parsing it, you don't treat it as as string when you create it. Use the proper tools to create your XML; in this case, the DomDocument class.
You had a number of problems with your XML; biggest is that you were creating a <content> element, but the original RSS had a <content:encoded> element. That means the element name is encoded but it's in the content namespace. Big difference between that and an element named content. I've added comments to explain the other steps.
<?php
// create the XML document with version and encoding
$xml = new DomDocument("1.0", "UTF-8");
$xml->formatOutput = true;
// add the stylesheet PI
$xml->appendChild(
$xml->createProcessingInstruction(
'xml-stylesheet',
'type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"'
)
);
// create the root element
$root = $xml->appendChild($xml->createElement('rss'));
// add the version attribute
$v = $root->appendChild($xml->createAttribute('version'));
$v->appendChild($xml->createTextNode('2.0'));
// add the namespace
$root->setAttributeNS(
'http://www.w3.org/2000/xmlns/',
'xmlns:content',
'http://purl.org/rss/1.0/modules/content/'
);
// create some child elements
$ch = $root->appendChild($xml->createElement('channel'));
// specify the text directly as second argument to
// createElement because it doesn't need escaping
$ch->appendChild($xml->createElement('title', 'Blaakdeer'));
$ch->appendChild($xml->createElement('description', 'Blog RSS'));
$ch->appendChild($xml->createElement('language', 'en-us'));
$url = "http://feeds.feedburner.com/vga4a/mPSm";
$rss = simplexml_load_file($url);
for ($i = 0; $i < 10; $i++) {
if (empty($rss->channel->item[$i])) {
continue;
}
$title = $rss->channel->item[$i]->title;
$description = $rss->channel->item[$i]->description;
$content = $rss->channel->item[$i]->children("content", true);
$content = preg_replace("/The post.*/","", $content);
$item_el = $ch->appendChild($xml->createElement('item'));
$title_el = $item_el->appendChild($xml->createElement('title'));
// this stuff is unknown so it has to be escaped
// so have to create a separate text node
$title_el->appendChild($xml->createTextNode($title));
$desc_el = $item_el->appendChild($xml->createElement('description'));
// the other alternative is to create a cdata section
$desc_el->appendChild($xml->createCDataSection($description));
// the content:encoded element is not the same as a content element
// the element must be created with the proper namespace prefix
$cont_el = $item_el->appendChild(
$xml->createElementNS(
'http://purl.org/rss/1.0/modules/content/',
'content:encoded'
)
);
$cont_el->appendChild($xml->createCDataSection($content));
}
header("Content-type: text/xml");
echo $xml->saveXML();
The first error is just a missing attribute, easy enough:
<rss version="2.0" ...>
For the <p> and other HTML elements, you need to escape them. The file should look like this:
<p>...
There are other ways, but this is the easiest way. In PHP you can just call a function to encode entities.
$output .= htmlspecialchars(" <p>Paragraph</p> ");
As for the <content> tag problem, it should be <description> instead. The <content> tag currently generates two errors. Changing it to <description> in both places should fix both errors.
Otherwise it looks like you understand the basics. You <open> and </close> tags and those have to match. You can also use what is called empty tags: <empty/> which exist on their own but to not include content and no closing tag.

How to load string back xml with simpleXML in PHP?

Would anyone know how i can "explode" a string back into "normal" xml format?
I found this script (ref:gooseflight,2010) that looks like it can do the job but the output comes out stuck together.
Here's the code:
enter code herefunction combineXML($file)
{
global $xmlstr;
$xml = simplexml_load_file($file);
foreach($xml as $element)
$xmlstr .= $element->asXML();
}
$files[] = "tmp.xml";
$files[] = "traduction.xml";
$xmlstr = '<CAB>';
foreach ($files as $file)
combineXML($file);
$xmlstr .= '</CAB>';
// Convert string to XML for further processing
$xml = simplexml_load_string($xmlstr);
$bytes = file_put_contents("combined.xml", $xml->asXML())
Here is the output:
<?xml version="1.0" encoding="UTF-8"?>
<CAB>
<CABO>XXXXXXXXXX0987650003</CABO><ACTIVITY>NONE</ACTIVITY><BEORI>blablaE</BEORI>BEDEST>blabla</BEDEST><NATRELA>more blabla</NATRELA><ANE>2014</ANE><NODEP>1111</NODEP>
</CAB>
So how could i seperate the nodes to look like this?:
<?xml version="1.0" encoding="UTF-8"?>
<CAB>
<CABO>XXXXXXXXXX0987650003</CABO>
<ACTIVITY>NONE</ACTIVITY>
<BEORI>blablaE</BEORI>
<BEDEST>blabla</BEDEST>
<NATRELA>more blabla</NATRELA>
<ANE>2014</ANE>
<NODEP>1111</NODEP>
.....
</CAB>
Would anyone know how to fix it?
I would suggest to use DomDocument class to save the XML; check this:
$dom_obj = new DOMDocument();
$dom_obj->loadXML($file);
// Do all your changes to the file by using DomDocument command (e.g. CreateElement, CreateAttribute, etc)
$dom_obj->formatOutput = true;
$dom_obj->save($file);

DOM get node value by "brother" value in this string

I am creating PHP system for edit XML files to translation of game.
I am using DOM e.g for file-comparision for translators (with update XML file).
I have old and new XML (in advance: I can not change XML structure) with new strings and/or new IDs.
For future echo node value to comparision by the same ID order, I have following code:
<?php
$xml2 = new DOMDocument('1.0', 'utf-16');
$xml2->formatOutput = true;
$xml2->preserveWhiteSpace = false;
$xml2->load(substr($file, 0, -4).'-pl.xml');
$xml = new DOMDocument('1.0', 'utf-16');
$xml->formatOutput = true;
$xml->preserveWhiteSpace = false;
$xml->load($file);
for ($i = 0; $i < $xml->getElementsByTagName('string')->length; $i++) {
if ($xml2->getElementsByTagName('string')->item($i)) {
$element_pl = $xml2->getElementsByTagName('string')->item($i);
$body_pl = $element_pl->getElementsByTagName('body')->item(0);
$id_pl = $element_pl->getElementsByTagName('id')->item(0);
} else $id_pl->nodeValue = "";
$element = $xml->getElementsByTagName('string')->item($i);
$id = $element->getElementsByTagName('id')->item(0);
$body = $element->getElementsByTagName('body')->item(0);
if ($id_pl->nodeValue == $id->nodeValue) {
$element->appendChild( $xml->createElement('body-pl', $body_pl->nodeValue) );
}
}
$xml = simplexml_import_dom($xml);
?>
Above code change:
<?xml version="1.0" encoding="utf-16"?>
<strings>
<string>
<id>1</id>
<name>ABC</name>
<body>English text</body>
</string>
</strings>
to (by adding text from *-pl.xml file):
<?xml version="1.0" encoding="utf-16"?>
<strings>
<string>
<id>1</id>
<name>ABC</name>
<body>English text</body>
<body-pl>Polish text</body-pl>
</string>
</strings>
But I need find "body" value in *-pl.xml by "name" value.
"For" loop:
get "ABC" from "name" tag [*.xml] ->
find "ABC" in "name" tag [*-pl.xml] ->
get body node from that "string" [*-pl.xml]
I can do that by strpos(), but my (the smallest) file have 25346 lines..
Is there something to do e.g. "has children ("name", "ABC") -> parent" ?
Then I can get "body" value of this string.
Thank you in advance for suggestions or link to similar, resolved ask,
Greetings
You need XPath expressions:
//name[text()='ABC']/../body
or
//name[text()='ABC']/following-sibling::body
Check the PHP manual for DOMXPath class and its query method. In a nutshell, you'd use it like this:
$xpath = new DOMXPath($dom_document);
// find all `body` nodes that have a `name` sibling
// with an `ABC` value in the entire document
$nodes = $xpath->query("//name[text()='ABC']/../body");
foreach($nodes as $node) {
echo $node->textContent , "\n\n";
}

How to remove XML elements and all children?

I need to read an XML file and delete all the elements named <images> and all the children associated. I have found similar old questions that did not work. What am I doing wrong? Is there a better method?
XML:
<?xml version='1.0' encoding='UTF-8'?>
<settings>
<background_color>#000000</background_color>
<show_context_menu>yes</show_context_menu>
<image>
<thumb_path>210x245.png</thumb_path>
<big_image_path>620x930.png</big_image_path>
</image>
<image>
<thumb_path>200x295.png</thumb_path>
<big_image_path>643x950.png</big_image_path>
</image>
</settings>
PHP:
$dom = new DOMDocument();
$dom->load('test.xml');
$thedocument = $dom->documentElement;
$elements = $thedocument->getElementsByTagName('image');
foreach ($elements as $node) {
$node->parentNode->removeChild($node);
}
$save = $dom->saveXML();
file_put_contents('test.xml', $save)
I figured it out after a good night of sleep. It was quite simple actually.
$xml = simplexml_load_file('test.xml');
unset($xml->image);
$xml_file = $xml->asXML();
$xmlFile = 'test.xml';
$xmlHandle = fopen($xmlFile, 'w');
fwrite($xmlHandle, $xml_file);
fclose($xmlHandle);
Edit: You probably want to make it save directly:
$file = 'test.xml';
$xml = simplexml_load_file($file);
unset($xml->image);
$success = $xml->asXML($file);
See SimpleXMLElement::asXML()Docs.
In the PHP Manual page (where you should always go 1st :-) one awesome contributor points out that:
You can't remove DOMNodes from a DOMNodeList as you're iterating over them in a foreach loop.
Then goes on to offer a potential solution. Try something like this instead:
<?php
$domNodeList = $domDocument->getElementsByTagname('p');
$domElemsToRemove = array();
foreach ( $domNodeList as $domElement ) {
// ...do stuff with $domElement...
$domElemsToRemove[] = $domElement;
}
foreach( $domElemsToRemove as $domElement ){
$domElement->parentNode->removeChild($domElement);
}
?>
First of all, your XML is broken, see <thumb>...</thumb_path>and next line as well -> fix it!
Then, real simple in 3 lines of code:
$xml = simplexml_load_string($x); // $x holds your xml
$count = $xml->image->count()-1;
for ($i = $count;$i >= 0;$i--) unset($xml->image[$i]);
See live demo # http://codepad.viper-7.com/HkGy5o

xml parsing with php

I would like to create a new simplified xml based on an existing one:
(using "simpleXml")
<?xml version="1.0" encoding="UTF-8"?>
<xls:XLS>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>Start</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>End</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
</xls:XLS>
Because there are always colons in the element-tags, it will mess with "simpleXml", I tried to use the following solution->link.
How can I create a new xml with this structure:
<main>
<instruction>Start</instruction>
<instruction>End</instruction>
</main>
the "instruction-element" gets its content from the former "xls:Instruction-element".
Here is the updated code:
But unfortunately it never loops through:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach($xml->children() as $child){
print_r("xml_has_childs");
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
echo $new_xml->asXML();
there is no error-message, if I leave the "#"…
/* the use of # is to suppress warning */
$xml = #simplexml_load_string($YOUR_RSS_XML);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->children() as $child)
{
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
/* to print */
echo $new_xml->asXML();
You could use xpath to simplify things. Without knowing the full details, I don't know if it will work in all cases:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start</instruction><instruction>End</instruction></main>
Edit: The file at http://www.gps.alaingroeneweg.com/route.xml is not the same as the XML you have in your question. You need to use a namespace like:
$xml = #simplexml_load_string(file_get_contents('http://www.gps.alaingroeneweg.com/route.xml'));
$xml->registerXPathNamespace('xls', 'http://www.opengis.net/xls'); // probably not needed
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//xls:Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start (Southeast) auf Sihlquai</instruction><instruction>Fahre rechts</instruction><instruction>Fahre halb links - Ziel erreicht!</instruction></main>

Categories