PHP SimpleXML: Remove items with for - php

I just can remove an item from a simpleXML element with:
unset($this->simpleXML->channel->item[0]);
but I can't with the a for:
$items = $this->simpleXML->xpath('/rss/channel/item');
for($i = count($items); $i > $itemsNumber; $i--) {
unset($items[$i - 1]);
}
some items are removed from $items (Netbeans Debug can confirm that) but when I get the path again (/rss/channel/item) nothing was deleted.
What's wrong?

SimpleXML does not handle node deletion, you need to use DOMNode for this.
Happily, when you import your nodes into DOMNode, the instances point to the same tree.
So, you can do that :
<?php
$items = $this->simpleXML->xpath('/rss/channel/item');
foreach ($items as $item) {
$node = dom_import_simplexml($item);
$node->parentNode->removeChild($node);
}

You're currently only, as you know, unsetting the item from the array.
To get the magical unsetting to work on the SimpleXMLElement, you have to either do as Xavier Barbosa suggested or give PHP a little nudge into firing off the correct unsetting behaviour.
The only change in the code snippet below is the additions of [0]. Heavy emphasis on the word magical.
$items = $this->simpleXML->xpath('/rss/channel/item');
for($i = count($items); $i > $itemsNumber; $i--) {
unset($items[$i - 1][0]);
}
With that said, I would recommend (as Xavier and Josh have) moving into DOM-land for manipulating the document.

Well I was racking my brain trying to figure out how to delete the last child from an xml document. Then I insert a new element at the top. This way there is always a set amount of items in my rss feed. I could not get the xpath stuff to work. That could be because of the free server I am using but anyways. This is what I did. My xml document is an rss feed so I have 6 elements before the items start. ie. title,description under the channel.
$file = 'newrss.xml';//get file
$fp = fopen($file, "rb") or die("cannot open file");//open the file
$str = fread($fp, filesize($file));//read the file
$xml = new DOMDocument();//new xml DOMDocument
$xml->formatOutput = true;
$xml->preserveWhiteSpace = false;
$xml->loadXML($str) or die("Error");//Load Document
// get document element
$root = $xml->documentElement;
$fnode = $root->firstChild;
$ori = $fnode->childNodes->item(6);//The 6th item starts the item nodes
//Get the number of items in my xml.
$nodeLength = $fnode->getElementsByTagName('item')->length;//count nodes
$itemNum=$nodeLength+5;//I added 5 so it starts from the first item
$lNode = $fnode->childNodes->item($itemNum);//Get the last child node
$fnode->removeChild($lNode);//finally remove that node.
I know this is not pretty but it works good. It took me forever to figure this out so I hope it will help someone else since I see this question a lot. If you are not interested in adding your new item to the top of the rss list then you could skip the $ori variable. Furthermore if you do leave out the $ori variable you will have to adjust the $itemNum so you remove the correct item.

Related

Using PHP to convert XML to CSV but with a twist

I'm trying to convert some XML files I have to CSV using PHP SimpleXML class. However, I'm unable to achieve the result I want, because one parent could have several child elements with the same name. My current XML file is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<club>
<name>Green Riders</name>
<membership>Free</membership>
<boardMember>
<name>James F.</name>
<position>CEO</position>
</boardMember>
<boardMember>
<name>Helen D.</name>
<position>Associate Director</position>
</boardMember>
</club>
<club>
<name>Broken Dice</name>
<membership>Paid</membership>
<boardMember>
<name>Patrick B.</name>
<position>CEO</position>
</boardMember>
</club>
</root>
The CSV output I was hoping to achieve is as such:
club,name,membership,boardMember>Name,boardMember>position
Green Riders,Free,James F.,CEO
Green Riders,Free,Helen D., Associate Director
Broken Dice,Paid,Patrick B., CEO
Is there anyway to achieve this without hard-coding the element names into the script (i.e. make it work on any generic XML file)?
I'm really hoping this is possible, given that I'll be having more than 25 XML variants; so would really be inefficient to write a dedicated script for each.
Thanks!
Since every child node's data need to be a row in the csv including the root root data, First you can capture & store the root data, then traverse the children and print their data with the root's data preceding them.
Please check the following code:
$xml = simplexml_load_file("your_xml_file.xml") or die("Error: Cannot create object");
$csv_delimeter = ",";
$csv_new_line = "\n";
foreach($xml->children() as $n) {
$club_data = array();
$club_data[] = $n->name;
$club_data[] = $n->membership;
if (isset($n->boardMember)) {
foreach ($n->boardMember as $boardMember) {
$boardMember_data = $club_data;
$boardMember_data[] = $boardMember->name;
$boardMember_data[] = $boardMember->position;
echo implode($csv_delimeter, $boardMember_data).$csv_new_line;
}
}
else {
echo implode($csv_delimeter, $club_data).$csv_new_line;
}
}
After testing with the example xml data, it generated the following type of output:
Green Riders,Free,James F.,CEO
Green Riders,Free,Helen D., Associate Director
Broken Dice,Paid,Patrick B., CEO
You can set different values based on your scenario for:
$csv_delimeter = ",";
$csv_new_line = "\n";
As there are no strict rules in csv output - like delimeter can be ",", ",", ";" or "|" and also new line can be "\n\r"
The codes prints csv rows one-by-one on the fly, but if you are to save csv data in a file, then instead of writing rows one-by-one, better approach would be create the entire array and write it once(as disk access is costly) unless the xml data is large. You will get plenty of simple php array-to-csv function examples in the net.
It is not really possible. XML is a nested structure and you miss the information. You can define some default mapping for XML structures, but that gets really complex really fast. So it is far easier (and less time consuming) to define the mapping by hand.
A Reusable Conversion
function readXMLAsRecords(string $xml, array $map) {
// load the xml
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
// iterate the elements defining the rows
foreach ($xpath->evaluate($map['row']) as $row) {
$line = [];
// get the field values from the current $row
foreach ($map['columns'] as $name => $expression) {
$line[$name] = $xpath->evaluate($expression, $row);
}
// return a line
yield $line;
}
}
The Mapping
With DOMXpath::evaluate() Xpath expressions can return strings. So we need one expression that returns the boardMember nodes and a list of expressions for the fields.
$map = [
'row' => '/root/club/boardMember',
'columns' => [
'club_name' => 'string(parent::club/name)',
'club_membership' => 'string(parent::club/membership)',
'board_member_name' => 'string(name)',
'board_member_position' => 'string(position)'
]
];
To CSV
readXMLAsRecords() returns a generator, you can use foreach on it:
$csv = fopen('php://stdout', 'w');
fputcsv($csv, array_keys($map['columns']));
foreach (readXMLAsRecords($xml, $map) as $record) {
fputcsv($csv, $record);
}
Output:
club_name,club_membership,board_member_name,board_member_position
"Green Riders",Free,"James F.",CEO
"Green Riders",Free,"Helen D.","Associate Director"
"Broken Dice",Paid,"Patrick B.",CEO

variable constantly retaining value once set

I'm looping through a directory, trying to find XML files with errors.
$baddies = array();
foreach (glob("fonts/*.svg") as $filename) {
libxml_use_internal_errors(true);
$str = file_get_contents($filename);
$sxe = simplexml_load_string($str);
$errors = libxml_get_errors();
$num_of_errors = 0;
$num_of_errors = sizeof($errors);
if ($num_of_errors > 0){
array_unshift($baddies, $filename);
}
}
However it seems that once the errors are put into this object, they persist there through subsequent iterations of the loop, and files without errors still test positive. $num_of_errors remains high for good files. I have it being reset to zero, and have even tried unseting it after each time through the loop. I suppose libxml_get_errors continues to retain a value once set. How can I reset it?
I think you should use libxml_clear_errors function. As per document here it says, the function keeps the errors stored in buffer.

Get anchor tags from mutiple HTML Files

I am not sure if this is even possible but I am trying to extract all the anchor tag links in a few HTML files on my website. I have currently written a php script that scans a few directories and sub directories that builds an array of HTML file links. Here is that code:
$di = new RecursiveDirectoryIterator('Migration');
$migrate = array();
foreach (new RecursiveIteratorIterator($di) as $filename => $file) {
if (eregi("\.html",$file) || eregi("\.htm",$file) ) {
$migrate[] .= $filename;
}
}
This method successfully produces the HTML File links that I need. Ex:
Migration/administration/billing/Billing.htm
Migration/administration/billing/_notes/Billing.htm.mno
Migration/administration/new business/_notes/New Business.htm.mno
Migration/administration/new business/New Business.htm
Migration/account/nycds/_notes/NYCDS Index.htm.mno
Migration/account/nycds/NYCDS Index.htm
There's more links but this gives you an idea. The next part is where I am stuck. I was thinking that I would need a for loop to loop through each array element, open the file, extract the links, then store those links somewhere. I am just not sure how I would go about this process. I tried to google this question but I never seemed to get results that matched what I was looking to do. Here is the simplified for loop that I have.
var obj = <?php echo json_encode($migrate); ?>;
for(var i=0;i< obj.length;i++){
// alert(obj[i]);
}
The above code is in javascript. From what I am reading, It seems that I shouldn't be using javascript but should maybe continue using PHP. I am confused on what my next steps should be. If someone can point me in the right direction I would really appreciate it. Thank you so much for your time.
Use DOMDocument::getElementsByTagName to retrieve all <a> tags
http://www.php.net/manual/en/domdocument.getelementsbytagname.php
Example,
$doc = new DOMDocument();
$doc->loadHTMLFile("filename.html");
$anchors = $doc->getElementsByTagName('a'); //retrieve all anchor tags
foreach ($anchors as $a) { //loop anchors
echo $a->nodeValue;
}

In PHP, how can I get an XML attribute based on a variable?

I'm retrieving files like so (from the Internet Archive):
<files>
<file name="Checkmate-theHumanTouch.gif" source="derivative">
<format>Animated GIF</format>
<original>Checkmate-theHumanTouch.mp4</original>
<md5>72ec7fcf240969921e58eabfb3b9d9df</md5>
<mtime>1274063536</mtime>
<size>377534</size>
<crc32>b2df3fc1</crc32>
<sha1>211a61068db844c44e79a9f71aa9f9d13ff68f1f</sha1>
</file>
<file name="CheckmateTheHumanTouch1961.thumbs/Checkmate-theHumanTouch_000001.jpg" source="derivative">
<format>Thumbnail</format>
<original>Checkmate-theHumanTouch.mp4</original>
<md5>6f6b3f8a779ff09f24ee4cd15d4bacd6</md5>
<mtime>1274063133</mtime>
<size>1169</size>
<crc32>657dc153</crc32>
<sha1>2242516f2dd9fe15c24b86d67f734e5236b05901</sha1>
</file>
</files>
They can have any number of <file>s, and I'm solely looking for the ones that are thumbnails. When I find them, I want to increase a counter. When I've gone through the whole file, I want to find the middle Thumbnail and return the name attribute.
Here's what I've got so far:
//pop previously retrieved XML file into a variable
$elem = new SimpleXMLElement($xml_file);
//establish variable
$i = 0;
// Look through each parent element in the file
foreach ($elem as $file) {
if ($file->format == "Thumbnail"){$i++;}
}
//find the middle thumbnail.
$chosenThumb = ceil(($i/2)-1);
//Gloriously announce the name of the chosen thumbnail.
echo($elem->file[$chosenThumb]['name']);`
The final echo doesn't work because it doesn't like have a variable choosing the XML element. It works fine when I hardcode it in. Can you guess that I'm new to handling XML files?
Edit:
Francis Avila's answer from below sorted me right out!:
$sxe = simplexml_load_file($url);
$thumbs = $sxe->xpath('/files/file[format="Thumbnail"]');
$n_thumbs = count($thumbs);
$middlethumb = $thumbs[(int) ($n_thumbs/2)];
$happy_string = (string)$middlethumb[name];
echo $happy_string;
Use XPath.
$sxe = simplexml_load_file($url);
$thumbs = $sxe->xpath('/files/file[format="Thumbnail"]');
$n_thumbs = count($thumbs);
$middlethumb = $thumbs[(int) ($n_thumbs/2)];
$middlethumbname = (string) $middlethumb['name'];
You can also accomplish this with a single XPath expression if you don't need the total count:
$thumbs = $sxe->xpath('/files/file[format="Thumbnail"][position() = floor(count(*) div 2)]/#name');
$middlethumbname = (count($thumbs)) ? $thumbs[0]['name'] : '';
A limitation of SimpleXML's xpath method is that it can only return nodes and not simple types. This is why you need to use $thumbs[0]['name']. If you use DOMXPath::evaluate(), you can do this instead:
$doc = new DOMDocument();
$doc->loadXMLFile($url);
$xp = new DOMXPath($doc);
$middlethumbname = $xp->evaluate('string(/files/file[format="Thumbnail"][position() = floor(count(*) div 2)]/#name)');
$elem->file[$chosenThumb] will give the $chosenThumb'th element from the main file[] not the filtered(for Thumbnail) file[], right?
foreach ($elem as $file) {
if ($file->format == "Thumbnail"){
$i++;
//add this item to a new array($filteredFiles)
}
}
$chosenThumb = ceil(($i/2)-1);
//echo($elem->file[$chosenThumb]['name']);
echo($filteredFiles[$chosenThumb]['name']);
Some problems:
Middle thumbnail is incorrectly calculated. You'll have to keep a separate array for those thumbs and get the middle one using count.
file might need to be {'file'}, I'm not sure how PHP sees this.
you don't have a default thumbnail
Code you should use is this one:
$files = new SimpleXMLElement($xml_file);
$thumbs = array();
foreach($files as $file)
if($file->format == "Thumbnail")
$thumbs[] = $file;
$chosenThumb = ceil((count($thumbs)/2)-1);
echo (count($thumbs)===0) ? 'default-thumbnail.png' : $thumbs[$chosenThumb]['name'];
/edit: but I recommend that guy's solution, to use XPath. Way easier.

PHP's SimpleXML doesn't save edited data

I'm trying to edit some xml data. After this I want to save the data to file.
The problem is that the edited data isn't saved by simplexml but the node has changed.
$spieler = $xml->xpath("/planer/spieltag[#datum='" .$_GET['date']. "']/spielerliste/spieler");
for ( $i = 1; $i < 13; $i++ ){
if (!empty($_POST['spieler' .$i ])){
$spieler[$i-1] = $_POST['spieler' .$i];
}
}
var_dump($spieler);
$xml->asXML("data.xml");
var_dump() shows the new data, but asXML() doesn't.
Make sure your script has write permission to data.xml
The XPath result array elements aren't PHP ($ref = &$var) references to the actual tree nodes, so this line
$spieler[$i-1] = $_POST['spieler' .$i];
isn't modifying anything in the tree, you're simply overwriting an entry in a completely independent array.

Categories