I'm writing a script that search in multiple XML files for some tag and then if it's find in this tag child named update I need to delete that child and then add it again.
Problem is that I don't understand why it doesn't deletes nodes I want to delete.
Ok so my script (important part I want to discuss) looks like this:
/*
// Pushing all offers from all files to $allOffers array
*/
foreach ($offerFiles as $file)
{
$file = $path . "\\" . $file;
$currentXML = new SimpleXMLElement($file, 0, true);
foreach($currentXML->offer as $offer)
{
if ($offer->number) {
if (!check_if_exists($compiledXML, $offer->number))
{
//array_push($allOffers, $offer);
}
if (check_if_exists($compiledXML, $offer->number) && $offer->action == "update")
{
update_existing_entry($compiledFile, $compiledXML, $offer);
// var_dump($allOffers);
}
}
}
}
/*
// Find and delete existing XML entry offer with update action
*/
function update_existing_entry ($compiledFile, $compiledXML, $parsedOffer) {
$index = 0;
$doc = new DOMDocument();
$doc->load($compiledFile);
$elem = $doc->documentElement;
foreach ($compiledXML->offer as $offer) {
if ((string)$parsedOffer->number === (string)$offer->number) {
$firstchild = $doc->getElementsByTagName('offer')->item($index);
// $firstchild->nodeValue = null;
$elem->removeChild($firstchild);
$doc->save($compiledFile);
//var_dump($parsedOffer->asXML());
}
$index++;
}
var_dump($deleteNodes);
}
Now if I have 2 XML files, 1 with update action, another without it then it works perfect. Problems starts when 1 and 2 files has update action, then I always ends with only one deleted node and error:
Fatal error: Uncaught TypeError: Argument 1 passed to
DOMNode::removeChild() must be an instance of DOMNode, null given
Why I can't delete nodes with selected index?
I don't know if it's the best approach, but I have fixed it this way:
function update_existing_entry ($compiledFile, $compiledXML, $parsedOffer) {
$doc = new DOMDocument();
$doc->load($compiledFile);
$node = $doc->documentElement;
foreach ($doc->getElementsByTagName('offer') as $child) {
if (strpos($child->nodeValue, (string)$parsedOffer->number) !== false) {
$node->removeChild($child);
}
}
$doc->save($compiledFile);
}
Related
I read this article.
https://qiita.com/yasumodev/items/74a73ed4b3f1dd45edb8
And I did the same thing.
// XML(RSSなど)を取得
$strXml = file_get_contents("./doc.xml");
// XML⇒JSONに変換
$strJson = xml_to_json($strXml);
// 出力
echo $strJson;
//**********************************
// XML ⇒ JSONに変換する関数
//**********************************
function xml_to_json($xml)
{
// コロンをアンダーバーに(名前空間対策)
$xml = preg_replace("/<([^>]+?):([^>]+?)>/", "<$1_$2>", $xml);
// プロトコルのは元に戻す
$xml = preg_replace("/_\/\//", "://", $xml);
// XML文字列をオブジェクトに変換(CDATAも対象とする)
$objXml = simplexml_load_string($xml, NULL, LIBXML_NOCDATA);
// 属性を展開する
xml_expand_attributes($objXml);
// JSON形式の文字列に変換
$json = json_encode($objXml, JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE);
// "\/" ⇒ "/" に置換
return preg_replace('/\\\\\//', '/', $json);
}
//**********************************
// XMLタグの属性を展開する関数
//**********************************
function xml_expand_attributes($node)
{
if($node->count() > 0) {
foreach($node->children() as $child)
{
foreach($child->attributes() as $key => $val) {
$node->addChild($child->getName()."#".$key, $val);
}
xml_expand_attributes($child); // 再帰呼出
}
}
}
But in this way , several objects name change to "#attributes".
I want the original object name here(T_T)
Please help me.
When you json_encode() XML with attributes, this will create the #attributes elements your getting. The only way round this is to remove them as you expand them. I've changed the routine, the first thing is that I've put the part that processes the attributes first, this ensures that the root node gets processed as well.
The main thing is that I've changed the way it works to use XPath to retrieve the attributes, this then encodes them as you have, but also allows you to remove the attribute from the original node (using unset($attribute[0]);)...
function xml_expand_attributes($node)
{
foreach ($node->xpath("#*") as $attribute) {
$node->addChild($node->getName()."#".$attribute->getName(), (string)$attribute);
unset($attribute[0]);
}
if($node->count() > 0) {
foreach($node->children() as $child)
{
xml_expand_attributes($child); // 再帰呼出
}
}
}
I'm trying to loop through a xml file and save nodes pared with it's value into an array (key => value). I also want it to keep track of the nodes it passed (something like array(users_user_name => "myName", users_user_email => "myEmail") etc.).
I know how to do this but there is a problem. All the nodes could have children and those children might also have children etc. so I need some sort of recursive function to keep looping through the children until it reaches the last child.
So far I got this:
//loads the xml file and creates simpleXML object
$xml = simplexml_load_string($content);
// for each root value
foreach ($xml->children() as $children) {
// for each child of the root node
$node = $children;
while ($children->children()) {
foreach ($children as $child) {
if($child->children()){
break;
}
$children = $node->getName();
//Give key a name
$keyOfValue = $xml->getName() . "_" . $children . "_" . $child->getName();
// pass value from child to children
$children = $child;
// no children, fill array: key => value
if ($child->children() == false) {
$parent[$keyOfValue] = (string)$child;
}
}
}
$dataObject[] = $parent;
}
The "break;" is to prevent it from giving me the wrong values because "child" is an object and not the last child.
Using recursion, you can write some 'complicated' processing, but the problem is loosing your place.
The function I use here passed in a couple of things to keep track of the name and the current output, but also the node it's currently working with. As you can see - the method checks if there are any child nodes and calls the function again to process each one of them.
$content = <<< XML
<users>
<user>
<name>myName</name>
<email>myEmail</email>
<address><line1>address1</line1><line2>address2</line2></address>
</user>
</users>
XML;
function processNode ( $base, SimpleXMLElement $node, &$output ) {
$base[] = $node->getName();
$nodeName = implode("_", $base);
$childNodes = $node->children();
if ( count($childNodes) == 0 ) {
$output[ $nodeName ] = (string)$node;
}
else {
foreach ( $childNodes as $newNode ) {
processNode($base, $newNode, $output);
}
}
}
$xml = simplexml_load_string($content);
$output = [];
processNode([], $xml, $output);
print_r($output);
This prints out...
Array
(
[users_user_name] => myName
[users_user_email] => myEmail
[users_user_address_line1] => address1
[users_user_address_line2] => address2
)
With this implementation, there are limitations to the content - so for example - repeating content will only keep the last value (say for example there were multiple users).
You'll want to use recursion!
Here's a simple example of recursion:
function doThing($param) {
// Do what you need to do
$param = alterParam($param);
// If there's more to do, do it again
if ($param != $condition) {
$param = doThing($param);
}
// Otherwise, we are ready to return the result
else {
return $param;
}
}
You can apply this thinking to your specific use case.
//Using SimpleXML library
// Parses XML but returns an Object for child nodes
public function getNodes($root)
{
$output = array();
if($root->children()) {
$children = $root->children();
foreach($children as $child) {
if(!($child->children())) {
$output[] = (array) $child;
}
else {
$output[] = self::getNodes($child->children());
}
}
}
else {
$output = (array) $root;
}
return $output;
}
I'll just add to this
I've had some trouble when namespaces come into the mix so i made the following recursive function to solve a node
This method goes into the deepest node and uses it as the value, in my case the top node's nodeValue contains all the values nested within so we have to dig into the lowest level and use that as the true value
// using the XMLReader to read an xml file ( in my case it was a 80gig xml file which is why i don't just load everything into memory )
$reader = new \XMLReader;
$reader->open($path); // where $path is the file path to the xml file
// using a dirty trick to skip most of the xml that is irrelevant where $nodeName is the node im looking for
// then in the next while loop i skip to the next node
while ($reader->read() && $reader->name !== $nodeName);
while ($reader->name === $nodeName) {
$doc = new \DOMDocument;
$dom = $doc->importNode($reader->expand(), true);
$data = $this->processDom($dom);
$reader->next($dom->localName);
}
public function processDom(\DOMNode $node)
{
$data = [];
/** #var \DomNode $childNode */
foreach ($node->childNodes as $childNode) {
// child nodes include of a lot of #text nodes which are irrelevant for me, so i just skip them
if ($childNode->nodeName === '#text') {
continue;
}
$childData = $this->processDom($childNode);
if ($childData === null || $childData === []) {
$data[$childNode->localName] = $childNode->nodeValue;
} else {
$data[$childNode->localName] = $childData;
}
}
return $data;
}
I am trying to read this xml file, but the code I am trying to make should work for any xml-file:
<?xml version="1.0"?>
<auto>
<binnenkant>
<voorkant>
<stuur/>
<gas/>
</voorkant>
<achterkant>
<stoel/>
<bagage/>
</achterkant>
</binnenkant>
<buitenkant>
<dak>
<dakkoffer>
<sky/>
<schoen/>
</dakkoffer>
</dak>
<trekhaak/>
<wiel/>
</buitenkant>
</auto>
I am using the two functions below to turn the XML-file into an array and turn that array into a tree.
I am trying to keep the parent-child relationship of the XML file. All I am getting back from the second function is an array with all the tags in the xml-file.
Can someone please help me?
function build_xml_tree(array $vals, $parent, $level) {
$branch = array();
foreach ($vals as $item) {
if (($item['type'] == "open") || $item['type'] == "complete") {
if ($branch && level == $item['level']) {
array_push($branch, ucfirst(strtolower($item['tag'])));
} else if ($parent == "" || $level < $item['level']) {
$branch = array(ucfirst(strtolower($item['tag'])) => build_xml_tree($vals, strtolower($item['tag']), $level));
}
}
}
return $branch;
}
function build_tree ($begin_tree, $content_naam) {
$xml = file_get_contents('xml_files/' . $content_naam . '.xml');
$p = xml_parser_create();
xml_parse_into_struct($p, $xml, $vals, $index);
?>
<pre>
<?php
print_r($vals);
?>
</pre>
<?php
$eindarray = array_merge($begin_tree, build_xml_tree($vals, "", 1));
return $eindarray;
}
There are many classes which can load an XML file. Many of them already represent the file in a tree structure: DOMDocument is one of them.
It seems a bit strange that you want to make a tree as an array when you already have a tree in a DOMDocument object: since you'll have to traverse the array-tree in some way... why don't you traverse directly the tree structure of the object-tree for printing for example?
Anyway the following code should do what you're asking for: I've used a recursive function in which the array-tree is passed by reference.
It should be trivial at this point to arrange my code to better suit your needs - i.e. complete the switch statement with more case blocks.
the $tree array has a numeric key for each node level
the tag node names are string values
if an array follows a string value, it contains the node's children
any potential text node is threated as a child node
function buildTree(DOMNode $node = NULL, &$tree) {
foreach ($node->childNodes as $cnode) {
switch($cnode->nodeType) {
case XML_ELEMENT_NODE:
$tree[] = $cnode->nodeName;
break;
case XML_TEXT_NODE:
$tree[] = $cnode->nodeValue;
break;
}
if ($cnode->hasChildNodes())
buildTree($cnode, $tree[count($tree)]);
}
}
$source ='the string which contains the XML';
$doc = new DOMDocument();
$doc->preserveWhiteSpace = FALSE;
$doc->loadXML($source, LIBXML_NOWARNING);
$tree = array();
buildTree($doc, $tree);
var_dump($tree);
Is there any solution to download STANDARD-XML metadata from RETS using PHRETS?
Currently am able to extract each class metadata as an array using PHRETS function GetMetadataTable and combining & converting to XML format.
But then recently I found difference in single STANDARD-XML metadata(of entire resources and classes) and individual class metadata. Using metadata viewer service RETSMD.com(built on PHRETS) also, the class name getting from STANDARD-XML metadata is different and unable to view the details.
Note: I got the STANDARD-XML metadata via direct browser log-in using credentials, like this
http://rets.login.url/GetMetadata?Type=METADATA-TABLE&Format=STANDARD-XML&ID=0
Anyone faced the same? Is there any solution using PHP?
Thanks in Advance!
I got a solution by modifying PHRETS library.
Added a new function there with following code,
if (empty($this->capability_url['GetMetadata'])) {
die("GetServerInformation() called but unable to find GetMetadata location. Failed login?\n");
}
$optional_params['Type'] = 'METADATA-SYSTEM';
$optional_params['ID'] = '*';
$optional_params['Format'] = 'STANDARD-XML';
//request server information
$result = $this->RETSRequest($this->capability_url['GetMetadata'], $optional_params );
if (!$result) {
return false;
}
list($headers, $body) = $result;
$xml = $this->ParseXMLResponse($body);
Note: Main thing to note is,
$optional_params['ID'] = '*';
Should be '*' instead '0'
If anyone is still unable to retrieve STANDARD-XML data from the CREA DDF data feed using PhRETS v2.x.x, I created a fork to the ./src/Parsers/Search/OneX.php file. You can add the following protected methods to the end of the file:
protected function parseDDFStandardXMLData(&$xml)
{
// we can only work with an array
$property_details = json_decode(json_encode($xml), true);
$retn = array();
if(! empty($property_details['RETS-RESPONSE']['PropertyDetails'])) {
foreach($property_details['RETS-RESPONSE']['PropertyDetails'] as $property_array) {
$retn[] = $this->parseArrayElements(null, $property_array);
}
}
return $retn;
}
protected function parseArrayElements($parent_key, $element)
{
// three possible $element types
// 1. scalar value
// 2. sub-array
// 3. SimpleXMLElement Object
$retn = array();
if(is_object($element)) {
$element = json_decode(json_encode($element), true);
}
if(is_array($element)) {
foreach($element as $node_key => $node) {
$key = $node_key;
if(! empty($parent_key)) {
$key = $parent_key . '|' . $key;
}
if(is_array($node) || is_object($node)) {
$nodes = $this->parseArrayElements($key, $node);
if(!empty($nodes)) {
foreach($nodes as $k => $n) {
$retn[$k] = $n;
}
}
}else{
$retn[$key] = $node;
}
}
}else{
$retn[$parent_key] = $element;
}
return $retn;
}
protected function parseRecordFromArray(&$array, Results $rs)
{
$r = new Record;
foreach($rs->getHeaders() as $key => $name) {
$r->set($name, $array[$name]);
}
return $r;
}
Then replace the parseRecords() method with:
protected function parseRecords(Session $rets, &$xml, $parameters, Results $rs)
{
if (isset($xml->DATA)) {
foreach ($xml->DATA as $line) {
$rs->addRecord($this->parseRecordFromLine($rets, $xml, $parameters, $line, $rs));
}
}elseif (isset($xml->{"RETS-RESPONSE"}->PropertyDetails)) {
$data = $this->parseDDFStandardXMLData($xml);
if(! empty($data)) {
$fields_saved = false;
foreach ($data as $line) {
if(!$fields_saved) {
$rs->setHeaders(array_keys($line));
}
$rs->addRecord($this->parseRecordFromArray($line, $rs));
}
}
}
}
The line, }elseif (isset($xml->{"RETS-RESPONSE"}->PropertyDetails)) { in the latter method does the trick to identify the STANDARD-XML RETS-RESPONSE node and parse the data.
Hope this helps,
Cheers!
I'm trying to come up with a solution that will allow users to upload a mail merge-enabled Word DOCX template file. Ideally, the system will read the DOCX file, extract the XML, find the mail merge fields and save them to a database for mapping down the road. I may go with a SOAP service such as Zend LiveDocX or PHPDOCX or something else entirely -- but for now I need to figure out how to identify the fields in a DOCX file. To do that I've started with this article: http://dfmaxwell.wordpress.com/2012/02/24/using-php-to-process-a-word-document-mail-merge/
I've adapted it a bit to my needs (which may be a problem, though I get the same error with the original code as well.) Specifically I'm not using it to perform the mail merge at this time, I just want to identify the fields. Here's what I've got:
$newFile = '/var/www/mysite.com/public_html/template.docx';
$zip = new ZipArchive();
if( $zip->open( $newFile, ZIPARCHIVE::CHECKCONS ) !== TRUE ) { echo 'failed to open template'; exit; }
$file = 'word/document.xml';
$data = $zip->getFromName( $file );
$zip->close();
$doc = new DOMDocument();
$doc->loadXML( $data );
$wts = $doc->getElementsByTagNameNS('http://schemas.openxmlformats.org/wordprocessingml/2006/main', 'fldChar');
$mergefields = array();
function getMailMerge(&$wts, $index) {
$loop = true;
$counter = $index;
$startfield = false;
while ($loop) {
if ($wts->item($counter)->attributes->item(0)->nodeName == 'w:fldCharType') {
$nodeName = '';
$nodeValue = '';
switch ($wts->item($counter)->attributes->item(0)->nodeValue) {
case 'begin':
if ($startfield) {
$counter = getMailMerge($wts, $counter);
}
$startfield = true;
if ($wts->item($counter)->parentNode->nextSibling) {
$nodeName = $wts->item($counter)->parentNode->nextSibling->childNodes->item(1)->nodeName;
$nodeValue = $wts->item($counter)->parentNode->nextSibling->childNodes->item(1)->nodeValue;
}
else {
// No sibling
// check next node
$nodeName = $wts->item($counter + 1)->parentNode->previousSibling->childNodes->item(1)->nodeName;
$nodeValue = $wts->item($counter + 1)->parentNode->previousSibling->childNodes->item(1)->nodeValue;
}
if (substr($nodeValue, 0, 11) == ' MERGEFIELD') {
$mergefields[] = strtolower(str_replace('"', '', trim(substr($nodeValue, 12))));
}
$counter++;
break;
case 'separate':
$counter++;
break;
case 'end':
if ($startfield) {
$startfield = false;
}
$loop = false;
}
}
}
return $counter;
}
for ($x = 0; $x < $wts->length; $x++) {
if ($wts->item($x)->attributes->item(0)->nodeName == 'w:fldCharType' && $wts->item($x)->attributes->item(0)->nodeValue == 'begin') {
$newcount = getMailMerge($wts, $x);
$x = $newcount;
}
}
I have no problem opening the DOCX file with ZipArchive() and if I use print_r($doc->saveHTML()); I see the XML data just fine. The problem is that when I execute my code I get Fatal error: Call to a member function item() on a non-object pointing to this:
$nodeName = $wts->item($counter)->parentNode->nextSibling->childNodes->item(1)->nodeName;
Google has let me down when trying to figure out this error, can anyone point me in the right direction? Thanks in advance!
Found a solution -- it's not quite as elegant as what I was hoping for but here goes.
Using xml_parser_create_ns I can search the DOCX file for the keys I need, specifically "HTTP://SCHEMAS.OPENXMLFORMATS.ORG/WORDPROCESSINGML/2006/MAIN:INSTRTEXT" which identifies all fields marked as "MERGEFIELD". Then I can dump the results into an array and use them to update the database. To wit:
// Word file to be opened
$newFile = '/var/www/mysite.com/public_html/template.docx';
// Extract the document.xml file from the DOCX archive
$zip = new ZipArchive();
if( $zip->open( $newFile, ZIPARCHIVE::CHECKCONS ) !== TRUE ) { echo 'failed to open template'; exit; }
$file = 'word/document.xml';
$data = $zip->getFromName( $file );
$zip->close();
// Create the XML parser and create an array of the results
$parser = xml_parser_create_ns();
xml_parse_into_struct($parser, $data, $vals, $index);
xml_parser_free($parser);
// Cycle the index array looking for the important key and save those items to another array
foreach ($index as $key => $indexitem) {
if ($key == 'HTTP://SCHEMAS.OPENXMLFORMATS.ORG/WORDPROCESSINGML/2006/MAIN:INSTRTEXT') {
$found = $indexitem;
break;
}
}
// Cycle *that* array looking for "MERGEFIELD" and grab the field name to yet another array
// Make sure to check for duplicates since fields may be re-used
if ($found) {
$mergefields = array();
foreach ($found as $field) {
if (!in_array(strtolower(trim(substr($vals[$field]['value'], 12))), $mergefields)) {
$mergefields[] = strtolower(trim(substr($vals[$field]['value'], 12)));
}
}
}
// View the fruits of your labor
print_r($mergefields);