Hello All!
I'm active in a fairly large project, but I have limited experience with XML. I am dynamically generating XML, data which may be customed to needs of individual customers. The current solution has been (please don't hurt me, I'm the new guy) to inline a php template by include(). This is not good practice and I want to move to a better solution.
Structure
<?xml version='1.0'?>
<Product id="">
<AswItem></AswItem>
<EanCode></EanCode>
<ImagePopup></ImagePopup>
<ImageInfo></ImageInfo>
<ImageThumbnail></ImageThumbnail>
<PriceCurrency></PriceCurrency>
<PriceValueNoTax></PriceValueNoTax>
<Manufacture></Manufacture>
<ProductDescriptions>
<ProductDescrition language="" id="">
<Name></Name>
<Description></Description>
<Color></Color>
<Size></Size>
<NavigationLevel1></NavigationLevel1>
<NavigationLevel2></NavigationLevel2>
<NavigationLevel3></NavigationLevel3>
<NavigationLevel4></NavigationLevel4>
</ProductDescrition>
</ProductDescriptions>
<MatrixProducts>
<AswItem></AswItem>
<EanCode></EanCode>
<ParentId></ParentId>
<PriceCurrency></PriceCurrency>
<PriceValueNoTax></PriceValueNoTax>
<ImagePopup></ImagePopup>
<ImageInfo></ImageInfo>
<ImageThumbnail></ImageThumbnail>
</MatrixProducts>
</Product>
This is our main structure. ProductDescriptions and MatrixProducts are basically list items, and may contain none to several children. Our object to be translated into XML is a PHP hash tree with a similar structure but with different keys.
Problem
The problem I have is that I get stuck in the thought process on how dynamically create a tree from an object. My current plan is to have a key conversion table (see Current Solution) but a voice in the back of my head is telling me that it's not best practice.
Previous solution
populate.php
foreach($products as $product) {
// too much black magic in here
include($chosenTemplate);
// $productXMLString is generated in the previous include
printToXML($productXMLString)
}
template.php
<?
echo "<Product id='{$product['id']}'>";
// etc...
echo "</product>";
As you can see, this is a pretty bad approach. Bad error handling, messy syntax and lot's of other quirks.
Current solution
$xmlProductTemplate = simplexml_load_file($currentTemplate);
foreach($products as $product) {
$xmlObj = clone $xmlProductTemplate;
foreach($product as $key => $productValue) {
// if value is a <$key>$string</$key>, just input
// it into the translated key for the $xmlObject
if(!is_array($productValue))
$xmlObj[translateKeyToXML($key)] = $productValue;
// elseway, we need to call the magic; traverse a child array
// and still hold true to templateing
else {
// what DO you do?
}
}
// save xml
fputs($xmlObj->asXML());
}
How would you go about this and what is best practice? I'm a bit hungry and dehydrated so please tell me if I'm missing something basic.
I am having a bit of trouble understanding what you're trying to do so excuse me if I'm off here. What it seems like you are trying to do is create an XML file based on a "template" with an ArrayObject containing the attributes and values of the XML elements.
Perhaps, instead of trying to do that, you just create a SimpleXML object. I think that would be much easier for what you're trying to do and it adds the value of error catching. See SimpleXML on PHP.net.
If I am not on the right track with an answer, can you post more source code like the class that contains the values? Thanks.
Related
I am trying to change a value in an xml file using php. I am loading the xml file using php into an object like this..
if(file_exists('../XML/example.xml')) {
$example = simplexml_load_file('../XML/example.xml');
}
else {
exit ("can't load the file");
}
Then once it is loaded I am changing values within tags, by assigning them the contents of another variable, like this...
$example->first_section->second_section->third_section->title = $var['data'];
Then once I've made the necessary changes the file is saved. So far this process is working well, but have now hit a stumbling block.
I want to change a value within a particular tag in my xml file, which has an id. In the XML file it looks like this.
<first_section>
<second_section>
<third_section id="2">
<title>Mrs</title>
</third_section>
</second_section>
</first_section>
How can I change this value using similar syntax to what I've been using?
doing..
$example->first_section->second_section->third_section id="2" ->title = $var['data']
doesn't work as the syntax is wrong.
I've been scanning through stack overflow, and all over the net for an example of doing it this way but come up empty.
Is it possible to target and change a value in an xml like this, or do I need to change the way I am amending this file?
Thanks.
Some dummy code as your provided XML is surely not the original one.
$xml = simplexml_load_file('../XML/example.xml');
$section = $xml->xpath("//third_section[#id='2']")[0];
// runs a query on the xml tree
// gives always back an array, so pick the first one directly
$section["id"] = "3";
// check if it has indeed changed
echo $xml->asXML();
As #Muhammed M. already said, check the SimpleXML documentation for more information. Check the corresponding demo on ideone.com.
Figured it our after much messing around. Thanks to your contributions I indeed needed to use Xpath. However the reason it wasn't working for me was because I wasn't specifying the entire path for the node I wanted to edit.
For example, after loading the xml file into an object ($xml):
foreach($xml->xpath("/first_section/second_section/third_section[#id='2']") as $entry ) {
$entry->title = "mr";
}
This will work, because the whole path to the node is included in the parenthesis.
But in our above examples eg:
foreach($xml->xpath("//third_section[#id='2']" as $entry ) {
$entry->title = "mr";
}
This wouldn't work, even though it was my understanding that the double // will make it drill down, and I assumed that xpath will search the whole xml structure and return where id=2. It appears after spending hours testing this isn't the case. You must include the entire path to the node. As soon as I did that it worked.
Also on a side note. $section = $xml->xpath("//third_section[#id='2']")[0];
IS incorrect syntax. You don't need to specify the index "[0]" at the end. Including it flags up Dreamweavers syntax checker. And ignoring Dreamweaver and uploading anyway breaks the code. All you need is..
$section = $xml->xpath(" entire path to node in here [#id='2']");
Thanks for helping and suggesting xpath. It works very well... once you know how to use it.
I have a DOM structure which acts as a template for building a larger document. The template looks something like this (oversimplified example)
<book> // $cache[0]
<data></data>
<author></author> // $cache[1]
<published>
<company></company> // $cache[3]
<date></date>
</published>
<blurb></blurb>
<related></related> // $cache[2]
</book>
As you can hopefully see, I cache certain nodes within this template with the hope of doing expensive searches only once. (XPath is unusable in this situation due to the strict standards of the template.)
The above template will be added to a document looking like this:
<store>
<genre>
<computing>
// Insert here
</computing>
<nature>
// Again here
</nature>
</genre>
</store>
Basically, it can be inserted anywhere. The problem I can't figure out how to solve is how to keep or quickly update the cache points after the template has been inserted with methods like appendChild and insertBefore. The only solution I can see is to re-search the inserted node, but like I mentioned, this is expensive and certain tags which aided the first search will have been removed.
I find the insert points similar to any template engine, by iterating the dom and perform actions on certain handlers eg. {{book}} will request the above template be inserted.
The cache is simply an array of DomNodes but this can easily be changed if there is a better cross document method. I'm open to suggestions or pointers to code that have implemented similar.
I solved this by not caching the DomNode but rather a path to the node. I first looked at getNodePath() which returns an XPath to the node, but just looking at the returned path I saw that XPath must do a lot of branching under the hood. So I came up with this:
foreach ( $node->childNodes as $child ) {
$index++;
$path = $path . "->childNodes->item($index)";
}
Then after inserting the node into the second document, those cache points can be quickly referenced by
eval("\$node = \$node$path;");
Im migrating big Wordpress page to custom CMS. I need to extract information from big (20MB+) XML file, exported from Wordpress.
I don't have any experience in XML under PHP and i don't know how to start reading file.
Wordpress file contains structures like this:
<excerpt:encoded><![CDATA[Encoded text here]]></excerpt:encoded>
and i don't know how to handle this in PHP.
You are probably going to do fine with simplexml:
$xml = simplexml_load_file('big_xml_file.xml');
foreach ($xml->element as $el) {
echo $el->name;
}
See php.net for more info
Unfortunately, your XML example didn't come through.
PHP5 ships with two extensions for working with XML - DOM and "SimpleXML".
Generally speaking, I recommend looking into SimpleXML first since it's the more accessible library of the two.
For starters, use "simplexml_load_file()" to read an XML file into an object for further processing.
You should also check out the "SimpleXML basic examples page on php.net".
I don't have any experience in XML under PHP
Take a look at simplexml_load_file() or DomDocument.
<excerpt:encoded><![CDATA[Encoded text here]]></excerpt:encoded>
This should not be a problem for the XML parser. However, you will have a problem with the content exported by WordPress. For example, it can contain WordPress shortcodes, which will come across in their raw format instead of expanded.
Better Approach
Determine if what you are migrating to supports an export from WordPress feature. Many other systems do - Drupal, Joomla, Octopress, etc.
Although Adam is Absolutely right, his answer needed a bit more details. Here's a simple script that should get you going.
$xmlfile = simplexml_load_file('yourxmlfile.xml');
foreach ($xmlfile->channel->item as $item) {
var_dump($item->xpath('title'));
var_dump($item->xpath('wp:post_type'));
}
simplexml_load_file() is the way to go creating an object, but you will also need to use xpath as WordPress uses name spaces. If I remember correctly SimpleXML does not handle name space well or at all.
$xml = simplexml_load_file( $file );
$xml->xpath('/rss/channel/wp:category');
I would recommend looking at what WordPress uses for importing the files.
https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/class-wp-importer.php
First off, I'm far from awesome with PHP - having only a basic familiarity with it, but I'm looking for a way to manipulate the contents of nested divs with php. This is a basic site for a local non-profit food bank that will allow them to post events for their clientelle.
For example, the file I want to parse and work with has this structure (consider this the complete file though there may be more than 2 entries at any point in time):
<div class="event">
<div class="eventTitle">title text</div>
<div class="eventContent">event content</div>
</div>
<div class="event">
<div class="eventTitle">title2</div>
<div class="eventContent">event content2</div>
</div>
My thoughts are to parse it (what's the best way?), and build a multidimensional array of all div with class="event", and the nested contents of each. However, up to this point all my attempts have ended in failure.
The point of this is allow the user (non-technical food bank admin) to add, edit, and delete these structures. I have the code working to add the structures - but am uncertain as to how I would re-open the file at a later date to then edit and/or delete select instances of the "event" divs and their nested contents. It seems like it should be an easy task but I just can't wrap my head around the search results I have found online.
I have tried some stuff with preg_match(), getElementById(), and getElementByTagName(). I'd really like to help this organization out but I'm at the point where I have to defer to my betters for advice on how to solve the task at hand.
Thanks in advance.
To Clarify:
This is for their website, hosted on an external service by a provider that does not allow them to host a DB or provide ftp/sftp/ssh access to the server for regular maintenance. The plan is to get the site up there once, and from then on, have it maintained via an unsecure (no other options at this point) url.
Can anyone provide a sample php syntax to parse the above html and create a multidimensional array of the div tags? As I mentioned, I have attempted to thumb my way through it, but have been unsuccessful. I know what I need to do, I just get lost in the syntax.
IE: this is what I've come up with to do this, but it doesn't seem to work, and I don't have a strong enough understanding of php to understand exactly why it does not.
<?php
$doc = new DOMDocument();
$doc->load('events.php');
$events = array();
foreach ($doc->getElementsByTagName('div') as $node) {
// looks at each <div> tag and creates an array from the other named tags below // hopefully...
$edetails = array (
'title' => $node->getElementsByTagName('eventTitle')->item(0)->nodeValue,
'desc' => $node->getElementsByTagName('eventContent')->item(0)->nodeValue
);
array_push($events, $edetails);
}
foreach ($events as &$edetails) {
// walk through the $events array and write out the appropriate information.
echo $edetails['title'] . "<br>";
echo $edetails['desc'] . "<br>";
}
print_r($events); // this is currently empty and not being populated
?>
Error:
PHP Warning: DOMDocument::load(): Extra content at the end of the document in /var/www/html/events.php, line: 7 in /var/www/html/test.php on line 4
Looking at this now, I realize this would never work because it is looking for tags named eventTitle and eventContent, not classes. :(
I would use a "database", whether it's an sqlite database or a simple text file (seems sufficient for your needs), and use php scripts to manipulate that file and build the required html to manage the text/database file and display the contents.
That would be a lot easier than using DOM manipulation to add / edit / remove events.
By the way, I would probably look for a sponsor, get a decent hosting provider and use a real database...
If you want to keep using the "php" file you have (which I think is needless complex), the reasons your current code fails are:
1) The load() method for DOMDocument is designed for XML, and expects a well formed file. The work around for this would be to either use the loadHTMLFile() method, or to wrap everything in a parent element.
2) The looping fails as the getElementsByTagName() is looking for tags - so the outermost loop gets 6 different divs in your current example (the parent event, and the children eventTitle and eventContent)
3) The inner loops fail of course, as you're again using getElementsByTagName(). Note that the tag names are all still 'div'; what you're really trying/wanting to search on is the value of 'class' attribute. In theory, you could work around this by putting in a lot of logic using things like hasChildNodes() and/or getAttribute().
Alternatively, you could restructure using valid XML, rather than this weird hybrid you're trying to use - if you do that, you could use DOMDocument to write out the file, as well as read it. Probably overkill, unless you're looking to learn how to use the PHP DOM libraries and XML.
As other's have mentioned, I'd change the format of events.php into something besides a bunch of div's. Since a database isn't an option, I'd probably go for a pipe delimited file, something like:
title text|event content
title2|event content2
The code to parse this would be much simpler, something along the lines of:
<?php
$events = array();
$filename = 'events.txt';
if (file_exists($filename)) {
$lines = file($filename);
foreach ($lines as $line) {
list($title, $desc) = explode('|', $line);
$event = array('title'=>$title, 'desc'=>$desc);
$events[] = $event; //better way of adding one element to an array than array_push (http://php.net/manual/en/function.array-push.php)
}
}
print_r($events);
?>
Note that this code reads the whole file into memory, so if they have too many events or super long descriptions, this could get unwieldy, but should work fine for hundreds, even thousands, of events or so.
I'm trying to take an existing php file which I've built for a page of my site (blue.php), and grab the parts I really want with some xPath to create a different version of that page (blue-2.php).
I've been successful in pulling in my existing .php file with
$documentSource = file_get_contents("http://mysite.com/blue.php");
I can alter an attribute, and have my changes reflected correctly within blue-2.php, for example:
$xpath->query("//div[#class='walk']");
foreach ($xpath->query("//div[#class='walk']") as $node) {
$source = $node->getAttribute('class');
$node->setAttribute('class', 'run');
With my current code, I'm limited to making changes like in the example above. What I really want to be able to do is remove/exclude certain divs and other elements from showing on my new php page (blue-2.php).
By using echo $doc->saveHTML(); at the end of my code, it appears that everything from blue.php is included in blue-2.php's output, when I only want to output certain elements, while excluding others.
So the essence of my question is:
Can I parse an entire page using $documentSource = file_get_contents("http://mysite.com/blue.php");, and pick and choose (include and exclude) which elements show on my new page, with xPath? Or am I limited to only making modifications to the existing code like in my 'div class walk/run' example above?
Thank you for any guidance.
I've tried this, and it just throws errors:
$xpath->query("//img[#src='blue.png']")->remove();
What part of the documentation did make you think remove is a method of DOMNodeList? Use DOMNode::removeChild
foreach($xpath->query("//img[#src='blue.png']") as $node){
$node->parentNode->removeChild($node);
}
I would suggest browsing a bit through all classes & functions from the DOM extension (which is not PHP-only BTW), to get a bit of a feel what to find where.
On a side note: is probably very more resource efficient if you could get a switch in your original blue.php resulting in the different output, because this solution (extra http-request, full DOM load & manipulation) has a LOT of unneeded overhead compared to that.