I'm trying to get the data from an XML file into an array so that I can import it via 'Magmi'. Using the following code, I'm working with a 3.6GB XML file.
<?php
$z = new XMLReader;
$z->open('wpcatsub.xml');
$doc = new DOMDocument;
// move to the first <App /> node
while ($z->read() && $z->name !== 'App');
// now that we're at the right depth, hop to the next <App/> until the end of the tree
while ($z->name === 'App')
{
// either one should work
//$node = new SimpleXMLElement($z->readOuterXML());
$node = simplexml_import_dom($doc->importNode($z->expand(), true));
// now you can use $node without going insane about parsing
var_dump($node->element_1);
// go to next <product />
$z->next('App');
}
?>
When I load the PHP file, no errors appear -- the page is just blank. My XML data structure is below...
<App action="A" id="1">
<BaseVehicle id= "17491"/>
<Note><![CDATA[License Plate Lamp]]></Note>
<Qty>.000</Qty>
<PartType id= "10043"/>
<Part>W0133-1620896</Part>
<Product>
<PartNumber>W0133-1620896</PartNumber>
<BrandID>OES</BrandID>
<BrandDescription><![CDATA[Genuine]]></BrandDescription>
<WorldpacCategoryID>P9032</WorldpacCategoryID>
<Price>29.85</Price>
<ListPrice>33.17</ListPrice>
<Available>Y</Available>
<OEFlag>OEM</OEFlag>
<Weight>.10</Weight>
<Height>.7</Height>
<Width>4.4</Width>
<Length>4.4</Length>
<SellingIncrement>1</SellingIncrement>
<Popularity>D</Popularity>
<ImageURL><![CDATA[http://img.eautopartscatalog.com/live/W01331620896OES.JPG]]></ImageURL>
<ThumbURL><![CDATA[http://img.eautopartscatalog.com/live/thumb/W01331620896OES.JPG]]></ThumbURL>
</Product>
<ImageURL><![CDATA[http://img.eautopartscatalog.com/live/W01331620896OES.JPG]]></ImageURL>
<ThumbURL><![CDATA[http://img.eautopartscatalog.com/live/thumb/W01331620896OES.JPG]]></ThumbURL>
</App>
Is it stalling because of the size of the file? If so, isn't XMLReader supposed to work for large XML files? If nothing else, what other options would I have?
I suppose I could load the XML data into a database if needed and then use SELECT queries to build the array for the MAGMI import. Though I'm not sure how to import an XML file into a SQL database. If need be, I'll be happy to get guidance with that.
Related
I just wanted to ask how I can insert a new node in an XML using PHP. my XML file (questions.xml) is given below
<?xml version="1.0" encoding="UTF-8"?>
<Quiz>
<topic text="Preparation for Exam">
<subtopic text="Science" />
<subtopic text="Maths" />
<subtopic text="english" />
</topic>
</Quiz>
I want to add a new "subtopic" with the "text" attribute, that is "geography". How can I do this using PHP? Thanks in advance though.
well my code is
<?php
$xmldoc = new DOMDocument();
$xmldoc->load('questions.xml');
$root = $xmldoc->firstChild;
$newElement = $xmldoc->createElement('subtopic');
$root->appendChild($newElement);
// $newText = $xmldoc->createTextNode('geology');
// $newElement->appendChild($newText);
$xmldoc->save('questions.xml');
?>
I'd use SimpleXML for this. It would look somehow like this:
// Open and parse the XML file
$xml = simplexml_load_file("questions.xml");
// Create a child in the first topic node
$child = $xml->topic[0]->addChild("subtopic");
// Add the text attribute
$child->addAttribute("text", "geography");
You can either display the new XML code with echo or store it in a file.
// Display the new XML code
echo $xml->asXML();
// Store new XML code in questions.xml
$xml->asXML("questions.xml");
The best and safe way is to load your XML document into a PHP DOMDocument object, and then go to your desired node, add a child, and finally save the new version of the XML into a file.
Take a look at the documentation : DOMDocument
Example of code:
// open and load a XML file
$dom = new DomDocument();
$dom->load('your_file.xml');
// Apply some modification
$specificNode = $dom->getElementsByTagName('node_to_catch');
$newSubTopic = $xmldoc->createElement('subtopic');
$newSubTopicText = $xmldoc->createTextNode('geography');
$newSubTopic->appendChild($newSubTopicText);
$specificNode->appendChild($newSubTopic);
// Save the new version of the file
$dom->save('your_file_v2.xml');
You can use PHP's Simple XML. You have to read the file content, add the node with Simple XML and write the content back.
I was successfully using the following code to merge multiple large XML files into a new (larger) XML file. Found at least part of this on StackOverflow
$docList = new DOMDocument();
$root = $docList->createElement('documents');
$docList->appendChild($root);
$doc = new DOMDocument();
foreach(xmlFilenames as $xmlfilename) {
$doc->load($xmlfilename);
$xmlString = $doc->saveXML($doc->documentElement);
$xpath = new DOMXPath($doc);
$query = self::getQuery(); // this is the name of the ROOT element
$nodelist = $xpath->evaluate($query, $doc->documentElement);
if( $nodelist->length > 0 ) {
$node = $docList->importNode($nodelist->item(0), true);
$xmldownload = $docList->createElement('document');
if (self::getShowFileName())
$xmldownload->setAttribute("filename", $filename);
$xmldownload->appendChild($node);
$root->appendChild($xmldownload);
}
}
$newXMLFile = self::getNewXMLFile();
$docList->save($newXMLFile);
I started running into OUT OF MEMORY issues when the number of files grew as did the size of them.
I found an article here which explained the issue and recommended using XMLWriter
So, now trying to use PHP XMLWriter to merge multiple large XML files together into a new (larger) XML file. Later, I will execute xpath against the new file.
Code:
$xmlWriter = new XMLWriter();
$xmlWriter->openMemory();
$xmlWriter->openUri('mynewFile.xml');
$xmlWriter->setIndent(true);
$xmlWriter->startDocument('1.0', 'UTF-8');
$xmlWriter->startElement('documents');
$doc = new DOMDocument();
foreach($xmlfilenames as $xmlfilename)
{
$fileContents = file_get_contents($xmlfilename);
$xmlWriter->writeElement('document',$fileContents);
}
$xmlWriter->endElement();
$xmlWriter->endDocument();
$xmlWriter->flush();
Well, the resultant (new) xml file is no longer correct since elements are escaped - i.e.
<?xml version="1.0" encoding="UTF-8"?>
<CONFIRMOWNX>
<Confirm>
<LglVeh id="GLE">
<AddrLine1>GLEACHER & COMPANY</AddrLine1>
<AddrLine2>DESCAP DIVISION</AddrLine2>
Can anyone explain how to take the content from the XML file and write them properly to new file?
I'm burnt on this and I KNOW it'll be something simple I'm missing.
Thanks.
Robert
See, the problem is that XMLWriter::writeElement is intended to, well, write a complete XML element. That's why it automatically sanitize (replace & with &, for example) the contents of what's been passed to it as the second param.
One possible solution is to use XMLWriter::writeRaw method instead, as it writes the contents as is - without any sanitizing. Obviously it doesn't validate its inputs, but in your case it does not seem to be a problem (as you're working with already checked source).
Hmm, Not sure why it's converting it to HTML Characters, but you can decode it like so
htmlspecialchars_decode($data);
It converts special HTML entities back to characters.
I'm having an issue with the moveToAttribute method from PHP's XMLReader class.
I don't want to read in each line of the XML file. I want to have the capability to traverse the XML file, without going in sequential order; that is, random access. I thought using moveToAttribute would move the cursor to a node with the attribute value specified, where I can then conduct processing on its inner nodes, but this is not working out as planned.
Here's a snippet of the xml file:
<?xml version="1.0" encoding="Shift-JIS"?>
<CDs>
<Cat Type="Rock">
<CD>
<Name>Elvis Prestley</Name>
<Album>Elvis At Sun</Album>
</CD>
<CD>
<Name>Elvis Prestley</Name>
<Album>Best Of...</Album>
</CD>
</Cat>
<Cat Type="JazzBlues">
<CD>
<Name>B.B. King</Name>
<Album>Singin' The Blues</Album>
</CD>
<CD>
<Name>B.B. King</Name>
<Album>The Blues</Album>
</CD>
</Cat>
</CDs>
Here is my PHP code:
<?php
$xml = new XMLReader();
$xml->open("MusicCatalog.xml") or die ("can't open file");
$xml->moveToAttribute("JazzBlues");
print $xml->nodeType . PHP_EOL; // 0
print $xml->readString() . PHP_EOL; // blank ("")
?>
What am I doing wrong, with regards to moveToAttribute? How can I randomly access nodes using a node's attribute? I want to target node Cat Type="JazzBlues" without doing it sequentially (i.e. $xml->read()), and then process its inner nodes.
Thank you very much.
i think there is no way to avoid XMLReader::read. XMLreader::moveToAttribute only works if the XMLReader already points to an element. Additionally you also can check XMLReader::moveToAttribute's return value to detect possible failures. Maybe try something like this:
<?php
$xml = new XMLReader();
$xml->open("MusicCatalog.xml") or die ("can't open file");
while ($xml->read() && xml->name != "Cat"){ }
//the parser now found the "Cat"-element
//(or the end of the file, maybe you should check that)
//and points to the desired element, so moveToAttribute will work
if (!$xml->moveToAttribute("Type")){
die("could not find the desired attribute");
}
//now $xml points to the attribute, so you can access the value just by $xml->value
echo "found a 'cat'-element, its type is " . $xml->value;
?>
this piece of code should print the value of the type-attribute of the first cat-element in the file. i dont know what you want to do with the file, so you have to change the code for your idea. for processing the inner nodes you can use:
<?php
//continuation of the code above
$depth = $xml->depth;
while ($xml->read() && $xml->depth >= $depth){
//do something with the inner nodes
}
//the first time this Loop should fail is when the parser encountered
//the </cat>-element, because the depth inside the cat-element is higher than
//the depth of the cat-element itself
//maybe you can search for other cat-nodes here, after you processed one
i cant tell you, how to rewrite this code for a random-access example, but i hope, i could help you with this.
I am using PHP curl to retrive an xml file from a remote url and save it to a local file on my server. The structure is the following:
<Store Country="Ireland">
<EventsPoints>
<Event ID="1800" >
<ArtistIDs>
<ArtistID ID="109" Type="Primary" />
</ArtistIDs>
<CategoryID>1</CategoryID>
<Country>IRL</Country>
<PerformanceName>Music and Arts</PerformanceName>
<VenueID ID="197" />
</Event>
<Venues>
<Venue ID="197">
<City>Dublin</City>
<Country>IRL</Country>
<VenueName>ABC</VenueName>
<VenueNumber>22</VenueNumber>
</Venue>
</Venues>
The above xml blocks are stored in the same XML file. There are several Event blocks and several Venue blocks.
The problem i'm having is using PHP to access this large XML file and iterate through the Venue blocks retrieving only a Venue block with a certain ID specified via a parameter.
I then want to iterate through the event blocks - only retrieving the events matching this specified venue ID. I then want to save this to a file on the server.
I want to do this for each venue.
How do I go about doing the above?
EDIT:
For each Venue and events related to that venue, I just want to literally copy them to their own file - basically splitting down the larger file into individual files
$docSource = new DOMDocument();
$docSource->loadXML($xml);
$docDest = new DOMDocument();
$docDest->loadXML(file_get_contents('/var/www/html/xml/testfile.xml'));
$xpath = new DOMXPath($docSource);
$id = "197";
$result = $xpath->query('//Event/VenueID[#ID=$id]')->item(0); //Get directly the node you want
$result = $docDest->importNode($result, true); //Copy the node to the other document
$items = $docDest->getElementsByTagName('items')->item(0);
$items->appendChild($result); //Add the copied node to the destination document
echo $docDest->saveXML();
You are not showing the desired output format, so I will assume this generates what you want. If not, feel free to modify the code so it meets the desired output format. This along with the comments above should have all you need to get this working on your own.
// load Source document
$srcDom = new DOMDocument;
$srcDom->load('/var/www/html/xml/testfile.xml');
$xPath = new DOMXPath($srcDom);
// iterate over all the venues in the source document
foreach ($srcDom->getElementsByTagName('Venue') as $venue) {
// create a new destination document for the current venue
$dstDom = new DOMDocument('1.0', 'utf-8');
// add an EventsPoint element as the root node
$dstDom->appendChild($dstDom->createElement('EventsPoint'));
// import the Venue element tree to the new destination document
$dstDom->documentElement->appendChild($dstDom->importNode($venue, true));
// fetch all the events for the current venue from the source document
$allEventsForVenue = $xPath->query(
sprintf(
'/Store/EventsPoints/Event[VenueID/#ID=%d]',
$venue->getAttribute('ID')
)
);
// iterate all the events found in Xpath query
foreach ($allEventsForVenue as $event) {
// add event element tree to current destination document
$dstDom->documentElement->appendChild($dstDom->importNode($event, true));
}
// make output prettier
$dstDom->formatOutput = true;
// save XML to file named after venue ID
$dstDom->save(sprintf('/path/to/%d.xml', $venue->getAttribute('ID')));
}
This will create an XML file like this
<?xml version="1.0" encoding="utf-8"?>
<EventsPoint>
<Venue ID="197">
<City>Dublin</City>
<Country>IRL</Country>
<VenueName>ABC</VenueName>
<VenueNumber>22</VenueNumber>
</Venue>
<Event ID="1800">
<ArtistIDs>
<ArtistID ID="109" Type="Primary"/>
</ArtistIDs>
<CategoryID>1</CategoryID>
<Country>IRL</Country>
<PerformanceName>Music and Arts</PerformanceName>
<VenueID ID="197"/>
</Event>
</EventsPoint>
in the file 197.xml
Hello I know there is many questions here about those three topics combined together to update XML entries, but it seems everyone is very specific to a given problem.
I have been spending some time trying to understand XPath and its way, but I still can't get what I need to do.
Here we go
I have this XML file
<?xml version="1.0" encoding="UTF-8"?>
<storagehouse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="schema.xsd">
<item id="c7278e33ef0f4aff88da10dfeeaaae7a">
<name>HDMI Cable 3m</name>
<weight>0.5</weight>
<category>Cables</category>
<location>B3</location>
</item>
<item id="df799fb47bc1e13f3e1c8b04ebd16a96">
<name>Dell U2410</name>
<weight>2.5</weight>
<category>Monitors</category>
<location>C2</location>
</item>
</storagehouse>
What I would like to do is to update/edit any of the nodes above when I need to. I will do a Html form for that.
But my biggest conserne is how do I find and update a the desired node and update it?
Here I have some of what I am trying to do
<?php
function fnDOMEditElementCond()
{
$dom = new DOMDocument();
$dom->load('storage.xml');
$library = $dom->documentElement;
$xpath = new DOMXPath($dom);
// I kind of understand this one here
$result = $xpath->query('/storagehouse/item[1]/name');
//This one not so much
$result->item(0)->nodeValue .= ' Series';
// This will remove the CDATA property of the element.
//To retain it, delete this element (see delete eg) & recreate it with CDATA (see create xml eg).
//2nd Way
//$result = $xpath->query('/library/book[author="J.R.R.Tolkein"]');
// $result->item(0)->getElementsByTagName('title')->item(0)->nodeValue .= ' Series';
header("Content-type: text/xml");
echo $dom->saveXML();
}
?>
Could someone maybe give me an examples with attributes and so on, so one a user decides to update a desired node, I could find that node with XPath and then update it?
The following example is making use of simplexml which is a close friend of DOMDocument. The xpath shown is the same regardless which method you use, and I use simplexml here to keep the code low. I'll show a more advanced DOMDocument example later on.
So about the xpath: How to find the node and update it. First of all how to find the node:
The node has the element/tagname item. You are looking for it inside the storagehouse element, which is the root element of your XML document. All item elements in your document are expressed like this in xpath:
/storagehouse/item
From the root, first storagehouse, then item. Divided with /. You already know that, so the interesting part is how to only take those item elements that have the specific ID. For that the predicate is used and added at the end:
/storagehouse/item[#id="id"]
This will return all item elements again, but this time only those which have the attribute id with the value id (string). For example in your case with the following XML:
$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<storagehouse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="schema.xsd">
<item id="c7278e33ef0f4aff88da10dfeeaaae7a">
<name>HDMI Cable 3m</name>
<weight>0.5</weight>
<category>Cables</category>
<location>B3</location>
</item>
<item id="df799fb47bc1e13f3e1c8b04ebd16a96">
<name>Dell U2410</name>
<weight>2.5</weight>
<category>Monitors</category>
<location>C2</location>
</item>
</storagehouse>
XML;
that xpath:
/storagehouse/item[#id="df799fb47bc1e13f3e1c8b04ebd16a96"]
will return the computer monitor (because such an item with that id exists). If there would be multiple items with the same id value, multiple would be returned. If there were none, none would be returned. So let's wrap that into a code-example:
$simplexml = simplexml_load_string($xml);
$result = $simplexml->xpath(sprintf('/storagehouse/item[#id="%s"]', $id));
if (!$result || count($result) !== 1) {
throw new Exception(sprintf('Item with id "%s" does not exists or is not unique.', $id));
}
list($item) = $result;
In this example, $titem is the SimpleXMLElement object of that computer monitor xml element name item.
So now for the changes, which are extremely easy with SimpleXML in your case:
$item->category = 'LCD Monitor';
And to finally see the result:
echo $simplexml->asXML();
Yes that's all with SimpleXML in your case.
If you want to do this with DOMDocument, it works quite similar. However, for updating an element's value, you need to access the child element of that item as well. Let's see the following example which first of all fetches the item as well. If you compare with the SimpleXML example above, you can see that things not really differ:
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$result = $xpath->query(sprintf('/storagehouse/item[#id="%s"]', $id));
if (!$result || $result->length !== 1) {
throw new Exception(sprintf('Item with id "%s" does not exists or is not unique.', $id));
}
$item = $result->item(0);
Again, $item contains the item XML element of the computer monitor. But this time as a DOMElement. To modify the category element in there (or more precisely it's nodeValue), that children needs to be obtained first. You can do this again with xpath, but this time with an expression relative to the $item element:
./category
Assuming that there always is a category child-element in the item element, this could be written as such:
$category = $xpath->query('./category', $item)->item(0);
$category does now contain the first category child element of $item. What's left is updating the value of it:
$category->nodeValue = "LCD Monitor";
And to finally see the result:
echo $doc->saveXML();
And that's it. Whether you choose SimpleXML or DOMDocument, that depends on your needs. You can even switch between both. You probably might want to map and check for changes:
$repository = new Repository($xml);
$item = $repository->getItemByID($id);
$item->category = 'LCD Monitor';
$repository->saveChanges();
echo $repository->getXML();
Naturally this requires more code, which is too much for this answer.