How can I search a XML file using regular expressions in php - php

I am curious if I can search an XML file for a certain tag with regular expressions. I can search the file if I use fopen('foo.xml'); but it will only allow me to search the content between the tags not the tags them self. My objective for this is I hope to create a function that will allow me to delete all the content between two tags for example between users which are in a xml file. He language that I am using is PHP.
Thanks in advance john.

You should use something like SimpleXMLto handle/edit XML files.
If you really insist on doing it by treating the SML file as a string you can do something like this (or you can use regex). But you should use an XML library.
// get your file as a string
$yourXML = file_get_contents($file) ;
$posStart = stripos($yourXML,'<users>') + strlen('<users>') ;
$posEnd = stripos($yourXML,'</users>') ;
$newXML = substr($yourXML,0,$posStart) . substr($yourXML,$posEnd) ;
// <users> is now empty
echo $newXML ;

DomDocument & XPath will make things very clean, direct and reliable.
You can use evaluate() or query() as they provide the same result.
// will seek out the matching tags regardless of their location.
Be aware that my solution is case-sensitive.
Code: (Demo)
$xml = <<<XML
<myXml>
<Person>
<firstName>pradeep</firstName>
<lastName>jain</lastName>
<address>
<doorNumber>287</doorNumber>
<street>2nd block</street>
<city>bangalore</city>
</address>
<phoneNums type="mobile">9980572765</phoneNums>
<phoneNums type="landline">080 42056434</phoneNums>
<phoneNums type="skype">123456</phoneNums>
</Person>
<Person>
<firstName>pradeep</firstName>
<lastName>jain</lastName>
<address>
<doorNumber>287</doorNumber>
<street>2nd block</street>
<city>bangalore</city>
</address>
<phoneNums type="mobile">1</phoneNums>
<phoneNums type="landline">2</phoneNums>
<phoneNums type="skype">3</phoneNums>
</Person>
</myXml>
XML;
$dom = new DOMDocument;
$dom->loadXML($xml); // <-- you'll need to import your file instead of a string as demo'ed here
$xpath = new DOMXPath($dom);
echo count($xpath->evaluate("//phoneNums")) , "\n"; // 6
echo count($xpath->evaluate("//street")) , "\n"; // 2
echo count($xpath->evaluate("//myXml")) , "\n"; // 1
echo count($xpath->evaluate("//Person")) , "\n"; // 2
echo count($xpath->evaluate("//person")) , "\n"; // 0 <-- case-sensitive

As a simple mock up of the various parts needed to do this in SimpleXML, there are a few concepts you need to know to get it to work.
The main one being XPath, which a sort of SQL for XML. Of course it has it's own notation and can be a little pedantic at times, but you can experiment with it on sites like https://codebeautify.org/Xpath-Tester.
$data = '<?xml version="1.0" encoding="UTF-8"?>
<Users>
<User id="123">
<Name>fred</Name>
<Extension>1234</Extension>
</User>
<User id="124">
<Name>bert</Name>
<Extension>1235</Extension>
</User>
<User id="125">
<Name>foo</Name>
<Extension>1236</Extension>
</User>
</Users>';
$userID = "123";
$users = simplexml_load_string($data);
// Find the user with the id attribute (use [0] as the call to xpath
// returns a list of matches and you only want the first one)
$userMatch = $users->xpath("//User[#id='{$userID}']")[0];
// Just output user id attribute and name
echo "id=".$userMatch['id'].",name=".$userMatch->Name.PHP_EOL;
echo "Removing user...".PHP_EOL;
// Remove the user - note the [0] is required here
unset($userMatch[0]);
// Print out the resulting XML after the removal
echo $users->asXML();
I've put comments through the code as how it works. The output is...
id=123,name=fred
Removing user...
<?xml version="1.0" encoding="UTF-8"?>
<Users>
<User id="124">
<Name>bert</Name>
<Extension>1235</Extension>
</User>
<User id="125">
<Name>foo</Name>
<Extension>1236</Extension>
</User>
</Users>

Related

Merge people profiles based on email match

I have an existing directory (php with xml datasource) which contains people information such as this:
MainSource.xml
<people>
<person>
<id></id>
<last_name></last_name>
<first_name></first_name>
<email></email>
<phone></phone>
</person>
...
</people>
I need to add a new node to MainSource.xml from NewSource.xml, matching on email address, from the new datasource which contains people info like this:
NewSource.xml
<people>
<person>
<email></email>
<website_url></website_url>
</person>
...
</people>
I have tried a number of variations, but I think my hangup is properly comparing the two documents. Logically, it feels like I need to be iterating, as opposed to foreach? Or two foreach, one for each source? Here's a sample of what I'm thinking. Please offer any clarity or insight which can nudge me along in the right direction.
<?php
$doc1 = new DOMDocument();
$doc1->load('MainSource.xml');
$doc2 = new DOMDocument();
$doc2->load('NewSource.xml');
foreach ($doc1->person as $person) {
if ($person->email === $doc2->person->email) {
$node = $doc1->createElement("website_url", $valueFromDoc2);
$newnode = $doc1->appendChild($node);
}
}
$merged = $doc1->saveXML();
file_put_contents('MergedSource.xml', $merged)
?>
As mentioned by #waterloomatt, you need to use xpath to achieve that.
Assuming that MainSource.xml looks like this:
<people>
<person>
<id>1</id>
<last_name>smith</last_name>
<first_name>john</first_name>
<email>js#example.com</email>
<phone>555-123-1234</phone>
</person>
<person>
<id>2</id>
<last_name>doe</last_name>
<first_name>jane</first_name>
<email>jd#anotherexample.com</email>
<phone>666-234-2345</phone>
</person>
</people>
and NewSource.xml looks like this:
<people>
<person>
<email>js#example.com</email>
<website_url>js.example.com</website_url>
</person>
<person>
<email>jd#anotherexample.com</email>
<website_url>jd.anotherexample.com</website_url>
</person>
</people>
you can try this:
$doc1->loadXML('MainSource.xml');
$xpath1 = new DOMXPath($doc1);
# find each person's email address
$sources = $xpath1->query('//person//email');
$doc2->loadXML('NewSource.xml');
$xpath2 = new DOMXPath($doc2);
foreach ($sources as $source) {
#for each email address, get the parent and use that as the destination
#of the new web address element
$destination = $xpath1->query('..',$source);
#in the other doc, search for each person whose email address matches
#that of the first doc and get the relevant web address
$exp2 = "//person[email[text()='{$source->nodeValue}']]//website_url";
$target = $xpath2->query($exp2);
#import the result of the search as a node into the first doc
$node = $doc1->importNode($target[0], true);
#finally, append the imported node in the right location of the first doc
$destination[0]->appendChild($node);
};
echo $doc1->saveXml();
Output:
<people>
<person>
<id>1</id>
<last_name>smith</last_name>
<first_name>john</first_name>
<email>js#example.com</email>
<phone>555-123-1234</phone>
<website_url>js.example.com</website_url></person>
<person>
<id>2</id>
<last_name>doe</last_name>
<first_name>jane</first_name>
<email>jd#anotherexample.com</email>
<phone>666-234-2345</phone>
<website_url>jd.anotherexample.com</website_url></person>
</people>

PHP XML append to created file

I have the following XML documment:
<list>
<person>
<name>Simple name</name>
</person>
</list>
I try to read it, and basically create another "person" element. The output I want to achieve is:
<list>
<person>
<name>Simple name</name>
</person>
<person>
<name>Simple name again</name>
</person>
</list>
Here is how I am doing it:
$xml = new DOMDocument();
$xml->load('../test.xml');
$list = $xml->getElementsByTagName('list') ;
if ($list->length > 0) {
$person = $xml->createElement("person");
$name = $xml->createElement("name");
$name->nodeValue = 'Simple name again';
$person->appendChild($name);
$list->appendChild($person);
}
$xml->save("../test.xml");
What I am missing here?
Edit: I have translated the tags, so that example would be clearer.
Currently, you're pointing/appending to the node list instead of that found parent node:
$list->appendChild($person);
// ^ DOMNodeList
You should point to the element:
$list->item(0)->appendChild($person);
Sidenote: The text can already put inside the second argument of ->createElement():
$name = $xml->createElement("name", 'Simple name again');

Prepending raw XML using PHP's SimpleXML

Given a base $xml and a file containing a <something> tag with attributes, children and children of its children, I would like to append it as first child and all of its children as raw XML.
Original XML:
<root>
<people>
<person>
<name>John Doe</name>
<age>47</age>
</person>
<person>
<name>James Johnson</name>
<age>13</age>
</person>
</people>
</root>
XML in file:
<something someval="x" otherthing="y">
<child attr="val" ..> { some children and values ... }</child>
<child attr="val2" ..> { some children and values ... }</child>
...
</something>
Result XML:
<root>
<something someval="x" otherthing="y">
<child attr="val" ..> { some children and values ... }</child>
<child attr="val2" ..> { some children and values ... }</child>
...
</something>
<people>
<person>
<name>John Doe</name>
<age>47</age>
</person>
<person>
<name>James Johnson</name>
<age>13</age>
</person>
</people>
</root>
This tag would contain several children both direct and recursively, so it would not be practical to build the XML via the SimpleXML operations. Besides, keeping it in a file would result in lower maintenance costs.
Technically it would simply be prepending one child. The problem is that this child would have other children and so on.
On the PHP addChild page there's a comment that says:
$x = new SimpleXMLElement('<root name="toplevel"></root>');
$f1 = new SimpleXMLElement('<child pos="1">alpha</child>');
$x->{$f1->getName()} = $f1; // adds $f1 to $x
However, this does not seem to treat my XML as raw XML therefore causing < and > escaped tags to appear. Several warnings concerning namespaces seem to appear as well.
I suppose I could do a quick replace of such tags but I am not sure whether it could cause future problems and it certainly does not feel right.
Manually hacking the XML is not an option and neither is adding children one by one. Choosing a different library could be.
Any clues on how to get this working?
Thanks!
I'm really not sure if that will work. Try this or downvote this, but I hope it helps. Using DOMDocument (Reference)
<?php
$xml = new DOMDocument();
$xml->loadHTML($yourOriginalXML);
$newNode = DOMDocument::createElement($someXMLtoPrepend);
$nodeRoot = $xml->getElementsByTagName('root')->item(0);
$nodeOriginal = $xml->getElementsByTagName('people')->item(0);
$nodeRoot->insertBefore($newNode,$nodeOriginal);
$finalXmlAsString = $xml->saveXML();
?>
Sometimes UTF-8 can make problems, then try this:
<?php
$xml = new DOMDocument();
$xml->loadHTML(mb_convert_encoding($yourOriginalXML, 'HTML-ENTITIES', 'UTF-8'));
$newNode = DOMDocument::createElement(mb_convert_encoding($someXMLtoPrepend, 'HTML-ENTITIES', 'UTF-8'));
$nodeRoot = $xml->getElementsByTagName('root')->item(0);
$nodeOriginal = $xml->getElementsByTagName('people')->item(0);
$nodeRoot->insertBefore($newNode,$nodeOriginal);
$finalXmlAsString = $xml->saveXML();
?>

PHP restructure recursive XML with closing tag on same line? ie <value="blah"/>

I need to restructure a very large xml source, example is at
http://www.fluffyduck.com.au/sampleXML.xml
I need to modify it for jstree however I'm not sure how to manipulate the data recursively, as loading it as xml with simpleXml only see's the first 1 user record.
<user id="41" username="bsmain" firstname="Boss" lastname="MyTest" fullname="Test Name" email="lalal#test.com" logins="1964" lastseen="11/09/2012">
</user>
to
<user id="41">
<content><name>bsmain</name></content>
</user>
The problem is some xml lines do not have a closing tag such as , but instead look like this :
<user id="61" username="underling" firstname="Under" lastname="MyTest" fullname="Test Name" email="lalal#test.com" logins="4" lastseen="08/09/2009"/>
If i modify this record and add underling jstree does not recognise it, i'm presuming the /> at the end is the same as ?
I did want to do this in XML but am thinking it may be easier, to simply somehow parse the xml file 'line by line', read in the line of data explode it perhaps,
then create a new variable storing it with modified contents such as :
<user id="61">
<content><name>bsmain</name>
</user>
and on the rows where /> exists at the end, manually insert a tag.
there has to be a smarter/faster way to achieve this.
Your best bet is to use DOMDocument for XML parsing. I have written an example that transforms attributes (excluding the id attribute) to content elements:
Code
<?php
$s =
'<users>' .
'<user id="61" username="underling" firstname="Under" lastname="MyTest" fullname="Test Name" email="lalal#test.com" logins="4" lastseen="08/09/2009"/>' .
'<user id="61" username="underling" firstname="Under" lastname="MyTest" fullname="Test Name" email="lalal#test.com" logins="4" lastseen="08/09/2009"/>' .
'<user id="8" test="testvalue"></user>' .
'</users>';
$doc = new DOMDocument();
$doc->loadXML($s);
$users = $doc->getElementsByTagName("user");
foreach ($users as $user)
{
if ($user->hasAttributes())
{
// create content node
$content = $user->appendChild($doc->createElement("content"));
// transform attributes into content elements
for ($i = 0; $i < $user->attributes->length; $i++)
{
$attr = $user->attributes->item($i);
if (strtolower($attr->name) != "id")
{
if ($user->removeAttribute($attr->name))
{
$content->appendChild($doc->createElement($attr->name, $attr->value));
$i--;
}
}
}
}
}
header("Content-Type: text/xml");
echo $doc->saveXML();
?>
Output
<users>
<user id="61">
<content>
<username>underling</username>
<firstname>Under</firstname>
<lastname>MyTest</lastname>
<fullname>Test Name</fullname>
<email>lalal#test.com</email>
<logins>4</logins>
<lastseen>08/09/2009</lastseen>
</content>
</user>
<user id="61">
<content>
<username>underling</username>
<firstname>Under</firstname>
<lastname>MyTest</lastname>
<fullname>Test Name</fullname>
<email>lalal#test.com</email>
<logins>4</logins>
<lastseen>08/09/2009</lastseen>
</content>
</user>
<user id="8">
<content>
<test>testvalue</test>
</content>
</user>
</users>

PHP XML adding new entry

How do i edit a xml files and add a new entry at end of < / user > ?
My xml(filezilla) look like
<FileZillaServer>
<Users>
<User Name="test">
</User>
/* using php to add another users on here <User Name="test2" */
</Users>
</FileZillaServer>
Thank you for help.
You can use the DOMDocument classes to manipulate an XML document.
For instance, you could use something like this :
$str = <<<XML
<FileZillaServer>
<Users>
<User Name="test">
</User>
</Users>
</FileZillaServer>
XML;
$xml = DOMDocument::loadXML($str);
$users = $xml->getElementsByTagName('Users');
$newUser = $xml->createElement('User');
$newUser->setAttribute('name', 'test2');
$users->item($users->length - 1)->appendChild($newUser);
var_dump($xml->saveXML());
Which will get you :
string '<?xml version="1.0"?>
<FileZillaServer>
<Users>
<User Name="test">
</User>
<User name="test2"/></Users>
</FileZillaServer>
' (length=147)
i.e. you :
create a new User element
you set its name attribute
and you append that new element to Users
(There are probably other ways to do that, avoiding the usage of length ; but this is what I first thought about -- quite early in the morning ^^ )
Use SimpleXML. As the name implies, it's the simplest way to deal with XML documents.
$FileZillaServer = simplexml_load_string(
'<FileZillaServer>
<Users>
<User Name="test" />
</Users>
</FileZillaServer>'
);
$User = $FileZillaServer->Users->addChild('User');
$User['Name'] = 'Test2';
echo $FileZillaServer->asXML();

Categories