I'm building a php script to transfer selected contents of an xml file to an sql database..
One of the hardcoded XML contents is formatted like this:
<visualURL>
id=18144083|img=http://upload.wikimedia.org/wikipedia/en/8/86/Holyrollernovacaine.jpg
</visualURL>
And I'm looking for a way to just get the contents of the URL (all text after img=).
$Image = $xpath->query("substring-after(/Playlist/PlaylistEntry[1]/visualURL[1]/text(), 'img=')", $element)->item(0)->nodeValue;
Displays a property non-object error on my php output.
There must be another way to just extract the URL contents using XPath that I want, no?
Any help would be greatly appreciated!
EDIT:
Here is the minimum code
<?php
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML('<Playlist>
<PlaylistEntry>
<visualURL>
id=12582194|img=http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
</visualURL>
</PlaylistEntry>
</Playlist>');
$xpath = new DOMXpath($xmlDoc);
$elements = $xpath->query("/Playlist/PlaylistEntry[1]");
if (!is_null($elements))
foreach ($elements as $element)
$Image = $xpath->query("substring-after(/Playlist/PlaylistEntry[1]/visualURL[1]/text(), 'img=')", $element)- >item(0)->nodeValue;
print "Finished Item: $Image";
?>
EDIT 2:
After some research I believe I must use
$xpath->evaluate
instead of my current use of
$xpath->query
see this link
Same XPath query is working with Google docs but not PHP
I'm not exactly sure how to do this yet.. but i will investigate more in the morning. Again, any help would be appreciated.
You're in right direction. Use DOMXPath::evaluate() for xpath expression that doesn't return node(s) like substring-after() (it returns string as documented in the linked page). The following codes prints expected output :
$xmlDoc = new DOMDocument();
$xml = <<<XML
<Playlist>
<PlaylistEntry>
<visualURL>
id=12582194|img=http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
</visualURL>
</PlaylistEntry>
</Playlist>
XML;
$xmlDoc->loadXML($xml);
$xpath = new DOMXpath($xmlDoc);
$elements = $xpath->query("/Playlist/PlaylistEntry");
foreach ($elements as $element) {
$Image = $xpath->evaluate("substring-after(visualURL, 'img=')", $element);
print "Finished Item: $Image <br>";
}
output :
Finished Item: http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
Demo
Related
So I tried to get the "Current Song" as echo in a PHP file using XPath.
I tried to echo the file_get_content and it returns the webpage I'm trying to get the content from, however it seems that I can't filter the webpage content using XPath. It should echo only the Current Song.
This is what I've tried:
<?php
libxml_use_internal_errors(false);
$html = file_get_contents('http://185.40.20.83/radio/8000/');
$doc = new DOMDocument;
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$node = $xpath->query('/html/body/div/div[1]/div[2]/table/tbody/tr[10]/td[2]')->item(0);
echo $node->textContent;
?>
I'm trying this for over one hour and I'm loosing hope because I don't see what's the problem...
Try changing your $node to :
$node = $xpath->query('//table//tr[./td[text()="Current Song:"]]/td[2]')->item(0);
or
$node = $xpath->query('//table//tr[./td[text()="Current Song:"]]/td[2]');
echo $node[0]->nodeValue;
Output:
Chmst - Pump Up The Jam
I want to read this xml document:
<?xml version="1.0" encoding="UTF-8"?>
<tns:getPDMNumber xmlns:tns="http://www.testgroup.com/TestPDM" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.testgroup.com/TestPDM getPDMNumber.xsd ">
<tns:getPDMNumberResponse>
<tns:requestID>22222</tns:requestID>
<tns:pdmNumber>654321</tns:pdmNumber>
<tns:responseCode>0</tns:responseCode>
</tns:getPDMNumberResponse>
</tns:getPDMNumber>
I tried it this way:
$dom->load('response/17_getPDMNumberResponse.xml');
$nodes = $dom->getElementsByTagName("tns:requestID");
//$nodes = $dom->getElementsByTagName("tns:getPDMNumber");
//$nodes = $dom->getElementsByTagName("tns:getPDMNumberResponse");
foreach($nodes as $node)
{
$response=$node->getElementsByTagName("tns:getPDMNumber");
foreach($response as $info)
{
$test = $info->getElementsByTagName("tns:pdmNumber");
$pdm = $test->nodeValue;
}
}
the code never runs into the foreach loop.
Only for clarification my goal is to read the "tns:pdmNumber" node.
Have anybody a idea?
EDIT: I have also tried the commited lines.
The XML uses a namespace, so you should use the namespace aware methods. They have the suffix _NS.
$tns = 'http://www.testgroup.com/TestPDM';
$document = new DOMDocument();
$document->loadXml($xml);
foreach ($document->getElementsByTagNameNS($tns, "pdmNumber") as $node) {
var_dump($node->textContent);
}
Output:
string(6) "654321"
A better option is to use Xpath expression. They allow a more comfortable access to DOM nodes. In this case you have to register a prefix for the namespace that you can use in the Xpath expression:
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('t', 'http://www.testgroup.com/TestPDM');
var_dump(
$xpath->evaluate('string(/t:getPDMNumber/t:getPDMNumberResponse/t:pdmNumber)')
);
This:
$nodes = $dom->getElementsByTagName("tns:requestID");
you find all the requestID nodes, and try to loop on them. That's fine, but then you use that node as a basis to find any getPDMNumber nodes UNDER the requestID - but there's nothing - requestID is a terminal node. So
$response=$node->getElementsByTagName("tns:getPDMNumber");
finds nothing, and the inner loop has nothing to do.
It's like saying "Start digging a hole until you reach china. Once you reach China, keep digging until you reach Australia". But you can't keep digging - you've reached the "bottom", and the only thing deeper than China would be going into orbit.
I have a xml file which contains this :
<ns1:Response xmlns:ns1="http://example.com/">
- <ns1:return>
<ns1:mid>39824</ns1:mid>
<ns1:serverType>4</ns1:serverType>
<ns1:size>5</ns1:size>
</ns1:return>
- <ns1:return>....
</ns1:return>
Now I want to get nodevalue of mid where nodevalue size has 5, I tried following code but no results:
$doc = new DOMDocument();
$doc->load($file);
$xpath = new DOMXPath($doc);
$query = '//Response/return/size[.="5"]/mid';
$entries = $xpath->evaluate($query);
So how can I do that ?
thanks in advance
PHP has some automatic registration for the namespaces of the current context, but it is a better idea not to depend on it. Prefixes can change. You can even use a default namespace and avoid the prefixes.
Best register your own prefix:
$xpath->registerNamespace('e', 'http://example.com/');
In XPath you define location paths with conditions:
Any return node inside a Response node:
//e:Response/e:return
If it has a child node size node with the value 5
//e:Response/e:return[e:size = 5]
Get the mid node inside it
//e:Response/e:return[e:size = 5]/e:mid
Cast the first found mid node into a string
string(//e:Response/e:return[e:size = 5]/e:mid)
Complete example:
$xml = <<<'XML'
<ns1:Response xmlns:ns1="http://example.com/">
<ns1:return>
<ns1:mid>39824</ns1:mid>
<ns1:serverType>4</ns1:serverType>
<ns1:size>5</ns1:size>
</ns1:return>
<ns1:return></ns1:return>
</ns1:Response>
XML;
$doc = new DOMDocument();
$doc->loadXml($xml);
$xpath = new DOMXPath($doc);
$xpath->registerNamespace('e', 'http://example.com/');
$mid = $xpath->evaluate(
'string(//e:Response/e:return[e:size = 5]/e:mid)'
);
var_dump($mid);
Output:
string(5) "39824"
You can also use following::sibling in this case. Get mid value where its following sibling is size with text equal to 5. Rough example:
$query = 'string(//ns1:Response/ns1:return/ns1:mid[following-sibling::ns1:size[text()="5"]])';
Sample Output
You're missing some namespace and you're trying to get the child mid of a size element whose content is 5.
try this:
$query = '//ns1:Response/ns1:return/ns1:mid[../ns1:size[text()="5"]]';
then, to see the result:
foreach ($entries as $entry) {
echo $entry->nodeValue . "<br />";
}
I am using domDocument hoping to parse this little html code. I am looking for a specific span tag with a specific id.
<span id="CPHCenter_lblOperandName">Hello world</span>
My code:
$dom = new domDocument;
#$dom->loadHTML($html); // the # is to silence errors and misconfigures of HTML
$dom->preserveWhiteSpace = false;
$nodes = $dom->getElementsByTagName('//span[#id="CPHCenter_lblOperandName"');
foreach($nodes as $node){
echo $node->nodeValue;
}
But For some reason I think something is wrong with either the code or the html (how can I tell?):
When I count nodes with echo count($nodes); the result is always 1
I get nothing outputted in the nodes loop
How can I learn the syntax of these complex queries?
What did I do wrong?
You can use simple getElementById:
$dom->getElementById('CPHCenter_lblOperandName')->nodeValue
or in selector way:
$selector = new DOMXPath($dom);
$list = $selector->query('/html/body//span[#id="CPHCenter_lblOperandName"]');
echo($list->item(0)->nodeValue);
//or
foreach($list as $span) {
$text = $span->nodeValue;
}
Your four part question gets an answer in three parts:
getElementsByTagName does not take an XPath expression, you need to give it a tag name;
Nothing is output because no tag would ever match the tagname you provided (see #1);
It looks like what you want is XPath, which means you need to create an XPath object - see the PHP docs for more;
Also, a better method of controlling the libxml errors is to use libxml_use_internal_errors(true) (rather than the '#' operator, which will also hide other, more legitimate errors). That would leave you with code that looks something like this:
<?php
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach($xpath->query("//span[#id='CPHCenter_lblOperandName']") as $node) {
echo $node->textContent;
}
I'm trying to write a script that grabs the URL of the first image from this website: http://www.slothradio.com/covers/?adv=&artist=pantera&album=vulgar+display+of+power
Here's my script:
$content = file_get_contents($url);
$doc = new DOMDocument();
$doc->loadHTML($content);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("*/div[#class='album0']/img");
echo '<pre>';print_r($elements);exit;
When I run that, it outputs
DOMNodeList Object
(
)
Even when I change my query to $xpath->query("*/img"), I still get nothing. What am I doing wrong?
$doc->loadHTMLFile($content); takes in FILE PATH not HTML content see documentation
http://php.net/manual/en/domdocument.loadhtmlfile.php
Use
$doc = new DOMDocument();
$doc->loadHTMLFile($url);
To Output Element use
var_dump(iterator_to_array($elements));
//Or
print_r(iterator_to_array($elements));
Thanks
:)
What am I doing wrong?
You are using print_r, but DOMNodeList does not offer any output for that function (because it's an internal class). You can start with outputting the number of items for example. In the end you need to iterate over the node list and deal with each node on your own.
printf("Found %d element(s).\n", $elements->length);