This is my code:
$xml = file_get_contents('C:\myxml.xml');
$dom = new DOMDocument();
$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$xpath->registerNamespace("bme", "http://www.bmecat.org/bmecat/1.2/bmecat_new_catalog");
$expression = 'string(//bme:ARTICLE)';
var_dump($xpath->evaluate($expression));
This will var_dump the first of all the ARTICLE nodes. How can I get all of them?
Thanks for helping!
Related
I'm calling some wikipedia content two different way:
$html = file_get_contents('https://en.wikipedia.org/wiki/Sans-serif');
The first one is to call the first paragraph
$dom = new DomDocument();
#$dom->loadHTML($html);
$p = $dom->getElementsByTagName('p')->item(0)->nodeValue;
echo $p;
The second one is to call the first paragraph after a specific $id
$dom = new DOMDocument();
#$dom->loadHTML($html);
$p=$dom->getElementById('$id')->getElementsByTagName('p')->item(0);
echo $p->nodeValue;
I'm looking for a third way to call all the first part.
So I was thinking about calling all the <p> before the id or class "toc" which is the id/class of the table of content.
Any idea how to do that?
If you're just looking for the intro in plain text, you can simply use Wikipedia's API:
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Sans-serif
If you want HTML formatting as well (excluding inner images and the likes):
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&titles=Sans-serif
You could use DOMDocument and DOMXPath with for example an xpath expression like:
//div[#id="toc"]/preceding-sibling::p
$doc = new DOMDocument();
$doc->load("https://en.wikipedia.org/wiki/Sans-serif");
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//div[#id="toc"]/preceding-sibling::p');
foreach ($nodes as $node) {
echo $node->nodeValue;
}
That would give you the content of the paragraphs preceding the div with id = toc.
I have this keyword: yt-lookup-title.
I want the next 17 letters after this in a variable. So I would have:
"<a href="/watch?v=HnlC81tWoY8"
How can I archive that I get it from all lines with this Keyword?
Keywords
If you want to get the href content, you can rely on domdocument.
If I'm not mistaken, all the links (<a>) have this class yt-uix-tile-link. So you can do the following:
$dom = new DOMDocument;
// $html is a string containing the html of the page you're parsing
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
links = array ();
$nodes = $xpath->query('//a[#class="yt-uix-tile-link"]/#href');
foreach ($nodes as $node) {
$links [] = $node->nodeValue;
}
var_dump ($links);
Hope that helps
I have a xml which the format is like this:
<root>
<a>1</a>
<b>2</b>
<c></c>
</root>
This is the code I have tried:
$to = 3;
$dom = new DOMDocument();
$dom->formatOutput = true;
$dom->preserveWhiteSpace = false;
$dom->load("../xxx.xml");
$xpath = new DOMXPath($dom);
$query = "/root/*[position()=$to]";
$nodes = $xpath->query($query);
$node = $nodes[0];
$dom->removeChild($node);
$dom->save("../xxx.xml", LIBXML_NOEMPTYTAG);
How can I delete the tag with name "c" ?
Oh lord, the problem was lie under
$dom->removeChild($node);
should be
$node->parentNode->removeChild($node);
in order to delete a node, you have to get back to the parent node and then it will take the action..I think, this is just my two cents. if someone understand well, feel free to correct me
This question already has answers here:
How to retrieve comments from within an XML Document in PHP
(4 answers)
Closed 8 years ago.
I am trying to retrieve content from a p element in this page. As you can see, in the source code there is a paragraph with the content i want:
<p id="qb"><!--
QBlastInfoBegin
Status=READY
QBlastInfoEnd
--></p>
Actually i want to take the value of the Status.
Here is my PHP code.
#$dom->loadHTML($ncbi->ncbi_request($params));
$XPath = new DOMXpath($dom);
$nodes = $XPath->query('//p[#id="qb"]');
$node = $nodes->item(0)->nodeValue;
var_dump($node))
that returns
["nodeValue"]=> string(0) ""
Any idea ?
Thanks!
Seems that to get comment values you need to use //comment()
I'm not too familiar with XPaths so am not too sure on the exact syntax
Sources: https://stackoverflow.com/a/7548089/723139 / https://stackoverflow.com/a/1987555/723139
Update: with working code
<?php
$data = file_get_contents('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?RID=UY5PPBRH014&CMD=Get');
$dom = new DOMDocument();
#$dom->loadHTML($data);
$XPath = new DOMXpath($dom);
$nodes = $XPath->query('//p[#id="qb"]/comment()');
foreach ($nodes as $comment)
{
var_dump($comment->textContent);
}
I checked up the site, and it seems you are after the comment inside, you need to add comment() on your xpath query. Consider this example:
$contents = file_get_contents('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?RID=UY5PPBRH014&CMD=Get');
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($contents);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$comment = $xpath->query('//p[#id="qb"]/comment()')->item(0)->nodeValue;
echo '<pre>';
print_r($comment);
Outputs:
QBlastInfoBegin
Status=READY
QBlastInfoEnd
How do I get the value of an input field like the one below where it does not have an ID attribute using PHP's DOMDocument?
<input type="text" name="make" value="Toyota">
XPath makes it simple, assuming that's the only text input with "make" as its name:
$dom = new DOMDocument();
$dom->loadHTML(...);
$xp = new DOMXpath($dom);
$nodes = $xp->query('//input[#name="make"]');
$node = $nodes->item(0);
$car_make = $node->getAttribute('value');
If there's more than one input with that particular field name on the page (which is entirely possible), then you'll have to do some extra work to narrow down WHICH of those multiple inputs you want.
$dom = new DOMDocument();
$dom->loadHTML($result);
$xpath = new DOMXpath($dom);
$node = $xpath->query('//input[#name="token"]/attribute::value');
$token = $node->item(0)->nodeValue;