DomXPath php omitting html elements [duplicate] - php

This question already has answers here:
How to get innerHTML of DOMNode?
(9 answers)
Closed 9 years ago.
I'm trying to keep certain parts of HTML elements inside a dom that was loaded by DomDocument and CURL.
Problem is that when I do xpath query and retireve nodeValue it omits the HTML elements.
Below is the code. Is there a way to retrieve HTML for that particular node?
$location = $xpath->query("//div[#id='location']/label");
echo $location->item(0)->nodeValue."<br>";

$dom = new DOMDocument();
$dom->loadHTML('<html><div id="location"><label><h1>Hello <b>world</b></h1></label></div></html>');
$xpath = new DOMXPath($dom);
$location = $xpath->query("//div[#id='location']/label/*");
var_dump($dom->saveXML($location->item(0)));
Output:
string(27) "<h1>Hello <b>world</b></h1>"

Related

Regex find every instance of element in html [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Using DOMDocument to extract from HTML document by class
(3 answers)
Closed 9 years ago.
I'm scraping an html page that has X amount of instances of the element class="page-title" inside a div element id="row-1"
So we have something like:
<div id="row-1">
<div class="page-title">
<span><h4><a>text I want to grab</a></h4></span>
</div>
</div>
There could be 1,2,3,10 of these rows. Could anyone help explain how I can grab every instance of the page title if there are multiple rows?
Whatever you do, don't use a regex! HE COMES
Instead, use a parser:
$dom = new DOMDocument();
$dom->loadHTML($your_html_source_here);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query("//*[#id='row-1']/div[#class='page-title']");

how to get a title of a webpage using php? [duplicate]

This question already has answers here:
Get meta information, title and all images of any webpage using php
(2 answers)
Get title of website via link
(10 answers)
Closed 9 years ago.
I wrote the following code to get the title of a webpage. But the code doesn't work and output this error message: Object of class DOMElement could not be converted to string
$html = file_get_contents("http://mysmallwebpage.com/");
$dom = new DOMDocument;
#$dom->loadHTML($html);
$links = $dom->getElementsByTagName('title');
foreach ($links as $title)
{
echo (string)$title."<br>";
}
Could you please show me with an example?

How would one get the value/text of an anchor DOMElement? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to get Anchor text using DomDocument?
Getting node's text in PHP DOM
I have a script that finds all the anchor tags of a certain class in a DOMDocument. I am looking to echo the text that is contained within the <a>"....."</a> tags.
You can access DOMText node directly using XPath:
$xpath = new DOMXPath($dom_document);
$node = $xpath->query('//a/text()')->item(0);
echo $node->textContent; // text
You can use preg_match(). Here is an example:
$link = 'www.CoursesWeb.net';
if(preg_match('/\<a([^\>]*)\>(.*?)\<\/a\>/i', $link, $mc)) {
echo $mc[2]; // www.CoursesWeb.net
}

getElementByAttribute PHP DOM [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Select xml node by attribute in php
Is there any function like getElementByAttribute in PHP? If no, how do I create a workaround?
E.g.
<div class="foo">FOO!</div>
How do I match that element?
You can use XPath:
$xpath = new DOMXPath($document);
$results = $xpath->query("//*[#class='foo']");
Here's a demo.

How to retrieve attribute of root element of DOMDocument? [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
DOMDocument::load - PHP - Getting attribute value
I use the following code:
$str = 'text';
$dom = new DOMDocument;
$dom->loadXML($str);
I want to obtain the value of root element of $str.
In this example "some_link" should be returned. In real case $str is read from file.
How to achieve this?
Try:
$dom->documentElement->getAttribute('%yourAttrName%');

Categories