String between php [duplicate] - php

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 6 years ago.
I've got something like this:
$string = '<some code before><div class="abc">Something written here</div><some other code after>'
What I want is to get what is within the div and output it:
Something written here
How can I do that in php? Thanks in advance!

You would use the DOMDocument class.
// HTML document stored in a string
$html = '<strong><div class="abc">Something written here</div></strong>';
// Load the HTML document
$dom = new DOMDocument();
$dom->loadHTML($html);
// Find div with class 'abc'
$xpath = new DOMXPath($dom);
$result = $xpath->query('//div[#class="abc"]');
// Echo the results...
if($result->length > 0) {
foreach($result as $node) {
echo $node->nodeValue,"\n";
}
} else {
echo "Empty result set\n";
}
Read up on the expression syntax for XPath to customize your DOM searches.

Related

dom x path to grab href value [duplicate]

This question already has answers here:
How to extract a node attribute from XML using PHP's DOM Parser
(3 answers)
Closed 3 years ago.
I have the following html
<div class="logo">***® text.<sup>TM</sup></div>
I would like to get the value of href with php dom xpath, how would I accomplish that?
This is what I have tried:
$anchors = $domXpath->query("//div[#class='logo']/a");
foreach($anchors as $a)
{
print $a->nodeValue." - ".$a->getAttribute("href")."<br/>";
}
Here is the solution.
$xpath = new DOMXpath($dom);
$link = $xpath->query('//div[#class="logo"]/a');
$link->getAttribute('href')

How can I get a specific div from website? [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 5 years ago.
I am trying get a specific div element (i.e. with attribute id="vung_doc") from a website, but I get almost every element. Do you have any idea what's wrong?
$doc = new DOMDocument;
// We don't want to bother with white spaces
$doc->preserveWhiteSpace = true;
// Most HTML Developers are chimps and produce invalid markup...
$doc->strictErrorChecking = false;
$doc->recover = true;
$doc->loadHTMLFile('http://lightnovelgate.com/chapter/epoch_of_twilight/chapter_300');
$xpath = new DOMXPath($doc);
$query = "//*[#class='vung_doc']";
$entries = $xpath->query($query);
var_dump($entries->item(0)->textContent);
Actually, it appears that that one element, which has both id and class attributes with value vung_doc, has many paragraphs inside its text content. Perhaps you are thinking each paragraph should be in its own div element.
<div id="vung_doc" class="vung_doc" style="font-size: 18px;">
<p></p>
"Mayor song..."
In the screenshot at the bottom of this post, I added an outline style to that element, to show just how many paragraphs are within that element.
If you wanted to separate the paragraphs, you could use preg_split() to split on any new line characters:
$entries = $xpath->query($query);
foreach($entries as $entry) {
$paragraphs = preg_split("/[\r\n]+/s",$entry->textContent);
foreach($paragraphs as $paragraph) {
if (trim($paragraph)) {
echo '<b>paragraph:</b> '.$paragraph;
break;
}
}
}
See a demonstration of this in this playground example. Note that before loading the HTML file, libxml_use_internal_errors() is called, to suppress the XML errors:
libxml_use_internal_errors(true);
Screenshot of the target div element with outline added:
Change
$query = "//*[#class='vung_doc']";
to
$query = "//*[#id='vung_doc']";

Get part of get html code from file_get_content [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 7 years ago.
I need a part of code html by a file from file_get_contents(url)
I do
$variableee = file_get_contents("http://url.com/path/to/file");
echo $variableee;
Ok now in Variableee I've all the url's code.
In this code there is a part that I need. I need a table with class name "table".
Es.
<div>text</div>
<span> text </span>
<table class="table">
<tr><td>Text that I need</td></tr>
</table>
How I can get it?
Sorry for bad english.
If you want the data within PHP itself use the built-in DOM parser,
<?php
$doc = new DOMDocument();
$doc->loadHTML($variableee);
$arr = $doc->getElementsByTagName("table"); // DOMNodeList Object
foreach($arr as $item) { // DOMElement Object
echo $item->nodeValue;
}
?>
EDIT: Parse using the class name with DOMXPath
$doc = new DOMDocument();
$doc->loadHTML($variableee);
$classname = 'table';
$a = new DOMXPath($doc);
$spans = $a->query("//*[contains(concat(' ', normalize-space(#class), ' '), ' $classname ')]");
foreach($spans as $item) { // DOMElement Object
echo $item->nodeValue;
}

Can not get Xpath to fetch a nodeList [duplicate]

This question already has answers here:
Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing?
(2 answers)
Closed 8 years ago.
libxml_use_internal_errors(true);
$url = 'http://thepiratebay.is/browse/200/0/7';
$html = file_get_contents($url);
$dom = new \DOMDocument();
$dom->loadHTML($html);
$x = new \DOMXPath($dom);
$nodeList = $x->query('/html/body/div[2]/div[2]/table/tbody/tr');
foreach ($nodeList as $node) {
die(var_dump($node));
}
Gives me the error:
"Invalid argument supplied for foreach()"
Not sure why xpath doesn't work on that domain?
If I'm right you'd like to get all the titles in that table. I'd suggest an easier, yet more specific XPath query, i.e.
$nodeList = $x->query('//div[#class="detName"]');
See it in action

php DOMDocument How to convert node value to string [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How can I get an element's serialised HTML with PHP's DOMDocument?
PHP + DOMDocument: outerHTML for element?
I am trying to extract all img tags from a string. I am using:
$domimg = new DOMDocument();
#$domimg->loadHTML($body);
$images_all = $domimg->getElementsByTagName('img');
foreach ($images_all as $image) {
// do something
}
I want to put the src= values or even the complete img tags into an array or string.
Use saveXML() or saveHTML() on each node to add it to an array:
$img_links = array();
$domimg = new DOMDocument();
$domimg->loadHTML($body);
$images_all = $domimg->getElementsByTagName('img');
foreach ($images_all as $image) {
// Append the XML or HTML of each to an array
$img_links[] = $domimg->saveXML($image);
}
print_r($img_links);
You could try a DOM parser like simplexml_load_string. Take a look at a similar answer I posted here:
Needle in haystack with array in PHP

Categories