Retrieve the DOM from a variable with Simple HTML DOM Parser? - php

I'm using Simple HTML DOM Parser to retrieve informations from a website with this code:
$html = file_get_html("http://www.example.com/"]);
$table = $html->find("div[class=table]");
foreach ( $table as $tabella ) {
$title = $tabella->find (".elementTitle");
echo "<h2>" . $title[0] -> plaintext . "</h2>";
$minisito = $tabella->find ("h1[class=elementTitle] a");
echo "<p>" . $minisito[0] -> href . "</p>";
}
Now I need to extract other pieces of contents from the url contained in this specific urls $minisito[0] -> href
How can I create another variable using file_get_html command to extract data from this new urls?

Related

Get td contain table from library simplehtmldom

simple_html_dom does not work in page "https://eldni.com/buscar-por-dni?dni=44626399"
<?php
include_once './simple_html_dom./HtmlWeb.php';
use simplehtmldom\HtmlWeb;
// get DOM from URL or file
$doc = new HtmlWeb();
$html = $doc->load('https://eldni.com/buscar-por-dni?dni=44626399');
foreach($html->find('td') as $e)
echo $e->plaintext . '<br>' . PHP_EOL;
?>
I want td plain text of the "td" table.

How to access an HTML attribute and retrieve data from it in PHP?

I'm new to PHP and I would like to know how to retrieve data from an HTML element such as an src?
It's very easy to do that in jQuery:
$('img').attr('src');
But I have no idea how to do it in PHP (if it is possible).
Here's an example I'm working on:
I loaded $result into SimpleXMLElement and stored it into $xml:
$xml = simplexml_load_string($result) or die("Error: Cannot create object");
Then used foreach to loop over all elements:
foreach($xml->links->link as $link){
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
// returns sometihing similar to: <a href='....'><img src='....'></a>
}
Inside of the foreach I'm trying to access links (src) in img.
Is there a way to access src of the img nested inside of the a — clear when outputted to the screen:
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
I would do this with the built-in DOMDocument and DOMXPath APIs, and then you can use the getAttribute method on any matching img node:
$doc = new DOMDocument();
// Load some example HTML. If you need to load from file, use ->loadHTMLFile
$doc->loadHTML("<a href='abc.com'><img src='ping1.png'></a>
<a href='def.com'><img src='ping2.png'></a>
<a href='ghi.com'>something else</a>");
$xpath = new DOMXpath($doc);
// Collect the images that are children of anchor elements
$imgs = $xpath->query("//a/img");
foreach($imgs as $img) {
echo "Image: " . $img->getAttribute("src") . "\n";
}

PHPHtmlParser getAttribute not works for custom attributes

I have some HTML with custom attributes and trying to parse it with component PHPHtmlParser. Whole project created via this component. Here is the problem example given.
use PHPHtmlParser\Dom;
class Parsemydiv {
function parseAttr()
{
$str='<div otop="20" oleft="20" name="info">
<img src="example.jpg">
</div>';
$dom = new Dom();
$dom->loadStr($str);
$otop = $dom->getAttribute("otop");
$name = $dom->getAttribute("name");
echo "Name: " . $name . PHP_EOL;
echo "Top: " . $otop . PHP_EOL;
echo "Left: " . $oleft . PHP_EOL;
}
}
Output is:
Name: info
Top:
Left:
getAttribute cannot get custom attributes.
Why use a 3rd party library to parse the DOM when PHP has built-in support for this? I suggest learning the native functions instead:
$str='<div otop="20" oleft="15" name="info">
<img src="example.jpg">
</div>';
$doc = new DOMDocument();
$doc->loadHTML($str);
$div = $doc->getElementsByTagName('div')[0];
$otop = $div->getAttribute('otop');
$oleft = $div->getAttribute('oleft');
echo "otop=$otop, oleft=$oleft"; //otop=20, oleft=15

PHP DOM saveHTML changes formatting

I load external HTML page and with loadHTML.
Than I replace two childs and remove one.
saveHTML() method changes something and I do not want that.
It changes position of the closing
</head>
tag, puts it right after and on original page closing head is further down the line after few tags.
It also changes body tag to:
<body class="something">
to just
<body>
.
How I can save it using PHP DOM so it respects all the positioning and attributes?
Here is the code:
$document = new DOMDocument();
#$document->loadHTML($contents);
$login_signup = $document->getElementById('loginBar')->getElementsByTagName('div')->item(1);
$login_signup->removeChild($login_signup->getElementsByTagName('h3')->item(0));
$todays_a = $document->createElement('a', 'Todays Digest');
$todays_a->setAttribute('href', $domain . $digest_newsletter . date('mdy') . '.html');
$previous_a = $document->createElement('a', 'Previous Digest');
$previous_a->setAttribute('href', $domain . $digest_newsletter . date('mdy', strtotime('-1 day')) . '.html');
$todays_div = $document->getElementById('myDiv');
$todays_div->replaceChild($todays_a, $todays_div->getElementsByTagName('script')->item(0));
$previous_div = $document->getElementById('myDiv2');
$previous_div->replaceChild($previous_a, $previous_div->getElementsByTagName('script')->item(0));
$contents = $document->saveHTML($document);

JSON escaping slashes in a file path, wont display image

I'm using PHP and a for loop to prepare data into proper html and output the data using JSON to be appended and displayed on the page. JSON slash escaping is causing the html to be viewed incorrectly by the browser.
This is my PHP for loop:
$json = '<div id="rsec3" class="rsec">';
for($i=0; $i<count($array); $i++)
{
$coverart = $array[$i]['cover'];
if(empty($coverart))
{
$coverart = "nocoverart.gif";
}
$json .= '<div><img="/video/cover/thumbs/' . $cover . '"></div>';
}
$json .= '</div>';
$json = json_encode(array('ok' => 'ok', 'html' => $json));
echo $json;
This is my javascript parsing and appending the json:
$.get('/index_get.php?iid='+this.id,function(data){
$('#indload').hide();
js=jQuery.parseJSON(data);
$('#indr').append(js.html);
});
This is what the browser is displaying, a bunch of useless jargon, and it is appending a </img="> on its own?
<img=" video cover thumbs img.png"></img=">
How can I prevent this from occuring, and having the image displayed properly?
I think problem could be invalid HTML tag <img> on the php code. In the <img> tag, src is missing and <img> tag was not closed.
Change the following
$json .= '<div><img="/video/cover/thumbs/' . $cover . '"></div>';
to
$json .= '<div><img src="/video/cover/thumbs/' . $cover . '" /></div>';
you need to close the image tag in your string
$json .= '<div><img="/video/cover/thumbs/' . $cover . '"/></div>';

Categories