PHPHtmlParser getAttribute not works for custom attributes - php

I have some HTML with custom attributes and trying to parse it with component PHPHtmlParser. Whole project created via this component. Here is the problem example given.
use PHPHtmlParser\Dom;
class Parsemydiv {
function parseAttr()
{
$str='<div otop="20" oleft="20" name="info">
<img src="example.jpg">
</div>';
$dom = new Dom();
$dom->loadStr($str);
$otop = $dom->getAttribute("otop");
$name = $dom->getAttribute("name");
echo "Name: " . $name . PHP_EOL;
echo "Top: " . $otop . PHP_EOL;
echo "Left: " . $oleft . PHP_EOL;
}
}
Output is:
Name: info
Top:
Left:
getAttribute cannot get custom attributes.

Why use a 3rd party library to parse the DOM when PHP has built-in support for this? I suggest learning the native functions instead:
$str='<div otop="20" oleft="15" name="info">
<img src="example.jpg">
</div>';
$doc = new DOMDocument();
$doc->loadHTML($str);
$div = $doc->getElementsByTagName('div')[0];
$otop = $div->getAttribute('otop');
$oleft = $div->getAttribute('oleft');
echo "otop=$otop, oleft=$oleft"; //otop=20, oleft=15

Related

How to show image element from XML with php

I've tried what others have posted on stack overflow but it doesn't seem to work for me. So could anyone help please.
I have this xml document with a structure of:
<surveys>
<survey>
<section>
<page>
<reference>P1</reference>
<image><! [CDATA[<img src="imagepath">]]></image>
</page>
<page>
<reference>P2</reference>
<image><! [CDATA[<img src="imagepath">]]></image>
</page>
</section>
</survey>
</surveys>
Then this is my PHP code to get the image to show up:
function xml($survey){
$result = "<surveys></surveys>";
$xml_surveys = new SimpleXMLExtended($result);
$xml_survey = $xml_surveys->addChild('survey');
if ("" != $survey[id]){
$xml_survey_>addChildData($survey['image']);
}
This is my other file:
$image = “”;
if(“” != $image){
$image = <div class=“image_holder”> $image </div>
echo $image;
}
I'm not sure how to progress forward with this. so any help would be appreciated
It looks like you would like to fetch the image for a specific survey id. Well you can use DOM+Xpath To fetch this directly:
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
$expression = 'string(
/surveys/survey/section/page[reference="P1"]/image
)';
$imageForSurvey = $xpath->evaluate($expression);
var_dump($imageForSurvey);
Output:
string(22) "<img src="imagepath1">"
The content of the CDATA section inside the image element is a separate HTML fragment. You can use it directly if you trust the source of the XML or you parse it as HTML.
$htmlFragment = new DOMDocument();
$htmlFragment->loadHTML($imageForSurvey);
$htmlXpath= new DOMXpath($htmlFragment);
var_dump(
$htmlXpath->evaluate('string(//img/#src)')
);
Output:
string(10) "imagepath"
Your example-logic is trying to create XML, not load it ;-)
First you need to find the path and/or address to the XML file, like:
$filePath = __DIR__ . '/my-file.xml';
Then load XML:
<?php
$filePath = __DIR__ . '/my-file.xml';
$document = simplexml_load_file($filePath);
$surveyCount = 0;
foreach($document->survey as $survey)
{
$surveyCount = $surveyCount + 1;
echo '<h1>Survey #' . $surveyCount . '</h1>';
foreach($survey->section->page as $page)
{
echo 'Page reference: ' . $page->reference . '<br>';
// Decode your image.
$imageHtml = $page->image;
$dom = new DOMDocument();
$dom->loadHTML($imageHtml);
$xpath= new DOMXpath($dom);
$image = $xpath->evaluate('string(//img/#src)');
if(!empty($image)) {
echo '<div class=“image_holder”>' . $image . '</div>';
}
echo "<br>";
}
}
?>
Note that you should replace <! [CDATA[ with <![CDATA[ (without space),
else you will get StartTag: invalid element name error probably.

Get td contain table from library simplehtmldom

simple_html_dom does not work in page "https://eldni.com/buscar-por-dni?dni=44626399"
<?php
include_once './simple_html_dom./HtmlWeb.php';
use simplehtmldom\HtmlWeb;
// get DOM from URL or file
$doc = new HtmlWeb();
$html = $doc->load('https://eldni.com/buscar-por-dni?dni=44626399');
foreach($html->find('td') as $e)
echo $e->plaintext . '<br>' . PHP_EOL;
?>
I want td plain text of the "td" table.

How to access an HTML attribute and retrieve data from it in PHP?

I'm new to PHP and I would like to know how to retrieve data from an HTML element such as an src?
It's very easy to do that in jQuery:
$('img').attr('src');
But I have no idea how to do it in PHP (if it is possible).
Here's an example I'm working on:
I loaded $result into SimpleXMLElement and stored it into $xml:
$xml = simplexml_load_string($result) or die("Error: Cannot create object");
Then used foreach to loop over all elements:
foreach($xml->links->link as $link){
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
// returns sometihing similar to: <a href='....'><img src='....'></a>
}
Inside of the foreach I'm trying to access links (src) in img.
Is there a way to access src of the img nested inside of the a — clear when outputted to the screen:
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
I would do this with the built-in DOMDocument and DOMXPath APIs, and then you can use the getAttribute method on any matching img node:
$doc = new DOMDocument();
// Load some example HTML. If you need to load from file, use ->loadHTMLFile
$doc->loadHTML("<a href='abc.com'><img src='ping1.png'></a>
<a href='def.com'><img src='ping2.png'></a>
<a href='ghi.com'>something else</a>");
$xpath = new DOMXpath($doc);
// Collect the images that are children of anchor elements
$imgs = $xpath->query("//a/img");
foreach($imgs as $img) {
echo "Image: " . $img->getAttribute("src") . "\n";
}

Retrieve the DOM from a variable with Simple HTML DOM Parser?

I'm using Simple HTML DOM Parser to retrieve informations from a website with this code:
$html = file_get_html("http://www.example.com/"]);
$table = $html->find("div[class=table]");
foreach ( $table as $tabella ) {
$title = $tabella->find (".elementTitle");
echo "<h2>" . $title[0] -> plaintext . "</h2>";
$minisito = $tabella->find ("h1[class=elementTitle] a");
echo "<p>" . $minisito[0] -> href . "</p>";
}
Now I need to extract other pieces of contents from the url contained in this specific urls $minisito[0] -> href
How can I create another variable using file_get_html command to extract data from this new urls?

Filter out link address from DOM results

I'm working with a DOM parser that grabs links from a website by the class thumbnail. This returns a list of links. They are then converted to their image state and shown on the page. The problem I'm having is I have 2 different links that are getting returned:
http://i.imgur.com/randomstuffhere
AND
http://imgur.com/randomstuffhere
I need to filter the results for the links that DO NOT contain the i.imgur.com. If the link is a imgur link but does not contain the i. before I need to filter it out not to show.
I have this so far and I cannot figure out where I've gone wrong... Any suggestions?
<?php
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hyperlinks = $xpath->evaluate('//a[#class="thumbnail "]');
foreach($hyperlinks as $hyperlink) {
if (preg_match("/http://imgur.com/", $hyperlink->getAttribute('href'))){
}
else{
echo "<img style='padding-left:30%' width=\"500\" src=\"" . $hyperlink->getAttribute('href') . "\" alt=\"\" />";
echo "<br />";
}
}
?>
You need to escape the // in http:// with \/\/.
You should probably use strpos, though.
if(strpos($hyperlink->getAttribute('href'), 'http://i.imgur.com/') !== FALSE){
echo "This is an i.imgur.com link!";
}

Categories