I have some HTML with custom attributes and trying to parse it with component PHPHtmlParser. Whole project created via this component. Here is the problem example given.
use PHPHtmlParser\Dom;
class Parsemydiv {
function parseAttr()
{
$str='<div otop="20" oleft="20" name="info">
<img src="example.jpg">
</div>';
$dom = new Dom();
$dom->loadStr($str);
$otop = $dom->getAttribute("otop");
$name = $dom->getAttribute("name");
echo "Name: " . $name . PHP_EOL;
echo "Top: " . $otop . PHP_EOL;
echo "Left: " . $oleft . PHP_EOL;
}
}
Output is:
Name: info
Top:
Left:
getAttribute cannot get custom attributes.
Why use a 3rd party library to parse the DOM when PHP has built-in support for this? I suggest learning the native functions instead:
$str='<div otop="20" oleft="15" name="info">
<img src="example.jpg">
</div>';
$doc = new DOMDocument();
$doc->loadHTML($str);
$div = $doc->getElementsByTagName('div')[0];
$otop = $div->getAttribute('otop');
$oleft = $div->getAttribute('oleft');
echo "otop=$otop, oleft=$oleft"; //otop=20, oleft=15
Related
I've tried what others have posted on stack overflow but it doesn't seem to work for me. So could anyone help please.
I have this xml document with a structure of:
<surveys>
<survey>
<section>
<page>
<reference>P1</reference>
<image><! [CDATA[<img src="imagepath">]]></image>
</page>
<page>
<reference>P2</reference>
<image><! [CDATA[<img src="imagepath">]]></image>
</page>
</section>
</survey>
</surveys>
Then this is my PHP code to get the image to show up:
function xml($survey){
$result = "<surveys></surveys>";
$xml_surveys = new SimpleXMLExtended($result);
$xml_survey = $xml_surveys->addChild('survey');
if ("" != $survey[id]){
$xml_survey_>addChildData($survey['image']);
}
This is my other file:
$image = “”;
if(“” != $image){
$image = <div class=“image_holder”> $image </div>
echo $image;
}
I'm not sure how to progress forward with this. so any help would be appreciated
It looks like you would like to fetch the image for a specific survey id. Well you can use DOM+Xpath To fetch this directly:
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
$expression = 'string(
/surveys/survey/section/page[reference="P1"]/image
)';
$imageForSurvey = $xpath->evaluate($expression);
var_dump($imageForSurvey);
Output:
string(22) "<img src="imagepath1">"
The content of the CDATA section inside the image element is a separate HTML fragment. You can use it directly if you trust the source of the XML or you parse it as HTML.
$htmlFragment = new DOMDocument();
$htmlFragment->loadHTML($imageForSurvey);
$htmlXpath= new DOMXpath($htmlFragment);
var_dump(
$htmlXpath->evaluate('string(//img/#src)')
);
Output:
string(10) "imagepath"
Your example-logic is trying to create XML, not load it ;-)
First you need to find the path and/or address to the XML file, like:
$filePath = __DIR__ . '/my-file.xml';
Then load XML:
<?php
$filePath = __DIR__ . '/my-file.xml';
$document = simplexml_load_file($filePath);
$surveyCount = 0;
foreach($document->survey as $survey)
{
$surveyCount = $surveyCount + 1;
echo '<h1>Survey #' . $surveyCount . '</h1>';
foreach($survey->section->page as $page)
{
echo 'Page reference: ' . $page->reference . '<br>';
// Decode your image.
$imageHtml = $page->image;
$dom = new DOMDocument();
$dom->loadHTML($imageHtml);
$xpath= new DOMXpath($dom);
$image = $xpath->evaluate('string(//img/#src)');
if(!empty($image)) {
echo '<div class=“image_holder”>' . $image . '</div>';
}
echo "<br>";
}
}
?>
Note that you should replace <! [CDATA[ with <![CDATA[ (without space),
else you will get StartTag: invalid element name error probably.
simple_html_dom does not work in page "https://eldni.com/buscar-por-dni?dni=44626399"
<?php
include_once './simple_html_dom./HtmlWeb.php';
use simplehtmldom\HtmlWeb;
// get DOM from URL or file
$doc = new HtmlWeb();
$html = $doc->load('https://eldni.com/buscar-por-dni?dni=44626399');
foreach($html->find('td') as $e)
echo $e->plaintext . '<br>' . PHP_EOL;
?>
I want td plain text of the "td" table.
I'm new to PHP and I would like to know how to retrieve data from an HTML element such as an src?
It's very easy to do that in jQuery:
$('img').attr('src');
But I have no idea how to do it in PHP (if it is possible).
Here's an example I'm working on:
I loaded $result into SimpleXMLElement and stored it into $xml:
$xml = simplexml_load_string($result) or die("Error: Cannot create object");
Then used foreach to loop over all elements:
foreach($xml->links->link as $link){
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
// returns sometihing similar to: <a href='....'><img src='....'></a>
}
Inside of the foreach I'm trying to access links (src) in img.
Is there a way to access src of the img nested inside of the a — clear when outputted to the screen:
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
I would do this with the built-in DOMDocument and DOMXPath APIs, and then you can use the getAttribute method on any matching img node:
$doc = new DOMDocument();
// Load some example HTML. If you need to load from file, use ->loadHTMLFile
$doc->loadHTML("<a href='abc.com'><img src='ping1.png'></a>
<a href='def.com'><img src='ping2.png'></a>
<a href='ghi.com'>something else</a>");
$xpath = new DOMXpath($doc);
// Collect the images that are children of anchor elements
$imgs = $xpath->query("//a/img");
foreach($imgs as $img) {
echo "Image: " . $img->getAttribute("src") . "\n";
}
I'm using Simple HTML DOM Parser to retrieve informations from a website with this code:
$html = file_get_html("http://www.example.com/"]);
$table = $html->find("div[class=table]");
foreach ( $table as $tabella ) {
$title = $tabella->find (".elementTitle");
echo "<h2>" . $title[0] -> plaintext . "</h2>";
$minisito = $tabella->find ("h1[class=elementTitle] a");
echo "<p>" . $minisito[0] -> href . "</p>";
}
Now I need to extract other pieces of contents from the url contained in this specific urls $minisito[0] -> href
How can I create another variable using file_get_html command to extract data from this new urls?
I'm working with a DOM parser that grabs links from a website by the class thumbnail. This returns a list of links. They are then converted to their image state and shown on the page. The problem I'm having is I have 2 different links that are getting returned:
http://i.imgur.com/randomstuffhere
AND
http://imgur.com/randomstuffhere
I need to filter the results for the links that DO NOT contain the i.imgur.com. If the link is a imgur link but does not contain the i. before I need to filter it out not to show.
I have this so far and I cannot figure out where I've gone wrong... Any suggestions?
<?php
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hyperlinks = $xpath->evaluate('//a[#class="thumbnail "]');
foreach($hyperlinks as $hyperlink) {
if (preg_match("/http://imgur.com/", $hyperlink->getAttribute('href'))){
}
else{
echo "<img style='padding-left:30%' width=\"500\" src=\"" . $hyperlink->getAttribute('href') . "\" alt=\"\" />";
echo "<br />";
}
}
?>
You need to escape the // in http:// with \/\/.
You should probably use strpos, though.
if(strpos($hyperlink->getAttribute('href'), 'http://i.imgur.com/') !== FALSE){
echo "This is an i.imgur.com link!";
}