nested selector failed in using simple html dom parser - php

I want to get the link and scrape its content but I can';t event reach there. What's wrong with my nested selector?
my php
$dom = file_get_html('http://mojim.com/%E5%BF%83%E8%B7%B3.html?t3');
$tables = $dom->find('.iB');
$firstRow = $tables->find('tr',1)->find('td',4);
foreach ($firstRow as $value) {
echo $value;
}
?>
here is how the DOM look like

You just have a problem on pointing/traversing the correct element.
Example:
$dom = file_get_html('http://mojim.com/%E5%BF%83%E8%B7%B3.html?t3');
$firstRow = $dom->find('table.iB', 0)->find('tr', 1)->find('td', 3);
$link = $firstRow->find('a', 0);
echo $link->href . '<br/>' . $link->title;
Should output:
/twy100015x34x8.htm
心跳 歌詞 王力宏

Related

How to get value of onclick= using xpath?

I have a string that has lots of <li> sets of data. I want to get this value:
1: call.php?category=fruits&fruitid=123456
inside onclick using xpath . My current xpath doesn't get me the onclick value so I parse it further to get my required data ! Could any one tell me what is the correct xpath to get value of onclick?
libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($code2);
$xpath = new DOMXPath($dom);
// Empty array to hold all links to return
$result = array();
//Loop through each <li> tag in the dom
foreach($dom->getElementsByTagName('li') as $li) {
//Loop through each <a> tag within the li, then extract the node value
foreach($li->getElementsByTagName('a') as $links){
$result[] = $links->nodeValue;
echo $result[0] . "\n";
}
$onclicks = $xpath->query("//li/a/onclick");
foreach ($onclicks as $onclick) {
echo $onclick->nodeValue . "\n";
}
}
data:
<li><a id="FR123456" onclick="setFood(false);setSeasonFruitID('123456');getit('call.php?category=fruits&fruitid=123456&',detailFruit,false);">mango season</a><img src="http://imagehosting.com/images/fru_123456.png">
</li>
onclick is an attribute, and you use #attribute_name to reference attribute in XPath :
$onclicks = $xpath->query("//li/a/#onclick");
foreach ($onclicks as $onclick) {
echo $onclick->nodeValue . "\n";
}
Try something like this :
$onclicks = $xpath->query("//li/a");
foreach ($links as $link) {
echo $link->getAttribute('onclick'). "\n";
}

How Can i get the child element using class using php DOMXPath?

I want to get the child element with specific class form html I have manage to find the element using tag name but can't figureout how can I get the child emlement with specific class?
Here is my CODE:
<?php
$html = file_get_contents('myfileurl'); //get the html returned from the following url
$pokemon_doc = new DOMDocument();
libxml_use_internal_errors(TRUE); //disable libxml errors
if (!empty($html)) { //if any html is actually returned
$pokemon_doc->loadHTML($html);
libxml_clear_errors(); //remove errors for yucky html
$pokemon_xpath = new DOMXPath($pokemon_doc);
//get all the h2's with an id
$pokemon_row = $pokemon_xpath->query("//li[#class='content']");
if ($pokemon_row->length > 0) {
foreach ($pokemon_row as $row) {
$title = $row->getElementsByTagName('h3');
foreach ($title as $a) {
echo "Title: ";
echo strip_tags($a->nodeValue). '<br>';
}
$links = $row->getElementsByTagName('a');
foreach ($links as $l) {
echo "Link: ";
echo strip_tags($l->nodeValue). '<br>';
}
$desc = $row->getElementsByTagName('span');
//I tried that but didnt work..... iwant to get the span with class desc
//$desc = $row->query("//span[#class='desc']");
foreach ($desc as $d) {
echo "DESC: ";
echo strip_tags($d->nodeValue) . '<br><br>';
}
// echo $row->nodeValue . "<br/>";
}
}
}
?>
Please let me know if this is a duplicate but I cant find out or you think question is not good or not explaining well please let me know in comments.
Thanks.

PHP: How to find an element with particular name attribute in html (from url)

I am currently using PHP's file_get_contents($url) to fetch content from a URL. After getting the contents I need to inspect the given HTML chunk, find a 'select' that has a given name attribute, extract its options, and their values text. I am not sure how to go about this, I can use PHP's simplehtmldom class to parse html, but how do I get a particular 'select' with name 'union'
<span class="d3-box">
<select name='union' class="blockInput" >
<option value="">Select a option</option> ..
Page can have multiple 'select' boxes and hence I need to specifically look by name attribute
<?php
include_once("simple_html_dom.php");
$htmlContent = file_get_contents($url);
foreach($htmlContent->find(byname['union']) as $element)
echo 'option : value';
?>
Any sort of help is appreciated. Thank you in advance.
Try this PHP code:
<?php
require_once dirname(__FILE__) . "/simple_html_dom.php";
$url = "Your link here";
$htmlContent = str_get_html(file_get_contents($url));
foreach ($htmlContent->find("select[name='union'] option") as $element) {
$option = $element->plaintext;
$value = $element->getAttribute("value");
echo $option . ":" . $value . "<br>";
}
?>
how about this:
$htmlContent = file_get_html('your url');
$htmlContent->find('select[name= "union"]');
in object oriented way:
$html = new simple_html_dom();
$htmlContent = $html->load_file('your url');
$htmlContent->find('select[name= "union"]');
From DOMDocument documentation: http://www.php.net/manual/en/class.domdocument.php
$html = file_get_contents( $url );
$dom = new DOMDocument();
$dom->loadHTML( $html );
$selects = $dom->getElementsByTagName( 'select' );
$select = $selects->item(0);
// Assuming all children are options.
$children = $select->childNodes;
$options_values = array();
for ( $i = 0; $i < $children->length; $i++ )
{
$item = $children->item( $i );
$options_values[] = $item->nodeValue;
}

how to display data id, name?

My file xml:
<pasaz:Envelope>
<pasaz:Body>
<loadOffe>
<offe>
<off>
<id>120023</id>
<name>my name John</name>
<name>Test</name>
</off>
</offe>
</loadOffe>
</pasaz:Body>
</pasaz:Envelope>
How to view a php (id and name).
If you're just looking for a simple way to extract the contents of a tag, but don't want to go to all the trouble of parsing the XML properly, you could do something like this:
$xml = ""; // your xml data as a string
function get_tag_contents($xml, $tagName) {
$startPosition = strpos($xml, "<" . $tagName . ">");
$endPosition = strpos($xml, "</" . $tagName . ">");
$length = $endPosition - ($startPosition + 1);
return substr($xml, $startPosition, $length);
}
$id = get_tag_contents($xml, "id");
$name = get_tag_contents($xml, "name");
This assumes you haven't assigned any attributes to your tags, and that each tag is unique (in the example you gave us I noted two "name" tags, and if you want both you'll need to make this solution a bit more robust or do proper XML parsing).
How to get all items?
Example (does not work ..)
$pliks = simplexml_load_file("file.xml");
foreach ($pliks->children('pasaz', true) as $body)
{
foreach ($body->children() as $loadOffe)
{
if ($loadOffe->offe->off) {
echo "<p>id: $loadOffe->id</p>";
echo "$id->id";
echo "<p>name: <b>$name->name</b></p>";
}
}
// echo $loadOffe->offe->off->id;
}
As Marc B suggested in his comment you should use DOM, either use getElementsByTagName() or DOMXPath, example for getElementaByTagName():
$dom = new DOMDocument;
$dom->loadXML($xml);
$ids = $dom->getElementsByTagName('id');
if( $ids || !$ids->length){
throw new Exception( 'Id not found');
}
return $ids->item(0);

PHP how to count xml elements in object returned by simplexml_load_file(),

I have inherited some PHP code (but I've little PHP experience) and can't find how to count some elements in the object returned by simplexml_load_file()
The code is something like this
$xml = simplexml_load_file($feed);
for ($x=0; $x<6; $x++) {
$title = $xml->channel[0]->item[$x]->title[0];
echo "<li>" . $title . "</li>\n";
}
It assumes there will be at least 6 <item> elements but sometimes there are fewer so I get warning messages in the output on my development system (though not on live).
How do I extract a count of <item> elements in $xml->channel[0]?
Here are several options, from my most to least favourite (of the ones provided).
One option is to make use of the SimpleXMLIterator in conjunction with LimitIterator.
$xml = simplexml_load_file($feed, 'SimpleXMLIterator');
$items = new LimitIterator($xml->channel->item, 0, 6);
foreach ($items as $item) {
echo "<li>{$item->title}</li>\n";
}
If that looks too scary, or not scary enough, then another is to throw XPath into the mix.
$xml = simplexml_load_file($feed);
$items = $xml->xpath('/rss/channel/item[position() <= 6]');
foreach ($items as $item) {
echo "<li>{$item->title}</li>\n";
}
Finally, with little change to your existing code, there is also.
$xml = simplexml_load_file($feed);
for ($x=0; $x<6; $x++) {
// Break out of loop if no more items
if (!isset($xml->channel[0]->item[$x])) {
break;
}
$title = $xml->channel[0]->item[$x]->title[0];
echo "<li>" . $title . "</li>\n";
}
The easiest way is to use SimpleXMLElement::count() as:
$xml = simplexml_load_file($feed);
$num = $xml->channel[0]->count();
for ($x=0; $x<$num; $x++) {
$title = $xml->channel[0]->item[$x]->title[0];
echo "<li>" . $title . "</li>\n";
}
Also note that the return of $xml->channel[0] is a SimpleXMLElement object. This class implements the Traversable interface so we can use it directly in a foreach loop:
$xml = simplexml_load_file($feed);
foreach($xml->channel[0] as $item {
$title = $item->title[0];
echo "<li>" . $title . "</li>\n";
}
You get count by count($xml).
I always do it like this:
$xml = simplexml_load_file($feed);
foreach($xml as $key => $one_row) {
echo $one_row->some_xml_chield;
}

Categories