extract title xpath if code xpath is present - php

i am trying to extract coupon codes and if the code is present then get the corresponding title too but unable to do so.
in the code below i am able to extract the coupon codes correctly but how do i get the corresponding title to be extracted oo. as you can see in the link some titles don't have coupon codes...
<?php
$html = file_get_contents('http://www.grabon.in/abof-coupons/'); //get the html returned from the following url
$mydoc = new DOMDocument();
libxml_use_internal_errors(TRUE); //disable libxml errors
if(!empty($html)){ //if any html is actually returned
$mydoc->loadHTML($html);
libxml_clear_errors(); //remove errors for yucky html
$my_xpath = new DOMXPath($mydoc);
//get all the codes
$my_code = $my_xpath->query('//*[#class="coupon-click"]//a//small');
if($my_code->length > 0){
foreach($my_code as $row){
$my_row = $my_xpath->query('//*[#class="h3_click"]');
echo $code->nodeValue . "<br/>";
}
}
}
?>
thanx fusion3k the code works perfectly but using ur code i tried for different url as below and get the error Notice: Trying to get property of non-object
<?php
$html = file_get_contents('http://official.deals/ebay-coupons?coupon-id=1055981&h=ed68f1b2a5b28471ecf9584734d65742&utm_source=coupon_page&utm_medium=deal_reveal&utm_campaign=od_deal_click#ebay1055981'); //get the html returned from the following url
$mydoc = new DOMDocument();
libxml_use_internal_errors(TRUE); //disable libxml errors
if(empty($html)) die("EMPTY HTML");
$mydoc->loadHTML($html);
libxml_clear_errors(); //remove errors for yucky html
$my_xpath = new DOMXPath($mydoc);
//////////////////////////////////////////////////////
$result = array();
$nodes = $my_xpath->query( '//div[#data-rowtype="1"]' );
foreach( $nodes as $node )
{
$title = $my_xpath->query( 'div[#class="cop-head"]/h4', $node )->item(0)->nodeValue;
$found = $my_xpath->query( 'div[#class="cop-head"]/div/input/value', $node );
$coupon = ( $found->length ) ? $found->item(0)->nodeValue : '' ;
$result[] = compact( 'title', 'coupon' );
}
echo '<pre>';
print_r($result);
echo '</pre>';
?>

If you want retrieve also boxes without coupon, you have to proceed in a different way: retrieve all boxes and, for each box, find if a coupon code exists.
Init an array to store results:
$result = array();
Search for boxes ( <li> nodes with class “coupon-list-item ” ):
$nodes = $my_xpath->query( '//li[#class="coupon-list-item "]' );
# ↑ pay attention!
Then analyze each node through a foreach loop:
foreach( $nodes as $node )
{
Match titles:
$title = $my_xpath->query( 'div/b[#class="h3_click"]', $node )->item(0)->nodeValue;
# ┊ ┊
# No starting slashes, pattern is node-relative ┊
# Second optional xpath->query parameter define the search context
Then search for coupons, if it exists:
$found = $my_xpath->query( 'div[#class="coupon-actions"]/div/a/small', $node );
$coupon = ( $found->length ) ? $found->item(0)->nodeValue : '' ;
At the end, you can add a sub-array to $result using the fabulous downplayed compact function:
$result[] = compact( 'title', 'coupon' );
}
If you want, you can also add related coupons in similar way:
$nodes = $my_xpath->query( '//div[#class="related-coupons"]/*/div[#class="col-sm-8"]' );
foreach( $nodes as $node )
{
$title = $my_xpath->query( 'div/div[#class="coupon-title"]', $node )->item(0)->nodeValue;
$found = $my_xpath->query( 'div/div[#class="coupon-click"]/a/small', $node );
$coupon = ( $found->length ) ? $found->item(0)->nodeValue : '' ;
$result[] = compact( 'title', 'coupon' );
}
At the end, $result looks like this:
Array
(
[0] => Array
(
[title] => Upto 80% OFF + Extra Rs. 500 OFF On Rs 1495 - All Users
[coupon] => ABOFBMF500C
)
(...)
[14] => Array
(
[title] => Fresh Arrivals on Women & Men Collection
[coupon] =>
)
(...)
)
phpFiddle demo

Related

Why does not display the attribute html via xpath php

Why does not display the attribute html via xpath php
<?php
$content = '<div class="keep-me">Keep this div</div><div class="remove-me" id="test">Remove this div</div>';
$badClasses = array('');
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($content);
libxml_clear_errors();
$xPath = new DOMXpath($dom);
foreach($badClasses as $badClass){
$domNodeList = $xPath->query('//div[#class="remove-me"]/#id');
$domElemsToRemove = ''; // container of deleted elements
foreach ( $domNodeList as $domElement ) {
$domElemsToRemove .= $dom->saveHTML($domElement); // concat them
$domElement->parentNode->removeChild($domElement); // then remove
}
}
$content = $dom->saveHTML();
echo htmlentities($domElemsToRemove);
?>
Works - //div[#class="remove-me"] or //div[#class="remove-me"]/text()
Not working - //div[#class="remove-me"]/#id
Maybe there is a way easier
The XPath //div[#class="remove-me"]/#id is correct, but you need to just loop over the returned elements and add the nodeValue to a list of matching ID's...
$xPath = new DOMXpath($dom);
$domNodeList = $xPath->query('//div[#class="remove-me"]/#id');
$ids = []; // container of deleted elements
foreach ( $domNodeList as $domElement ) {
$ids[] = $domElement->nodeValue;
}
print_r($ids);
If the aim is to fetch the ID of any element with class "remove-me" as is how I interpret the question then perhaps you can try like this - untested btw...
.... other code before
$xp=new DOMXpath( $dom );
$col= $xp->query( '*[#class="remove-me"]' );
if( $col->length > 0 ){
foreach($col as $node){
$id=$node->hasAttribute('id') ? $node->getAttribute('id') : 'banana';
echo $id;
}
}
however looking at the code in the question suggests that you wish to delete nodes - in which case build an array of nodes ( nodelist ) and iterate through it from the end to the front - ie: backwards...

Unable to insertBefore each DOMNodeList item in php. Only last item is updated

I am trying to add tag <input type="checkbox"> before every li that has class menu-item-has-children but the dom is updating only for the last item not all. The code is written below:
$dom = new DOMDocument();
$dom->loadHTML( $sanitized_menu );
$finder = new DOMXPath( $dom );
$inner_menus = $finder->query( "/html/body//li[ contains( #class, 'menu-item-has-children' ) ]");
// element to be added
$elem = $dom->createElement('input');
$elem_attr = $dom->createAttribute( 'type' );
$elem_attr->value = 'checkbox';
$elem->appendChild( $elem_attr );
$index = 0;
while( $index < $inner_menus->length ) {
$insert_val = $inner_menus->item( $index );
$insert_val->parentNode->insertBefore( $elem, $insert_val);
$index++;
}
$html = $dom->saveHTML();
print_r( $html );
You only create one input and then you append it multiple times.
Since an element can't exist in multiple places at once, that moves it.
Create the element inside the while loop.

Printing out an array to a file

I'm stuck on particular task. As you can see I'm extracting hrefs and title from webpage and I need to put this information to a file. But how this array can be printed in order like this: href1 : title1 , href2 : title2 and so on.
<?php
$searched = file_get_contents('http://technologijos.lt');
$xml = new DOMDocument();
#$xml->loadHTML($searched);
foreach($xml->getElementsByTagName('a') as $lnk)
{
$links[] = array(
'href' => $lnk->getAttribute('href'),
'title' => $lnk->getAttribute('title')
);
}
echo '<pre>'; print_r($links); echo '</pre>';
?>
Why not create the array directly in a way that is usable afterwards?
<?php
$searched = file_get_contents('http://technologijos.lt');
$xml = new DOMDocument();
#$xml->loadHTML($searched);
$links = [];
foreach($xml->getElementsByTagName('a') as $lnk) {
$links[] = sprintf(
'%s : %s',
$lnk->getAttribute('href'),
$lnk->getAttribute('title');
);
}
var_dump(implode(', ', $links);
Obviously the same can be done by using a second loop to iterate over the links array if it is create as shown in your example.

Trying to get property of non-object when trying to echo the attribute of element

i keep getting error Trying to get property of non-object on the line
$title->$my_xpath->query when running the script. i have the node and path correct but till not working
$nodes = $my_xpath->query( '//div[#class="info_coupon"]' );
foreach( $nodes as $node )
{
$title = $my_xpath->query( 'a', $node )->item(0)->nodeValue;
echo $title;
$code = $my_xpath->query( 'a/#data-code', $node );
if( $code->length>0 ) {
$coupon = $code->item(0)->nodeValue ;
echo $coupon;
}
}
There is some javascript manipulation on that page that does the generation of <a>. But in essence (when you got the initial $html), there is none.
Here's a snippet of what you're getting from the initial lines of your code:
$url = "http://zoutons.com/stores/paytm-coupons/";
$html = file_get_contents($url); <-- this one contains
This one:
<div class="info_coupon">
<span rel="nofollow" data-lnu="aHR0cDovL3RyYWNraW5nLnZjb21taXNzaW9uLmNvbS9hZmZfYz9vZmZlcl9pZD0xMDIyJmFmZl9pZD0yMDYwJnVybD1odHRwcyUzQSUyRiUyRnBheXRtLmNvbSUyRiUzRnV0bV90ZXJtJTNEe2FmZmlsaWF0ZV9pZH0=" href="http://zoutons.com/stores/paytm-coupons/?#cid=31215" class="heading affiliate affiliate_map c_data_31215" data-affiliate="aHR0cDovL3RyYWNraW5nLnZjb21taXNzaW9uLmNvbS9hZmZfYz9vZmZlcl9pZD0xMDIyJmFmZl9pZD0yMDYwJnVybD1odHRwcyUzQSUyRiUyRnBheXRtLmNvbSUyRiUzRnV0bV90ZXJtJTNEe2FmZmlsaWF0ZV9pZH0=" data-id="31215" data-code="NEW50" data-link_type="text" store="Paytm">GET FREE Rs.50/- ON RECHARGE (VALID TILL – APRIL 27)
</span>
So there is really no <a> after all.
But the data you're after is actually inside that <span>:
href="http://zoutons.com/stores/paytm-coupons/?#cid=31215"
data-code="NEW50"
So just get it there:
$nodes = $my_xpath->query( '//div[#class="info_coupon"]' );
foreach( $nodes as $node )
{
$title = $my_xpath->evaluate('string(./span/#href)', $node);
$code = $my_xpath->evaluate('string(./span/#data-code)', $node);
echo $title;
echo $code;
}

How can I get all attributes with PHP xpath?

Given the following HTML string:
<div
class="example-class"
data-caption="Example caption"
data-link="https://www.example.com"
data-image-url="https://example.com/example.jpg">
</div>
How can I use PHP with xpath to output / retrieve an array with all attributes as key / value pairs?
Hoping for output like:
Array
(
[data-caption] => Example caption
[data-link] => https://www.example.com
[data-image-url] => https://example.com/example.jpg
)
// etc etc...
I know how to get individual attributes, but I'm hoping to do it in one fell swoop. Here's what I currently have:
function get_data($html = '') {
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//div/#data-link');
foreach ($nodes as $node) {
var_dump($node);
}
}
Thanks!
In XPath, you can use #* to reference attributes of any name, for example :
$nodes = $xpath->query('//div/#*');
foreach ($nodes as $node) {
echo $node->nodeName ." : ". $node->nodeValue ."<br>";
}
eval.in demo
output :
class : example-class
data-caption : Example caption
data-link : https://www.example.com
data-image-url : https://example.com/example.jpg
I think this should do what you want - or at least, give you the basis to proceed.
define('BR','<br />');
$strhtml='<div
class="example-class"
data-caption="Example caption"
data-link="https://www.example.com"
data-image-url="https://example.com/example.jpg">
</div>';
$dom=new DOMDocument;
$dom->loadHTML( $strhtml );
$xpath=new DOMXPath( $dom );
$col=$xpath->query('//div');
if( $col ){
foreach( $col as $node ) if( $node->nodeType==XML_ELEMENT_NODE ) {
foreach( $node->attributes as $attr ) echo $attr->nodeName.' '.$attr->nodeValue.BR;
}
}
$dom = $col = $xpath = null;

Categories