I've been using DOMXPATH and I love it, but I need it to be a little more intuitive.
Some clients add some extra HTML in their code, which screws up our project.
Example 1:
<div id="Fooen">
<span class="FooTitle">Overdracht</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 299.000,-</span>
</span>
<span class="Foo aanvaarding">
<span class="FooName">Aanvaarding</span>
<span class="FooValue">In overleg</span>
</span>
</div>
We can get the SPAN name and values fine with this:
$filtered = $domxpath->query("//div[#class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[#class='FooName'])", $myItem);
$name = strtolower(preg_replace('/\s*/', '', $temp_name));
$value = $domxpath->evaluate("string(descendant::span[#class='FooValue'])", $myItem);
}
But, sometimes the client added code, so the nodes are now deeper. I cannot seem to find an answer to this without mapping it all the way down.
Example 2:
<div id="Fooen">
<div>
<div class="blok-sizer"></div>
<div id="" class="block">
<div class="top">
<div class="center column"></div>
</div>
<div class="middle">
<div class="center column">
<span class="FooTitle">Overdracht</span>
<span class="Foo first transactiestatus">
<span class="FooName">Status</span>
<span class="FooValue">Beschikbaar</span>
</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 975.000,-</span>
</span>
</div>
</div>
</div>
</div>
</div>
But now, this won't work:
$filtered = $domxpath->query("//div[#class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[#class='FooName'])", $myItem);
$name = strtolower(preg_replace('/\s*/', '', $temp_name));
$value = $domxpath->evaluate("string(descendant::span[#class='FooValue'])", $myItem);
}
I have tried variations like these:
$domxpath->evaluate("string(descendant::*[#class='FooName'])", $myItem);
$domxpath->evaluate("string(//*[#class='FooName'])", $myItem);
$domxpath->evaluate("string(*[#class='FooName'])", $myItem);
$domxpath->evaluate("string(.//span[#class='FooName'])", $myItem);
Is there a way to get the outcome of a string, even if it is not at the same place each time, thus more flexible?
Edit, here is a ready to copy/paste sample I am currently working with. First is the working one, second is the one I'd like to get working from root to end and not fixed but flexible. If I knew how to fiddle, I would, sorry.
<?php
function getDom($url = "")
{
$str = $url;
$internalErrors = libxml_use_internal_errors(true);
$dom = new \DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($str);
libxml_use_internal_errors($internalErrors);
return $dom;
}
$domcode = '<div class="Fooen">
<span class="FooTitle">Overdracht</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 299.000,-</span>
</span>
<span class="Foo aanvaarding">
<span class="FooName">Aanvaarding</span>
<span class="FooValue">In overleg</span>
</span>
</div>';
$dom = getDom($domcode);
$html = '';
$domxpath = new \DOMXPath($dom);
$newDom = new \DOMDocument;
$newDom->formatOutput = true;
$filtered = $domxpath->query("//div[#class='Fooen']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[#class='FooName'])", $myItem);
echo strtolower(preg_replace('/\s*/', '', $temp_name));
echo " = ";
echo $domxpath->evaluate("string(descendant::span[#class='FooValue'])", $myItem);
echo "<br>";
}
echo "<br>";
$domcode = '
<div class="Fooen">
<div>
<div class="blok-sizer"></div>
<div id="" class="block">
<div class="top">
<div class="center column"></div>
</div>
<div class="middle">
<div class="center column">
<span class="FooTitle">Overdracht</span>
<span class="Foo first transactiestatus">
<span class="FooName">Status</span>
<span class="FooValue">Beschikbaar</span>
</span>
<span class="Foo koopprijs">
<span class="FooName">Vraagprijs</span>
<span class="FooValue">€ 975.000,-</span>
</span>
</div>
</div>
</div>
</div>
</div>';
$dom = getDom($domcode);
$html = '';
$domxpath = new \DOMXPath($dom);
$newDom = new \DOMDocument;
$newDom->formatOutput = true;
$filtered = $domxpath->query("//div[#class='center column']/span");
foreach ($filtered as $myItem) {
$temp_name = $domxpath->evaluate("string(descendant::span[#class='FooName'])", $myItem);
echo "<br>";
echo strtolower(preg_replace('/\s*/', '', $temp_name));
echo " = ";
echo $domxpath->evaluate("string(descendant::span[#class='FooValue'])", $myItem);
}
Turns out I had been beating the wrong line of code all day. Apparently I needed to broaden the Filtered Search. If there's room for non-greedy code I'm all ears. Otherwise, I hope it helps somebody else.
$filtered = $domxpath->query("//div[#class='Fooen']/descendant::span");
Related
A web development dummy here :)
How do I put a php variable inside an html tag? for example, here I want to print each product's name, price, and image
(also could you please suggest whether the way I retrieve the image is correct?)
<?php
$doc = new DOMDocument();
$doc->load('database/products.xml');
$products = $doc->getElementsByTagName("fruit");
foreach ($products as $fruit) {
$names = $fruit->getElementsByTagName("name");
$name = $names->item(0)->nodeValue;
$prices = $fruit->getElementsByTagName("price");
$price = $prices->item(0)->nodeValue;
$images = $fruit->getElementsByTagName("image");
$image = $images->item(0)->nodeValue;
echo "<b>$name - $price - $image\n</b><br>";
echo'
<div class="container">
<a href="p3Apples.html">
<img src="img/'.$image.'" class="item-image">
<div class=‘iamge-title’>$name</div>
<div class=‘item-price’> $.$price </div>
<a href=‘shoppingcart.html’ class=‘b-menu’>
<img id=‘test’ src=‘img/addToCart.png’> </a>
</form>
</a>
</div>
';
};
?>
When using ' variables aren't processed, use " in this case
$doc = new DOMDocument();
$doc->load('database/products.xml');
$products = $doc->getElementsByTagName("fruit");
foreach ($products as $fruit) {
$names = $fruit->getElementsByTagName("name");
$name = $names->item(0)->nodeValue;
$prices = $fruit->getElementsByTagName("price");
$price = $prices->item(0)->nodeValue;
$images = $fruit->getElementsByTagName("image");
$image = $images->item(0)->nodeValue;
echo "<b>$name - $price - $image\n</b><br>";
echo "
<div class='container'>
<a href='p3Apples.html'>
<img src='img/".$image."' class='item-image'>
<div class='iamge-title'>$name</div>
<div class='item-price'> $.".$price."</div>
<a href='shoppingcart.html' class='b-menu'>
<img id='test' src='img/addToCart.png'>
</a>
</form>
</a>
</div>
";
}
I have a lot of these in my html document:
<div class="thumb-under">
<p class="title">
text
</p>
<p class="metadata">
<span class="bg">
<span class="first">3A
</span>
<a href="4A">
<span class="name">5A
</span></a>
<span>
<span class="spring"> -
</span> 6A
<span class="spring">something
</span>
</span>
<span class="spring"> -
</span>
</span>
</p>
</div>
so I need to extract data from positions 1A, 2A, 3A, 4A, 5A, 6A
I tried this but something I am doing wrong:
$matches = array();
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach($dom->getElementsByTagName('p') as $tr) {
if ( ! $tr->hasAttribute('class')) {
continue;
}
$class = explode(' ', $tr->getAttribute('class'));
if (in_array('title', $class)) {
$matches[] = $tr->getElementsByTagName('a');
}
}
print_r($matches);
I am totally lost...
I have a page that runs off a local webserver that is uses SQLite as its database. As its used local I am not worried about listing all results on one page as they load super fast. I am having an issue with it though as after 500 results are displayed from SQLite3 the formatting goes all wonky and starts stacking them on top of each other. Everything before that is fine. Its written in php. Info was entered into the database using htmlspecialchars so I dont believe that is the issue. The code that builds each record in the loop is
$list = '';
while($row = $results->fetchArray()) {
$id = $row["id"];
$MovieTitle = $row["MovieTitle"];
$MovieYear = $row["MovieDate"];
$MovieRes = $row["MovieRes"];
$FileName = $row["FileName"];
$Summary = $row["Summary"];
$Genres = $row["Genres"];
$PictureLocation = $row["PictureLocation"];
$Rating = $row["Rating"];
$ReleaseDate = $row["ReleaseDate"];
$list .= '<div class="box">
<div class="movie">
<div class="movie-image"><span class="play"><span class="name">'.$MovieTitle.'</span></span><img src="'.$ThumbnailPic.'" alt=""></div>
<div class="rating">
<p>RATING: '.$Rating.'</p>
<div class="stars">
<div class="'.$StarGraphic.'"></div>
</div>
<span class="comments"></span></div>
</div>';
}
and i just echo them them in the html as such
<html>
<body>
<div id="main">
<br>
<?php echo $list; ?>
</div>
</body>
</html>
Your HTML is wrong, you did not close <div class="box"> and <span class="play"> tags properly.
Correct HTML is:
<div class="box">
<div class="movie">
<div class="movie-image">
<span class="play">
<a href="movielist.php?movie='.$FileName.'">
<span class="name">'.$MovieTitle.'</span>
<img src="'.$ThumbnailPic.'" alt="">
</a>
</span>
</div>
<div class="rating">
<p>
RATING: '.$Rating.'
</p>
<div class="stars">
<div class="'.$StarGraphic.'"></div>
</div>
<span class="comments"></span>
</div>
</div>
</div>
Aso, you can have some tags or quotes in your database records. So you have to use escaping your variables before output http://php.net/manual/en/function.htmlspecialchars.php
Something like this:
$list = '';
while($row = $results->fetchArray()) {
$id = htmlspecialchars($row["id"]);
$MovieTitle = htmlspecialchars($row["MovieTitle"]);
$MovieYear = htmlspecialchars($row["MovieDate"]);
$MovieRes = htmlspecialchars($row["MovieRes"]);
$FileName = htmlspecialchars($row["FileName"]);
$Summary = htmlspecialchars($row["Summary"]);
$Genres = htmlspecialchars($row["Genres"]);
$PictureLocation = htmlspecialchars($row["PictureLocation"]);
$Rating = htmlspecialchars($row["Rating"]);
$ReleaseDate = htmlspecialchars($row["ReleaseDate"]);
$list .= '<div class="box">
<div class="movie">
<div class="movie-image"><span class="play"><span class="name">'.$MovieTitle.'</span></span><img src="'.$ThumbnailPic.'" alt=""></div>
<div class="rating">
<p>RATING: '.$Rating.'</p>
<div class="stars">
<div class="'.$StarGraphic.'"></div>
</div>
<span class="comments"></span></div>
</div>';
}
i need a data "2.5 (0.5)" and "3.5"
my pattern is '/class="match_total_goal_div">.+</s'
But it is not working.
Please help.
<div class="match_total_goal_div">
2.5 (0.5) </div>
<div class="match_half_goal_div hide" ">
</div>
</td>
<td class="text-center corner_goal_range">
<div>
<span class="newlabel">N.A.</span>
</div>
.
.
.
<div class="match_total_goal_div">
3.5 </div>
.
.
.
First, you need to add brackets around your .+ to capture the desired data. By the way you need a question mark: .+?.
Hope this can help you
$str = '<div class="match_total_goal_div">
2.5 (0.5) </div>
<div class="match_total_goal_div">
3.5 </div>';
$pattern = '/class="match_total_goal_div">(.+?)</s';
preg_match_all($pattern, $str, $matches);
var_dump($matches);
Check this code to accomplish your goal
<?php
$html = '<div class="match_total_goal_div">
2.5 (0.5) </div>
<div class="match_half_goal_div hide">
</div>
<td class="text-center corner_goal_range"></td>
<div>
<span class="newlabel">N.A.</span>
</div>
<div class="match_total_goal_div">
3.5 </div>';
$DOM = new DOMDocument();
$DOM->loadHTML($html);
$finder = new DomXPath($DOM);
$classname = 'match_total_goal_div';
$nodes = $finder->query("//*[contains(#class, '$classname')]");
foreach ($nodes as $node) {
echo $node->nodeValue."\n";
}
?>
Live demo : http://sandbox.onlinephpfunctions.com/code/b3e645ac56b9f7bf57d4519abd6b1be90ed87945
I need to find and replace some html elements inside an html code (I followed this answer: Getting an element from PHP DOM and changing its value), to do so I retrieve the content with:
$transport = $observer->getTransport();
$html = $transport->getHtml();
$dom = new Zend_Dom_Query($html);
$document = $dom->getDocument();
and this is the result:
<div class="page-title category-title">
<h1>Title</h1>
</div>
<div class="category-products">
<div class="toolbar">
<div class="pager">
<p class="amount">Items 2 to 2 of 2 total</p>
<div class="limiter">
<label>Show</label>
<select onchange="setLocation(this.value)">
<option value="limit=1" selected="selected">1</option>
</select>per page</div>
<div class="pages"> <strong>Page:</strong>
<ol>
<li>
<a class="previous i-previous" href="p=1" title="Previous">
<img src="skin/frontend/default/default/images/pager_arrow_left.gif" alt="Previous" class="v-middle" />
</a>
</li>
<li>1
</li>
<li class="current">2</li>
</ol>
</div>
</div>
<div class="sorter">
<p class="view-mode">
<label>View as:</label> <strong title="Grid" class="grid">Grid</strong> List </p>
<div class="sort-by">
<label>Sort By</label>
<select onchange="setLocation(this.value)">
<option value="dir=asc&order=position" selected="selected">Position</option>
<option value="dir=asc&order=name">Name</option>
<option value="dir=asc&order=price">Price</option>
</select> <img src="skin/frontend/default/default/images/i_asc_arrow.gif" alt="Set Descending Direction" class="v-middle" />
</div>
</div>
</div>
<ul class="products-grid">
<li class="item first">
<a href="test/a-2.html" title="a" class="product-image">
<img src="media/catalog/product/cache/1/small_image/135x/9df78eab33525d08d6e5fb8d27136e95/images/catalog/product/placeholder/small_image.jpg" width="135" height="135" alt="a" />
</a>
<h2 class="product-name">a</h2>
<div class="price-box"> <span class="regular-price" id="product-price-2">
<span class="price">$1.00</span> </span>
</div>
<div class="actions">
<button type="button" title="Add to Cart" class="button btn-cart" onclick="setLocation('test/a-2.html')"><span><span>Add to Cart</span></span>
</button>
<ul class="add-to-links">
<li>
Add to Wishlist
</li>
<li>
<span class="separator">|</span> Add to Compare
</li>
</ul>
</div>
</li>
</ul>
<script type="text/javascript">
decorateGeneric($$('ul.products-grid'), ['odd', 'even', 'first', 'last'])
</script>
<div class="toolbar-bottom">
<div class="toolbar">
<div class="pager">
<p class="amount">Items 2 to 2 of 2 total</p>
<div class="limiter">
<label>Show</label>
<select onchange="setLocation(this.value)">
<option value="limit=1" selected="selected">1</option>
</select>per page</div>
<div class="pages"> <strong>Page:</strong>
<ol>
<li>
<a class="previous i-previous" href="p=1" title="Previous">
<img src="skin/frontend/default/default/images/pager_arrow_left.gif" alt="Previous" class="v-middle" />
</a>
</li>
<li>1
</li>
<li class="current">2</li>
</ol>
</div>
</div>
<div class="sorter">
<p class="view-mode">
<label>View as:</label> <strong title="Grid" class="grid">Grid</strong> List </p>
<div class="sort-by">
<label>Sort By</label>
<select onchange="setLocation(this.value)">
<option value="dir=asc&order=position" selected="selected">Position</option>
<option value="dir=asc&order=name">Name</option>
<option value="dir=asc&order=price">Price</option>
</select> <img src="skin/frontend/default/default/images/i_asc_arrow.gif" alt="Set Descending Direction" class="v-middle" />
</div>
</div>
</div>
</div>
</div>
To find the lements I use Zend_Dom_Query:
$transport = $observer->getTransport();
$html = $transport->getHtml();
$dom = new Zend_Dom_Query($html);
$document = $dom->getDocument();
if(!is_object($document)){
Mage::log(print_r($document, TRUE), null, 'mylogfile1.log');
$transport->setHtml($html);
exit();
}
$node = $document->createElement("p", "This product isn't available in your country.");
Unfortunately it always exit in obeject check otherwise it returns this error:
Fatal error: Call to a member function createElement() on a non-object
EDIT
Full code, if anyone wants to see where I retrieve content (I have added some comments to be more clear):
//retrieve html from observer
$transport = $observer->getTransport();
$html = $transport->getHtml();
//Retrieve other info
$stored = json_decode(Mage::getStoreConfig('razorphyn/country/buttons'));
$theme=trim(Mage::getSingleton('core/design_package')->getTheme('frontend'));
$dom = new Zend_Dom_Query($html);
$document = $dom->getDocument();
//check if $document is an object
if(!is_object($document)){
Mage::log(print_r($document, TRUE), null, 'mylogfile1.log');
$transport->setHtml($html);
exit();
}
//Create node that will replace the finded ones
$node = $document->createElement("p", "This product isn't available in your country.");
$elArray=array();
$productsIds= array();
//Retrieve products id if button and store query results
if($stored[$theme]['isOnClick']){
$queryDom='button'.$stored[$theme]['class'].'[onclick*="/checkout/cart/add/"]';
$results = $dom->query($queryDom);
foreach ($results as $result) {
preg_match("/checkout\/cart\/add.+\/([0-9]+)\//",$result->getAttribute('onclick'),$currentProdId);
$elArray[$currentProdId[0]]=$result;
$productsIds[]=$currentProdId[0];
}
}
//Retrieve products id if form, runa nother query to find button and store query results
else{
$queryDom='form'.$stored[$theme]['formId'].'[action*="/checkout/cart/add/"]';
$results = $dom->query($queryDom);
foreach ($results as $result) {
preg_match("/checkout\/cart\/add.+\/([0-9]+)\//",$result->getAttribute('action'),$currentProdId);
if($currentProdId[0] && is_numeric($currentProdId[0])){
$productsIds[]=$currentProdId[0];
$formDOM = new Zend_Dom_Query($result);
$formButton = $dom->query('button'.$stored[$theme]['class']);
foreach($formButton as $child){
$elArray[$currentProdId[0]]=$child;
}
}
}
}
//Retrieve info from table
$collection = Mage::getModel('razorphyn_country/product')->getCollection()
->addFieldToFilter('active', 1)
->addAttributeToFilter('productId', array('in' => $productsIds));
$res = $collection->getFirstItem();
$country = Mage::getSingleton('core/session')->getCustomerCountry;
//Replace items
if(isset($res->allowed)){
foreach($collection as $res){
if(isset($res->allowed) && (($res->allowed==0 && strpos($res->country, $country) !== false) || ($res->allowed==1 && strpos($res->country, $country) === false))){
$document = $document->replaceChild($node,$elArray[$res->productId]);
}
}
}
//Return edited html
$html = $document->saveHTML();
$transport->setHtml($html);
You realise you are not selecting an element but creating one with createElement()?
I'm not sure where you want to place the paragraph, but let's say in the div.category-products. So let's try something like this;
$transport = $observer->getTransport();
$html = $transport->getHtml();
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//div[#class="category-products"]');
foreach($nodes as $node) {
$newNode = $dom->createElement("p", "This product isn't available in your country.");
$node->insertBefore($newNode, $node->nextSibling());
}