Php simple html dom parser find string with any character - php

I have this html
<div class="price-box">
<p class="old-price">
<span class="price-label">This:</span>
<span class="price" id="old-price-326">
8,69 € </span>
</p>
<p class="special-price">
<span class="price-label">This is:</span>
<span class="price" id="product-price-326">
1,99 € </span> <span style="">/ 6.87 </span>
</p>
</div>
I'm need get "1,99 €", but the id 'product-price-326' is generating random numbers. How to find 'product-price-*'? I'm trying
foreach($preke->find('span[id="product-price-[0-9]"]') as $div)
and
foreach($preke->find('span[id="product-price-"]') as $div)
but it doesn't work.

As per my comment, here's what you need to do:
foreach($preke->find('span[id^="product-price-"]') as $div) {} // note the ^ before the =
^= means starts with.

I am not sure what $preke is, but if it's a DOM selector that supports proper class selectors you can use
$preke->find('span[id^="product-price"]')
or
$preke->find('span[id*="product-price"]')
The ^= tells it to look for elements that has an ID starting with "product-price" and the *= tells it to look for elements that has an ID that contains "product-price".

Try Like This Might Be Works
foreach($preke->find('span[id^="product-price-"]') as $div) { /* Code */ }

why not to get it using class?
echo $preke->find('.special-price', 0)->find('.price', 0)->plaintext;
this will get you 1,99 €

Related

Get the content from HTML tags

I'm trying to get the content from the html tags
function get_model($html){
return preg_match('!<b>Model:</b>(.*?)<br>!i', $html, $matches) ? $matches[1] : '';
}
But, it returns "" string.
The entire html code looks like:
<div class="prodInfo">
<div class="prodOptions">
<div class="redBtn">
-
<input type="text" class="tnyTxt" value="1" name="quantity"/>
+
</div>
<br/>
<a href="/0-30cb9a-adjustable-pan-connector-p-mw555"
onclick="addToCart(139, $('.tnyTxt').val() ); return false;" class="redBtn"
id="button-cart">Add to Cart</a>
</div>
<p>
<b>Our Price: <span class="price">£5.55</span></b><br/>
<span class="grey">
(Exc. 20% VAT)<br/>
(£6.66 Inc. VAT)
</span>
</p>
<p>
<b>Model:</b> MW555<br/>
<b>Availability:</b> 2 - 3 Days</p>
</div>
I'm not quite understand why is this? even if I write preg_match('!<b>Model:</b>) it also return empty result. Could you help me please?
Please use this PHP Simple HTML DOM Parser.
This question have also duplicate :-
How parse HTML in PHP?
I prefer You to use phpQuery for this job.

Remove the closing </ span> from the expression contains more than one

There is a very large piece of code that does not work out well when specific syntax html.
There is an expression:
<span class="*0">
<span class="*1">TEXT</span>
...
<span class="*2">TEXT</span>
</span>
There is a regular expression:
$mstr = '#<span class="0">(.*?)</span>#';
What is needed:
Cut the upper span (<span class = "* 0">) with the correct closing tag.
My regular cuts out the first in a row :(
Here is a solution. I don't know if it fits your needs, but it does the job. It simply looks for all the starting tags and closing tags, stores their substring positions and pairs them. Then it removes the tag with the class you need.
One note: if a tag is not propperly closed, this could fail. So I would suggest you build in some safety measures.
$start_pos=stripos($var,'<span class="*0">');
$len=strlen($var);
$str_len=strlen('<span class="*0">');
$offset=0;
do{
$p=stripos($var,'<span',$offset);
if($p===false){break;}
$open_pos[]=$p;
$offset=$p+1;
}while($offset<$len);
$offset=0;
do{
$p=stripos($var,'</span>',$offset);
if($p===false){break;}
$close_pos[]=$p;
$offset=$p+1;
}while($offset<$len);
$t=0;
do{
$change=false;
for($i=0;$i<count($open_pos)-1;$i++){
foreach($close_pos as $k=>$v){
if($open_pos[$i+1]>$v){
if($open_pos[$i]==$start_pos){
$end_pos=$v;
break 3;
}
unset($open_pos[$i],$close_pos[$k]);
$open_pos=array_values($open_pos);
$close_pos=array_values($close_pos);
$change=true;
break 2;
}
}
}
if($open_pos[$i]!=$start_pos){
unset($open_pos[$i],$close_pos[0]);
$open_pos=array_values($open_pos);
$close_pos=array_values($close_pos);
$change=true;
}
else{
$end_pos=$close_pos[0];
break 3;
}
if(count($open_pos)<2)break;
$t++;
}while($t<1000);
$var=substr_replace($var,'###',$end_pos,7);
$var=substr_replace($var,'###',$start_pos,$str_len);
echo $var;
Tested on this beautiful HTML:
$var='<span class="*A">a
<span class="*B">b
<span class="*E">e</span>
<span class="*C">c
<span class="*D">d
<span class="*E">e</span>
<span class="*0">BEFORE THIS ONE
<span class="*F">a</span>
<span class="*G">g
<span class="*H">h
<span class="*J">j</span>
</span>
<span class="*K">k</span>
<span class="*L">l</span>
<span class="*M">m</span>
_GGG</span>
<span class="*N">n</span>
BETWEEN</span>BETWEEN
<span class="*O">o
<span class="*P">p</span>
_OOO</span>
</span>
_CCC</span>
<span class="*Q">q
<span class="*R">r</span>
_RRR</span>
</span>
</span>
';

Store the values of nested DOMNodes in a PHP array

I have the following html structure:
<span class="1">
<span class="name">
</span>
<span class="books">
<span class="english">
</span>
<span class="english">
</span>
</span>
</span>
<span class="2">
<span class="name">
</span>
<span class="books">
<span class="english">
</span>
<span class="english">
</span>
</span>
</span>
...
I am using the following function to retrieve it:
$oDomObject = $oDomXpath->query("//span[number(#class)=number(#class)]");
How can I store the values in a PHP array keeping the nesting order?
foreach ($oDomObject as $oObject) {
..*SOMETHING*..
}
Thank you for your help!
You will want to build a recursive function that resembles the following.
WARNING: Not-tested and may require some tweaking. But this should put your head in the right place.
foreach ($oDomObject as $oObject) {
$myArray[] = getChildren($oObject);
}
function getChildren($nodeObj) {
retArray = array();
if($nodeObj->hasChildren()) {
$retArray[] = getChildren($nodeObj);
} else {
$retArray[] = $nodeObj->nodeValue;
}
return $retArray;
}
What it does: If it encounters a node without children, it appends the value to the array. If not, it appends an array of the children's values to the array. This occurs ad nauseam, and as deeply as you can wrap your head around.
Things to think about:
What do I want my array to look like when this finishes, because with certain levels of depth, this gets very ridiculous and very annoying to traverse.
Why am I appending to an array, which I am likely to loop through again, instead of handling the desired operation right now?

Find all elements except for those with certain class with simple_html_dom.php

I am using the simple_html_dom parser and I want to fetch data from html code that looks like this:
<pre class="root">
<span class="B bgB"></span>
<span class="B bgB"></span>
<span class="B bgB"></span>
<span class="B bgB"></span>
<span class="W"></span>
<span class="Y DH"> </span>
<span class="Y DH">Some text</span>
</pre>
etc..
But I only want to get the content from the ones without the bgB class. So far I have this code:
$elements = $html->find('pre.root span[class!=bgB]');
But all spans are fetched and later printed, not only the ones without the bgB class. How can I accomplish this?
It can't be done with simple but if you switch to this one you can use the css :not pseudo:
$html = str_get_html($str);
$elements = $html->find('pre.root span:not(.bgB)');

Capturing Words in HTML tag

I want to know what is the most regular optimized to capture keywords in an HTML text expression.
Note that I am using PHP.
I have a piece of HTML code like this:
...
<li><span class="fl">
Dish</span>
<div class="oflow">
<span class="1F4446484E1FCB4FC3C21FC04AC6C21E232020211F underline">
pasta</span>
, <span class="1F4446484E1FCB4FC3C21FC04AC6C21E23202A251F underline">
rice</span>
, <span class="1F4446484E1FCB4FC3C21FC04AC6C21E2320202B1F underline">
potatoes</span>
</div>
</li>
...
I want to select the available dishes (pasta, rice and potatoes), knowing that the only word that is always the same is "Dish" and that there's always a span between each keyword that I would recover.
Thank you in advance.
<?php
var $aDishes = explode(',', strip_tags($sHtml));
?>

Categories