Exclude a group regular expression - php

I search in many threads in Stackoverflow but I didn't find anything relevant for my case.
Here is the source text :
<span class="red"><span>70</span><span style="display:none">1</span><span>,89</span> € TTC<br /></span>
I want to extract 70,89 with a regular expression.
So I tried :
<span class="red"><span>([0-9]+)(<\/span><span style="display:none">1<\/span><span>)(,[0-9]+)<\/span>
which returns an array (with preg_match_all in PHP) with 3 groups :
1/ 70
2/
</span><span style="display:none">1</span><span>
3/ ,89
I would like to exclude group 2 and merge 1 & 3.
So I also tried :
<span class="red"><span>([0-9]+)(?:<\/span><span style="display:none">1<\/span><span>)(,[0-9]+)<\/span>
but it returns :
70
,89
How can I merge the two groups ?
Thanks a lot for your answers, I am going to be crazy searching for this regular expression ! :)
Have a good day !

Just match the numbers that are wrapped with a plain <span>:
$str = '<span class="red"><span>70</span><span style="display:none">1</span><span>,89</span> € TTC<br /></span>';
if (preg_match_all('#<span>([,\d]+)</span>#', $str, $matches)) {
echo join('', $matches[1]);
}
// output: 70,89

Related

how to do echo from a string, only from values that are between a specific stretch[href tag] of the string?

[PHP]I have a variable for storing strings (a BIIGGG page source code as string), I want to echo only interesting strings (that I need to extract to use in a project, dozens of them), and they are inside the quotation marks of the tag
but I just want to capture the values that start with the letter: N (news)
[<a href="/news7044449/exclusive_news_sunday_"]
<a href="/n[ews7044449/exclusive_news_sunday_]"
that is, I think you will have to work with match using: [a href="/n]
how to do that to define that the echo will delete all the texts of the variable, showing only:
note that there are other hrefs tags with values that start with other letters, such as the letter 'P' : href="/profiles... (This does not interest me.)
$string = '</div><span class="news-hd-mark">HD</span></div><p>exclusive_news_sunday_</p><p class="metadata"><span class="bg">Czech AV<span class="mobile-hide"> - 5.4M Views</span>
- <span class="duration">7 min</span></span></p></div><script>xv.thumbs.preparenews(7044449);</script>
<div id="news_31720715" class="thumb-block "><div class="thumb-inside"><div class="thumb"><a href="/news31720715/my_sister_running_every_single_morning"><img src="https://static-hw.xnewss.com/img/lightbox/lightbox-blank.gif"';
I imagine something like this:
$removes_everything_except_values_from_the_href_tag_starting_with_the_letter_n = ('/something regex expresion I think /' or preg_match, substring?);
echo $string = str_replace($removes_everything_except_values_from_the_href_tag_starting_with_the_letter_n,'',$string);
expected output: /news7044449/exclusive_news_sunday_
NOTE: it is not essential to be through a variable, it can be from a .txt file the place where the extracts will be extracted, and not necessarily a variable.
thanks.
I believe this will help her.
<?php
$source = file_get_contents("code.html");
preg_match_all("/<a href=\"(\/n(?:.+?))\"[^>]*>/", $source, $results);
var_export( end($results) );
Step by Step Regex:
Regex Demo
Regex Debugger
To get just the links out of the $results array from Valdeir's answer:
foreach ($results as $r) {
echo $r;
// alt: to display them with an HTML break tag after each one
echo $r."<br>\n";
}

php regex preg_match find multiple values then Preg_match_all [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 1 year ago.
I have to find a values between specific tags in HTML page through php regex. but I want if HTML page contain multiple value then do preg_match_all otherwise do nothing.
For example if preg_match find 4 values in HTML then do preg_match_all in next phase otherwise if it is preg_match find only 1 tag value then do nothing.
<td class"page">
<span class="my-tag">value1</span>
<span class="my-tag">value2</span>
<span class="my-tag">value3</span>
<span class="my-tag">value4</span>
</td>
preg_match('/<td class"page">(.*?)<\/td>/s';)
now do preg_match_all in next phase because preg_match find 4 values
preg_match_all('|\<span class="my-tag"\>(.*?)\</span\>|', $html, $string);
and if HTML contain only 1 value like this
<td class"page">
<span class="my-tag">value1</span>
</td>
So if HTML contain only 1 value then do nothing
Basically, from your preg_match, you would be getting a string back that looks like this:
Array
(
[0] => <td class="page">
<span class="my-tag">value1</span>
<span class="my-tag">value2</span>
<span class="my-tag">value3</span>
<span class="my-tag">value4</span>
</td>
[1] =>
<span class="my-tag">value1</span>
<span class="my-tag">value2</span>
<span class="my-tag">value3</span>
<span class="my-tag">value4</span>
)
With that, we can just go ahead and do the match - regardless of if it only found one match or multiple matches. (Because it's not going to hurt anything to match one item or four items, I am proposing to move the logic down in your code.) Then we can just count how many it found and store that in a variable named $count.
// CHECK TO SEE IF WE FOUND A MATCH
if (isset($matches[1])) {
// GO AHEAD AND DO THE MATCH ON THE SPANS
preg_match_all('~<span class="my-tag">(.*?)</span>~s', $string, $span_matches);
$count = count($span_matches[1]);
// IF WE FOUND MULTIPLE MATCHES, LIST THEM OUT
if ($count > 1) {
print 'COUNT IS: '.$count;
print_r($span_matches[1]);
}
// WE DID NOT MATCH ANY SPAN TAGS
elseif ($count == 0) {
print 'COUNT IS ZERO - CRAP';
}
// IF WE ONLY FOUND ONE MATCH, WE DON'T NEED TO DO ANYTHING
else {
print 'COUNT IS EXACTLY 1 - DO NOTHING';
}
}
// WE DID NOT FIND AN INITAL MATCH TO BEGIN WITH
else {
print 'WE DID NOT FIND A MATCH';
}
From there, it's just a simple if/else statement to do what you want with it.
Here is a working demo:
http://ideone.com/SiPiOx
You can combine multiple patterns in a single regex with the pipe character in parentheses:
preg_match('/(cats?|dogs?|re.*tion)/', $string, $matches);

PHP regex preg_match numbers before a multiword string

I am trying to extract the number 203 from this sample.
Here is the sample I am running the regex against:
<span class="crAvgStars" style="white-space:no-wrap;"><span class="asinReviewsSummary" name="B00KFQ04CI" ref="cm_cr_if_acr_cm_cr_acr_pop_" getargs="{"tag":"","linkCode":"sp1"}">
<img src="https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/customer-reviews/ratings/stars-4-5._CB192238104_.gif" width="55" alt="4.3 out of 5 stars" align="absbottom" title="4.3 out of 5 stars" height="12" border="0" /> </span>(203 customer reviews)</span>
Here is the code I am using that does not work
preg_match('/^\D*(\d+)customer reviews.*$/',$results[0], $clean_results);
echo "<pre>";
print_r( $clean_results);
echo "</pre>";
//expecting 203
It is just returning
<pre>array ()</pre>
Your regexp has two problems.
First, there are other numbers in the string before the number of customer reviews (like 4.3 out of 5 stars and height="12"), but \D* prevents matching that -- it only matches if there are no digits anywhere between the beginning of the string and the number of reviews.
Second, you have no space between (\d+) and customer reviews, but the input string has a space there.
There's no need to match any of the string before and after the part that contains the number of customer reviews, just match the part you care about.
preg_match('/(\d+) customer reviews/',$results[0], $clean_results);
$num_reviews = $clean_results[1];
DEMO

PHP preg_replace numeric backreference not working

I am trying to wrap numbers inside a given string in a <span>.
$datitle = "hey hey 13";
$datitle = preg_replace('/[0-9]/', '<span class="title-number">$1</span>', $datitle);
However that returns:
hey hey <span class="title-number"></span><span class="title-number"></span>
With two empty spans, without the number inside.
What I want to get is:
hey hey <span class="title-number">13</span>
How do I use the number matched by preg_replace as a backreference?
First of all /[0-9]/ means One number from 0 to 9. This means 1 fits your regexp, 3 fits your regexp. Not 13.
Second - items which are not wrapped in a () are not stored as a result of regexp. But full regexp is stored in $0.
So proper code is:
$datitle = "hey hey 13";
$datitle = preg_replace('/([0-9]+)/', '<span class="title-number">$1</span>', $datitle);
echo $datitle; // hey hey <span class="title-number">13</span>
Or:
$datitle = "hey hey 13";
$datitle = preg_replace('/[0-9]+/', '<span class="title-number">$0</span>', $datitle);
echo $datitle; // hey hey <span class="title-number">13</span>

prevent line break after specific word (php)

I have a long text and I would like to add a no-wrap after specific key words. Lets say: 'Mr.', 'the', 'an' the only problem is I do not know what word will be after the key.
So if I have a text like:
... there is an elephant in the room ...
script should change it to:
... there is <span class="no-wrap">an elephant</span> in <span class="no-wrap"> the room</span> ...
I know that it should be done with regular expression of some sort but I am really bad at those. So any tips on how to do this in php?
Capture Mr., the, an strings and also the following word into a group.
(\b(?:Mr\.|the|an)\h+\S+)
Replacement string:
<span class="no-wrap">$1</span>
DEMO
Code:
<?php
$string = "... there is an elephant in the room ...";
echo preg_replace('~(\b(?:Mr\.|the|an)\h+\S+)~', '<span class="no-wrap">$1</span>', $string)
?>
Output:
... there is <span class="no-wrap">an elephant</span> in <span class="no-wrap">the room</span> ...

Categories