Preg_match_all, word with randon text - php

I'm in trouble with preg_match, I want to extract a div using preg_match.
Div Example: <div class="post_message_(RANDOM NUMBER)">
I've tried to use <div\s+class="post_message_/\w+/"\s*> to detect the number sequence after the keyword and fill the preg_match_all command.
But <div\s+class="post_message_/\w+/"\s*> is not working, anyone know how to perform this?
Regards,

Assuming you want a random number (and not random text of numbers and letters) then this will catch the 123456:
$re = "/<div\\s+class=\"post_message_([\\d]+)\"\\s*>/";
$str = "<div class=\"post_message_123456\">";
preg_match($re, $str, $matches);

Related

Can this be solved with a regular expression?

I am trying to extract the digits from between the words in this string.
110.0046102.005699.0008103.0104....
I want to extract 4 digits after dot (point/period).
110.0046
102.0056
99.0008
103.0104
I was wondering if this was possible to do with a regular expression or if I should just use other way.
// replace the variable $numbers with your numbers
$numbers = "110.0046102.005699.0008103.0104";
preg_match_all("#\d+\.\d{4}#", $numbers, $matches);
var_dump($matches); // outputting all matches
https://regex101.com/r/oG1dK1/1 -> you can see the regex in action here. The numbers are in the box MATCH INFORMATION on the right.
Try this regex:
(\d{1,}\.\d{4})
Demo here: https://regex101.com/r/uJ1wU6/1

preg_replace with Regex - find number-sequence in URL

I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!
You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo
You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.
You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)

Regular Expressions: Numeric Value before Occurrence PHP

Given the string:
100,000 this is some text 12,000 this is text I want to match.
I need a regular expression that matches 12,000 based on matching
text I want to match
So, we can get a position with:
strpos($haystack, 'text I want to match');
Then, I guess we could use a regular expression to look backwards:
But, this is where I need help.
If you know that the digits will always precede the based context you want to match ...
preg_match('/([\d,]+)\D*text I want to match/', $str, $match);
var_dump($match[1]);
It is simple:
/ ([0-9,]+) this is text I want to match\.$/
Demo:
http://sandbox.onlinephpfunctions.com/code/b288ca9a322c7a5b54c6490334540ab142b6a979
Another solution:
$re = "/([\\d,]+)(?=\\D*text I want to match)/";
$str = "100,000 this is some text 12,000 this is text I want to match.";
preg_match($re, $str, $matches);
Live demo

PHP: Preg_match and replace all

I have an obvious hyperlink which I all want to replace in a text to just normal HTML hyperlinks.
So this just works for one hyperlink:
$string = '<u>\\n\\\\*HYPERLINK \\"http://www.youtube.com/watch?v=A0VUsoeT9aM\\"A Youtube Video</u>';
$pattern = '/http[?.:=\\w\\d\\/]*/';
$namePattern = '/(?:")([\\s\\w]*)</';
preg_match($pattern, $string, $matches);
preg_match($namePattern, $string, $nameMatches);
echo ''.$nameMatches[1].'';
But there are more hyperlinks than just one in a text so I want to just change all of these hyperlinks:
<?php
$input = 'Blablabla Beginning Text <u>\\n\\\\*HYPERLINK \\"http://www.youtube.com/watch?v=A0VUsoeT9aM\\"1.A Youtube Video</u> blablabla Text Middle <u>\\n\\\\*HYPERLINK \\"http://www.youtube.com/watch?v=A0VUsoeT9aM\\"2. A Youtube Video</u> blabla Text after';
//To become:
$output = 'Blablabla Beginning Text 1. A Youtube Videoblablabla Text Middle 2. A Youtube Video blabla Text after';
?>
How would I do that?
So, you want to replace the found matches, then use preg_replace() which does exactly that. However, you'll run into one obvious problem: Currently there are two instances of preg_match() - should those be replaced by two instances of preg_replace()? No. Combine them.
$pattern = '/http[?.:=\w\d\\/]*/';
$namePattern = '/(?:")([\s\w]*)</';
Can be combined to (I added . to the $namePattern part, so it can work with the second example text where the link description contains a dot):
$replacePattern = '/(http[?.:=\w\d\\/]*)\\\\"([\s\w.]*)</';
Because link and text are separated by \\" in the original text. I tested via preg_match_all() if this pattern works and it does. Also by adding () to the first pattern, they are now grouped.
$replacePattern = '/(http[?.:=\w\d\\/]*)\\\\"([\s\w.]*)</';
// ^-group1-----------^ ^-group2-^
These groupes can now be used in the replace statement.
$replaceWith = '\\2<';
Where \\1 points to the first group and \\2 to the second. The < at the end is necessary because preg_replace() will replace the whole found pattern (not just groups) and since the < is at the end of the pattern, we would lose it if it wasn't in the replace part.
All that you now need, is to call preg_replace() with this parameters like the following:
$output = preg_replace($replacePattern, $replaceWith, $string);
All occurences of the $replacePattern will now be replaced with their version of $replaceWith and saved in the variable $output.
You can see it here.
If you want a larger part to be removed, just extend the $replacePattern.
$replacePattern = '/<u>.*?(http[?.:=\w\d\\/]*)\\\\"([\s\w.]+)<\/u>/';
$replaceWith = '\\2';
(see it here) .*? will match everything and is not greedy, meaning it will stop once it finds the first occurence of whatever comes after (so here it is http...).

Convert Notepad++ Regex to PHP Regular Expression

I'm trying to convert a Notepad++ Regex to a PHP regular expression which basically get IDs from a list of URL in this format:
http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html
http://www.example.com/category-example/1471337-text-blah-blah-2-blah-2010.html
Using Notepad++ regex function i get the output that i need in two steps (a list of comma separated IDs)
(.*)/ replace with space
-(.*) replace with comma
Result:
1371937,1471337
I tried to do something similar with PHP preg_replace but i can't figure how to get the correct regex, the below example removes everything except digits but it doesn't work as expected since there can be also numbers that do not belong to ID.
$bb = preg_replace('/[^0-9]+/', ',', $_POST['Text']);
?>
Which is the correct structure?
Thanks
If you are matching against:
http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html
To get:
1371937
You would:
$url = "http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html";
preg_match( "/[^\d]+(\d+)-/", $url, $matches );
$code = $matches[1];
.. which matches all non-numeric characters, then an unbroken string of numbers, until it reaches a '-'
If all you want to do is find the ID, then you should use preg_match, not preg_replace.
You've got lost of options for the pattern, the simplest being:
$url = 'http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html';
preg_match('/\d+/', $url, $matches);
echo $matches[0];
Which simply finds the first bunch of numbers in the URL. This works for the examples.

Categories