Using preg_replace and regex with varying formats - php

I have three types of strings that I encounter. My goal is to cycle through all of them and just get name.
page_1.name
page_2.name.text
page_1.name.something
The only way I can figure out doing this is to first remove the page_# with the following:
$remove_page = preg_replace("/(page_\d+\.)(\w+)/", "$2", $string);
Then remove the last bit like so:
$get_name = preg_replace("/(\w+)(\.\w+)/", "$1", $remove_page);
Is there a more efficient way to do this? This works, but I feel like I'm only slightly grasping the power of regex.

You can use this regex:
preg_match('/(?<=page_\d\.)[^.]+/', $input, $matches);
(?<=page_\d\.) is a lookbehind that makes sure our match is preceded by page_ and a digit. [^.]+ will match 1 or more characters that are not DOT.
RegEx Demo
Otherwise split by DOT and take 1st element:
$arr = explode('.', $input);
$name = $arr[1];

Related

PHP RexExp match and substitute

I am testing RegExp with online regexr.com tool. I will test string with multiple cases, but I can't get substitution to work.
RexEx for matching string is:
/^[0-9]{1,3}[0-9]{6,7}$/
Which matches local mobile number in my country like this:
0921234567
But then I want to substitute number in this way: add "+" sign, add my country code "123", add "." sign, and then finaly, add matched number with stripped leading zero.
Final number will be:
+385.921234567
I have basic idea to insert matched string, but I am not sure how prepend characters, and strip zero from matched string in following substitution pattern:
\+$&\n\t
I will use PHP preg_replace function.
EDIT:
As someone mentioned wisely, there is posibility that there will be one, two or none of zeros, but I will create separate test cases with regex just testing number of zeroes. Doing so in one regex seems to complicated for now.
Possible numbers will be:
0921234567
00111921234567
Where 111 is country code. I know that some country codes consist of 2 or 3 digits, but I will create special cases, for most country codes.
You can use this preg_replace to strip optional zeroes from start of your mobile #:
$str = preg_replace('~^0*(\d{7,9})$~', '+385.$1', $str);
^[0-9]([0-9]{1,2}[0-9]{6,7})$
You just need to add groups.Replace by +385.$1.See demo.
https://regex101.com/r/cJ6zQ3/22
$re = "/^[0-9]([0-9]{1,2}[0-9]{6,7})$/m";
$str = "0921234567\n";
$subst = "+385.$1";
$result = preg_replace($re, $subst, $str);
I would use a 2-step solution:
Check if we match the main regex
Replace the number by pre-pending + + country code + . + number without leading zeros.
PHP code:
$re = "/^[0-9]{7,10}$/";
$str = "0921234567";
if (preg_match($re, $str, $match)) {
echo "+385." . preg_replace('/^0+/', '', $match[0]);
}
Note that splitting out character class in your regex pattern makes no sense when not using capture groups. ^[0-9]{7,10}$ is the same then as ^[0-9]{1,3}[0-9]{6,7}$, meaning match 7 to 10 digits from start to end of the string.
Leading zeros are easily trimmed from the start with /^0+/ regex.

Regexp for preg_replace in PHP

I have strings like this (some examples):
F7998FM3213/02F
J442554NM/05
K439459845/34D
I need to use PHP with preg_replace and regular expressions to delete all non-numeric characters in any string, after the forward-slash, '/'.
For example the codes above would look like this afterwards:
F7998FM3213/02
J442554NM/05
K439459845/34
If you're going for readability, something like this would be perfect:
$parts = explode("/",$line,2);
$parts[1] = preg_replace("/\D/","",$parts[1]);
$output = implode("/",$parts);
However, for conciseness and based entirely on the examples you have given, try this:
$output = preg_replace("/\D+$/","",$input);
This will strip any non-numeric characters from the end of the string, which seems to be what you're after based on your examples.
you can use this:
$subject = <<<LOD
F7998FM3213/02F
J442554NM/05
K439459845/34D
K439459845/34D34
LOD;
echo preg_replace('~^[^/]*+/\K|[^\d\n]++~m', '', $subject);
explanation:
The regex is an alternation between two things:
You match the begining until you encounter / included
the part after the / that is all that is not a digit or a new line one or more times
Since the begining of the string is checked at first, all non digit characters are removed after the /
To remove all \D anywhere after a / you could replace:
(?:/\K|\G(?!^))(\d*)\D+
with $1. Like:
preg_replace(',(?:/\K|\G(?!^))(\d*)\D+,', '$1', $str);

PHP preg_replace: How can I match something but not replace it?

For example, if I wanted to preg_replace the title of a HTML element:
$str = preg_replace('/title=\"([^\"]+)\"/', 'foo', $str);
Please do not give me other solutions (non regex) for this specific example, this is merely an example. I need a solution that works for any regular expressions.
If you want to match parts with preg_replace, but only partially replace something else, then there are two options.
Either you just reinsert the matched parts (enclose in capture groups, and then use $1 and $3):
$str = preg_replace('/(title=")([^"]+)(")/', '$1foo$3', $str);
Or you use assertions:
$str = preg_replace('/(?<=title=")([^"]+)(?=")/', 'foo', $str);

PREG_ or Regex Question

I'd like to match the last instance of / (I believe you use [^/]+$) and copy the contents of the next four or less numbers until I get to a dash -.
I believe the "right" method to return this number is through a preg_split, but I'm
not sure. the only other way I know is to explode on /, array reverse, explode on -, assign. I'm sure there's a more elegant way though?
For instance
example.com/12-something // get 12
example.com/996-something // get 996
example.com/12345-no-deal // return nothing
I'm unfortunately not a regex guru like some of you folks though.
Here is an ugly way to do the same thing.
$strip = array_reverse(explode('/', $page));
$strip = $strip[0];
$strip = explode('-', $strip);
$strip = $strip[0];
echo (strlen($strip) < 4) ? (int)$strip : null;
This should work
$str = "example.com/123-test";
preg_match("/\/([\d]{1,4})-[^\/]+$/", $str, $matches);
echo $matches[1]; // 123
It makes sure that the ###-word part is at the end and that there are only 1-4 digits.
A match on /\/(\d{1,4})-[^\/]+$/ should fit the bill with the number in the first capture var. My apologies, I don't write PHP and I don't want to deal with preg_match's interface, but that's the regex anyhow.
If PHP supports non-slash regex delimiters these days, m#/(\d{1,4})-[^/]+$# is the version with fewer leaning-toothpicks.

using preg_match to strip specified underscore in php

There has always been a confusion with preg_match in php.
I have a string like this:
apsd_01_03s_somedescription
apsd_02_04_somedescription
Can I use preg_match to strip off anything from 3rd underscore including the 3rd underscore.
thanks.
Try this:
preg_replace('/^([^_]*_[^_]*_[^_]*).*/', '$1', $str)
This will take only the first three sequences that are separated by _. So everything from the third _ on will be removed.
if you want to strip the "_somedescription" part: preg_replace('/([^]*)([^]*)([^]*)(.*)/', '$1_$2_$3', $str);
I agree with Gumbo's answer, however, instead of using regular expressions, you can use PHP's array functions:
$s = "apsd_01_03s_somedescription";
$parts = explode("_", $s);
echo implode("_", array_slice($parts, 0, 3));
// apsd_01_03s
This method appears to execute similarly in speed, compared to a regular expression solution.
If the third underscore is the last one, you can do this:
preg_replace('/^(.+)_.+?)$/', $1, $str);

Categories