Simple php regex question - php

I have a string:
$uri = "start/test/go/";
Basically I need to know which regular expression and PHP function I can use to match the first item with a forward slash ("/") and remove it from the string. It should also work if the first item is not start and is anything else which might also have a space in it.
So all these combination should work:
$uri = "start_my_test/test/go/";
$uri2 = "start my test/test/go/";
Then after the RegEx it should always return:
$newUri = "test/go/";
Oh and the other side of the string could be anything as well, So basically I want it to delete anything before the first occurrence of a forward slash.
Cheers

Use strstr to find the first occurrence of a string in php.
That in itself should return the remainder of the string.
see here

$result = preg_replace('/^[^\/]*\//' , '', $subject);
This says "start at the beginning of the string" ^, "match any number of characters that are not a forward slash" [^\/]*, then match a single forward slash \/ -- and "replace the whole matched thing with nothing" ''.

regex is too expensive an operation for what you need. use strpos and substr instead
$position = strpos($needle, $haystack);
if ( $position !== false ) {
$result = substr($needle, $position + 1);
}

Related

Replace all occurrences using preg_replace

The code below works perfectly:
$string = '(test1)';
$new = preg_replace('/^\(+.+\)+$/','word',$string);
echo $new;
Output:
word
If the code is this:
$string = '(test1) (test2) (test3)';
How to generate output:
word word word?
Why my regex do not work ?
^ and $ are anchors which means match should start from start of string and expand upto end of string
. means match anything except newline, + means one or more, by default regex is greedy in nature so it tries to match as much as possible where as we want to match ( ) so we need to change the pattern a bit
You can use
\([^)]+\)
$string = '(test1) (test2) (test3)';
$new = preg_replace('/\([^)]+\)/','word',$string);
echo $new;
Regex Demo

Regex rules in an array

Maybe it can not be solved this issue as I want, but maybe you can help me guys.
I have a lot of malformed words in the name of my products.
Some of them has leading ( and trailing ) or maybe one of these, it is same for / and " signs.
What I do is that I am explode the name of the product by spaces, and examines these words.
So I want to replace them to nothing. But, a hard drive could be 40GB ATA 3.5" hard drive. I need to process all the word, but I can not use the same method for 3.5" as for () or // because this 3.5" is valid.
So I only need to replace the quotes, when it is at the start of the string AND at end of the string.
$cases = [
'(testone)',
'(testtwo',
'testthree)',
'/otherone/',
'/othertwo',
'otherthree/',
'"anotherone',
'anothertwo"',
'"anotherthree"',
];
$patterns = [
'/^\(/',
'/\)$/',
'~^/~',
'~/$~',
//Here is what I can not imagine, how to add the rule for `"`
];
$result = preg_replace($patterns, '', $cases);
This is works well, but can it be done in one regex_replace()? If yes, somebody can help me out the pattern(s) for the quotes?
Result for quotes should be this:
'"anotherone', //no quote at end leave the leading
'anothertwo"', //no quote at start leave the trailin
'anotherthree', //there are quotes on start and end so remove them.
You may use another approach: rather than define an array of patterns, use one single alternation based regex:
preg_replace('~^[(/]|[/)]$|^"(.*)"$~s', '$1', $s)
See the regex demo
Details:
^[(/] - a literal ( or / at the start of the string
| - or
[/)]$ - a literal ) or / at the end of the string
| - or
^"(.*)"$ - a " at the start of the string, then any 0+ characters (due to /s option, the . matches a linebreak sequence, too) that are captured into Group 1, and " at the end of the string.
The replacement pattern is $1 that is empty when the first 2 alternatives are matched, and contains Group 1 value if the 3rd alternative is matched.
Note: In case you need to replace until no match is found, use a preg_match with preg_replace together (see demo):
$s = '"/some text/"';
$re = '~^[(/]|[/)]$|^"(.*)"$~s';
$tmp = '';
while (preg_match($re, $s) && $tmp != $s) {
$tmp = $s;
$s = preg_replace($re, '$1', $s);
}
echo $s;
This works
preg_replace([[/(]?(.+)[/)]?|/\"(.+)\"/], '$1', $string)

Removing all characters and numbers except last variable with dash symbol

Hi I want to remove a characters using preg_replace in php so i have this code here which i want to remove the whole characters, letters and numbers except the last digit(s) which has dash(-) symbol followed by a digits so here's my code.
echo preg_replace('/(.+)(?=-[0-9])|(.+)/','','asdf1245-10');
I expect the result will be
-10
the problem is above is not working very well. I checked the pattern using http://www.regextester.com/ it seems like it works, but on the other side http://www.phpliveregex.com/ doesn't work at all. I don't know why but anyone who can help to to figure it out?
Thanks a lot
Here is a way to go:
echo preg_replace('/^.+?(-[0-9]+)?$/','$1','asdf1245-10');
Output:
-10
and
echo preg_replace('/^.+?(-[0-9]+)?$/','$1','asdf124510');
Output:
<nothing>
My first thinking is to use explode in this case.. make it simple like the following code.
$string = 'asdf1245-10';
$array = explode('-', $string);
end($array);
$key = key($array);
$result = '-' . $array[$key];
$result => '-10';
An other way:
$result = preg_match('~\A.*\K-\d+\z~', $str, $m) ? $m[0] : '';
pattern details:
\A # start of the string anchor
.* # zero or more characters
\K # discard all on the left from match result
-\d+ # the dash and the digits
\z # end of the string anchor
echo preg_replace('/(\w+)(-\w+)/','$2', 'asdf1245-10');

preg_match_all regex issue for url routing

for an url routing I have
Patern :
/^\/stuff\/other-stuff\/(?:([^\/]\+?))$/i
Subject :
/stuff/other-stuff/foo-AB123456.html
why $num_matches is equal to 0 ??
$num_matches = preg_match_all($patern, $subject, $matches);
Help should be greatly appreciated :)
because of this:
[^\/]\+?
firstly there is no slash after other-stuff so you cannot find the sentence with a negated / secondly the + must not be escaped if you are doing this kind of match . + must only be escaped when you are doing a literal match.
the corrected regex should be :
^\/stuff\/other-stuff\/(?:(.+?))$
demo here : http://regex101.com/r/aV9cR0
will match foo-AB123456.html in the first capture
$patern= "#^/stuff/other-stuff/([^/]+)$#i";
$subject = "/stuff/other-stuff/foo-AB123456.html";
preg_match_all($patern, $subject, $matches);
print_r($matches[1]);
It looks to me like your regex could be simplified to something like:
(?i)^/stuff/other-stuff/[\w-.]+$
It would work like this:
<?php
$regex="~(?i)^/stuff/other-stuff/([\w-./]+)$~";
$string = "/stuff/other-stuff/foo-AB123456.html";
$hit = preg_match($regex,$string,$m);
echo $m[0]."<br />";
echo $m[1]."<br />";
?>
Output:
/stuff/other-stuff/foo-AB123456.html
foo-AB123456.html
Note that this could be done in a number of different ways.
Here are some details about the regex.
The ~ delimiter is nicer than the original / because you don't have to escape the slashes.
The parentheses in ([\w-.]+) capture the end of the url into Group 1. This is why $m[1] yields foo-AB123456.html
After the final slash, [\w-./]+ matches any number of letters or digits, underscores, dashes, dots and forward slashes. This is a "mini-spec" for what characters we expect there. If you want to allow anything at all, you could go with a simple dot.

Identifying a random repeating pattern in a structured text string

I have a string that has the following structure:
ABC_ABC_PQR_XYZ
Where PQR has the structure:
ABC+JKL
and
ABC itself is a string that can contain alphanumeric characters and a few other characters like "_", "-", "+", "." and follows no set structure:
eg.qWe_rtY-asdf or pkl123
so, in effect, the string can look like this:
qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ
My goal is to find out what string constitutes ABC.
I was initially just using
$arrString = explode("_",$string);
to return $arrString[0] before I was made aware that ABC ($arrString[0]) itself can contain underscores, thus rendering it incorrect.
My next attempt was exlpoding it on "_" anyway and then comparing each of the exploded string parts with the first string part until I get a semblance of a pattern:
function getPatternABC($string)
{
$count = 0;
$pattern ="";
$arrString = explode("_", $string);
foreach($arrString as $expString)
{
if(strcmp($expString,$arrString[0])!==0 || $count==0)
{
$pattern = $pattern ."_". $arrString[$count];
$count++;
}
else break;
}
return substr($pattern,1);
}
This works great - but I wanted to know if there was a more elegant way of doing this using regular expressions?
Here is the regex solution:
'^([a-zA-Z0-9_+-]+)_\1_\1\+'
What this does is match (starting from the beginning of the string) the longest possible sequence consisting of the characters inside the square brackets (edit that per your spec). The sequence must appear exactly twice, each time followed by an underscore, and then must appear once more followed by a plus sign (this is actually the first half of PQR with the delimiter before JKL). The rest of the input is ignored.
You will find ABC captured as capture group 1.
So:
$input = 'qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ';
$result = preg_match('/^([a-zA-Z0-9_+-]+)_\1_\1\+/', $input, $matches);
if ($result) {
echo $matches[2];
}
See it in action.
Sure, just make a regular expression that matches your pattern. In this case, something like this:
preg_match('/^([a-zA-Z0-9_+.-]+)_\1_\1\+JKL_XYZ$/', $string, $match);
Your ABC is in $match[1].
If the presence of underscores in these strings has a low frequency, it may be worth checking to see if a simple explode() will do it before bothering with regex.
<?php
$str = 'ABC_ABC_PQR_XYZ';
if(substr_count($str, '_') == 3)
$abc = reset(explode('_', $str));
else
$abc = regexy_function($str);
?>

Categories