Trouble With Regexp - php

I have to replace matches of patterns like <something:any-char> within a URL.
For example, a URL like this:
http://some-site.com/some-acion/pippo:1/mypar:asdasd/pippo2:sdd/ .....
should become:
http://some-site.com/some-acion/pippo:1/pippo2:sdd/ .....
In other words, I have to filter out any occurrence of mypar: from the URL.
I will use php for that.
I tried with RegExp:
.*[\/]+(sh:.*)[\/]?.*$
But it matches only strings like /pippo:3/mypar:wdfds. Strings like /pippo:2/mypar:asa/7pippo:1/ are not matched.
Any hint appreciated.

You could do this:
$url = "/pippo:2/mypar:asa/7pippo:1/";
$stripped = preg_replace("/\/mypar:.*?(\/|$)/", "$1", $url);

The combination of the lazy dot matching .*? with a positive lookahead (?=/|$) (either a / or the end of string) can be replaced with a mere any 0+ chars other than / with [^/]*:
'~/mypar:[^/]*~'
See the regex demo
The ~ delimiter makes it possible to use / in the pattern without escaping.
Pattern details:
/ - a forward slash
mypar: - a sequence of literal characters
[^/]* - zero or more characters other than / character
See PHP demo:
$re = '~/mypar:[^/]*~';
$str = "/pippo:2/mypar:asa/7pippo:1/";
$result = preg_replace($re, '', $str, 1);
echo $result;

Related

simple pattern with preg_match_ALL work fine!, how to use with preg_replace?

thanks by your help.
my target is use preg_replace + pattern for remove very sample strings.
then only using preg_replace in this string or others, I need remove ANY content into <tag and next symbol >, the pattern is so simple, then:
$x = '#<\w+(\s+[^>]*)>#is';
$s = 'DATA<td class="td1">111</td><td class="td2">222</td>DATA';
preg_match_all($x, $s, $Q);
print_r($Q[1]);
[1] => Array
(
[0] => class="td1"
[1] => class="td2"
)
work greath!
now I try remove strings using the same pattern:
$new_string = '';
$Q = preg_replace($x, "\\1$new_string", $s);
print_r($Q);
result is completely different.
what is bad in my use of preg_replace?
using only preg_replace() how I can remove this strings?
(we can use foreach(...) for remove each string, but where is the error in my code?)
my result expected when I intro this value:
$s = 'DATA<td class="td1">111</td><td class="td2">222</td>DATA';
is this output:
$Q = 'DATA<td>111</td><td>222</td>DATA';
Let's break down your RegEx, #<\w+(\s+[^>]*)>#is, and see if that helps.
# // Start delimiter
< // Literal `<` character
\w+ // One or more word-characters, a-z, A-Z, 0-9 or _
( // Start capturing group
\s+ // One or more spaces
[^>]* // Zero or more characters that are not the literal `>`
) // End capturing group
> // Literal `>` character
# // End delimiter
is // Ignore case and `.` matches all characters including newline
Given the input DATA<td class="td1">DATA this matches <td class="td1"> and captures class="td1". The difference between match and capture is very important.
When you use preg_match you'll see the entire match at index 0, and any subsequent captures at incrementing indexes.
When you use preg_replace the entire match will be replaced. You can use the captures, if you so choose, but you are replacing the match.
I'm going to say that again: whatever you pass as the replacement string will replace the entirety of the found match. If you say $1 or \\=1, you are saying replace the entire match with just the capture.
Going back to the sample after the breakdown, using $1 is the equivalent of calling:
str_replace('<td class="td1">', ' class="td1"', $string);
which you can see here: https://3v4l.org/ZkPFb
To your question "how to change [0] by $new_string", you are doing it correctly, it is your RegEx itself that is wrong. To do what you are trying to do, your pattern must capture the tag itself so that you can say "replace the HTML tag with all of the attributes with just the tag".
As one of my comments noted, this is where you'd invert the capturing. You aren't interesting in capturing the attributes, you are throwing those away. Instead, you are interested in capturing the tag itself:
$string = 'DATA<td class="td1">DATA';
$pattern = '#<(\w+)\s+[^>]*>#is';
echo preg_replace($pattern, '<$1>', $string);
Demo: https://3v4l.org/oIW7d

Exclude link starts with a character from PREG_REPLACE

This codes convert any url to clickable link:
$str = preg_replace('/(http[s]?:\/\/[^\s]*)/i', '$1', $str);
How to make it not convert when url starts with [ character? Like this:
[http://google.com
Use a negative lookbehind:
$str = preg_replace('/(?<!\[)(http[s]?:\/\/[^\s]*)/i', '$1', $str);
^^^^^^^
Then, the http... substring that is preceded with [ won't be matched.
You may enhance the pattern as
preg_replace('/(?<!\[)https?:\/\/\S*/i', '$0', $str);
that is: remove the ( and ) (the capturing group) and replace the backreferences from $1 with $0 in the replacement pattern, and mind that [^\s] = \S, but shorter. Also, [s]? = s?.

PHP Regex matches beween Slash and Subtract

Hello I need a regex to get a string "trkfixo" from
SIP/trkfixo-000072b6
I was trying to use explode but I prefer a regex solution.
$ex = explode("/",$sip);
$ex2 = explode("-",$ex[1]);
echo $ex2[0];
You may use '~/([^-]+)~':
$re = '~/([^-]+)~';
$str = "SIP/trkfixo-000072b6";
preg_match($re, $str, $match);
echo $match[1]; // => trkfixo
See the regex demo and a PHP demo
Pattern details:
/ - matches a /
([^-]+) - Group 1 capturing 1 or more (+) symbols other than - (due to the fact that [^-] is a negated character class that matches any symbols other than all symbols and ranges inside this class).
$match = preg_match('/\/[a-zA-Z]-/', "SIP/trkfixo-000072b6");

How can I remove a specific format from string with RegEx?

I have a list of string like this
$16,500,000(#$2,500)
$34,000(#$11.00)
$214,000(#$18.00)
$12,684,000(#$3,800)
How can I extract all symbols and the (#$xxxx) from these strings so that they can be like
16500000
34000
214000
12684000
\(.*?\)|\$|,
Try this.Replace by empty string.See demo.
https://regex101.com/r/vD5iH9/42
$re = "/\\(.*?\\)|\\$|,/m";
$str = "\$16,500,000(#\$2,500)\n\$34,000(#\$11.00)\n\$214,000(#\$18.00)\n\$12,684,000(#\$3,800)";
$subst = "";
$result = preg_replace($re, $subst, $str);
To remove the end (#$xxxx) characters, you could use the regex:
\(\#\$.+\)
and replace it with nothing:
preg_replace("/\(\#\$.+\)/g"), "", $myStringToReplaceWith)
Make sure to use the g (global) modifier so the regex doesn't stop after it finds the first match.
Here's a breakdown of that regex:
\( matches the ( character literally
\# matches the # character literally
\$ matches the $ character literally
.+ matches any character 1 or more times
\) matches the ) character literally
Here's a live example on regex101.com
In order to remove all of these characters:
$ , ( ) # .
From a string, you could use the regex:
\$|\,|\(|\)|#|\.
Which will match all of the characters above.
The | character above is the regex or operator, effectively making it so
$ OR , OR ( OR ) OR # OR . will be matched.
Next, you could replace it with nothing using preg_replace, and with the g (global) modifier, which makes it so the regex doesn't return on the first match:
preg_replace("/\$|\,|\(|\)|#|\./g"), "", $myStringToReplaceWith)
Here's a live example on regex101.com
So in the end, your code could look like this:
$str = preg_replace("/\(\#\$.+\)/g"), "", $str)
$str = preg_replace("/\$|\,|\(|\)|#|\./g"), "", $str)
Although it isn't in one regex, it does not use any look-ahead, or look-behind (both of which are not bad, by the way).

Regex extracting characters between last and second last occurance of character

i am trying to extract the word in between the last / and the second last / - i.e. food in the following PHP example.
$string = https://ss1.xxx/img/categories_v2/FOOD/fastfood (would like to replace $string to food)
$string = https://ss1.xxx/img/categories_v2/SHOPS/barbershop (would like to replace $string to shops)
I am new to regex and tried /[^/]*$ - however that is returning everying after the last /.. any help would be appreciated.. thanks!
I am using PHP.
Use:
preg_match('#/([^/]*)/[^/]*$#', $string, $match);
echo $match[1];
You could also use:
$words = explode('/', $string);
echo $words[count($words)-2];
You can use this:
$result = preg_replace_callback('~(?<=/)[^/]+(?=/[^/]*$)~', function ($m) {
return strtolower($m[0]); }, $string);
Pattern details:
~ # pattern delimiter
(?<=/) # zero width assertion (lookbehind): preceded by /
[^/]+ # all characters except / one or more times
(?=/[^/]*$) # zero width assertion (lookahead): followed by /,
# all that is not a / zero or more times, and the end of the string
~ # pattern delimiter
Regex:
(\w+)(/[^/]+)$
PHP code:
<?php
$string = "https://ss1.xxx/img/categories_v2/FOOD/fastfood";
echo preg_replace("#(\w+)(/[^/]+)$#", "food$2", $string);
$string = "https://ss1.xxx/img/categories_v2/SHOPS/barbershop";
echo preg_replace("#(\w+)(/[^/]+)$#", "shops$2", $string);
?>

Categories