Regex to remove non-alphanumeric characters and all characters after dot? - php

I need a regular expression (php) to remove the forward slash, the dot and eveything after the dot in my string so that
$str = "ab/12c.3de";
becomes
$newstr = "ab12c";

You can use alternation in regex:
$str = "ab/12c.3de";
$newstr = preg_replace('~/|\..*~', '', $str);
//=> ab12c
Regex: /|\..*
/ matches literal /
| OR (alternation)
\..* matches a dot and everything after it
Replacement is just by empty string.

Related

PHP - How to modify matched pattern and replace

I have string which contain space in its html tags
$mystr = "< h3> hello mom ?< / h3>"
so i wrote regex expression for it to detect the spaces in it
$pattern = '/(?<=<)\s\w+|\s\/\s\w+|\s\/(?=>)/mi';
so next i want to modify the matches by removing space from it and replace it, so any idea how it can be done? so that i can fix my string like
"&lt;h3&gt; hello mom ?&lt;/h3&gt;"
i know there is php function pre_replace but not sure how i can modify the matches
$result = preg_replace( $pattern, $replace , $mystr );
For the specific tags like you showed, you can use
preg_replace_callback('/&lt;(?:\s*\/)?\s*\w+\s*&gt;/ui', function($m) {
return preg_replace('/\s+/u', '', $m[0]);
}, $mystr)
The regex - note the u flag to deal with Unicode chars in the string - matches
&lt; - a literal string
(?:\s*\/)? - an optional sequence of zero or more whitespaces and a / char
\s* - zero or more whitespaces
\w+ - one or more word chars
\s* - zero or more whitespaces
&gt; - a literal string.
The preg_replace('/\s+/u', '', $m[0]) line in the anonymous callback function removes all chunks of whitespaces (even those non-breaking spaces).
You could keep it simple and do:
$output = str_replace(['&lt; / ', '&lt; ', '&gt; '],
['&lt;/', '&lt;', '&gt;'], $input);

Exclude link starts with a character from PREG_REPLACE

This codes convert any url to clickable link:
$str = preg_replace('/(http[s]?:\/\/[^\s]*)/i', '$1', $str);
How to make it not convert when url starts with [ character? Like this:
[http://google.com
Use a negative lookbehind:
$str = preg_replace('/(?<!\[)(http[s]?:\/\/[^\s]*)/i', '$1', $str);
^^^^^^^
Then, the http... substring that is preceded with [ won't be matched.
You may enhance the pattern as
preg_replace('/(?<!\[)https?:\/\/\S*/i', '$0', $str);
that is: remove the ( and ) (the capturing group) and replace the backreferences from $1 with $0 in the replacement pattern, and mind that [^\s] = \S, but shorter. Also, [s]? = s?.

Trouble With Regexp

I have to replace matches of patterns like <something:any-char> within a URL.
For example, a URL like this:
http://some-site.com/some-acion/pippo:1/mypar:asdasd/pippo2:sdd/ .....
should become:
http://some-site.com/some-acion/pippo:1/pippo2:sdd/ .....
In other words, I have to filter out any occurrence of mypar: from the URL.
I will use php for that.
I tried with RegExp:
.*[\/]+(sh:.*)[\/]?.*$
But it matches only strings like /pippo:3/mypar:wdfds. Strings like /pippo:2/mypar:asa/7pippo:1/ are not matched.
Any hint appreciated.
You could do this:
$url = "/pippo:2/mypar:asa/7pippo:1/";
$stripped = preg_replace("/\/mypar:.*?(\/|$)/", "$1", $url);
The combination of the lazy dot matching .*? with a positive lookahead (?=/|$) (either a / or the end of string) can be replaced with a mere any 0+ chars other than / with [^/]*:
'~/mypar:[^/]*~'
See the regex demo
The ~ delimiter makes it possible to use / in the pattern without escaping.
Pattern details:
/ - a forward slash
mypar: - a sequence of literal characters
[^/]* - zero or more characters other than / character
See PHP demo:
$re = '~/mypar:[^/]*~';
$str = "/pippo:2/mypar:asa/7pippo:1/";
$result = preg_replace($re, '', $str, 1);
echo $result;

How can I remove a specific format from string with RegEx?

I have a list of string like this
$16,500,000(#$2,500)
$34,000(#$11.00)
$214,000(#$18.00)
$12,684,000(#$3,800)
How can I extract all symbols and the (#$xxxx) from these strings so that they can be like
16500000
34000
214000
12684000
\(.*?\)|\$|,
Try this.Replace by empty string.See demo.
https://regex101.com/r/vD5iH9/42
$re = "/\\(.*?\\)|\\$|,/m";
$str = "\$16,500,000(#\$2,500)\n\$34,000(#\$11.00)\n\$214,000(#\$18.00)\n\$12,684,000(#\$3,800)";
$subst = "";
$result = preg_replace($re, $subst, $str);
To remove the end (#$xxxx) characters, you could use the regex:
\(\#\$.+\)
and replace it with nothing:
preg_replace("/\(\#\$.+\)/g"), "", $myStringToReplaceWith)
Make sure to use the g (global) modifier so the regex doesn't stop after it finds the first match.
Here's a breakdown of that regex:
\( matches the ( character literally
\# matches the # character literally
\$ matches the $ character literally
.+ matches any character 1 or more times
\) matches the ) character literally
Here's a live example on regex101.com
In order to remove all of these characters:
$ , ( ) # .
From a string, you could use the regex:
\$|\,|\(|\)|#|\.
Which will match all of the characters above.
The | character above is the regex or operator, effectively making it so
$ OR , OR ( OR ) OR # OR . will be matched.
Next, you could replace it with nothing using preg_replace, and with the g (global) modifier, which makes it so the regex doesn't return on the first match:
preg_replace("/\$|\,|\(|\)|#|\./g"), "", $myStringToReplaceWith)
Here's a live example on regex101.com
So in the end, your code could look like this:
$str = preg_replace("/\(\#\$.+\)/g"), "", $str)
$str = preg_replace("/\$|\,|\(|\)|#|\./g"), "", $str)
Although it isn't in one regex, it does not use any look-ahead, or look-behind (both of which are not bad, by the way).

Selective string reduction

I would like to know how to strip all non-alphanumeric characters from a string except for underscores and dashes in PHP.
Use preg_replace with /[^a-zA-Z0-9_\-]/ as the pattern and '' as the replacement.
$string = preg_replace('/[^a-zA-Z0-9_\-]/', '', $string);
EDIT
As skippy said, you can use the i modifier for case insensitivity:
$string = preg_replace('/[^a-z0-9_\-]/i', '', $string);
Use preg_replace:
$str = preg_replace('/[^\w-]/', '', $str);
The first argument to preg_replace is a regular expression. This one contains:
/ - starting delimiter -- start the regex
[ - start character class -- define characters that can be matched
^ - negative -- make the character class match only characters that don't match the selection that follows
\w - word character -- so don't match word characters. These are A-Za-z0-9 and _ (underscore)
- - hyphen -- don't match hypens either
] - close the character class
/ - ending delimiter -- close the regex
Note that this only matches hyphens (i.e. -). It does not match genuine dash characters (– or —).
Accepts a-z, A-Z, 0-9, '-', '_' and spaces:
$str = preg_replace("/[^a-z0-9\s_-]+/i", '', $tr);
No spaces:
$str = preg_replace("/[^a-z0-9_-]+/i", '', $tr);

Categories