Removing non-alphanumeric characters from a string - php

I have a string in PHP and I want it to match the regex [A-Za-Z0-9]. How can I do this?

I am assuming you meant, a-z instead of a-Z, inside of your regex, but you can use preg_replace
$new_string = preg_replace("/[^a-zA-Z0-9\s]/", "", $string);
It takes as arguments the pattern ([a-zA-Z0-9]), replacement ("") and the subject ($string) and returns the new string ($new_string)

$string = preg_replace('/[^a-zA-Z0-9]/', '', $string);

\W is a shortcut for [^a-Z0-9_]. May not be extremely helpful as it allows underscores too, but thought I'd let you know.

Related

Strip non Alphanumeric chars from filename - php [duplicate]

I'd like a regexp or other string which can replace everything except alphanumeric chars (a-z and 0-9) from a string. All things such as ,##$(#*810 should be stripped. Any ideas?
Edit: I now need this to strip everything but allow dots, so everything but a-z, 1-9, .. Ideas?
$string = preg_replace("/[^a-z0-9.]+/i", "", $string);
Matches one or more characters not a-z 0-9 [case-insensitive], or "." and replaces with ""
I like using [^[:alnum:]] for this, less room for error.
preg_replace('/[^[:alnum:]]/', '', "(ABC)-[123]"); // returns 'ABC123'
Try:
$string = preg_replace ('/[^a-z0-9]/i', '', $string);
/i stands for case insensitivity (if you need it, of course).
/[^a-z0-9.]/
should do the trick
This also works to replace anything not a digit, a word character, or a period with an underscore. Useful for filenames.
$clean = preg_replace('/[^\d\w.]+/', '_', $string);

PHP Regex - Replace substring within a string

If I have the following string for example:
"Value1 *|VALUE_2|* Value3"
I want to be able to remove any substrings from within the string that begins with the chracters | and end in |. I will not know what will be between these characters but this is irrelevant as I want to remove whatever is between them.
Basically, I am just unsure of what pattern to use in the code below.
preg_replace("*|PATTERN|*", "", $str);
| is a special character that needs to be escaped in regex:
preg_replace('/\|[^|]+\|/', '', $str);
preg_replace('/|[^|]+|/', '', $str);
This should do the trick.
preg_replace("/\|.*\|/U", "", $str);
I think this should do the trick..
The /U makes the pattern ungreedy so that in Value1 |VALUE_2| Value3|test the match will be |VALUE_2| instead o |VALUE_2| Value3|
If you only want the contents between the |'s removed, you can do:
preg_replace("/\|.*\|/U", "||", $str);
There are other ways, but this would be the most easy one imho.

Regex to remove non alphanumeric characters from UTF8 strings

How can I remove characters, like punctuation, commas, dashes etc from a string, in a multibyte safe manner?
I will be working with input from many different languages and I am wondering if there is something that can help me with this
Thanks
There are the unicode character class thingys that you can use:
http://www.regular-expressions.info/unicode.html
http://php.net/manual/en/regexp.reference.unicode.php
To match any non-letter symbols you can just use \PL+, the negation of \p{L}. To not remove spaces, use a charclass like [^\pL\s]+. Or really just remove punctuation with \pP+
Well, and obviously don't forget the regex /u modifier.
I used this:
$clean = preg_replace( "/[^\p{L}|\p{N}]+/u", " ", $raw );
$clean = preg_replace( "/[\p{Z}]{2,}/u", " ", $clean );
Similar post
Remove non-utf8 characters from string
I'm not sure if this covers all characters though.
According to this post on th dreamincode forum
http://www.dreamincode.net/forums/topic/78179-regular-expression-to-remove-non-ascii-characters/
this should work
/[^\x{21}-\x{7E}\s\t\n\r]/
Maybe this will be usefull?
$newstring = preg_replace('/[^0-9a-zA-Z\s]/', $oldstring);

PHP regex, replace all trash symbols

I can't get my head around a solid RegEx for doing this, still very new at all this RegEx magic. I had some limited success, but I feel like there is a simpler, more efficient way.
I would like to purify a string of all non-alphanumeric characters, and turn all those invalid subsets into one single underscore, but trim them at the edges. For example, the string <<+ćThis?//String_..! should be converted to This_String
Any thoughts on doing this all in one RegEx? I did it with regular str_replace, and then regexed the multi-underscores out of the way, and then trimmed the last underscores from the edges, but it seems like overkill and like something RegEx could do in one go. Kind of going for max speed/efficiency here, even if it is milliseconds I'm dealing with.
= trim(preg_replace('<\W+>', "_", $string), "_");
The uppercase \W escape here matches "non-word" characters, meaning everything but letters and numbers. To remove the leftover outer underscores I would still use trim.
Yes, you could do this:
preg_replace("/[^a-zA-Z0-9]+/", "_", $myString);
Then you would trim leading and trailing underscores, maybe by doing this:
preg_replace("/^_+|_+$/", "", $myReplacedString);
It's not one regex, but it's cleaner than str_replace and a bunch of regex.
$output = preg_replace('/([^0-9a-z])/i', ' ', '<<+ćThis?//String_..!');
$output = preg_replace('!\s+!', '_', trim($output));
echo $output;
This_String

stripping out all characters from a string, leaving numbers

Hay, i have a string like this:
v8gn5.8gnr4nggb58gng.g95h58g.n48fn49t.t8t8t57
I want to strip out all the characters leaving just numbers (and .s)
Any ideas how to do this? Is there a function prebuilt?
thanks
$str = preg_replace('/[^0-9.]+/', '', $str);
replace substrings that do not consist of digits or . with nothing.
Here's how it works:
preg_replace is a PHP function that searches a string for a pattern and replaces it with a given replacement string.
The first parameter in preg_replace is the regular expression pattern to search for. In this case, the pattern is '/[^0-9.]+/', which matches any character that is not a digit or a dot. The ^ character inside square brackets means "not", so [^0-9.] means any character that is not a digit or a dot. The + sign means one or more occurrences of the previous character or character group, in this case [^0-9.].
The second parameter in preg_replace is the replacement string. In this case, the replacement string is an empty string ''. So any character that matches the pattern in the first parameter will be replaced with an empty string.
The third parameter in preg_replac is the input string to search and modify. In this case, the input string is represented by the variable $str.
So, this line of code will remove any character from the input string $str that is not a digit or a dot, and return the modified string with only digits and dots.
preg_replace('/[^0-9.]/', '', $string);
$input = 'some str1ng 234';
$newString = preg_replace("/[^0-9.]/", '', $input);
To satisfy my curiosity I asked about the speed of the proposed answers and as shown in preg_replace speed optimisation/ it is (much) faster to use str_replace() than preg_replace().
So you might want to use str_replace() instead.
Here is the shortest one:
$str = preg_replace('/\D/', '', $str);
\D = all non-digits.

Categories