I'm having trouble finding a solution to this. How can I avoid losing the period in this regex?
$text = preg_replace('~[^\\pL\d]+~u', '-', $text);
$text = preg_replace('#[^0-9a-z\.]+#i', '-', $text);
This replaces anything that isn't 0-9, a-z, or a period, in a case-insenstive manner.
Just add the dot to your character class:
$text = preg_replace('~[^\\pL\d.]+~u', '-', $text);
You are using a negated character class (the [^ part) so anything that does not match any of the characters in that character class, gets replaced.
By the way, your question title does not match your regex.
What the heck is "\\pL"? AFAIK this matches a Backslash and the letters p and L.
Is this what you mean?
<?php
echo preg_replace('/[^a-z0-9.]+/ui', '-', 'abc093.-23.-2ªıØẞÆ.23.OAIFJ→øæł¶iwoeweo');
?>
Result:
abc093.-23.-2-.23.OAIFJ-iwoeweo
Don't do a double escape and to be fully unicode compatible, numerics are : \pN then:
$text = preg_replace('~[^\pL\pN]+~u', '-', $text);
Related
I'm making a function that that detect and remove all trailing special characters from string. It can convert strings like :
"hello-world"
"hello-world/"
"hello-world--"
"hello-world/%--+..."
into "hello-world".
anyone knows the trick without writing a lot of codes?
Just for fun
[^a-z\s]+
Regex demo
Explanation:
[^x]: One character that is not x sample
\s: "whitespace character": space, tab, newline, carriage return, vertical tab sample
+: One or more sample
PHP:
$re = "/[^a-z\\s]+/i";
$str = "Hello world\nhello world/\nhello world--\nhellow world/%--+...";
$subst = "";
$result = preg_replace($re, $subst, $str);
try this
$string = preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.
or escape apostraphe from string
preg_replace('/[^A-Za-z0-9\-\']/', '', $string); // escape apostraphe
You could use a regex like this, depending on your definition of "special characters":
function clean_string($input) {
return preg_replace('/\W+$/', '', $input);
}
It replaces any characters that are not a word character (\W) at the end of the string $ with nothing. \W will match [^a-zA-Z0-9_], so anything that is not a letter, digit, or underscore will get replaced. To specify which characters are special chars, use a regex like this, where you put all your special chars within the [] brackets:
function clean_string($input) {
return preg_replace('/[\/%.+-]+$/', '', $input);
}
This one is what you are looking for. :
([^\n\w\d \"]*)$
It removes anything that is not from the alphabet, a number, a space and a new line.
Just call it like this :
preg_replace('/([^\n\w\s]*)$/', '', $string);
I need a regex to remove all non-alphanumeric and space characters, I have this
$page_title = preg_replace("/[^A-Za-z0-9 ]/", "", $page_title);
but it doesn't remove space characters and replaces some non-alphanumeric characters with numbers.
I need the special characters like puntuation and spaces removed.
If all you want to leave all of the alphanumeric bits you would use this:
(\W)+
Here is some test code:
$original = "Match spaces and {!}#";
echo $original ."<br>";
$altered = preg_replace("/(\W)+/", "", $original);
echo $altered;
Here is the output:
Match spaces and {!}#
Matchspacesand
Here is the explanation:
1st Capturing group: (\W) matches any non-word character [^a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
I need the special characters like puntuation and spaces removed.
Then use:
$page_title = preg_replace('/[\p{P}\p{Zs}]+/u', "", $page_title);
\p{P} matches any punctuation character
\p{Zs} matches any space character
/u - To support unicode
Try this
preg_replace('/[^[:alnum:]]/', '', $page_title);
[:alnum:] matches alphanumeric characters
Works good for me on Sublime and PHP Regex Tester
$page_title = preg_replace("/[^A-Za-z0-9]/", "", $page_title);
How can I use PHP to strip out all characters that are NOT letters, numbers, spaces, or punctuation marks?
I've tried the following, but it strips punctuation.
preg_replace("/[^a-zA-Z0-9\s]/", "", $str);
preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", $str);
Example:
php > echo preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", "⟺f✆oo☃. ba⟗r!");
foo. bar!
\p{P} matches all Unicode punctuation characters (see Unicode character properties). If you only want to allow specific punctuation, simply add them to the negated character class. E.g:
preg_replace("/[^a-zA-Z0-9\s.?!]/", "", $str);
You're going to have to list the punctuation explicitly as there is no shorthand for that (eg \s is shorthand for white space characters).
preg_replace('/[^a-zA-Z0-9\s\-=+\|!##$%^&*()`~\[\]{};:\'",<.>\/?]/', '', $str);
$str = trim($str);
$str = trim($str, "\x00..\x1F");
$str = str_replace(array( ""","'","&","<",">"),' ',$str);
$str = preg_replace('/[^0-9a-zA-Z-]/', ' ', $str);
$str = preg_replace('/\s\s+/', ' ', $str);
$str = trim($str);
$str = preg_replace('/[ ]/', '-', $str);
Hope this helps.
Let's build a multibyte-safe/unicode-safe pattern for this task.
From https://www.regular-expressions.info/unicode.html:
\p{L} or \p{Letter}: any kind of letter from any language.
\p{Z} or \p{Separator}: any kind of whitespace or invisible separator.
\p{N} or \p{Number}: any kind of numeric character in any script.
\p{P} or \p{Punctuation}: any kind of punctuation character.
[^ ... ] is a negated character class that matches any character not in the list.
+ is a "one or more" quantifier.
u This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8. An invalid subject will cause the preg_* function to match nothing; an invalid pattern will trigger an error of level E_WARNING. Five and six octet UTF-8 sequences are regarded as invalid.
Code: (Demo)
echo preg_replace('/[^\p{L}\p{Z}\p{N}\p{P}]+/u', '', $string);
wondering how I can replace all special chars on my string like: hello this is a test!
I've wrote this code:
$text = preg_replace("/[^A-Za-z0-9]/", ' ', $text);
This works need more flexibility to allow special chars like áéíóú... and remove only certain chars like: :!"#$%&/()=?¿¡...
Any ideas?
Use $text = preg_replace("/[^\p{L}\p{N}]/u", ' ', $text);
This will match all characters that are not letters or numbers and will treat Unicode letters appropriately.
I've got text from which I want to remove all characters that ARE NOT the following.
desired_characters =
0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n
The last is a \n (newline) that I do want to keep.
To match all characters except the listed ones, use an inverted character set [^…]:
$chars = "0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n";
$pattern = "/[^".preg_quote($chars, "/")."]/";
Here preg_quote is used to escape certain special characters so that they are interpreted as literal characters.
You could also use character ranges to express the listed characters:
$pattern = "/[^0-9!&',-.\\/a-z\n]/";
In this case it doesn’t matter if the literal - in ,-. is escaped or not. Because ,-. is interpreted as character range from , (0x2C) to . (0x2E) that already contains the - (0x2D) in between.
Then you can remove those characters that are matched with preg_replace:
$output = preg_replace($pattern, "", $str);
$string = 'This is anexample $tring! :)';
$string = preg_replace('/[^0-9!&\',\-.\/a-z\n]/', '', $string);
echo $string; // hisisanexampletring!
^ This is case sensitive, hence the capital T is removed from the string. To allow capital letters as well, $string = preg_replace('/[^0-9!&\',\-.\/A-Za-z\n]/', '', $string)