replace all combinations of non-alphanumeric characters - php

I need to to create url from various strings.
So trying to replace all non-alphanumeric characters and all of their combinations - with a hyphen character (-)
$string = "blue - sky";
$string = preg_replace("/[^A-Za-z0-9 ]/", '-', $string);
echo $string;
result - blue---sky
expected - blue-sky.

use the + sign to replace more than one character with one replacement character:
string = preg_replace("/[^A-Za-z0-9]+/", '-', $string);

Related

PHP function to keep only a-z 0-9 and replace spaces with "-" (including regular expression)

I want to write a PHP function that keeps only a-z (keeps all letters as lowercase) 0-9 and "-", and replace spaces with "-".
Here is what I have so far:
...
$s = strtolower($s);
$s = str_replace(' ', '-', $s);
$s = preg_replace("/[^a-z0-9]\-/", "", $s);
But I noticed that it keeps "?" (question marks) and I'm hoping that it doesn't keep other characters that I haven't noticed.
How could I correct it to obtain the expected result?
(I'm not super comfortable with regular expressions, especially when switching languages/tools.)
$s = strtolower($s);
$s = str_replace(' ', '-', $s);
$s = preg_replace("/[^a-z0-9\-]+/", "", $s);
You did not have the \- in the [] brackets.
It also seems you can use - instead of \-, both worked for me.
You need to add multiplier of the searched characters.
In this case, I used +.
The plus sign indicates one or more occurrences of the preceding element.

Php make spaces in a word with a dash

I have the following string:
$thetextstring = "jjfnj 948"
At the end I want to have:
echo $thetextstring; // should print jjf-nj948
So basically what am trying to do is to join the separated string then separate the first 3 letters with a -.
So far I have
$string = trim(preg_replace('/s+/', ' ', $thetextstring));
$result = explode(" ", $thetextstring);
$newstring = implode('', $result);
print_r($newstring);
I have been able to join the words, but how do I add the separator after the first 3 letters?
Use a regex with preg_replace function, this would be a one-liner:
^.{3}\K([^\s]*) *
Breakdown:
^ # Assert start of string
.{3} # Match 3 characters
\K # Reset match
([^\s]*) * # Capture everything up to space character(s) then try to match them
PHP code:
echo preg_replace('~^.{3}\K([^\s]*) *~', '-$1', 'jjfnj 948');
PHP live demo
Without knowing more about how your strings can vary, this is working solution for your task:
Pattern:
~([a-z]{2}) ~ // 2 letters (contained in capture group1) followed by a space
Replace:
-$1
Demo Link
Code: (Demo)
$thetextstring = "jjfnj 948";
echo preg_replace('~([a-z]{2}) ~','-$1',$thetextstring);
Output:
jjf-nj948
Note this pattern can easily be expanded to include characters beyond lowercase letters that precede the space. ~(\S{2}) ~
You can use str_replace to remove the unwanted space:
$newString = str_replace(' ', '', $thetextstring);
$newString:
jjfnj948
And then preg_replace to put in the dash:
$final = preg_replace('/^([a-z]{3})/', '\1-', $newString);
The meaning of this regex instruction is:
from the beginning of the line: ^
capture three a-z characters: ([a-z]{3})
replace this match with itself followed by a dash: \1-
$final:
jjf-nj948
$thetextstring = "jjfnj 948";
// replace all spaces with nothing
$thetextstring = str_replace(" ", "", $thetextstring);
// insert a dash after the third character
$thetextstring = substr_replace($thetextstring, "-", 3, 0);
echo $thetextstring;
This gives the requested jjf-nj948
You proceeding is correct. For the last step, which consists in inserting a - after the third character, you can use the substr_replace function as follows:
$thetextstring = 'jjfnj 948';
$string = trim(preg_replace('/\s+/', ' ', $thetextstring));
$result = explode(' ', $thetextstring);
$newstring = substr_replace(implode('', $result), '-', 3, false);
If you are confident enough that your string will always have the same format (characters followed by a whitespace followed by numbers), you can also reduce your computations and simplify your code as follows:
$thetextstring = 'jjfnj 948';
$newstring = substr_replace(str_replace(' ', '', $thetextstring), '-', 3, false);
Visit this link for a working demo.
Oldschool without regex
$test = "jjfnj 948";
$test = str_replace(" ", "", $test); // strip all spaces from string
echo substr($test, 0, 3)."-".substr($test, 3); // isolate first three chars, add hyphen, and concat all characters after the first three

Remove All these characters [duplicate]

How can I use PHP to strip out all characters that are NOT letters, numbers, spaces, or punctuation marks?
I've tried the following, but it strips punctuation.
preg_replace("/[^a-zA-Z0-9\s]/", "", $str);
preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", $str);
Example:
php > echo preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", "⟺f✆oo☃. ba⟗r!");
foo. bar!
\p{P} matches all Unicode punctuation characters (see Unicode character properties). If you only want to allow specific punctuation, simply add them to the negated character class. E.g:
preg_replace("/[^a-zA-Z0-9\s.?!]/", "", $str);
You're going to have to list the punctuation explicitly as there is no shorthand for that (eg \s is shorthand for white space characters).
preg_replace('/[^a-zA-Z0-9\s\-=+\|!##$%^&*()`~\[\]{};:\'",<.>\/?]/', '', $str);
$str = trim($str);
$str = trim($str, "\x00..\x1F");
$str = str_replace(array( ""","'","&","<",">"),' ',$str);
$str = preg_replace('/[^0-9a-zA-Z-]/', ' ', $str);
$str = preg_replace('/\s\s+/', ' ', $str);
$str = trim($str);
$str = preg_replace('/[ ]/', '-', $str);
Hope this helps.
Let's build a multibyte-safe/unicode-safe pattern for this task.
From https://www.regular-expressions.info/unicode.html:
\p{L} or \p{Letter}: any kind of letter from any language.
\p{Z} or \p{Separator}: any kind of whitespace or invisible separator.
\p{N} or \p{Number}: any kind of numeric character in any script.
\p{P} or \p{Punctuation}: any kind of punctuation character.
[^ ... ] is a negated character class that matches any character not in the list.
+ is a "one or more" quantifier.
u This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8. An invalid subject will cause the preg_* function to match nothing; an invalid pattern will trigger an error of level E_WARNING. Five and six octet UTF-8 sequences are regarded as invalid.
Code: (Demo)
echo preg_replace('/[^\p{L}\p{Z}\p{N}\p{P}]+/u', '', $string);

Remove all non-matching characters in PHP string?

I've got text from which I want to remove all characters that ARE NOT the following.
desired_characters =
0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n
The last is a \n (newline) that I do want to keep.
To match all characters except the listed ones, use an inverted character set [^…]:
$chars = "0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n";
$pattern = "/[^".preg_quote($chars, "/")."]/";
Here preg_quote is used to escape certain special characters so that they are interpreted as literal characters.
You could also use character ranges to express the listed characters:
$pattern = "/[^0-9!&',-.\\/a-z\n]/";
In this case it doesn’t matter if the literal - in ,-. is escaped or not. Because ,-. is interpreted as character range from , (0x2C) to . (0x2E) that already contains the - (0x2D) in between.
Then you can remove those characters that are matched with preg_replace:
$output = preg_replace($pattern, "", $str);
$string = 'This is anexample $tring! :)';
$string = preg_replace('/[^0-9!&\',\-.\/a-z\n]/', '', $string);
echo $string; // hisisanexampletring!
^ This is case sensitive, hence the capital T is removed from the string. To allow capital letters as well, $string = preg_replace('/[^0-9!&\',\-.\/A-Za-z\n]/', '', $string)

How to remove all non-alphanumeric and non-space characters from a string in PHP?

I want to remove all non-alphanumeric and space characters from a string. So I do want spaces to remain. What do I put for a space in the below function within the [ ] brackets:
ereg_replace("[^A-Za-z0-9]", "", $title);
In other words, what symbol represents space, I know \n represents a new line, is there any such symbol for a single space.
Just put a plain space into your character class:
[^A-Za-z0-9 ]
For other whitespace characters (tabulator, line breaks, etc.) use \s instead.
You should also be aware that the PHP’s POSIX ERE regular expression functions are deprecated and will be removed in PHP 6 in favor of the PCRE regular expression functions. So I recommend you to use preg_replace instead:
preg_replace("/[^A-Za-z0-9 ]/", "", $title)
If you want only a literal space, put one in. the group for 'whitespace characters' like tab and newlines is \s
The accepted answer does not remove spaces.
Consider the following
$string = 'tD 13827$2099';
$string = preg_replace("/[^A-Za-z0-9 ]/", "", $string);
echo $string;
> tD 138272099
Now if we str_replace spaces, we get the desired output
$string = 'tD 13827$2099';
$string = preg_replace("/[^A-Za-z0-9 ]/", "", $string);
// remove the spaces
$string = str_replace(" ", "", $string);
echo $string;
> tD138272099

Categories