I want to convert all the non-alphanumerical characters to hyphens (-) (dashes) for an elegant URL. For this purpose I am using the following method:
$title = 'Any Authentic PHP Script / Third Party & # 10 $ tool to';
$title .= 'Convert HTML to BBcode, BBcode to HTML';
$url = preg_replace("/[^0-9a-zA-Z ]/m", "", $title );
$url = preg_replace("/ /", "-", $url);
It outputs the following:
Any-Authentic-PHP-Script--Third-Party---10--tool-to-Convert-HTML-to-BBcode-BBcode-to-HTML
But, as you will have noticed, there are some unwanted double hyphens (--) and some triple hyphens (--) in the output. I want only one hyphen between each word. How can I achieve my target?
For your code, just replace
$url = preg_replace("/ /", "-", $url);
to
$url = preg_replace("/\s+/", "-", $url);
And get all your spaces (and tabs and so on) convert to only one hyphen. \s means any space character, and + means one or more of the previous token
However, you can do better. Replace both your regexes into one preg_replace
$url = preg_replace("/\W+/m", "-", $title );
...because \W precisely mean non-alphanumeric characters.
In addition, if you also don't want underscores (_) in your result, use
$url = preg_replace("/[\W_]+/m", "-", $title );
As a side note, next time if you genuinely want to do
preg_replace("/ /", "-", $url);
please do this instead
str_replace(" ", "-", $url);
Because str_replace is much faster than preg_replace and is even recommended from PHP docs:
http://php.net/manual/en/function.str-replace.php
If you don't need fancy replacing rules (like regular expressions), you should always use this function instead of preg_replace().
It's because first all non-alphanumerical characters are removed, so your string becomes
Any Authentic PHP Script Third Party 10 tool to
You're seeing it already—leaving double spaces at some places.
Just do this:
preg_replace("/[^a-zA-Z0-9]+/", "-", $subject);
It replaces all occurences of one or multiple non-alphanumerical characters to a single dash.
Related
I want to write a PHP function that keeps only a-z (keeps all letters as lowercase) 0-9 and "-", and replace spaces with "-".
Here is what I have so far:
...
$s = strtolower($s);
$s = str_replace(' ', '-', $s);
$s = preg_replace("/[^a-z0-9]\-/", "", $s);
But I noticed that it keeps "?" (question marks) and I'm hoping that it doesn't keep other characters that I haven't noticed.
How could I correct it to obtain the expected result?
(I'm not super comfortable with regular expressions, especially when switching languages/tools.)
$s = strtolower($s);
$s = str_replace(' ', '-', $s);
$s = preg_replace("/[^a-z0-9\-]+/", "", $s);
You did not have the \- in the [] brackets.
It also seems you can use - instead of \-, both worked for me.
You need to add multiplier of the searched characters.
In this case, I used +.
The plus sign indicates one or more occurrences of the preceding element.
Ok so I am taking a string, querying a database and then must provide a URL back to the page. There are multiple special characters in the input and I am stripping all special characters and spaces out using the following code and replacing with HTML "%25" so that my legacy system correctly searches for the value needed. What I need to do however is cut down the number of "%25" that show up.
My current code would replace something like
"Hello. / there Wilbur" with "Hello%25%25%25%25there%25Wilbur"
but I would like it to return
"Hello%25there%25Wilbur"
replacing multiples of the "%25" with only one instance
$string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens.
return preg_replace('/[^A-Za-z0-9]/', '%25', $string); // Replaces special chars.
Just add a + after selecting a non-alphanumeric character.
$string = "Hello. / there Wilbur";
$string = str_replace(' ', '-', $string);
// Just add a '+'. It will remove one or more consecutive instances of illegal
// characters with '%25'
return preg_replace('/[^A-Za-z0-9]+/', '%25', $string);
Sample input: Hello. / there Wilbur
Sample output: Hello%25there%25Wilbur
This will work:
while (strpos('%25%25', $str) !== false)
$str = str_replace('%25%25', '%25', $str);
Or using a regexp:
preg_replace('#((?:\%25){2,})#', '%25', $string_to_replace_in)
No looping using a while, so the more consecutive '%25', the faster preg_replace is against a while.
Cf PHP doc:
http://fr2.php.net/manual/en/function.preg-replace.php
I know this question has been asked several times for sure, but I have my problems with regular expressions... So here is the (simple) thing I want to do in PHP:
I want to make a function which replaces unwanted characters of strings. Accepted characters should be:
a-z A-Z 0-9 _ - + ( ) { } # äöü ÄÖÜ space
I want all other characters to change to a "_". Here is some sample code, but I don't know what to fill in for the ?????:
<?php
// sample strings
$string1 = 'abd92 s_öse';
$string2 = 'ab! sd$ls_o';
// Replace unwanted chars in string by _
$string1 = preg_replace(?????, '_', $string1);
$string2 = preg_replace(?????, '_', $string2);
?>
Output should be:
$string1: abd92 s_öse (the same)
$string2: ab_ sd_ls_o
I was able to make it work for a-z, 0-9 but it would be nice to allow those additional characters, especially äöü. Thanks for your input!
To allow only the exact characters you described:
$str = preg_replace("/[^a-zA-Z0-9_+(){}#äöüÄÖÜ -]/", "_", $str);
To allow all whitespace, not just the (space) character:
$str = preg_replace("/[^a-zA-Z0-9_+(){}#äöüÄÖÜ\s-]/", "_", $str);
To allow letters from different alphabets -- not just the specific ones you mentioned, but also things like Russian and Greek, or other types of accent marks:
$str = preg_replace("/[^\w+(){}#\s-]/", "_", $str);
If I were you, I'd go with the last one. Not only is it shorter and easier to read, but it's less restrictive, and there's no particular advantage to blocking stuff like и if äöüÄÖÜ are all fine.
Replace [^a-zA-Z0-9_\-+(){}#äöüÄÖÜ ] with _.
$string1 = preg_replace('/[^a-zA-Z0-9_\-+(){}#äöüÄÖÜ ]/', '_', $string1);
This replaces any characters except the ones after ^ in the [character set]
Edit: escaped the - dash.
I've got text from which I want to remove all characters that ARE NOT the following.
desired_characters =
0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n
The last is a \n (newline) that I do want to keep.
To match all characters except the listed ones, use an inverted character set [^…]:
$chars = "0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n";
$pattern = "/[^".preg_quote($chars, "/")."]/";
Here preg_quote is used to escape certain special characters so that they are interpreted as literal characters.
You could also use character ranges to express the listed characters:
$pattern = "/[^0-9!&',-.\\/a-z\n]/";
In this case it doesn’t matter if the literal - in ,-. is escaped or not. Because ,-. is interpreted as character range from , (0x2C) to . (0x2E) that already contains the - (0x2D) in between.
Then you can remove those characters that are matched with preg_replace:
$output = preg_replace($pattern, "", $str);
$string = 'This is anexample $tring! :)';
$string = preg_replace('/[^0-9!&\',\-.\/a-z\n]/', '', $string);
echo $string; // hisisanexampletring!
^ This is case sensitive, hence the capital T is removed from the string. To allow capital letters as well, $string = preg_replace('/[^0-9!&\',\-.\/A-Za-z\n]/', '', $string)
I want to remove all non-alphanumeric and space characters from a string. So I do want spaces to remain. What do I put for a space in the below function within the [ ] brackets:
ereg_replace("[^A-Za-z0-9]", "", $title);
In other words, what symbol represents space, I know \n represents a new line, is there any such symbol for a single space.
Just put a plain space into your character class:
[^A-Za-z0-9 ]
For other whitespace characters (tabulator, line breaks, etc.) use \s instead.
You should also be aware that the PHP’s POSIX ERE regular expression functions are deprecated and will be removed in PHP 6 in favor of the PCRE regular expression functions. So I recommend you to use preg_replace instead:
preg_replace("/[^A-Za-z0-9 ]/", "", $title)
If you want only a literal space, put one in. the group for 'whitespace characters' like tab and newlines is \s
The accepted answer does not remove spaces.
Consider the following
$string = 'tD 13827$2099';
$string = preg_replace("/[^A-Za-z0-9 ]/", "", $string);
echo $string;
> tD 138272099
Now if we str_replace spaces, we get the desired output
$string = 'tD 13827$2099';
$string = preg_replace("/[^A-Za-z0-9 ]/", "", $string);
// remove the spaces
$string = str_replace(" ", "", $string);
echo $string;
> tD138272099