PHP - Removing Brackets and Special Characters - php

I'm looking to modify a PHP string so I can use it as an anchor tag.
I used the method found here: Remove all special characters from a string
It worked well to remove ampersands from my strings, but it doesn't seem to be removing or affecting the brackets or punctuation.
Here's what I'm currently using:
$name_clean = preg_replace('/ [^A-Za-z0-9\-]/', '', $name); // REMOVES SPECIAL CHARACTERS
$name_slug = str_replace(' ', '-', $name_clean); // REPLACES SPACES WITH DASHES IN TITLE
$link = strtolower( $name_slug ); // CREATES LOWERCASE SLUG VERSION OF TITLE_SLUG
My string (in this case $name) = St. John's (Newfoundland).
The output I get = #st.-john'snewfoundland)
I'd like to remove the periods, apostrophes and brackets altogether.
Any help would be greatly appreciated!

Your regex pattern / [^A-Za-z0-9\-]/ appears to contain a space after the opening /. This pattern will only match a special character that comes after a space. Removing that space should get the result you want.

Related

PHP Regular Expression: could not convert space into glue

I have a PHP functions that should do following tasks:
The function will take 2 params - the string and the glue(defaults to "-").
for a given string,
-- remove any special characters
-- make it lowercase
-- remove multiple spaces
-- replace spaces with glue (-).
The function takes $input as parameter. The code I have used for that is below:
//make all the charecters in lowercase
$low = strtolower($input);
//remove special charecters and multiple spaces
$nospecial = preg_replace('/[^a-zA-Z0-9\s+]/', '', $low);
//replace the spaces into glues (-). here is the problem.
$converted = preg_replace('/\s/', '-', $nospecial);
return $converted;
I did not find anything wrong with this code. but is shows multiple glues in the output. but i have already removed multiple spaces in the second line of the code. so why it shows multiple glues? could anyone have any solution?
but i have already removed multiple spaces in the second line of the code
No, you haven't remove the spaces. The second line of code keeps in $nospecial the letters, the digits, the spaces and the plus sign (+).
A character class matches a single character in the subject. \s+ in a character class doesn't mean "one or many space characters". It means either a space character (\s) or a plus sign (+). If it would mean what you meant, $nospecial won't contain any space character at all.
I suggest you split the second processing step in two: first remove all the special characters (keep letters, digits and spaces) then compact the spaces (there is no way to do both of them in a single replace).
The compacting can be then combined with the replacement of the spaces with the glue in a single operation:
// Make all the charecters lowercase
// Trim the white spaces first to avoid the final result have stray hyphens on the sides
$low = strtolower(trim($input));
// Remove special characters (keep letters, digits and spaces)
$nospecial = preg_replace('/[^a-z0-9\s]/', '', $low);
// Compact the spaces and replace them with the glue
$converted = preg_replace('/\s+/', '-', $nospecial);
return $converted;
Update: added trimming the input string before any processing to avoid getting a result that start or end with the glue. This is not required by the question, it was suggested by #niet-the-dark-absol in a comment and I also think it's a good thing; most probably, the string generated this way is used as file name by the question's author.

Regex replace special characters with hyphen except first and last

I'm using this regular expression to format song names to url friendly aliases. It replaces 1 or more consecutive instances of special characters and white space with a hyphen.
$alias = preg_replace("/([^a-zA-Z0-9]+)/", "-", trim(strtolower($name)));
This works fine so if the song name was This is *is* a (song) NAME it would create the alias this-is-a-song-name. But if the name has special characters at the beginning or end *This is a song (name) it would create -this-is-a-song-name- creating hyphens at each end.
How can I modify the regex to replace as it is above, but to replace the first and last special character with empty space.
You need to do a double replacement.
$str = '*This is a song (name)';
$s = preg_replace('~^[^a-zA-Z0-9]+|[^a-zA-Z0-9]+$~', '', $str);
echo preg_replace('~[^a-zA-Z0-9]+~', '-', $s);
You can perform that in 2 steps:
In the first step, you remove special characters at the beginning and at the end of your string
In the second step, you reuse your regex as it is now.
The code could look like this:
$alias = preg_replace("/(^[^a-zA-Z0-9]+|[^a-zA-Z0-9]+$)/", "", trim(strtolower($name));
$alias = preg_replace("/([^a-zA-Z0-9]+)/", "-", $alias);
(I haven't tested it but you may get the idea)

How to use RegEx to strip specific leading and trailing punctuation in PHP

We're scrubbing a ridiculous amount of data, and am finding many examples of clean data that are left with irrelevant punctuation at the beginning and end of the final string. Quotes and DoubleQuotes are fine, but leading/trailing dashes, commas, etc need to be removed
I've studied the answer at How can I remove all leading and trailing punctuation?, but am unable to find a way to accomplish the same in PHP.
- some text. dash and period should be removed
"Some Other Text". period should be removed
it's a matter of opinion apostrophe should be kept
/ some more text? Slash should be removed and question mark kept
In short,
Certain punctuation occurring BEFORE the first AlphaNumeric character must be removed
Certain punctuation occurring AFTER the last AlphaNumeric character must be removed
How can I accomplish this with PHP - the few examples I've found surpass my RegEx/JS abilites.
This is an answer without regex.
You can use the function trim (or a combination of ltrim/rtrim to specify all characters you want to remove. For your example:
$str = trim($str, " \t\n\r\0\x0B-.");
(As I suppose you also want to remove spacing and newlines at the begin/end, I left the default mask)
See also rtrim and ltrim if you don't want to remove the same charlist at the beginning and the end of your strings.
You can modify the pattern to include characters.
$array = array(
'- some text.',
'"Some Other Text".',
'it\'s a matter of opinion',
'/ some more text?'
);
foreach($array as $key => $string){
$array[$key] = preg_replace(array(
'/^[\.\-\/]*/',
'/[\.\-\/]*$/'
), array('', ''), $string);
}
print_r($array);
If the punctuation could be more than one character, you could do this
function trimFormatting($str){ // trim
$osl = 0;
$pat = '(<br>|,|\s+)';
while($osl!==strlen($str)){
$osl = strlen($str);
$str =preg_replace('/^'.$pat.'|'.$pat.'$/i','',$str);
}
return $str;
}
echo trimFormatting('<BR>,<BR>Hello<BR>World<BR>, <BR>');
// will give "Hello<BR>World"
The routine checks for "<BR>" and "," and one or spaces ("\s+"). The "|" being the OR operator used three times in the routine. It trims both at the start "^" and the end "$" at the same time. It keeps looping through this until no more matches are trimmed off (i.e. there is no further reduction in string length).

How to replace one or more consecutive spaces with one single character?

I want to generate the string like SEO friendly URL. I want that multiple blank space to be eliminated, the single space to be replaced by a hyphen (-), then strtolower and no special chars should be allowed.
For that I am currently the code like this:
$string = htmlspecialchars("This Is The String");
$string = strtolower(str_replace(htmlspecialchars((' ', '-', $string)));
The above code will generate multiple hyphens. I want to eliminate that multiple space and replace it with only one space. In short, I am trying to achieve the SEO friendly URL like string. How do I do it?
You can use preg_replace to replace any sequence of whitespace chars with a dash...
$string = preg_replace('/\s+/', '-', $string);
The outer slashes are delimiters for the pattern - they just mark where the pattern starts and ends
\s matches any whitespace character
+ causes the previous element to match 1 or more times. By default, this is 'greedy' so it will eat up as many consecutive matches as it can.
See the manual page on PCRE syntax for more details
echo preg_replace('~(\s+)~', '-', $yourString);
What you want is "slugify" a string. Try a search on SO or google on "php slugify" or "php slug".

Replace all spaces and special symbols with dash in URL using PHP language

How to replace spaces and dashes when they appear together with only dash in PHP?
e.g below is my URL
http://kjd.case.150/1 BHK+Balcony- 700+ sqft. spacious apartmetn Bandra Wes
In this I want to replace all special characters with dash in PHP. In the URL there is already one dash after "balcony". If I replace the dash with a special character, then it becomes two dashes because there's already one dash in the URL and I want only 1 dash.
I'd say you may be want it other way. Not "spaces" but every non-alphanumeric character.
Because there can be other characters, disallowed in the URl (+ sign, for example, which is used as a space replacement)
So, to make a valid url from a free-form text
$url = preg_replace("![^a-z0-9]+!i", "-", $url);
If there could be max one space surrounding the hyphen you can use the answer by John. If there could be more than one space you can try using preg_replace:
$str = preg_replace('/\s*-\s*/','-',$str);
This would replace even a - not surrounded with any spaces with - !!
To make it a bit more efficient you could do:
$str = preg_replace('/\s+-\s*|\s*-\s+/','-',$str);
Now this would ensure a - has at least one space surrounding it while its being replaced.
This should do it for you
strtolower(str_replace(array(' ', ' '), '-', preg_replace('/[^a-zA-Z0-9 s]/', '', trim($string))));
Apply this regular expression /[^a-zA-Z0-9]/, '-' which will replace all non alphanumeric characters with -. Store it in a variable and again apply this regular expression /\-$/, '' which will escape the last character.
Its old tread but to help some one, Use this Function:
function urlSafeString($str)
{
$str = eregi_replace("[^a-z0-9\040]","",str_replace("-"," ",$str));
$str = eregi_replace("[\040]+","-",trim($str));
return $str;
}
it will return you a url safe string

Categories