preg_replace vs trim PHP - php

I am working with a slug function and I dont fully understand some of it and was looking for some help on explaining.
My first question is about this line in my slug function $string = preg_replace('# +#', '-', $string); Now I understand that this replaces all spaces with a '-'. What I don't understand is what the + sign is in there for which comes after the white space in between the #.
Which leads to my next problem. I want a trim function that will get rid of spaces but only the spaces after they enter the value. For example someone accidentally entered "Arizona " with two spaces after the a and it destroyed the pages linked to Arizona.
So after all my rambling I basically want to figure out how I can use a trim to get rid of accidental spaces but still have the preg_replace insert '-' in between words.
ex.. "Sun City West " = "sun-city-west"
This is my full slug function-
function getSlug($string){
if(isset($string) && $string <> ""){
$string = strtolower($string);
//var_dump($string); echo "<br>";
$string = preg_replace('#[^\w ]+#', '', $string);
//var_dump($string); echo "<br>";
$string = preg_replace('# +#', '-', $string);
}
return $string;
}

You can try this:
function getSlug($string) {
return preg_replace('#\s+#', '-', trim($string));
}
It first trims extra spaces at the beginning and end of the string, and then replaces all the other with the - character.
Here your regex is:
#\s+#
which is:
# = regex delimiter
\s = any space character
+ = match the previous character or group one or more times
# = regex delimiter again
so the regex here means: "match any sequence of one or more whitespace character"

The + means at least one of the preceding character, so it matches one or more spaces. The # signs are one of the ways of marking the start and end of a regular expression's pattern block.
For a trim function, PHP handily provides trim() which removes all leading and trailing whitespace.

Related

How to replace all occurrences of a character except the first one in PHP using a regular expression?

Given an address stored as a single string with newlines delimiting its components like:
1 Street\nCity\nST\n12345
The goal would be to replace all newline characters except the first one with spaces in order to present it like:
1 Street
City ST 12345
I have tried methods like:
[$street, $rest] = explode("\n", $input, 2);
$output = "$street\n" . preg_replace('/\n+/', ' ', $rest);
I have been trying to achieve the same result using a one liner with a regular expression, but could not figure out how.
I would suggest not solving this with complicated regex but keeping it simple like below. You can split the string with a \n, pop out the first split and implode the rest with a space.
<?php
$input = explode("\n","1 Street\nCity\nST\n12345");
$input = array_shift($input) . PHP_EOL . implode(" ", $input);
echo $input;
Online Demo
You could use a regex trick here by reversing the string, and then replacing every occurrence of \n provided that we can lookahead and find at least one other \n:
$input = "1 Street\nCity\nST\n12345";
$output = strrev(preg_replace("/\n(?=.*\n)/", " ", strrev($input)));
echo $output;
This prints:
1 Street
City ST 12345
You can use a lookbehind pattern to ensure that the matching line is preceded with a newline character. Capture the line but not the trailing newline character and replace it with the same line but with a trailing space:
preg_replace('/(?<=\n)(.*)\n/', '$1 ', $input)
Demo: https://onlinephp.io/c/5bd6d
You can use an alternation pattern that matches either the first two lines or a newline character, capture the first two lines without the trailing newline character, and replace the match with what's captured and a space:
preg_replace('/(^.*\n.*)\n|\n/', '$1 ', $input)
Demo: https://onlinephp.io/c/2fb2f
I leave you another method, the regex is correct as long as the conditions are met, in this way it always works
$string=explode("/","1 Street\nCity\nST\n12345");
$string[0]."<br>";
$string[1]." ".$string[2]." ".$string[3]

How to match alphanumeric and symbols using PHP?

I'm working with text content in UTF8 encoding stored in variable $title.
Using preg_replace, how do I append an extra space if the $title string is ending with:
upper/lower case character
digit
symbol, eg. ? or !
This should do the trick:
preg_replace('/^(.*[\w?!])$/', "$1 ", $string);
In essence what it does is if the string ends in one of your unwanted characters it appends a single space.
If the string doesn't match the pattern, then preg_replace() returns the original string - so you're still good.
If you need to expand your list of unwanted endings you can just add them into the character block [\w?!]
Using a positive lookbehind before the end of the line.
And replace with a space.
$title = preg_replace('/(?<=[A-Za-z0-9?!])$/',' ', $title);
Try it here
You may want to try this Pattern Matching below to see if that does it for you.
<?php
// THE REGEX BELOW MATCHES THE ENDING LOWER & UPPER-CASED CHARACTERS, DIGITS
// AND SYMBOLS LIKE "?" AND "!" AND EVEN A DOT "."
// HOWEVER YOU CAN IMPROVISE ON YOUR OWN
$rxPattern = "#([\!\?a-zA-Z0-9\.])$#";
$title = "What is your name?";
var_dump($title);
// AND HERE, YOU APPEND A SINGLE SPACE AFTER THE MATCHED STRING
$title = preg_replace($rxPattern, "$1 ", $title);
var_dump($title);
// THE FIRST var_dump($title) PRODUCES:
// 'What is your name?' (length=18)
// AND THE SECOND var_dump($title) PRODUCES
// 'What is your name? ' (length=19) <== NOTICE THE LENGTH FROM ADDED SPACE.
You may test it out HERE.
Cheers...
You need
$title=preg_replace("/.*[\w?!]$/", "\\0 ", $title);

PHP rtrim all trailing special characters

I'm making a function that that detect and remove all trailing special characters from string. It can convert strings like :
"hello-world"
"hello-world/"
"hello-world--"
"hello-world/%--+..."
into "hello-world".
anyone knows the trick without writing a lot of codes?
Just for fun
[^a-z\s]+
Regex demo
Explanation:
[^x]: One character that is not x sample
\s: "whitespace character": space, tab, newline, carriage return, vertical tab sample
+: One or more sample
PHP:
$re = "/[^a-z\\s]+/i";
$str = "Hello world\nhello world/\nhello world--\nhellow world/%--+...";
$subst = "";
$result = preg_replace($re, $subst, $str);
try this
$string = preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.
or escape apostraphe from string
preg_replace('/[^A-Za-z0-9\-\']/', '', $string); // escape apostraphe
You could use a regex like this, depending on your definition of "special characters":
function clean_string($input) {
return preg_replace('/\W+$/', '', $input);
}
It replaces any characters that are not a word character (\W) at the end of the string $ with nothing. \W will match [^a-zA-Z0-9_], so anything that is not a letter, digit, or underscore will get replaced. To specify which characters are special chars, use a regex like this, where you put all your special chars within the [] brackets:
function clean_string($input) {
return preg_replace('/[\/%.+-]+$/', '', $input);
}
This one is what you are looking for. :
([^\n\w\d \"]*)$
It removes anything that is not from the alphabet, a number, a space and a new line.
Just call it like this :
preg_replace('/([^\n\w\s]*)$/', '', $string);

PHP Regex: Remove words less than 3 characters

I'm trying to remove all words of less than 3 characters from a string, specifically with RegEx.
The following doesn't work because it is looking for double spaces. I suppose I could convert all spaces to double spaces beforehand and then convert them back after, but that doesn't seem very efficient. Any ideas?
$text='an of and then some an ee halved or or whenever';
$text=preg_replace('# [a-z]{1,2} #',' ',' '.$text.' ');
echo trim($text);
Removing the Short Words
You can use this:
$replaced = preg_replace('~\b[a-z]{1,2}\b\~', '', $yourstring);
In the demo, see the substitutions at the bottom.
Explanation
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)
[a-z]{1,2} matches one or two letters
\b another word boundary
Replace with the empty string.
Option 2: Also Remove Trailing Spaces
If you also want to remove the spaces after the words, we can add \s* at the end of the regex:
$replaced = preg_replace('~\b[a-z]{1,2}\b\s*~', '', $yourstring);
Reference
Word Boundaries
You can use the word boundary tag: \b:
Replace: \b[a-z]{1,2}\b with ''
Use this
preg_replace('/(\b.{1,2}\s)/','',$your_string);
As some solutions worked here, they had a problem with my language's "multichar characters", such as "ch". A simple explode and implode worked for me.
$maxWordLength = 3;
$string = "my super string";
$exploded = explode(" ", $string);
foreach($exploded as $key => $word) {
if(mb_strlen($word) < $maxWordLength) unset($exploded[$key]);
}
$string = implode(" ", $exploded);
echo $string;
// outputs "super string"
To me, it seems that this hack works fine with most PHP versions:
$string2 = preg_replace("/~\b[a-zA-Z0-9]{1,2}\b\~/i", "", trim($string1));
Where [a-zA-Z0-9] are the accepted Char/Number range.

Replace the leading space with with the same number of times using a PHP regular expression

I want to replace leading space with with the same number of occurrences.
Explanation:
If one leading space exist in the input then it should replace it with one .
If two leading spaces exist in input then it should replace with two s.
If n leading spaces are exist in the input then it should replace it with exactly n number of times with .
Example 1:
My name is XYZ
Output:
My name is XYZ
Example 2:
My name is XYZ
Output:
My name is XYZ
How can I replace only leading spaces, using a PHP regular expression?
preg_replace('/\G /', ' ', $str);
\G matches the position where the last match ended, or the beginning of the string if there isn't any previous match.
Actually, in PHP it matches where the next match is supposed to begin. That isn't necessarily the same as where the previous match ended.
$len_before = strlen($str);
$str = ltrim($str, ' ');
$len_after = strlen($str);
$str = str_repeat(' ', $len_before - $len_after) . $str;
Using preg_replace there is also
$str = preg_replace('/^( +)/e', 'str_repeat(" ", strlen("$1"))', $str);
but note that it uses the /e flag.
See http://www.ideone.com/VWNKZ for the result.
Use:
preg_replace('/^ +/m', ' ', $str);
You can test it here.
Use preg_match with PREG_OFFSET_CAPTURE flag set. The offset is the length of the "spaces". Then use str_repeat with the offset.

Categories