REGEX (in PHP).. remove non-alphanumeric characters at ends? - php

$test = "!!! sdfsdf sd$$$fdf ___";
$test = str_replace(' ', '_', $test); // Turn all spaces into underscores.
echo $test."<br />"; // Output: !!!___sdfsdf___sd$$$fdf______
$test = preg_replace('/[^a-zA-Z0-9_-]/', '-', $test); // Replace anything that isn't alphanumeric, or _-, with a hyphen.
echo $test."<br />"; // Output: !!!___sdfsdf___sd---fdf______
$test = preg_replace('/([_-])\1+/', '$1', $test); // Reduce multiple _- in a row to just one.
echo $test."<br />"; // Output: !_sdfsdf_sd-fdf_
The above code is what I currently have, what I'm trying to figure out the REGEX for is how to cut off any non-alphanumeric characters off the ends. So turning the final output from "!_sdfsdf_sd-fdf_" to "sdfsdf_sd-fdf".

$clean = preg_replace('~(^[^a-z0-9]+)|([^a-z0-9]+$)~i', '', $str);

You can use trim():
$test = trim($test, '_-');
echo $test;
The "!" won't make it past the first regular expression.

You can replace your whole code with this:
$test = preg_replace('/[^a-zA-Z0-9]+/', '_', $test);
$test = trim($test, '_');
The first will replace all occurrences of one or more illegal characters with _ and the second will rmove any remaining _ from the start and end.

[a-zA-Z0-9].*[a-zA-Z0-9]
Meaning: Read any alphanumeric character, then read as much of anything as we can, making sure we can get at least one alphanumeric character at the end.

Related

Php make spaces in a word with a dash

I have the following string:
$thetextstring = "jjfnj 948"
At the end I want to have:
echo $thetextstring; // should print jjf-nj948
So basically what am trying to do is to join the separated string then separate the first 3 letters with a -.
So far I have
$string = trim(preg_replace('/s+/', ' ', $thetextstring));
$result = explode(" ", $thetextstring);
$newstring = implode('', $result);
print_r($newstring);
I have been able to join the words, but how do I add the separator after the first 3 letters?
Use a regex with preg_replace function, this would be a one-liner:
^.{3}\K([^\s]*) *
Breakdown:
^ # Assert start of string
.{3} # Match 3 characters
\K # Reset match
([^\s]*) * # Capture everything up to space character(s) then try to match them
PHP code:
echo preg_replace('~^.{3}\K([^\s]*) *~', '-$1', 'jjfnj 948');
PHP live demo
Without knowing more about how your strings can vary, this is working solution for your task:
Pattern:
~([a-z]{2}) ~ // 2 letters (contained in capture group1) followed by a space
Replace:
-$1
Demo Link
Code: (Demo)
$thetextstring = "jjfnj 948";
echo preg_replace('~([a-z]{2}) ~','-$1',$thetextstring);
Output:
jjf-nj948
Note this pattern can easily be expanded to include characters beyond lowercase letters that precede the space. ~(\S{2}) ~
You can use str_replace to remove the unwanted space:
$newString = str_replace(' ', '', $thetextstring);
$newString:
jjfnj948
And then preg_replace to put in the dash:
$final = preg_replace('/^([a-z]{3})/', '\1-', $newString);
The meaning of this regex instruction is:
from the beginning of the line: ^
capture three a-z characters: ([a-z]{3})
replace this match with itself followed by a dash: \1-
$final:
jjf-nj948
$thetextstring = "jjfnj 948";
// replace all spaces with nothing
$thetextstring = str_replace(" ", "", $thetextstring);
// insert a dash after the third character
$thetextstring = substr_replace($thetextstring, "-", 3, 0);
echo $thetextstring;
This gives the requested jjf-nj948
You proceeding is correct. For the last step, which consists in inserting a - after the third character, you can use the substr_replace function as follows:
$thetextstring = 'jjfnj 948';
$string = trim(preg_replace('/\s+/', ' ', $thetextstring));
$result = explode(' ', $thetextstring);
$newstring = substr_replace(implode('', $result), '-', 3, false);
If you are confident enough that your string will always have the same format (characters followed by a whitespace followed by numbers), you can also reduce your computations and simplify your code as follows:
$thetextstring = 'jjfnj 948';
$newstring = substr_replace(str_replace(' ', '', $thetextstring), '-', 3, false);
Visit this link for a working demo.
Oldschool without regex
$test = "jjfnj 948";
$test = str_replace(" ", "", $test); // strip all spaces from string
echo substr($test, 0, 3)."-".substr($test, 3); // isolate first three chars, add hyphen, and concat all characters after the first three

PHP preg_replace odd characters not working

I have the following code, but for some reason, the characters are not replaced....
test.php
<?php
$s = 'AABBCC����ˮ��������Ƽ���� �˾XXYYZZ';
$softwareVersion = preg_replace('[^a-zA-Z\d\s\.]', '', $s);
echo $softwareVersion . "\n";
what I am getting
jeffreylroberts:~$ php test.php
AABBCC����ˮ��������Ƽ���� �˾XXYYZZ
jeffreylroberts:~$
what I am expecting
jeffreylroberts:~$ php test.php
AABBCC XXYYZZ
jeffreylroberts:~$
Any ideas on how to preg_replace those characters?
You forgot to add a leading an trailing forward slash in the regex, This will give you the output you need:
$softwareVersion = preg_replace('/[^a-zA-Z0-9\d\s\.]/', '', $s);
Also you can do it this way, which will remove all characters except alphanumeric and underscore:
$softwareVersion = preg_replace('/\W/', '', $s);
A few things to tweak:
Use a pattern delimiter. / is the most common one.
Reduce your pattern length by only writing a-z in the character class and use the i modifier/flag at the end of your pattern.
Escaping the dot is not necessary in the character class.
Use the + "one or more" quantifier to improve efficiency. It will match consecutive occurrences of character and replace the multi-character substring in one shot.
Code: (Demo)
$s='AABBCC����ˮ��������Ƽ���� �˾XXYYZZ';
$softwareVersion = preg_replace('/[^a-z\d\s.]+/i','',$s);
echo $softwareVersion . "\n";
Output:
AABBCC XXYYZZ

PHP Regex: Remove words less than 3 characters

I'm trying to remove all words of less than 3 characters from a string, specifically with RegEx.
The following doesn't work because it is looking for double spaces. I suppose I could convert all spaces to double spaces beforehand and then convert them back after, but that doesn't seem very efficient. Any ideas?
$text='an of and then some an ee halved or or whenever';
$text=preg_replace('# [a-z]{1,2} #',' ',' '.$text.' ');
echo trim($text);
Removing the Short Words
You can use this:
$replaced = preg_replace('~\b[a-z]{1,2}\b\~', '', $yourstring);
In the demo, see the substitutions at the bottom.
Explanation
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)
[a-z]{1,2} matches one or two letters
\b another word boundary
Replace with the empty string.
Option 2: Also Remove Trailing Spaces
If you also want to remove the spaces after the words, we can add \s* at the end of the regex:
$replaced = preg_replace('~\b[a-z]{1,2}\b\s*~', '', $yourstring);
Reference
Word Boundaries
You can use the word boundary tag: \b:
Replace: \b[a-z]{1,2}\b with ''
Use this
preg_replace('/(\b.{1,2}\s)/','',$your_string);
As some solutions worked here, they had a problem with my language's "multichar characters", such as "ch". A simple explode and implode worked for me.
$maxWordLength = 3;
$string = "my super string";
$exploded = explode(" ", $string);
foreach($exploded as $key => $word) {
if(mb_strlen($word) < $maxWordLength) unset($exploded[$key]);
}
$string = implode(" ", $exploded);
echo $string;
// outputs "super string"
To me, it seems that this hack works fine with most PHP versions:
$string2 = preg_replace("/~\b[a-zA-Z0-9]{1,2}\b\~/i", "", trim($string1));
Where [a-zA-Z0-9] are the accepted Char/Number range.

preg_replace vs trim PHP

I am working with a slug function and I dont fully understand some of it and was looking for some help on explaining.
My first question is about this line in my slug function $string = preg_replace('# +#', '-', $string); Now I understand that this replaces all spaces with a '-'. What I don't understand is what the + sign is in there for which comes after the white space in between the #.
Which leads to my next problem. I want a trim function that will get rid of spaces but only the spaces after they enter the value. For example someone accidentally entered "Arizona " with two spaces after the a and it destroyed the pages linked to Arizona.
So after all my rambling I basically want to figure out how I can use a trim to get rid of accidental spaces but still have the preg_replace insert '-' in between words.
ex.. "Sun City West " = "sun-city-west"
This is my full slug function-
function getSlug($string){
if(isset($string) && $string <> ""){
$string = strtolower($string);
//var_dump($string); echo "<br>";
$string = preg_replace('#[^\w ]+#', '', $string);
//var_dump($string); echo "<br>";
$string = preg_replace('# +#', '-', $string);
}
return $string;
}
You can try this:
function getSlug($string) {
return preg_replace('#\s+#', '-', trim($string));
}
It first trims extra spaces at the beginning and end of the string, and then replaces all the other with the - character.
Here your regex is:
#\s+#
which is:
# = regex delimiter
\s = any space character
+ = match the previous character or group one or more times
# = regex delimiter again
so the regex here means: "match any sequence of one or more whitespace character"
The + means at least one of the preceding character, so it matches one or more spaces. The # signs are one of the ways of marking the start and end of a regular expression's pattern block.
For a trim function, PHP handily provides trim() which removes all leading and trailing whitespace.

remove special characters before and after string

Trying to remove the hyphens which come before start of the alphabet and after end of a alphabet, but not to lose the hyphens in between.
Example
this the string i have
---this-is-my-page--
output: this-is-my-page
Note( no of hyphen are different on each request, it may be many in numbers)
2. Example
how to do this,
---this-is-page---
i need to replace the hyphen which is in between string with empty space. but not to loose to the hyphens in start and end.
Use trim function it will work for any number of -(hyphen) at start or end of your string,
$str = "---this-is-my-page---";
echo $str = trim($str,"-");
Edit:
And than use str_replace,
$str = str_replace("-"," ",$str);
DEMO.
Use trim($string, $trimCharacters):
trim — Strip whitespace (or other characters) from the beginning and end of a string
<?php
$str = '---this-is-my-page---';
var_dump( trim($str, '-') ); //string(15) "this-is-my-page"
?>
DEMO
If you only want to replace the hyphens inside the string (and not in the start/end) you can use regex:
/^(-+)(.*?)(-+)$/
..and replace it with (first group)(second group with hyphens replaced)(third group).
In code:
<?php
$str = '---this-is-my-page---';
$str = preg_replace_callback('/^(-+)(.*?)(-+)$/', function($matches) {
return $matches[1] . str_replace('-', ' ', $matches[2]) . $matches[3];
}, $str);
var_dump( $str ); //string(21) "---this is my page---"
?>
DEMO
echo trim( "---this-is-my-page---","-");
trim removes a character at the and and begin

Categories