Don't show white spaces in regex output

Don't show white spaces in regex output - php

I need to match everything but not the white spaces
For example in this string : 16790 - 140416 / 3300
I want to have the following result made by regex (and PHP) without white spaces: 140416/3300
I used : \s+\d+\s+-\s+(\d+\s+\/\s+\d+) and of course it gave me the results with the whitespaces:
140416 / 3300
How could I have a match 1 result without white spaces ?
Thank you

$subject = "16790 - 140416 / 3300";
$result = preg_replace('%.*?(\d+)\s+/\s+(\d+)%', '$1/$2', $subject);
echo $result;
// 140416/3300
http://regex101.com/r/oV4hN0

If you're just removing an unknown number of spaces and tabs from a string, you can use
$result = preg_replace('~\s*~', '', $subject);
If you're matching the whole pattern you gave (not just the division), you can use this:
$result = preg_replace('~(\d+)\s*-\s*(\d+)\s*/\s*(\d+)~', '\1-\2/\3', $subject);
Finally, and this might be the best, if you'd like to remove spaces around operators such as =,*,-,/ when they are surrounded by digits, you can use this:
$result = preg_replace('~(\d)\s*([+*/-])\s*(\d)~', '\1\2\3', $subject);

Related

how to clean a dirty csv string using php regex

my string may be like this:
# *lorem.jpg,,, ip sum.jpg,dolor ..jpg,-/ ?
in fact - it is a dirty csv string - having names of jpg images
I need to remove any non-alphanum chars - from both sides of the string
then - inside the resulting string - remove the same - except commas and dots
then - remove duplicates commas and dots - if any - replace them with single ones
so the final result should be:
lorem.jpg,ipsum.jpg,dolor.jpg
I firstly tried to remove any white space - anywhere
$str = str_replace(" ", "", $str);
then I used various forms of trim functions - but it is tedious and a lot of code
the additional problem is - duplicates commas and dots may have one or more instances - for example - .. or ,,,,
is there a way to solve this using regex, pls ?

List of modeled steps following your words:
Step 1
"remove any non-alphanum chars from both sides of the string"
translated: remove trailing and tailing consecutive [^a-zA-Z0-9] characters
regex: replace ^[^a-zA-Z0-9]*(.*?)[^a-zA-Z0-9]*$ with $1
Step 2
"inside the resulting string - remove the same - except commas and dots"
translated: remove any [^a-zA-Z0-9.,]
regex: replace [^a-zA-Z0-9.,] with empty string
Step 3
"remove duplicates commas and dots - if any - replace them with single ones"
translated: replace consecutive [,.] as a single
instance
regex: replace (\.{2,}) with .
regex: replace (,{2,}) with ,
PHP Demo:
https://onlinephp.io/c/512e1
<?php
$subject = " # *lorem.jpg,,, ip sum.jpg,dolor ..jpg,-/ ?";
$firstStep = preg_replace('/^[^a-zA-Z0-9]*(.*?)[^a-zA-Z0-9]*$/', '$1', $subject);
$secondStep = preg_replace('/[^a-z,A-Z0-9.,]/', '', $firstStep);
$thirdStepA = preg_replace('(\.{2,})', '.', $secondStep);
$thirdStepB = preg_replace('(,{2,})', ',', $thirdStepA);
echo $thirdStepB; //lorem.jpg,ipsum.jpg,dolor.jpg

Look at
https://www.php.net/manual/en/function.preg-replace.php
It replace anything inside a string based on pattern. \s represent all space char, but care of NBSP (non breakable space, \h match it )
Exemple 4
$str = preg_replace('/\s\s+/', '', $str);
It will be something like that

Can you try this :
$string = ' # *lorem.jpg,,,, ip sum.jpg,dolor .jpg,-/ ?';
// this will left only alphanumirics
$result = preg_replace("/[^A-Za-z0-9,.]/", '', $string);
// this will remove duplicated dot and ,
$result = preg_replace('/,+/', ',', $result);
$result = preg_replace('/\.+/', '.', $result);
// this will remove ,;. and space from the end
$result = preg_replace("/[ ,;.]*$/", '', $result);

Regex rules in an array

Maybe it can not be solved this issue as I want, but maybe you can help me guys.
I have a lot of malformed words in the name of my products.
Some of them has leading ( and trailing ) or maybe one of these, it is same for / and " signs.
What I do is that I am explode the name of the product by spaces, and examines these words.
So I want to replace them to nothing. But, a hard drive could be 40GB ATA 3.5" hard drive. I need to process all the word, but I can not use the same method for 3.5" as for () or // because this 3.5" is valid.
So I only need to replace the quotes, when it is at the start of the string AND at end of the string.
$cases = [
'(testone)',
'(testtwo',
'testthree)',
'/otherone/',
'/othertwo',
'otherthree/',
'"anotherone',
'anothertwo"',
'"anotherthree"',
];
$patterns = [
'/^\(/',
'/\)$/',
'~^/~',
'~/$~',
//Here is what I can not imagine, how to add the rule for `"`
];
$result = preg_replace($patterns, '', $cases);
This is works well, but can it be done in one regex_replace()? If yes, somebody can help me out the pattern(s) for the quotes?
Result for quotes should be this:
'"anotherone', //no quote at end leave the leading
'anothertwo"', //no quote at start leave the trailin
'anotherthree', //there are quotes on start and end so remove them.

You may use another approach: rather than define an array of patterns, use one single alternation based regex:
preg_replace('~^[(/]|[/)]$|^"(.*)"$~s', '$1', $s)
See the regex demo
Details:
^[(/] - a literal ( or / at the start of the string
| - or
[/)]$ - a literal ) or / at the end of the string
| - or
^"(.*)"$ - a " at the start of the string, then any 0+ characters (due to /s option, the . matches a linebreak sequence, too) that are captured into Group 1, and " at the end of the string.
The replacement pattern is $1 that is empty when the first 2 alternatives are matched, and contains Group 1 value if the 3rd alternative is matched.
Note: In case you need to replace until no match is found, use a preg_match with preg_replace together (see demo):
$s = '"/some text/"';
$re = '~^[(/]|[/)]$|^"(.*)"$~s';
$tmp = '';
while (preg_match($re, $s) && $tmp != $s) {
$tmp = $s;
$s = preg_replace($re, '$1', $s);
}
echo $s;

This works
preg_replace([[/(]?(.+)[/)]?|/\"(.+)\"/], '$1', $string)

Remove empty space and plus sign from the beginning of a string

I have a string that begins with an empty space and a + sign :
$s = ' +This is a string[...]';
I can't figure out how to remove the first + sign using PHP. I've tried ltrim, preg_replace with several patterns and with trying to escape the + sign, I've also tried substr and str_replace. None of them is removing the plus sign at the beginning of the string. Either it doesn't replace it or it remplace/remove the totality of the string. Any help will be highly appreciated!
Edit : After further investigation, it seems that it's not really a plus sign, it looks 100% like a + sign but I think it's not. Any ideas for how to decode/convert it?
Edit 2 : There's one white space before the + sign. I'm using get_the_excerpt Wordpress function to get the string.
Edit 3 : After successfully removing the empty space and the + with substr($s, 2);, Here's what I get now :
$s == '#43;This is a string[...]'
Wiki : I had to remove 6 characters, I've tried substr($s, 6); and it's working well now. Thanks for your help guys.

ltrim has second parameter
$s = ltrim($s,'+');
edit:
if it is not working it means that there is sth else at the beginning of that string, eg. white spaces. You can check it by using var_dump($s); which shows you exactly what you have there.

You can use explode like this:
$result = explode('+', $s)[0];
What this function actually does is, it removes the delimeter you specify as a first argument and breaks the string into smaller strings whenever that delimeter is found and places those strings in an array.
It's mostly used with multiple ocurrences of a certain delimeter but it will work in your case too.
For example:
$string = "This,is,a,string";
$results = explode(',', $string);
var_dump($results); //prints ['This', 'is', 'a', 'string' ]
So in your case since the plus sign appears ony once the result is in the zero index of the returned array (that contains only one element, your string obviously)

Here's a couple of different ways I can think of
str_replace
$string = str_replace('+', '', $string);
preg_replace
$string = preg_replace('/^\+/', '', $string);
ltrim
$string = ltrim($string, '+');
substr
$string = substr($string, 1);

try this
<?php
$s = '+This is a string';
echo ltrim($s,'+');
?>

You can use ltrim() or substr().
For example :
$output = ltrim($string, '+');
or you can use
$output = substr($string, 1);

You can remove multiple characters with trim. Perhaps you were not re-assigning the outcome of your trim function.
<?php
$s = ' +This is a string[...]';
$s = ltrim($s, '+ ');
print $s;
Outputs:
This is a string[...]
ltrim in the above example removes all spaces and addition characters from the left hand side of the original string.

PHP Regex: Remove words less than 3 characters

I'm trying to remove all words of less than 3 characters from a string, specifically with RegEx.
The following doesn't work because it is looking for double spaces. I suppose I could convert all spaces to double spaces beforehand and then convert them back after, but that doesn't seem very efficient. Any ideas?
$text='an of and then some an ee halved or or whenever';
$text=preg_replace('# [a-z]{1,2} #',' ',' '.$text.' ');
echo trim($text);

Removing the Short Words
You can use this:
$replaced = preg_replace('~\b[a-z]{1,2}\b\~', '', $yourstring);
In the demo, see the substitutions at the bottom.
Explanation
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)
[a-z]{1,2} matches one or two letters
\b another word boundary
Replace with the empty string.
Option 2: Also Remove Trailing Spaces
If you also want to remove the spaces after the words, we can add \s* at the end of the regex:
$replaced = preg_replace('~\b[a-z]{1,2}\b\s*~', '', $yourstring);
Reference
Word Boundaries

You can use the word boundary tag: \b:
Replace: \b[a-z]{1,2}\b with ''

Use this
preg_replace('/(\b.{1,2}\s)/','',$your_string);

As some solutions worked here, they had a problem with my language's "multichar characters", such as "ch". A simple explode and implode worked for me.
$maxWordLength = 3;
$string = "my super string";
$exploded = explode(" ", $string);
foreach($exploded as $key => $word) {
if(mb_strlen($word) < $maxWordLength) unset($exploded[$key]);
}
$string = implode(" ", $exploded);
echo $string;
// outputs "super string"

To me, it seems that this hack works fine with most PHP versions:
$string2 = preg_replace("/~\b[a-zA-Z0-9]{1,2}\b\~/i", "", trim($string1));
Where [a-zA-Z0-9] are the accepted Char/Number range.

preg_replace vs trim PHP

I am working with a slug function and I dont fully understand some of it and was looking for some help on explaining.
My first question is about this line in my slug function $string = preg_replace('# +#', '-', $string); Now I understand that this replaces all spaces with a '-'. What I don't understand is what the + sign is in there for which comes after the white space in between the #.
Which leads to my next problem. I want a trim function that will get rid of spaces but only the spaces after they enter the value. For example someone accidentally entered "Arizona " with two spaces after the a and it destroyed the pages linked to Arizona.
So after all my rambling I basically want to figure out how I can use a trim to get rid of accidental spaces but still have the preg_replace insert '-' in between words.
ex.. "Sun City West " = "sun-city-west"
This is my full slug function-
function getSlug($string){
if(isset($string) && $string <> ""){
$string = strtolower($string);
//var_dump($string); echo "<br>";
$string = preg_replace('#[^\w ]+#', '', $string);
//var_dump($string); echo "<br>";
$string = preg_replace('# +#', '-', $string);
}
return $string;
}

You can try this:
function getSlug($string) {
return preg_replace('#\s+#', '-', trim($string));
}
It first trims extra spaces at the beginning and end of the string, and then replaces all the other with the - character.
Here your regex is:
#\s+#
which is:
# = regex delimiter
\s = any space character
+ = match the previous character or group one or more times
# = regex delimiter again
so the regex here means: "match any sequence of one or more whitespace character"

The + means at least one of the preceding character, so it matches one or more spaces. The # signs are one of the ways of marking the start and end of a regular expression's pattern block.
For a trim function, PHP handily provides trim() which removes all leading and trailing whitespace.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Don't show white spaces in regex output - php

$subject = "16790 - 140416 / 3300"; $result = preg_replace('%.*?(\d+)\s+/\s+(\d+)%', '$1/$2', $subject); echo $result; // 140416/3300 http://regex101.com/r/oV4hN0

Related

how to clean a dirty csv string using php regex

Regex rules in an array

Remove empty space and plus sign from the beginning of a string

PHP Regex: Remove words less than 3 characters

preg_replace vs trim PHP

Categories

Resources