Remove excess whitespace from within a string - php

I receive a string from a database query, then I remove all HTML tags, carriage returns and newlines before I put it in a CSV file. Only thing is, I can't find a way to remove the excess white space from between the strings.
What would be the best way to remove the inner whitespace characters?

Not sure exactly what you want but here are two situations:
If you are just dealing with excess whitespace on the beginning or end of the string you can use trim(), ltrim() or rtrim() to remove it.
If you are dealing with extra spaces within a string consider a preg_replace of multiple whitespaces " "* with a single whitespace " ".
Example:
$foo = preg_replace('/\s+/', ' ', $foo);

$str = str_replace(' ','',$str);
Or, replace with underscore, & nbsp; etc etc.

none of other examples worked for me, so I've used this one:
trim(preg_replace('/[\t\n\r\s]+/', ' ', $text_to_clean_up))
this replaces all tabs, new lines, double spaces etc to simple 1 space.

$str = trim(preg_replace('/\s+/',' ', $str));
The above line of code will remove extra spaces, as well as leading and trailing spaces.

If you want to replace only multiple spaces in a string, for Example: "this string have lots of space . "
And you expect the answer to be
"this string have lots of space", you can use the following solution:
$strng = "this string have lots of space . ";
$strng = trim(preg_replace('/\s+/',' ', $strng));
echo $strng;

There are security flaws to using preg_replace(), if you get the payload from user input [or other untrusted sources]. PHP executes the regular expression with eval(). If the incoming string isn't properly sanitized, your application risks being subjected to code injection.
In my own application, instead of bothering sanitizing the input (and as I only deal with short strings), I instead made a slightly more processor intensive function, though which is secure, since it doesn't eval() anything.
function secureRip(string $str): string { /* Rips all whitespace securely. */
$arr = str_split($str, 1);
$retStr = '';
foreach ($arr as $char) {
$retStr .= trim($char);
}
return $retStr;
}

$str = preg_replace('/[\s]+/', ' ', $str);

You can use:
$str = trim(str_replace(" ", " ", $str));
This removes extra whitespaces from both sides of string and converts two spaces to one within the string. Note that this won't convert three or more spaces in a row to one!
Another way I can suggest is using implode and explode that is safer but totally not optimum!
$str = implode(" ", array_filter(explode(" ", $str)));
My suggestion is using a native for loop or using regex to do this kind of job.

To expand on Sandip’s answer, I had a bunch of strings showing up in the logs that were mis-coded in bit.ly. They meant to code just the URL but put a twitter handle and some other stuff after a space. It looked like this
? productID =26%20via%20#LFS
Normally, that would‘t be a problem, but I’m getting a lot of SQL injection attempts, so I redirect anything that isn’t a valid ID to a 404. I used the preg_replace method to make the invalid productID string into a valid productID.
$productID=preg_replace('/[\s]+.*/','',$productID);
I look for a space in the URL and then remove everything after it.

I wrote recently a simple function which removes excess white space from string without regular expression implode(' ', array_filter(explode(' ', $str))).

Laravel 9.7 intruduced the new Str::squish() method to remove extraneous whitespaces including extraneous white space between words: https://laravel.com/docs/9.x/helpers#method-str-squish

$str = "I am a PHP Developer";
$str_length = strlen($str);
$str_arr = str_split($str);
for ($i = 0; $i < $str_length; $i++) {
if (isset($str_arr[$i + 1]) && $str_arr[$i] == ' ' && $str_arr[$i] == $str_arr[$i + 1]) {
unset($str_arr[$i]);
}
else {
continue;
}
}
echo implode("", $str_arr);

Related

Remove s or 's from all words in a string with PHP

I have a string in PHP
$string = "Dogs are Jonny's favorite pet";
I want to use regex or some method to remove s or 's from the end of all words in the string.
The desired output would be:
$revisedString = "Dog are Jonny favorite pet";
Here is my current approach:
<?php
$string = "Dogs are Jonny's favorite pet";
$stringWords = explode(" ", $string);
$counter = 0;
foreach($stringWords as $string) {
if(substr($string, -1) == s){
$stringWords[$counter] = trim($string, "s");
}
if(strpos($string, "'s") !== false){
$stringWords[$counter] = trim($string, "'s");
}
$counter = $counter + 1;
}
print_r($stringWords);
$newString = "";
foreach($stringWords as $string){
$newString = $newString . $string . " ";
}
echo $newString;
}
?>
How would this be achieved with REGEX?
For general use, you must leverage much more sophisticated technique than an English-ignorant regex pattern. There may be fringe cases where the following pattern fails by removing an s that it shouldn't. It could be a name, an acronym, or something else.
As an unreliable solution, you can optionally match an apostrophe then match a literal s if it is not immediately preceded by another s. Adding a word boundary (\b) on the end improves the accuracy that you are matching the end of words.
Code: (Demo)
$string = "The bass can access the river's delta from the ocean. The fishermen, assassins, and their friends are happy on the banks";
var_export(preg_replace("~'?(?<!s)s\b~", '', $string));
Output:
'The bass can access the river delta from the ocean. The fishermen, assassin, and their friend are happy on the bank'
PHP Live Regex always helped me a lot in such moments. Even already knowing how REGEX works, I still use it just to be sure some times.
To make use of REGEX in your case, you can use preg_replace().
<?php
// Your string.
$string = "Dogs are Jonny's favorite pet";
// The vertical bar means "or" and the backslash
// before the apostrophe is needed so you don't end
// your pattern string since we're using single quotes
// to delimit it. "\s" means a single space.
$regex_pattern = '/\'s\s|s\s|s$/';
// Fill the preg_replace() with the pattern, the replacement
// (a single space in this case), your string, -1 (so preg_replace()
// will replace all the matches) and a variable of your desire
// to be the "counter" (preg_replace() will automatically
// fill it).
$newString = preg_replace($regex_pattern, ' ', $string, -1, $counter);
// Use the rtrim() to remove spaces at the right of the sentence.
$newString = rtrim($newString, " ");
echo "New string: " . $newString . ". ";
echo "Replacements: " . $counter . ".";
?>
In this case, the function will identify any "'s" or "s" with spaces (\s) after them and then replace them with a single space.
The preg_replace() will also count all the replacements and register them automatically on $counter or any variable you place there instead.
Edit:
Phil's comment is right and indeed my previous REGEX would lose a "s" at the end of the string. Adding "|s$" will solve it. Again, "|" means "or" and the "$" means that the "s" must be at the end of the string.
In attention to mickmackusa's comment, my solution is meant only to remove "s" characters at the end of words inside the string as this was Sparky Johnson' request here. Removing plurals would require a complex code since not only we need to remove "s" characters from plural only words but also change verbs and other stuff.

Remove empty space and plus sign from the beginning of a string

I have a string that begins with an empty space and a + sign :
$s = ' +This is a string[...]';
I can't figure out how to remove the first + sign using PHP. I've tried ltrim, preg_replace with several patterns and with trying to escape the + sign, I've also tried substr and str_replace. None of them is removing the plus sign at the beginning of the string. Either it doesn't replace it or it remplace/remove the totality of the string. Any help will be highly appreciated!
Edit : After further investigation, it seems that it's not really a plus sign, it looks 100% like a + sign but I think it's not. Any ideas for how to decode/convert it?
Edit 2 : There's one white space before the + sign. I'm using get_the_excerpt Wordpress function to get the string.
Edit 3 : After successfully removing the empty space and the + with substr($s, 2);, Here's what I get now :
$s == '#43;This is a string[...]'
Wiki : I had to remove 6 characters, I've tried substr($s, 6); and it's working well now. Thanks for your help guys.
ltrim has second parameter
$s = ltrim($s,'+');
edit:
if it is not working it means that there is sth else at the beginning of that string, eg. white spaces. You can check it by using var_dump($s); which shows you exactly what you have there.
You can use explode like this:
$result = explode('+', $s)[0];
What this function actually does is, it removes the delimeter you specify as a first argument and breaks the string into smaller strings whenever that delimeter is found and places those strings in an array.
It's mostly used with multiple ocurrences of a certain delimeter but it will work in your case too.
For example:
$string = "This,is,a,string";
$results = explode(',', $string);
var_dump($results); //prints ['This', 'is', 'a', 'string' ]
So in your case since the plus sign appears ony once the result is in the zero index of the returned array (that contains only one element, your string obviously)
Here's a couple of different ways I can think of
str_replace
$string = str_replace('+', '', $string);
preg_replace
$string = preg_replace('/^\+/', '', $string);
ltrim
$string = ltrim($string, '+');
substr
$string = substr($string, 1);
try this
<?php
$s = '+This is a string';
echo ltrim($s,'+');
?>
You can use ltrim() or substr().
For example :
$output = ltrim($string, '+');
or you can use
$output = substr($string, 1);
You can remove multiple characters with trim. Perhaps you were not re-assigning the outcome of your trim function.
<?php
$s = ' +This is a string[...]';
$s = ltrim($s, '+ ');
print $s;
Outputs:
This is a string[...]
ltrim in the above example removes all spaces and addition characters from the left hand side of the original string.

Get rid of multiple white spaces in php or mysql

I have a form which takes user inputs; Recently, I have come across many user inputs with multiple white spaces.
Eg.
"My tests are working fine!"
Is there any way I can get rid of these white spaces at PHP level or MySQL level?
Clearly trim doesn't work here.
I was thinking of using Recursive function but not sure if there's an easy and fast way of doing this.
my code so far is as below:
function noWhiteSpaces($string) {
if (!empty($string)) {
$string = trim($string);
$new_str = str_replace(' ', ' ', $string);
} else {
return false;
}
return $new_str;
}
echo noWhiteSpaces("My tests are working fine here !");
If the input is actual whitespaces and you want to replace them with a single space, you could use a regular expression.
$stripped = preg_replace('/\s+/', ' ', $input);
\s means 'whitespace' character. + means one or more. So, this replaces every instance of one or more whitespace characters' in $input with a single space. See the preg_replace() docs, and a nice tutorial on regexes.
If you're not looking to replace real whitespace but rather stuff like , you could do the same, but not with \s. Use this instead:
$stripped = preg_replace('/( )+/', ' ', $input);
Note how the brackets enclose .

An explode() function that ignores characters inside quotes?

Does somebody know a quick and easy explode() like function that can ignore splitter characters that are enclosed in a pair of arbitrary characters (e.g. quotes)?
Example:
my_explode(
"/",
"This is/a string/that should be/exploded.//But 'not/here',/and 'not/here'"
);
should result in an array with the following members:
This is
a string
that should be
exploded.
But 'not/here',
and 'not/here'
the fact that the characters are wrapped in single quotes would spare them from being splitters.
Bonus points for a solution that can deal with two wrapper characters
(not/here)
A native PHP solution would be preferred, but I don't think such a thing exists!
str_getcsv($str, '/')
There's a recipe for <5.3 on the linked page.
This is near-impossible with preg_split, because you can't tell from the middle of the string whether you're between quotes or not. However, preg_match_all can do the job.
Simple solution for a single type of quote:
function quoted_explode($subject, $delimiter = ',', $quote = '\'') {
$regex = "(?:[^$delimiter$quote]|[$quote][^$quote]*[$quote])+";
preg_match_all('/'.str_replace('/', '\\/', $regex).'/', $subject, $matches);
return $matches[0];
}
That function will have all kinds of problems if you pass it certain special characters (\^-], according to http://www.regular-expressions.info/reference.html), so you'll need to escape those. Here's a general solution that escapes special regex characters and can track multiple kinds of quotes separately:
function regex_escape($subject) {
return str_replace(array('\\', '^', '-', ']'), array('\\\\', '\\^', '\\-', '\\]'), $subject);
}
function quoted_explode($subject, $delimiters = ',', $quotes = '\'') {
$clauses[] = '[^'.regex_escape($delimiters.$quotes).']';
foreach(str_split($quotes) as $quote) {
$quote = regex_escape($quote);
$clauses[] = "[$quote][^$quote]*[$quote]";
}
$regex = '(?:'.implode('|', $clauses).')+';
preg_match_all('/'.str_replace('/', '\\/', $regex).'/', $subject, $matches);
return $matches[0];
}
(Note that I keep all of the variables between square brackets to minimize what needs escaping - outside of square brackets, there are about twice as many special characters.)
If you wanted to use ] as a quote, then you probably wanted to use [ as the corresponding quote, but I'll leave adding that functionality as an exercise for the reader. :)
Something very near with preg_split : http://fr2.php.net/manual/en/function.preg-split.php#92632
It handles multiple wrapper characters AND multiple delimiter characters.

Remove newline character from a string using PHP regex

How can I remove a new line character from a string using PHP?
$string = str_replace(PHP_EOL, '', $string);
or
$string = str_replace(array("\n","\r"), '', $string);
$string = str_replace("\n", "", $string);
$string = str_replace("\r", "", $string);
To remove several new lines it's recommended to use a regular expression:
$my_string = trim(preg_replace('/\s\s+/', ' ', $my_string));
Better to use,
$string = str_replace(array("\n","\r\n","\r"), '', $string).
Because some line breaks remains as it is from textarea input.
Something a bit more functional (easy to use anywhere):
function strip_carriage_returns($string)
{
return str_replace(array("\n\r", "\n", "\r"), '', $string);
}
stripcslashes should suffice (removes \r\n etc.)
$str = stripcslashes($str);
Returns a string with backslashes stripped off. Recognizes C-like \n,
\r ..., octal and hexadecimal representation.
Try this out. It's working for me.
First remove n from the string (use double slash before n).
Then remove r from string like n
Code:
$string = str_replace("\\n", $string);
$string = str_replace("\\r", $string);
Let's see a performance test!
Things have changed since I last answered this question, so here's a little test I created. I compared the four most promising methods, preg_replace vs. strtr vs. str_replace, and strtr goes twice because it has a single character and an array-to-array mode.
You can run the test here:
        https://deneskellner.com/stackoverflow-examples/1991198/
Results
251.84 ticks using preg_replace("/[\r\n]+/"," ",$text);
81.04 ticks using strtr($text,["\r"=>"","\n"=>""]);
11.65 ticks using str_replace($text,["\r","\n"],["",""])
4.65 ticks using strtr($text,"\r\n"," ")
(Note that it's a realtime test and server loads may change, so you'll probably get different figures.)
The preg_replace solution is noticeably slower, but that's okay. They do a different job and PHP has no prepared regex, so it's parsing the expression every single time. It's simply not fair to expect them to win.
On the other hand, in line 2-3, str_replace and strtr are doing almost the same job and they perform quite differently. They deal with arrays, and they do exactly what we told them - remove the newlines, replacing them with nothing.
The last one is a dirty trick: it replaces characters with characters, that is, newlines with spaces. It's even faster, and it makes sense because when you get rid of line breaks, you probably don't want to concatenate the word at the end of one line with the first word of the next. So it's not exactly what the OP described, but it's clearly the fastest. With long strings and many replacements, the difference will grow because character substitutions are linear by nature.
Verdict: str_replace wins in general
And if you can afford to have spaces instead of [\r\n], use strtr with characters. It works twice as fast in the average case and probably a lot faster when there are many short lines.
Use:
function removeP($text) {
$key = 0;
$newText = "";
while ($key < strlen($text)) {
if(ord($text[$key]) == 9 or
ord($text[$key]) == 10) {
//$newText .= '<br>'; // Uncomment this if you want <br> to replace that spacial characters;
}
else {
$newText .= $text[$key];
}
// echo $k . "'" . $t[$k] . "'=" . ord($t[$k]) . "<br>";
$key++;
}
return $newText;
}
$myvar = removeP("your string");
Note: Here I am not using PHP regex, but still you can remove the newline character.
This will remove all newline characters which are not removed from by preg_replace, str_replace or trim functions

Categories