Removing various sorts of whitespace in PHP - php

I am looking to remove multiple line breaks using regular expression. Say I have this text:
"On the Insert tab\n \n\nthe galleries include \n\n items that are designed"
then I want to replace it with
"On the Insert tab\nthe galleries include\nitems that are designed"
So my requirement is:
it will remove all multiple newlines and will replace with one newline
It will remove all multiple spaces and will replace with one space
Spaces will be trimmed as well
I do searched a lot but couldn't find solution - the closest I got was this one Removing redundant line breaks with regular expressions.

Use this :
echo trim(preg_replace('#(\s)+#',"$1",$string));

$text = str_replace("\r\n", "\n", $text); // converts Windows new lines to Linux ones
while (strpos($text, "\n\n") != false)
{
$text = str_replace("\n\n", "\n", $text);
}
That will sort out newline characters.

$text = trim($text);
preg_replace('/\s+/', ' ', $text);
preg_replace('/(?:\s*(?:\r\n|\r|\n)\s*){2}/s', "\n", $text);
Thanks to Removing redundant line breaks with regular expressions

Related

PHP Remove extra spaces between break lines

I have a string in PHP, i'm able to remove multiple continuous break lines and multiple spaces, but what i'm still not able is to remove multiple break lines if i have an space in the middle.
For example:
Text \r\n \r\n extra text
I would like to clean this text as:
Text \r\nextra text
Could also be too:
Text \r\n \r\nextra text
Don't need to be an extra espace after the break line.
What i have right now is:
function clearHTML($text){
$text = strip_tags($text);
$text = str_replace(" ", " ", $text);
$text = preg_replace("/[[:blank:]]+/"," ",$text);
$text = preg_replace("/([\r\n]{4,}|[\n]{2,}|[\r]{2,})/", "\r\n", $text);
$text = trim($text);
return $text;
}
Any suggestions?
To remove extra whitespace between lines, you can use
preg_replace('~\h*(\R)\s*~', '$1', $text)
The regex matches:
\h* - 0 or more horizontal whitespaces
(\R) - Group 1: any line ending sequence (the replacement is $1, just this group vaue)
\s* - one or more whitespaces
The whitespace shrinking part can be merged to a single preg_replace call with the (?: |\h)+ regex that matches one or more occurrences of an string or a horizontal whitespace.
NOTE: If you have Unicode texts, you will need u flag.
The whole cleaning function can look like
function clearHTML($text){
$text = strip_tags($text);
$text = preg_replace("~(?: |\h)+~u", " ", $text);
$text = preg_replace('~\h*(\R)\s*~u', '$1', $text);
return trim($text);
}

Php display as a html text the new line \n

I'm using echo to display result but the result contains line breaks /n /t /r.
I want to know if the result has is \n or \t or \r and how many. I need to know so I can replace it in a html tag like <p> or <div>.
The result is coming from on other website.
In pattern CreditTransaction/CustomerData:
Email does not contain any text
In pattern RecurUpdate/CustomerData:
Email does not contain any text
In pattern AccountInfo:
I want like this.
In pattern CreditTransaction/CustomerData:
\n
\n
\n
\n\tEmail does not contain any text
\n
In pattern RecurUpdate/CustomerData:
\n
\n
\n
\n\tEmail does not contain any text
\n\tIn pattern AccountInfo:
Your question is quite unclear but I'll do my best to provide an answer.
If you want to make \n, \r, and \t visible in the output you could just manually unescape them:
str_replace("\n", '\n', str_replace("\r", '\r', str_replace("\t", '\t', $string)));
Or if you want to unescape all escaped characters:
addslashes($string);
To count how many times a specific character/substring occurs:
substr_count($string, $character_or_substring);
To check if the string contains a specific character/substring:
if (substr_count($string, $character_or_substring) > 0) {
// your code
}
Or:
if (strpos($string, $character_or_substring) !== false) { // notice the !==
// your code
}
As mentioned earlier by someone else in a comment, if you want to convert the newlines to br tags:
nl2br($string);
If you want to make tabs indenting you could replace all tabs with  :
str_replace("\t", ' ', $string);
Use double quotes to find newline and tab characters.
$s = "In pattern CreditTransaction/CustomerData:
Email does not contain any text
In pattern RecurUpdate/CustomerData: ";
echo str_replace("\t", "*", $s); // Replace all tabs with '*'
echo str_replace("\n", "*", $s); // Replace all newlines with '*'

PHP - Remove excess Whitespace but not new lines

i was looking for a way to remove excess whitespaces from within a string (that is, if 2 or more spaces are next each other, leave only 1 and remove the others), i found this Remove excess whitespace from within a string and i wanted to use this solution:
$foo = preg_replace( '/\s+/', ' ', $foo );
but this removes new lines aswell, while i want to keep them.
Is there any way to keep newlines while removing excess whitespace?
http://www.php.net/manual/en/regexp.reference.escape.php
defines \h
any horizontal whitespace character (since PHP 5.2.4)
so probably you are looking for
$foo = preg_replace( '/\h+/', ' ', $foo );
example: http://ideone.com/NcOiKW
If some of your symbols were converted to � after preg_replace (for example, Cyrillic capital letter R / Р), use mb_ereg_replace instead of preg_replace:
$value = mb_ereg_replace('/\h+/', ' ', $value);
if you want to remove excess of only-spaces (not tabs, new-lines, etc) you could use HEX code to be more specific:
$text = preg_replace('/\x20+/', ' ', $text);

regex: change html before saving in database

Before saving into database i need to
delete all tags
delete all more then one white space characters
delete all more then one newlines
for it i do the following
$content = preg_replace('/<[^>]+>/', "", $content);
$content = preg_replace('/\n/', "NewLine", $content);it's for not to lose them when deleting more then one white space character
$content = preg_replace('/(\&nbsp\;){1,}/', " ", $content);
$content = preg_replace('/[\s]{2,}/', " ", $content);
and finnaly i must delete more then one "NewLine" words.
after first two points i get text in such format-
NewLineWordOfText
NewLine
NewLine
NewLine NewLine WordOfText "WordOfText WordOfText" WordOfText NewLine"WordOfText
...
how telede more then one newline from such content?
Thanks
First of all, while HTML is not regular and thus it is a bad idea to use regular expressions to parse it, PHP has a function that will remove tags for you: strip_tags
To squeeze spaces while preserving newlines:
$content = preg_replace('/[^\n\S]{2,}/', " ", $content);
$content = preg_replace('/\n{2,}/', "\n", $content);
The first line will squeeze all whitespace other than \n ([^\n\S] means all characters that aren't \n and not a non-whitespace character) into one space. The second will squeeze multiple newlines into a single newline.
why don't you use nl2br() and then preg_replace all <br /><br />s with just <br /> then all <br />s back to \n?

Remove newline character from a string using PHP regex

How can I remove a new line character from a string using PHP?
$string = str_replace(PHP_EOL, '', $string);
or
$string = str_replace(array("\n","\r"), '', $string);
$string = str_replace("\n", "", $string);
$string = str_replace("\r", "", $string);
To remove several new lines it's recommended to use a regular expression:
$my_string = trim(preg_replace('/\s\s+/', ' ', $my_string));
Better to use,
$string = str_replace(array("\n","\r\n","\r"), '', $string).
Because some line breaks remains as it is from textarea input.
Something a bit more functional (easy to use anywhere):
function strip_carriage_returns($string)
{
return str_replace(array("\n\r", "\n", "\r"), '', $string);
}
stripcslashes should suffice (removes \r\n etc.)
$str = stripcslashes($str);
Returns a string with backslashes stripped off. Recognizes C-like \n,
\r ..., octal and hexadecimal representation.
Try this out. It's working for me.
First remove n from the string (use double slash before n).
Then remove r from string like n
Code:
$string = str_replace("\\n", $string);
$string = str_replace("\\r", $string);
Let's see a performance test!
Things have changed since I last answered this question, so here's a little test I created. I compared the four most promising methods, preg_replace vs. strtr vs. str_replace, and strtr goes twice because it has a single character and an array-to-array mode.
You can run the test here:
        https://deneskellner.com/stackoverflow-examples/1991198/
Results
251.84 ticks using preg_replace("/[\r\n]+/"," ",$text);
81.04 ticks using strtr($text,["\r"=>"","\n"=>""]);
11.65 ticks using str_replace($text,["\r","\n"],["",""])
4.65 ticks using strtr($text,"\r\n"," ")
(Note that it's a realtime test and server loads may change, so you'll probably get different figures.)
The preg_replace solution is noticeably slower, but that's okay. They do a different job and PHP has no prepared regex, so it's parsing the expression every single time. It's simply not fair to expect them to win.
On the other hand, in line 2-3, str_replace and strtr are doing almost the same job and they perform quite differently. They deal with arrays, and they do exactly what we told them - remove the newlines, replacing them with nothing.
The last one is a dirty trick: it replaces characters with characters, that is, newlines with spaces. It's even faster, and it makes sense because when you get rid of line breaks, you probably don't want to concatenate the word at the end of one line with the first word of the next. So it's not exactly what the OP described, but it's clearly the fastest. With long strings and many replacements, the difference will grow because character substitutions are linear by nature.
Verdict: str_replace wins in general
And if you can afford to have spaces instead of [\r\n], use strtr with characters. It works twice as fast in the average case and probably a lot faster when there are many short lines.
Use:
function removeP($text) {
$key = 0;
$newText = "";
while ($key < strlen($text)) {
if(ord($text[$key]) == 9 or
ord($text[$key]) == 10) {
//$newText .= '<br>'; // Uncomment this if you want <br> to replace that spacial characters;
}
else {
$newText .= $text[$key];
}
// echo $k . "'" . $t[$k] . "'=" . ord($t[$k]) . "<br>";
$key++;
}
return $newText;
}
$myvar = removeP("your string");
Note: Here I am not using PHP regex, but still you can remove the newline character.
This will remove all newline characters which are not removed from by preg_replace, str_replace or trim functions

Categories