Problem Replacing Literal String \r\n With Line Break in PHP - php

I have a text file that has the literal string \r\n in it. I want to replace this with an actual line break (\n).
I know that the regex /\\r\\n/ should match it (I have tested it in Reggy), but I cannot get it to work in PHP.
I have tried the following variations:
preg_replace("/\\\\r\\\\n/", "\n", $line);
preg_replace("/\\\\[r]\\\\[n]/", "\n", $line);
preg_replace("/[\\\\][r][\\\\][n]/", "\n", $line);
preg_replace("/[\\\\]r[\\\\]n/", "\n", $line);
If I just try to replace the backslash, it works properly. As soon as I add an r, it finds no matches.
The file I am reading is encoded as UTF-16.
Edit:
I have also already tried using str_replace().
I now believe that the problem here is the character encoding of the file. I tried the following, and it did work:
$testString = "\\r\\n";
echo preg_replace("/\\\\r\\\\n/", "\n", $testString);
but it does not work on lines I am reading in from my file.

Save yourself the effort of figuring out the regex and try str_replace() instead:
str_replace('\r\n', "\n", $string);

Save yourself the effort of figuring out the regex and the escaping within double quotes:
$fixed = str_replace('\r\n', "\n", $line);
For what it is worth, preg_replace("/\\\\r\\\\n/", "\n", $line); should be fine. As a demonstration:
var_dump(preg_replace("/\\\\r\\\\n/", "NL", 'Cake is yummy\r\n\r\n'));
Gives: string(17) "Cake is yummyNLNL"
Also fine is: '/\\\r\\\n/' and '/\\\\r\\\\n/'
Important - if the above doesn't work, are you even sure literal \r\n is what you're trying to match?..

UTF-16 is the problem. If you're just working with raw the bytes, then you can use the full sequences for replacing:
$out = str_replace("\x00\x5c\x00\x72\x00\x5c\x00\x6e", "\x00\x0a", $in);
This assumes big-endian UTF-16, else swap the zero bytes to come after the non zeros:
$out = str_replace("\x5c\x00\x72\x00\x5c\x00\x6e\x00", "\x0a\x00", $in);
If that doesn't work, please post a byte-dump of your input file so we can see what it actually contains.

$result = preg_replace('/\\\\r\\\\n/', '\n', $subject);
The regex above replaces the type of line break normally used on windows (\r\n) with linux line breaks (\n).
References:
Difference between CR LF, LF and CR line break types?
Right way to escape backslash [ \ ] in PHP regex?
Regex Explanation

I always keep searching for this topic, and I always come back to a personal line I wrote.
It looks neat and its based on RegEx:
"/[\n\r]/"
PHP
preg_replace("/[\n\r]/",'\n', $string )
or
preg_replace("/[\n\r]/",$replaceStr, $string )

Related

Change string read from file into special chars (e.g. "\n")

My script reads a file containing a replacement string and then makes a preg_replace of spaces in some text with the replacement. The idea is that the replacement file should contain any valid regex replacement.
When the replacement file contains a simple string like e.g. "xyz", it works fine. But when it contains "\n", I would like to treat it as a new line, but it doesn't work. The spaces in text are replaced literally by "\n". Here is the script:
$c = file_get_contents('replacements.txt');
$s = preg_replace('/ /', $c, 'some text');
file_put_contents('output.txt', $s);
The output.txt contains "some\ntext" when viewed in text editor.
So I added a simple if statement:
if ($c == '\n') {
$c = "\n";
}
And now it works. But is there a more general way to deal with this problem, i.e. get the replacement string from file interpreted as a real regex replacement? Because in the future it might be a more complicated replacement.
You may have indeed similar issues with other escape sequences, like \t, \r, \x10, ... etc.
I would suggest this solution, to turn the string into a version that has these characters interpreted.
$c = json_decode('"'.str_replace('"', '\"', $c).'"');
... then the replace will work as intended.

How to remove every second occurrence within a string?

Basically, I have a string that I need to search through and remove every SECOND occurrence within it.
Here is what my string looks like ($s):
question1,answer1,answer2,answer3,answer4
question2,answer1,answer2,answer3,answer4
question3,answer1,answer2,answer3,answer4
Here is what my code currently looks like:
$toRemove = array("\n");
$finalString = str_replace($toRemove, "", $s);
As you can see, each line within my s string contains two \n between them. I would like to search through my string and only replace every SECOND \n so that my string ends up being:
question1,answer1,answer2,answer3,answer4
question2,answer1,answer2,answer3,answer4
question3,answer1,answer2,answer3,answer4
Is this possible? If so, how can I do it?
In your specific case, you may want to just replace two newlines with one newline:
$string = str_replace("\n\n", "\n", $string);
More complicated regex solutions could collapse any number of concurrent newlines:
preg_replace("/\n+/", "\n", "foo\n\nbar\n\n\n\n\nblee\nnope");
Adam's answer is correct for UNIX like systems but in Windows you can have different line endings. My Regex is a little bit rusty but I think this should work for UNIX and Windows.
$string = preg_replace('/[\n\r]{2}/', '\n', $string); Replace exact 2 line endings
$string = preg_replace('/[\n\r]+/', '\n', $string); Replace 1 or more line endings

Trying to remove new lines and spaces using regex

I am attempting to remove some line breaks and spaces from a multiline string I have, such as the following:
Toronto (YTZ)
to
Montreal (YUL)
I tried doing:
$matched = preg_replace('/[\n]/', '', $string);
var_dump($matched);
but all it returns is:
Montreal (YUL)
I've tried all sorts of combinations of regular expressions, but it only ever seems to find what I specify, replace it, and display anything AFTER the matched expression.
I'm sure it's something simple, but I can't seem to figure it out.
Thanks in advance!
\n only represents "go to line" if it is between double quotes in PHP "\n"... Your regex should be "/[\n]/" not '/[\n]/'
Anyway, don't use a regular expression for that, but str_replace("\n",'',$string) instead. It's faster.
As Kash already noticed you, expression of new line in different OS can be different.
That's where PHP_EOL constant is used. This constant is defined depending on OS.
$string = str_replace(PHP_EOL, '', $string);
if string could be created on different machine, then it would be better to replace "\r" and "\n" separately
$string = str_replace(array("\r", "\n"), '', $string);
$str = preg_replace('/\n+(?=.)/', " ",
preg_replace('/^\s*/m', "",
$str));
Check this code here.

explode error \r\n and \n in windows and linux server

I have used explode function to get textarea's contain into array based on line. When I run this code in my localhost (WAMPserver 2.1) It work perfectly with this code :
$arr=explode("\r\n",$getdata);
When I upload to my linux server I need to change above code everytime into :
$arr=explode("\n",$getdata);
What will be the permanent solution to me. Which common code will work for me for both server?
Thank you
The constant PHP_EOL contains the platform-dependent linefeed, so you can try this:
$arr = explode(PHP_EOL, $getdata);
But even better is to normalize the text, because you never know what OS your visitors uses. This is one way to normalize to only use \n as linefeed (but also see Alex's answer, since his regex will handle all types of linefeeds):
$getdata = str_replace("\r\n", "\n", $getdata);
$arr = explode("\n", $getdata);
As far as I know the best way to split a string by newlines is preg_split and \R:
preg_split('~\R~', $str);
\R matches any Unicode Newline Sequence, i.e. not only LF, CR, CRLF, but also more exotic ones like VT, FF, NEL, LS and PS.
If that behavior isn't wanted (why?), you could specify the BSR_ANYCRLF option:
preg_split('~(*BSR_ANYCRLF)\R~', $str);
This will match the "classic" newline sequences only.
Well, the best approach would be to normalize your input data to just use \n, like this:
$input = preg_replace('~\r[\n]?~', "\n", $input);
Since:
Unix uses \n.
Windows uses \r\n.
(Old) Mac OS uses \r.
Nonetheless, exploding by \n should get you the best results (if you don't normalize).
The PHP_EOL constant contains the character sequence of the host operating system's newline.
$arr=explode(PHP_EOL,$getdata);
You could use preg_split() which will allow it to work regardless:
$arr = preg_split('/\r?\n/', $getdata);

newline question

I want to detect a carriage return or a newline character when a user enters data into a textarea. What is the best way to handle this? I've tried str_replace with escape characters but carriage returns and newlines are not detected.
OK, say I type the following into a textarea:
The summer was hot this year
but next year is supposed to be cooler.
I want to detect the CRs. In this case, there is one.
Newlines could be \r, \r\n, or \n, depending on the client.
$input = preg_replace('/\r\n?/',"\n",$input)
will standardize all of your newlines to "\n" regardless of where they came from.
You can do it like this with str_replace:
function replace_newline($string) {
return (string)str_replace(array("\r", "\r\n", "\n"), '', $string);
}
There are several ways how new line is stored.
Some systems use only "\n" some "\r" and some both "\r\n". You need to check for both "\r" and "\n"
Try the following. It's always worked a charm for me.
You need to replace \n AND \r, it's because a linux system and a windows system use different characters for newlines.
$input = str_replace(array("\n","\r"),'',$input);
Or check for chr(10) and replace on that
Have you tried preg_replace because that can be used for regex replacements and then you can replace using \n or \r or any combination you require although I believe str_replace should also work fine.
function replace_newlines($string) {
return preg_replace('/\r\n|\r|\n/', '', $string);
}

Categories