How to remove every second occurrence within a string? - php

Basically, I have a string that I need to search through and remove every SECOND occurrence within it.
Here is what my string looks like ($s):
question1,answer1,answer2,answer3,answer4
question2,answer1,answer2,answer3,answer4
question3,answer1,answer2,answer3,answer4
Here is what my code currently looks like:
$toRemove = array("\n");
$finalString = str_replace($toRemove, "", $s);
As you can see, each line within my s string contains two \n between them. I would like to search through my string and only replace every SECOND \n so that my string ends up being:
question1,answer1,answer2,answer3,answer4
question2,answer1,answer2,answer3,answer4
question3,answer1,answer2,answer3,answer4
Is this possible? If so, how can I do it?

In your specific case, you may want to just replace two newlines with one newline:
$string = str_replace("\n\n", "\n", $string);
More complicated regex solutions could collapse any number of concurrent newlines:
preg_replace("/\n+/", "\n", "foo\n\nbar\n\n\n\n\nblee\nnope");

Adam's answer is correct for UNIX like systems but in Windows you can have different line endings. My Regex is a little bit rusty but I think this should work for UNIX and Windows.
$string = preg_replace('/[\n\r]{2}/', '\n', $string); Replace exact 2 line endings
$string = preg_replace('/[\n\r]+/', '\n', $string); Replace 1 or more line endings

Related

PHP preg_replace() pattern

I have about 500 lines stored in a text file.
Each line look like Filename_662344.xlsx , 324.
I would like to count numbers at the end of each line (324 in this case).
My idea is to delete part of lines using preg_replace() function.
I tried preg_replace("/[^0-9]/", '', $line); , but the result was 662344324
How should the pattern look like if I want to delete the filename (including numbers);
Thanks!
Can try this regex.
$line = 'Filename_662344.xlsx , 324';
$line = preg_replace("/^(.*?), /", '', $line);
echo $line;
Regex Demo
\d+$ will match the numbers at the end of a string
See Here for an example of it.
If you are trying to remove everything except the numbers, you can just use (^.*, ). This starts at the beginning of the line and select everything up to the comma. See Here for an example

Explode text into array as per paragraph

I have the following text:
$test = 'Test This is first line
Test:123
This is Test';
I want to explode this string to an array of paragraphs. I wrote the following code but it is not working:
$array = explode('\n\n', $test);
Any idea what I'm missing here?
You might be on Windows which uses \r\n instead of \n. You could use a regex to make it universal with preg_split():
$array = preg_split('#(\r\n?|\n)+#', $test);
Pattern explanation:
( : start matching group 1
\r\n?|\n : match \r\n, \r or \n
) : end matching group 1
+ : repeat one or more times
If you want to split by 2 newlines, then replace + by {2,}.
Update: you might use:
$array = preg_split('#\R+#', $test);
This extensive answer covers the meaning of \R. Note that this is only supported in PCRE/perl. So in a sense, it's less cross-flavour compatible.
Your code
$array = explode('\n\n', $test);
should have \n\n enclosed in double quotes:
$array = explode("\n\n", $test);
Using single quotes, it looks through the variable $test for a literal \n\n. With double quotes, it looks for the evaluated values of \n\n which are two carriage returns.
Also, note that the end of line depends on the host operating system. Windows uses \r\n instead of \n. You can get the end of line for the operating system by using the predefined constant PHP_EOL.
Try double quotes
$array = explode("\n\n", $test);
did you have try this ?
$array = explode("\n", $test);
The easiest way to get this text into an array like you describe would be:
preg_match_all('/.+/',$string, $array);
Since /./ matches any char, except for line terminators, and the + is greedy, it'll match as many chars as possible, until a new-line is encountered.
Using preg_match_all ensures this is repeated for each line, too. When I tried this, the output looked like this:
array (
0 =>
array (
0 => '$test = \'Test This is first line',
1 => 'Test:123',
2 => 'This is Test\';',
),
)
Also note that line-feeds are different, depending on the environment (\n for *NIX systems, compared to \r\n for windows, or in some cases a simple \r). Perhaps you might want to try explode(PHP_EOL, $text);, too
You need to use double quotes in your code, such that the \n\n is actually evaluated as two lines. Look below:
'Paragraph 1\n\nParagraph 2' =
Paragraph 1\n\nParagraph 2
Whereas:
"Paragraph 1\n\nParagraph 2" =
Paragraph 1
Paragraph 2
Also, Windows systems use \r\n\r\n instead of \n\n. You can detect which line endings the system is using with:
PHP_EOL
So, your final code would be:
$paragraphs = explode(PHP_EOL, $text);

Trying to remove new lines and spaces using regex

I am attempting to remove some line breaks and spaces from a multiline string I have, such as the following:
Toronto (YTZ)
to
Montreal (YUL)
I tried doing:
$matched = preg_replace('/[\n]/', '', $string);
var_dump($matched);
but all it returns is:
Montreal (YUL)
I've tried all sorts of combinations of regular expressions, but it only ever seems to find what I specify, replace it, and display anything AFTER the matched expression.
I'm sure it's something simple, but I can't seem to figure it out.
Thanks in advance!
\n only represents "go to line" if it is between double quotes in PHP "\n"... Your regex should be "/[\n]/" not '/[\n]/'
Anyway, don't use a regular expression for that, but str_replace("\n",'',$string) instead. It's faster.
As Kash already noticed you, expression of new line in different OS can be different.
That's where PHP_EOL constant is used. This constant is defined depending on OS.
$string = str_replace(PHP_EOL, '', $string);
if string could be created on different machine, then it would be better to replace "\r" and "\n" separately
$string = str_replace(array("\r", "\n"), '', $string);
$str = preg_replace('/\n+(?=.)/', " ",
preg_replace('/^\s*/m', "",
$str));
Check this code here.

Removing newlines in php

Following is the syntax for preg_replace() function in php:
$new_string = preg_replace($pattern_to_match, $replacement_string, $original_string);
if a text file has both Windows (rn) and Linux(n) End of line (EOL) characters i.e line feeds.
then which of the following is the correct order of applying preg_replace() to get rid of all end of line characters?
remove cr first
$string = preg_replace('|rn|','',$string);
$string = preg_replace('|n|','',$string);
remove plain nl first
$string = preg_replace('|n|','',$string);
$string = preg_replace('|rn|','',$string);
I would recommend to use: (Windows, Unix and Mac EOL characters)
$string = preg_replace('/\r|\n/m','',$string);
Notice m multiline modifier.
which of the following is the correct order of applying preg_replace to get rid of all end of line characters?
$string = preg_replace("!\r|\n!m",'',$string);
Using the power of regular expressions, you could specify something like
'|[\r][\n]|'
which specifically mean, 0 or 1 '\r', then 0 or 1 '\n' which would match the end of a row under both linux and windows.
EDIT:
Using the build-in function trim would achieve the same result in an even better manner, but only if the newline characters are located at the beginning or end of the string.

Problem Replacing Literal String \r\n With Line Break in PHP

I have a text file that has the literal string \r\n in it. I want to replace this with an actual line break (\n).
I know that the regex /\\r\\n/ should match it (I have tested it in Reggy), but I cannot get it to work in PHP.
I have tried the following variations:
preg_replace("/\\\\r\\\\n/", "\n", $line);
preg_replace("/\\\\[r]\\\\[n]/", "\n", $line);
preg_replace("/[\\\\][r][\\\\][n]/", "\n", $line);
preg_replace("/[\\\\]r[\\\\]n/", "\n", $line);
If I just try to replace the backslash, it works properly. As soon as I add an r, it finds no matches.
The file I am reading is encoded as UTF-16.
Edit:
I have also already tried using str_replace().
I now believe that the problem here is the character encoding of the file. I tried the following, and it did work:
$testString = "\\r\\n";
echo preg_replace("/\\\\r\\\\n/", "\n", $testString);
but it does not work on lines I am reading in from my file.
Save yourself the effort of figuring out the regex and try str_replace() instead:
str_replace('\r\n', "\n", $string);
Save yourself the effort of figuring out the regex and the escaping within double quotes:
$fixed = str_replace('\r\n', "\n", $line);
For what it is worth, preg_replace("/\\\\r\\\\n/", "\n", $line); should be fine. As a demonstration:
var_dump(preg_replace("/\\\\r\\\\n/", "NL", 'Cake is yummy\r\n\r\n'));
Gives: string(17) "Cake is yummyNLNL"
Also fine is: '/\\\r\\\n/' and '/\\\\r\\\\n/'
Important - if the above doesn't work, are you even sure literal \r\n is what you're trying to match?..
UTF-16 is the problem. If you're just working with raw the bytes, then you can use the full sequences for replacing:
$out = str_replace("\x00\x5c\x00\x72\x00\x5c\x00\x6e", "\x00\x0a", $in);
This assumes big-endian UTF-16, else swap the zero bytes to come after the non zeros:
$out = str_replace("\x5c\x00\x72\x00\x5c\x00\x6e\x00", "\x0a\x00", $in);
If that doesn't work, please post a byte-dump of your input file so we can see what it actually contains.
$result = preg_replace('/\\\\r\\\\n/', '\n', $subject);
The regex above replaces the type of line break normally used on windows (\r\n) with linux line breaks (\n).
References:
Difference between CR LF, LF and CR line break types?
Right way to escape backslash [ \ ] in PHP regex?
Regex Explanation
I always keep searching for this topic, and I always come back to a personal line I wrote.
It looks neat and its based on RegEx:
"/[\n\r]/"
PHP
preg_replace("/[\n\r]/",'\n', $string )
or
preg_replace("/[\n\r]/",$replaceStr, $string )

Categories