Following is the syntax for preg_replace() function in php:
$new_string = preg_replace($pattern_to_match, $replacement_string, $original_string);
if a text file has both Windows (rn) and Linux(n) End of line (EOL) characters i.e line feeds.
then which of the following is the correct order of applying preg_replace() to get rid of all end of line characters?
remove cr first
$string = preg_replace('|rn|','',$string);
$string = preg_replace('|n|','',$string);
remove plain nl first
$string = preg_replace('|n|','',$string);
$string = preg_replace('|rn|','',$string);
I would recommend to use: (Windows, Unix and Mac EOL characters)
$string = preg_replace('/\r|\n/m','',$string);
Notice m multiline modifier.
which of the following is the correct order of applying preg_replace to get rid of all end of line characters?
$string = preg_replace("!\r|\n!m",'',$string);
Using the power of regular expressions, you could specify something like
'|[\r][\n]|'
which specifically mean, 0 or 1 '\r', then 0 or 1 '\n' which would match the end of a row under both linux and windows.
EDIT:
Using the build-in function trim would achieve the same result in an even better manner, but only if the newline characters are located at the beginning or end of the string.
Related
Basically, I have a string that I need to search through and remove every SECOND occurrence within it.
Here is what my string looks like ($s):
question1,answer1,answer2,answer3,answer4
question2,answer1,answer2,answer3,answer4
question3,answer1,answer2,answer3,answer4
Here is what my code currently looks like:
$toRemove = array("\n");
$finalString = str_replace($toRemove, "", $s);
As you can see, each line within my s string contains two \n between them. I would like to search through my string and only replace every SECOND \n so that my string ends up being:
question1,answer1,answer2,answer3,answer4
question2,answer1,answer2,answer3,answer4
question3,answer1,answer2,answer3,answer4
Is this possible? If so, how can I do it?
In your specific case, you may want to just replace two newlines with one newline:
$string = str_replace("\n\n", "\n", $string);
More complicated regex solutions could collapse any number of concurrent newlines:
preg_replace("/\n+/", "\n", "foo\n\nbar\n\n\n\n\nblee\nnope");
Adam's answer is correct for UNIX like systems but in Windows you can have different line endings. My Regex is a little bit rusty but I think this should work for UNIX and Windows.
$string = preg_replace('/[\n\r]{2}/', '\n', $string); Replace exact 2 line endings
$string = preg_replace('/[\n\r]+/', '\n', $string); Replace 1 or more line endings
I have strings like this (some examples):
F7998FM3213/02F
J442554NM/05
K439459845/34D
I need to use PHP with preg_replace and regular expressions to delete all non-numeric characters in any string, after the forward-slash, '/'.
For example the codes above would look like this afterwards:
F7998FM3213/02
J442554NM/05
K439459845/34
If you're going for readability, something like this would be perfect:
$parts = explode("/",$line,2);
$parts[1] = preg_replace("/\D/","",$parts[1]);
$output = implode("/",$parts);
However, for conciseness and based entirely on the examples you have given, try this:
$output = preg_replace("/\D+$/","",$input);
This will strip any non-numeric characters from the end of the string, which seems to be what you're after based on your examples.
you can use this:
$subject = <<<LOD
F7998FM3213/02F
J442554NM/05
K439459845/34D
K439459845/34D34
LOD;
echo preg_replace('~^[^/]*+/\K|[^\d\n]++~m', '', $subject);
explanation:
The regex is an alternation between two things:
You match the begining until you encounter / included
the part after the / that is all that is not a digit or a new line one or more times
Since the begining of the string is checked at first, all non digit characters are removed after the /
To remove all \D anywhere after a / you could replace:
(?:/\K|\G(?!^))(\d*)\D+
with $1. Like:
preg_replace(',(?:/\K|\G(?!^))(\d*)\D+,', '$1', $str);
So evidently:
\n = CR (Carriage Return) // Used as a new line character in Unix
\r = LF (Line Feed) // Used as a new line character in Mac OS
\r\n = CR + LF // Used as a new line character in Windows
(char)13 = \n = CR // Same as \n
but then I also heard that for HTML textarea, when it's submitted and parsed by a php script, all new lines are converted to \r\n regardless of the platform
is this true and can I rely on this or am I completely mistaken?
ie. if I wanna do explode() based on a new line, can I use '\r\n' as the delimiter regardless of whether or not the user is using mac, pc, etc
All newlines should be converted in \r\n by the spec.
So you could indeed do a simple explode("\r\n", $theContent) no matter the platform used.
P.S.
\r is only used on old(er) Macs. Nowadays Macs also use the *nix style line breaks (\n).
You could try preg_split, which will use a regular expression to split up the string. Within this regular expression you can match on all 3 new line variants.
$ArrayOfResults = preg_split( '/\r\n|\r|\n/', $YourStringToExplode );
It depends on what you want to achieve. If you are doing this eventually to display / format it as HTML, you can as well use the nl2br() function or possibly use str_replace like this:
$val = str_replace( array("\n","\r","\r\n"), '<br />', $val );
In case you want to just get an array of all lines, I would suggest you use all 3 characters ("\n","\r","\r\n") for explode
I tried using preg_replace method to replace matching regular expression but i am getting the error message
"Warning: preg_replace(): No ending delimiter '_' found"
$oldString = "";
$newString = preg_replace("/[^a-z0-9_]/ig", "", $oldString);
Here i am trying to remove all the characters other than alphabets,numbers and underscore.
The g is not supported in PHP, remove the g modifier (global) will do.
Here is the list of supported modifier
I think php doesn't like the control g char after your trailing /. I've been having trouble with this as well and removing the g seems to help. preg_replace has optional params it takes after the string you wish to augment where you control the number of times you wish to limit the search to, it's global by default.
The manual says that you will set the limit with the 4th param (limit) and if you want you can pass in a count param 5th which will will give you the number of times it found the match.
For my money this is just another thing that PHP does 1/2 right, which all adds up to it being just about a perfectly 1/2 assed language. But that's neither here nor there :)
Oh, and welcome to Stack! :)
First of all there is no modifier g for preg_replace.
$oldString = "";
$newString = preg_replace("/[^a-z0-9_]*/i", "", $oldString);
Second, try to put a multiplier after your character class in order to replace more than 1 char.
In RegEx \W means any non-alpha-numeric-underscore characters. Keep in mind this will also replace spaces.
$oldString = "This, is not _all_ alpha-numeric";
$newString = preg_replace("/\W+/", "", $oldString);
# Gives "Thisisnot_all_alphanumeric"
$newString = preg_replace("/[^\w ]+/", "", $oldString);
# Gives "This is not _all_ alphanumeric"
I have a text file that has the literal string \r\n in it. I want to replace this with an actual line break (\n).
I know that the regex /\\r\\n/ should match it (I have tested it in Reggy), but I cannot get it to work in PHP.
I have tried the following variations:
preg_replace("/\\\\r\\\\n/", "\n", $line);
preg_replace("/\\\\[r]\\\\[n]/", "\n", $line);
preg_replace("/[\\\\][r][\\\\][n]/", "\n", $line);
preg_replace("/[\\\\]r[\\\\]n/", "\n", $line);
If I just try to replace the backslash, it works properly. As soon as I add an r, it finds no matches.
The file I am reading is encoded as UTF-16.
Edit:
I have also already tried using str_replace().
I now believe that the problem here is the character encoding of the file. I tried the following, and it did work:
$testString = "\\r\\n";
echo preg_replace("/\\\\r\\\\n/", "\n", $testString);
but it does not work on lines I am reading in from my file.
Save yourself the effort of figuring out the regex and try str_replace() instead:
str_replace('\r\n', "\n", $string);
Save yourself the effort of figuring out the regex and the escaping within double quotes:
$fixed = str_replace('\r\n', "\n", $line);
For what it is worth, preg_replace("/\\\\r\\\\n/", "\n", $line); should be fine. As a demonstration:
var_dump(preg_replace("/\\\\r\\\\n/", "NL", 'Cake is yummy\r\n\r\n'));
Gives: string(17) "Cake is yummyNLNL"
Also fine is: '/\\\r\\\n/' and '/\\\\r\\\\n/'
Important - if the above doesn't work, are you even sure literal \r\n is what you're trying to match?..
UTF-16 is the problem. If you're just working with raw the bytes, then you can use the full sequences for replacing:
$out = str_replace("\x00\x5c\x00\x72\x00\x5c\x00\x6e", "\x00\x0a", $in);
This assumes big-endian UTF-16, else swap the zero bytes to come after the non zeros:
$out = str_replace("\x5c\x00\x72\x00\x5c\x00\x6e\x00", "\x0a\x00", $in);
If that doesn't work, please post a byte-dump of your input file so we can see what it actually contains.
$result = preg_replace('/\\\\r\\\\n/', '\n', $subject);
The regex above replaces the type of line break normally used on windows (\r\n) with linux line breaks (\n).
References:
Difference between CR LF, LF and CR line break types?
Right way to escape backslash [ \ ] in PHP regex?
Regex Explanation
I always keep searching for this topic, and I always come back to a personal line I wrote.
It looks neat and its based on RegEx:
"/[\n\r]/"
PHP
preg_replace("/[\n\r]/",'\n', $string )
or
preg_replace("/[\n\r]/",$replaceStr, $string )