I'm trying to work with some regex in PHP but there is something i don't understand.
Here is my text:
# fhzmvbzmvbzmb##!
# blabla
# test
sbsfzzbg
And let's say i want to emphasise it as in markdown. Why does the following function apply to my second line only ? I would expect it to apply to the third line as well.
preg_replace("/\n(.*)\n/", "<h1>$1</h1>", $input_lines);
Also, i want to catch the first line. Is there a way to write the expression i am trying to catch could either be at the beginning of the string or not ? I've thought about the next function but it doesn't seem to work:
preg_replace("/(^|\n)(.*)\n/", "<h1>$1</h1>", $input_lines);
Thank you very much.
Pierrick
By using the m modifier, you can have ^ and $ apply to every line:
http://www.phpliveregex.com/p/4eb
From the documentation:
By default, PCRE treats the subject string as consisting of a single "line" of characters (even if it actually contains several newlines). The "start of line" metacharacter (^) matches only at the start of the string, while the "end of line" metacharacter ($) matches only at the end of the string, or before a terminating newline (unless D modifier is set). This is the same as Perl. When this modifier is set, the "start of line" and "end of line" constructs match immediately following or immediately before any newline in the subject string, respectively, as well as at the very start and end. This is equivalent to Perl's /m modifier. If there are no "\n" characters in a subject string, or no occurrences of ^ or $ in a pattern, setting this modifier has no effect.
To do the replacement, you can do something like this with lookaheads and lookbehinds to match the newlines. I'm not sure how you'd go about capturing the first line within the same expression you're using to replace. Here's what I came up with:
$input_lines = '# fhzmvbzmvbzmb##!
# blabla
# test
sbsfzzbg';
// REPLACE
$data = preg_replace("/(?<=\n)(.*)(?=\n)/m", "<h1>$1</h1>", $input_lines);
print $data;
// GET THE FIRST LINE
preg_match('/^(.*)\n/', $input_lines, $first_line_matches);
print "\n\nFirst Line: ".$first_line_matches[1];
This outputs the following:
# fhzmvbzmvbzmb##!
<h1># blabla
</h1>
<h1># test
</h1>
sbsfzzbg
First Line: # fhzmvbzmvbzmb##!
Related
I use a regex pattern i preg_match php function. The pattern is let's say '/abc$/'. It matches both strings:
'abc'
and
'abc
'
The second one has the line break at its end. What would be the pattern that matches only this first string?
'abc'
The reason why /abc$/ matches both "abc\n" and "abc" is that $ matches the location at the end of the string, or (even without /m modifier) the position before the newline that is at the end of the string.
You need the following regex:
/abc\z/
where \z is the unambiguous very end of the string, or
/abc$/D
where the /D modifier will make $ behave the same way as \z. See PHP.NET:
The meaning of dollar can be changed so that it matches only at the very end of the string, by setting the PCRE_DOLLAR_ENDONLY option at compile or matching time.
See the regex demo
I use a regex pattern i preg_match php function. The pattern is let's say '/abc$/'. It matches both strings:
'abc'
and
'abc
'
The second one has the line break at its end. What would be the pattern that matches only this first string?
'abc'
The reason why /abc$/ matches both "abc\n" and "abc" is that $ matches the location at the end of the string, or (even without /m modifier) the position before the newline that is at the end of the string.
You need the following regex:
/abc\z/
where \z is the unambiguous very end of the string, or
/abc$/D
where the /D modifier will make $ behave the same way as \z. See PHP.NET:
The meaning of dollar can be changed so that it matches only at the very end of the string, by setting the PCRE_DOLLAR_ENDONLY option at compile or matching time.
See the regex demo
I want to match a $ only at the end.
Why does it does not work:
<?php
$reg = '{$$}';
$str= 'helloc$a';
print preg_match($reg,$str);
It prints 1 -- matched. But I want it to match for example inputs like abc$ or zzz$ only.
$ is a meta-character in regular expressions and has a special meaning — it asserts the position at the end of a line. When you want to match a literal $, you'll need to escape it, i.e. use \$ instead of $:
$reg = '{\$$}';
As Casmir notes in the comments section below the answer, this pattern will also match when the last $ is immediately followed by a newline. To prevent this, you can use the following pattern instead:
$reg = '{\$$}D';
With the D modifier set, a dollar metacharacter in the pattern matches only at the end of the given string. If this modifier is not set, $ also matches immediately before the final character if it is a newline character.
$ is a special character in PHP. You should add a \ before it . Try this:
$reg = '/\$$/';
I have a text file with some configuration value. There a comment starts with a #
I am trying to find a regular expression pattern that will find out all the lines that start with a #
So, sample file:
1st line
#test line this
line #new line
aaaa #aaaa
bbbbbbbbbbb#
cccccccccccc
#ddddddddd
I want to find
#test line this
#ddddddddd
because only these two lines start with #
I tried the following code:
preg_match_all("/^#(.*)$/siU",$text,$m);
var_dump($m);
But it always outputs empty array. Anyone can help?
You forgot the multiline modifier (and you should not use the singleline modifier; also the case-insensitive modifier is unnecessary as well as the ungreedy modifier):
preg_match_all("/^#(.*)$/m",$text,$m);
Explanation:
/m allows the ^ and $ to match at the start/end of lines, not just the entire string (which you need here)
/s allows the dot to match newlines (which you don't want here)
/i turns on case-insensitive matching (which you don't need here)
/U turns on ungreedy matching (which doesn't make a difference here because of the anchors)
A PHP code demo:
$text = "1st line\n#test line this \nline #new line\naaaa #aaaa\nbbbbbbbbbbb#\ncccccccccccc\n#ddddddddd";
preg_match_all("/^#(.*)$/m",$text,$m);
print_r($m[0]);
Results:
[0] => #test line this
[1] => #ddddddddd
You can simply write:
preg_match_all('~^#.*~m', $text, $m);
since the quantifier is greedy by default and the dot doesn't match newlines by default, you will obtain what you want.
Let's say I have the following string:
Some Text Here }
}
How can I do a preg_replace so that only the "}" on the line by itself gets replaced?
I would expect the following to work, but it doesn't:
preg_replace('/^(\s*)(\})(\s*)/', etc);
The following should work:
preg_replace('/^\s*\}\s*$/m', $replacement, $subject);
The s* means any number of the character s. What you probably mean is \s*, any number of whitespace characters.
You need to enable multiline mode for the ^ anchor to work on a per line basis; the default setting is that ^ is the beginning and $ the end of the entire string, not a single line.
Remember the $ anchor, otherwise something like }hello would also get matched.
^ and $ matches the beginning and end of a string. You need the m modifier to make this match the beginning and end of a line.
Your RE will not work as expected. s* matches zero or more occurences of s. It's very likely that you wanted to use \s* instead, to match white space.
preg_replace('/^(\s*)(\})(\s*)$/m', $replacement, $subject);
A multi-line free version, that could be used in a larger regex should spanning lines be needed:
/(^|\n)([^\S\n]*\}[^\S\n]*)(?=\n|$)/