Given a literal string such as:
Hello\n\n\n\n\n\n\n\n\n\n\n\nWorld
I would like to reduce the repeated \n's to a single \n.
I'm using PHP, and been playing around with a bunch of different regex patterns. So here's a simple example of the code:
$testRegex = '/(\\n){2,}/';
$test = 'Hello\n\n\n\n\n\n\n\n\nWorld';
$test2 = preg_replace($testRegex ,'\n',$test);
echo "<hr/>test regex<hr/>".$test2;
I'm new to PHP, not that new to regex, but it seems '\n' conforms to special rules. I'm still trying to nail those down.
Edit: I've placed the literal code I have in my php file here, if I do str_replace() I can get good things to happen, but that's not a complete solution obviously.
To match a literal \n with regex, your string literal needs four backslashes to produce a string with two backlashes that’s interpreted by the regex engine as an escape for one backslash.
$testRegex = '/(\\\\n){2,}/';
$test = 'Hello\n\n\n\n\n\n\n\n\n\n\n\nWorld';
$test2 = preg_replace($testRegex, '\n', $test);
Perhaps you need to double up the escape in the regular expression?
$pattern = "/\\n+/"
$awesome_string = preg_replace($pattern, "\n", $string);
Edit: Just read your comment on the accepted answer. Doesn't apply, but is still useful.
If you're intending on expanding this logic to include other forms of white-space too:
$output = echo preg_replace('%(\s)*%', '$1', $input);
Reduces all repeated white-space characters to single instances of the matched white-space character.
it indeed conforms to special rules, and you need to add the "multiline"-modifier, m. So your pattern would look like
$pattern = '/(\n)+/m'
which should provide you with the matches. See the doc for all modifiers and their detailed meaning.
Since you're trying to reduce all newlines to one, the pattern above should work with the rest of your code. Good luck!
Try this regular expression:
/[\n]*/
Related
Lately I've been studying (more in practice to tell the truth) regex, and I'm noticing his power. This demand made by me (link), I am aware of 'backreference'. I think I understand how it works, it works in JavaScript, while in PHP not.
For example I have this string:
[b]Text B[/b]
[i]Text I[/i]
[u]Text U[/u]
[s]Text S[/s]
And use the following regex:
\[(b|i|u|s)\]\s*(.*?)\s*\[\/\1\]
This testing it on regex101.com works, the same for JavaScript, but does not work with PHP.
Example of preg_replace (not working):
echo preg_replace(
"/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\1\]/i",
"<$1>$2</$1>",
"[b]Text[/b]"
);
While this way works:
echo preg_replace(
"/\[(b|i|u|s)\]\s*(.*?)\s*\[\/(b|i|u|s)\]/i",
"<$1>$2</$1>",
"[b]Text[/b]"
);
I can not understand where I'm wrong, thanks to everyone who helps me.
It is because you use a double quoted string, inside a double quoted string \1 is read as the octal notation of a character (the control character SOH = start of heading), not as an escaped 1.
So two ways:
use single quoted string:
'/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\1\]/i'
or escape the backslash to obtain a literal backslash (for the string, not for the pattern):
"/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\\1\]/i"
As an aside, you can write your pattern like this:
$pattern = '~\[([bius])]\s*(.*?)\s*\[/\1]~i';
// with oniguruma notation
$pattern = '~\[([bius])]\s*(.*?)\s*\[/\g{1}]~i';
// oniguruma too but relative:
// (the second group on the left from the current position)
$pattern = '~\[([bius])]\s*(.*?)\s*\[/\g{-2}]~i';
I have a string that looks like this
../Clean_Smarty_Projekt/tpl/templates_c\.
../Clean_Smarty_Projekt/tpl/templates_c\..
I want to replace ../, \. and \.. with a regulare expression.
Before, I did this like this:
$result = str_replace(array("../","\..","\."),"",$str);
And there it (pattern) has to be in this order because changing it makes the output a little buggy. So I decided to use a regular expression.
Now I came up with this pattern
$result = preg_replace('/(\.\.\/)|(\\[\.]{1,2})/',"",$str);
What actually returns only empty strings...
Reason: (\\[\.]{1,2})
In Regex101 its all ok. (Took me a couple of minutes to realize that I don't need the /g in preg_replace)
If I use this pattern in preg_replace I have to do (\\\\[\.]{1,2}) to get it to work. But that's obviously wrong because im not searching for two slashes.
Of course I know the escaping rulse (escaping slashes).
Why doesn't this match correctly ?
I suggest you to use a different php delimiter. Within the / delimiter, you need to use three \\\ or four \\\\ backslashes to match a single backslash.
$string = '../Clean_Smarty_Projekt/tpl/templates_c\.'."\n".'../Clean_Smarty_Projekt/tpl/templates_c\..';
echo preg_replace('~\.\./|\\\.{1,2}~', '', $string)
Output:
Clean_Smarty_Projekt/tpl/templates_c
Clean_Smarty_Projekt/tpl/templates_c
I've been looking all over the internet for some useful information and I think I found too much. I'm trying to understand regular expressions but don't get it.
Lets for instance say $data="A bunch of text [link=123] another bunch of text.", and it should get replaced with "< a href=\"123.html\">123< /a>".
I've been trying around a lot with code similar to this:
$find = "/[link=[([0-9])]/";
$replace = "< a href=\"$1\">$1< /a>";
echo preg_replace ($find, $replace, $data);
but the output is always the same as the original $data.
I think I have to see something relevent to my problem understand the basics.
Remove the extra [] around the (), and add + after the [0-9] to quantify it. Also, escape the [] that make up the tag itself.
$find = "/\[link=(\d+)\]/"; // "\d" is equivalent to "[0-9]"
$replace = "$1";
echo preg_replace($find,$replace,$data);
The regex would be \[link=([\d]+)\]
A good source for an quick overview of regular expression can you find here http://www.regular-expressions.info/
When you really interested in the power of regular expression, you should buy this book: Mastering Regular Expressions
A good Programm to test your RexEx on a Windows Client is: RegEx-Trainer
You are missing the + quantifier and as a result of this your pattern matches if there is a single digit following link=.
And there is an extra pair of [..] as a result of this the outer [...] will be treated as the character class.
You also forgot the escape the closing ].
Solution:
$find = "/[link=([0-9]+)\]/";
<?php
$data= "A bunch of text [link=123] another bunch of text.";
$find = '/\[link=([0-9]+?)\]/';
echo preg_replace($find, "$1", $data);
I am trying to parse a badly formed html table:
A couple of lines of this are:
Food:</b> Yes<b><br>
Pool: </b>Beach<b></b><b><br>
Centre:</b> Yes<b><br>
After spending a lot of time on this with Xpath, I think it is probably better to split the above text into lines use preg_split and parse from there.
The pattern I think would work uses:
<\b><\br>*: <\b>
my code is as follows:
$pattern='</b></br>*:</b>';
$pattern=preg_quote($pattern,'#');
$chars = preg_split($pattern, $output);
print_r($chars);
I am getting the following error:
Delimiter must not be alphanumeric or backslash
What I am doing wrong?
Try this:
$pattern='</b></br>*:</b>';
$pattern=preg_quote($pattern,'#');
$chars = preg_split('#'.$pattern.'#', $output);
print_r($chars);
The preg_quote function just makes it safely escaped, it doesn't actually add the delimiters for you.
As other people will surely point out, using regular expressions is not a good way to parse HTML :)
Your regular expression is also not going to match what you hope. Here's a version that will probably work for your input:
$in = " Pool: </b>Beach<b></b><b><br>";
$out = explode(':', strip_tags($in));
$key = trim($out[0]);
$value = trim($out[1]);
echo "$key = $value\n";
This removes all the HTML, then splits on the colon, and then removes any surrounding whitespace.
Your pattern needs to start and end with a delimiter; looks like you're using # if I'm reading this correctly, so you should have $pattern = '#</b></br>.*:</b>#';.
Also, you're mixing things up; * is not a simple wildcard in regex. If you mean "any number of any characters," the pattern you need is .*. I've included this above.
OK,I know that I should use a DOM parser, but this is to stub out some code that's a proof of concept for a later feature, so I want to quickly get some functionality on a limited set of test code.
I'm trying to strip the width and height attributes of chunks HTML, in other words, replace
width="number" height="number"
with a blank string.
The function I'm trying to write looks like this at the moment:
function remove_img_dimensions($string,$iphone) {
$pattern = "width=\"[0-9]*\"";
$string = preg_replace($pattern, "", $string);
$pattern = "height=\"[0-9]*\"";
$string = preg_replace($pattern, "", $string);
return $string;
}
But that doesn't work.
How do I make that work?
PHP is unique among the major languages in that, although regexes are specified in the form of string literals like in Python, Java and C#, you also have to use regex delimiters like in Perl, JavaScript and Ruby.
Be aware, too, that you can use single-quotes instead of double-quotes to reduce the need to escape characters like double-quotes and backslashes. It's a good habit to get into, because the escaping rules for double-quoted strings can be surprising.
Finally, you can combine your two replacements into one by means of a simple alternation:
$pattern = '/(width|height)="[0-9]*"/i';
Your pattern needs the start/end pattern character. Like this:
$pattern = "/height=\"[0-9]*\"/";
$string = preg_replace($pattern, "", $string);
"/" is the usual character, but most characters would work ("|pattern|","#pattern#",whatever).
I think you're missing the parentheses (which can be //, || or various other pairs of characters) that need to surround a regular expression in the string. Try changing your $pattern assignments to this form:
$pattern = "/width=\"[0-9]*\"/";
...if you want to be able to do a case-insensitive comparison, add an 'i' at the end of the string, thus:
$pattern = "/width=\"[0-9]*\"/i";
Hope this helps!
David