str_replace on string only if it INCLUDES a whitespace - php

I have a substitution code that replaces all instances of XD with an XD smiley face... thing is, should a link include the string 'XD', it then breaks the link.
I want it to only replace the XD, if it is followed by a whitespace, as in 'XD ', except I can't seem to get it to work (tried &nbsp, /\s/ and as in 'XD&nbsp')
Chances are I'm getting something really obvious wrong, but I can't find any help (all of it seems to be about removing whitespace, not requiring it), so I'm hoping someone can help me.
Here's the code for reference:
function BB_CODE($content) {
$content = str_replace("XD", "<img src=\"images/smilies/icon_xd.gif\" alt=\"XD\">", $content);
}
The content is user input. Thanks for any help!

You should surround "XD" with %
$content= str_replace("%XD%", "<img src=\"images/smilies/icon_xd.gif\" alt=\"XD\">", $content);
EDIT :
Or using preg_replace
preg_replace("/XD/", "<img src=\"images/smilies/icon_xd.gif\" alt=\"XD\">", $content);

So you want to replace XD only if it's alone?
preg_replace('/\bXD\b/', '(ಠ‿ಠ)', "Then I was like XD")
Use \b to watch for word boundaries instead of \s. This means it works at the beginning and the end of string too, like in my example.
With preg_replace() there are two common gotchas:
The separator char, I used / here by convention but in my own code I prefer %. You could write the regex as '%\bXD\b%' with the same meaning.
Escaping the backslashes, I used a single quoted string so I don't have to escape the backslash in \b. If you use double qoutes, you have to escape it, like so: "/\\bXD\\b/"

Related

Using regex replace a string with the same string with quotes (") removed using php

I have a multi-line string shown below -
?a="text1
?bc="text23
I need to identify a pattern like using below regex
'/[?][a-z^A-Z]+[=]["]/'
and replace my string by just remove the double quote (") in it, expected output is shown below
?a=text1
?b=text23
Please help in solving the above issue using php.
Capture everything except the quote in a capture group () and replace:
$string = preg_replace('/([?][a-z^A-Z]+[=])["]/', '$1', $string);
But you really don't need all those character classes []:
/(\?[a-z^A-Z]+=)"/
I will give another solution because i see the php tag also. So let's say you have these:
$a='"text1';
$b='"text2';
if i echo them i get
"text1
"text2
in order to get rid of double quote there is a function trim in php that you can use like that:
echo trim($a,'"');
echo trim($b,'"');
the results will be
text1
text2
I dont think you need regex in this occasion as long as you use php. Php can take care of those small things without bother with complex regex expressions.

Regex extra spaces in string not in double or single quotes - PHP

I would like to replace extra spaces (instances of consecutive whitespace characters) with one space, as long as those extra spaces are not in double or single quotes (or any other enclosures I may want to include).
I saw some similar questions, but I could not find a direct response to my needs above. Thank you!
Hope you're still looking, or come back to check! This seems to work for me:
'/\s+((["\']).*?(?=\2)\2)|\s\s+/'
...and replace with $1
EDIT
Also, if you need to allow for escaped quotes like \" or \', you could use this expression:
'/\s+((["\'])(\\\\\2|(?!\2).)*?(?=\2)\2)|\s\s+/'
It gets a bit stickier if you want to add support for "balanced" quotes like brackets (e.g. () or {})
END EDIT
Let me know if you find problems or would like some explanation!
HOPEFULLY FINAL EDIT AND WARNINGS
Potential problem: If a quoted string starts at the beginning of the string variable (or file), it will either not count as a quoted string (and have any whitespace reduced) or it will throw off the whole thing, making anything NOT in quotes get treated as though it was in quotes and vice versa -
A potential change that might remedy this is to use the following match expression
/(?:^|\s+)((["\'])(\\\\\2|(?!\2).)*?(?=\2)\2)|\s\s+/
this replaces \s+ with (?:^|\s+) at the beginning of the expression
this will add a space at the beginning of the variable if the string starts with a quote - just trim() or remove that whitespace to continue
I seem to have used the "line by line" approach (like sed, if I'm not mistaken) to reach my original results - if you use the "whole file" or "whole string" setting or approach, carriage-return-line-feed seems to count as two whitespace characters (can't imagine why...), thus turning any newlines into single spaces (unless they are inside quotes and "dot-matches-newline" is used, of course)
this could be resolved by replacing the . and \s shorthand character classes with the specific characters you want to match, like the following:
/(?:^|[ \t]+)((["\'])(\\\\\2|(?!\2)[\s\S])*?(?=\2)\2)|[ \t]{2,}/
this does not require the dot-matches-newline switch and only replaces multiple spaces or tabs - not newlines - with a single space (and of course, only if they are not quoted)
EXAMPLE
This link shows an example of the first expression and last expression in use on sample text on http://codepad.viper-7.com
You could do it in several steps. Consider the following example:
$str = 'This is a string with "Bunch of extra spaces". Leave them "untouched !".';
$id = 0;
$buffer = array();
$str = preg_replace_callback('|".*?"|', function($m) use (&$id, &$buffer) {
$buffer[] = $m[0];
return '__' . $id++;
}, $str);
$str = preg_replace('|\s+|', ' ', $str);
$str = preg_replace_callback('|__(\d+)|', function($m) use ($buffer) {
return $buffer[$m[1]];
}, $str);
echo $str;
This will output the string:
This is a string with "Bunch of extra spaces". Leave them "untouched !".
Although this is is not the prettiest solution.

Regex pattern matching literal repeated \n

Given a literal string such as:
Hello\n\n\n\n\n\n\n\n\n\n\n\nWorld
I would like to reduce the repeated \n's to a single \n.
I'm using PHP, and been playing around with a bunch of different regex patterns. So here's a simple example of the code:
$testRegex = '/(\\n){2,}/';
$test = 'Hello\n\n\n\n\n\n\n\n\nWorld';
$test2 = preg_replace($testRegex ,'\n',$test);
echo "<hr/>test regex<hr/>".$test2;
I'm new to PHP, not that new to regex, but it seems '\n' conforms to special rules. I'm still trying to nail those down.
Edit: I've placed the literal code I have in my php file here, if I do str_replace() I can get good things to happen, but that's not a complete solution obviously.
To match a literal \n with regex, your string literal needs four backslashes to produce a string with two backlashes that’s interpreted by the regex engine as an escape for one backslash.
$testRegex = '/(\\\\n){2,}/';
$test = 'Hello\n\n\n\n\n\n\n\n\n\n\n\nWorld';
$test2 = preg_replace($testRegex, '\n', $test);
Perhaps you need to double up the escape in the regular expression?
$pattern = "/\\n+/"
$awesome_string = preg_replace($pattern, "\n", $string);
Edit: Just read your comment on the accepted answer. Doesn't apply, but is still useful.
If you're intending on expanding this logic to include other forms of white-space too:
$output = echo preg_replace('%(\s)*%', '$1', $input);
Reduces all repeated white-space characters to single instances of the matched white-space character.
it indeed conforms to special rules, and you need to add the "multiline"-modifier, m. So your pattern would look like
$pattern = '/(\n)+/m'
which should provide you with the matches. See the doc for all modifiers and their detailed meaning.
Since you're trying to reduce all newlines to one, the pattern above should work with the rest of your code. Good luck!
Try this regular expression:
/[\n]*/

regex with special characters?

i am looking for a regex that can contain special chracters like / \ . ' "
in short i would like a regex that can match the following:
may contain lowercase
may contain uppercase
may contain a number
may contain space
may contain / \ . ' "
i am making a php script to check if a certain string have the above or not, like a validation check.
The regular expression you are looking for is
^[a-z A-Z0-9\/\\.'"]+$
Remember if you are using PHP you need to use \ to escape the backslashes and the quotation mark you use to encapsulate the string.
In PHP using preg_match it should look like this:
preg_match("/^[a-z A-Z0-9\\/\\\\.'\"]+$/",$value);
This is a good place to find the regular expressions you might want to use.
http://regexpal.com/
You can always escape them by appending a \ in front of the special characters.
try this:
preg_match("/[A-Za-z0-9\/\\.'\"]/", ...)
NikoRoberts is 100% correct.
I would only add the following suggestion: When creating a PHP regex pattern string, always use: single-quotes. There are far fewer chars which need to be escaped (i.e. only the single quote and the backslash itself needs to be escaped (and the backslash only needs to be escaped if it appears at the end of the string)).
When dealing with backslash soup, it helps to print out the (interpreted) regex string. This shows you exactly what is being presented to the regex engine.
Also, a "number" might have an optional sign? Yes? Here is my solution (in the form of a tested script):
<?php // test.php 20110311_1400
$data_good = 'abcdefghijklmnopqrstuvwxyzABCDE'.
'FGHIJKLMNOPQRSTUVWXYZ0123456789+- /\\.\'"';
$data_bad = 'abcABC012~!###$%^&*()';
$re = '%^[a-zA-Z0-9+\- /\\\\.\'"]*$%';
echo($re ."\n");
if (preg_match($re, $data_good)) {
echo("CORRECT: Good data matches.\n");
} else {
echo("ERROR! Good data does NOT match.\n");
}
if (preg_match($re, $data_bad)) {
echo("ERROR! Bad data matches.\n");
} else {
echo("CORRECT: Bad data does NOT match.\n");
}
?>
The following regex will match a single character that fits the description you gave:
[a-zA-Z0-9\ \\\/\.\'\"]
If your point is to insure that ONLY characters in this range of characters are used in your string, then you can use the negation of this which would be:
[^a-zA-Z0-9\ \\\/\.\'\"]
In the second case, you could use your regex to find the bad stuff (that you don't want to be included), and if it didn't find anything then your string pattern must be kosher, because I'm assuming that if you find one character that is not in the proper range, then your string is not valid.
so to put it in PHP syntax:
$regex = "[^a-zA-Z0-9\ \\\/\.\'\"]"
if preg_match( $regex, ... ) {
// handle the bad stuff
}
Edit 1:
I've completely ignored the fact that backslashes are special in php double-quoted strings, so here is a correcting to the above code:
$regex = "[^a-zA-Z0-9\\ \\\\\\/\\.\\'\\\"]"
If that doesn't work it shouldn't take too much for someone to debug how many of the backslashes need to be escaped with a backslash, and what other characters need also to be escaped....

PHP Regex No Backslash

So I hadn't done any regexps for a while, so I thought I'd brush up on my memory. I'm trying to convert a string like a*b*c into a<b>b</b>c. I've already gotten that working, but now I want to keep a string like a\*b\*c from turning into a\<b>b\</b>c, but rather, into a*b*c. Here's the code I'm using now:
$string = preg_replace("/\*([\s\S]*?)\*/", "<b>$1</b>", $input);
I've tried putting this \\\\{0} in before the asterisks, and that didn't work. Neither did [^\\\\].
Try negative lookbehind:
"/(?<!\\\\)\*([\s\S]*?)(?<!\\\\)\*/"
This only matches a * if it's not preceded by a \.
This is brittle, though; it would also fail if the string is escaped backslash \\*bold* text.

Categories