I wan to replace text using preg_replace.But my search string have a / so it makes problem.
How can I solve it?
$search='r/trtrt';
echo preg_replace('/\b'.addslashes($search).'\b/', 'ERTY', 'TG FRT');
I am getting error preg_replace(): Unknown modifier 'T'
Use a different delimiter and don't use addslashes, that is escaping non-regex special characters (or a mix of regex and non-regex characters, I'd say the majority of the time dont use addslashes).
$search='r/trtrt';
echo preg_replace('~\b'. $search.'\b~', 'ERTY', 'TG FRT');
You could use preg_quote as an alternative. Just changing the delimiter is the easiest solution though.
use ~ as delimiter:
$search='r/trtrt';
echo preg_replace('~\b'.addslashes($search).'\b~', 'ERTY', 'TG FRT');
I always use ~ as it is one of the least used char in a string but you can use any character you want and won't need to escape your regexp chars!
You don't need addslashes() in your case but if you have a more complex regexp and you want to escape chars you should use preg_quote($search).
Why not escape it the way it is meant to be done
$search='r/trtrt';
echo preg_replace('/\b'.preg_quote($search, '/').'\b/', 'ERTY', 'TG FRT');
http://php.net/manual/en/function.preg-quote.php
preg_quote() takes str and puts a backslash in front of every
character that is part of the regular expression syntax. This is
useful if you have a run-time string that you need to match in some
text and the string may contain special regex characters
delimiter
If the optional delimiter is specified, it will also be escaped.
This is useful for escaping the delimiter that is required by the PCRE
functions. The / is the most commonly used delimiter.
Add slashes is not the function to use here. It provides no escaping for any of the special characters in Regx.
The special regular expression characters are: . \ + * ? [ ^ ] $ ( ) {
} = ! < > | : -
Using the proper functions promote readability of the code, if at some later point in time you or another coder see the ~ delimiter they may just think its part of a personal "style" or pay it little attention. However, seeing the input properly escaped will tell any experienced coder that the input could contain characters that conflict with regular expressions.
Personally, readability is at the top of my list whenever I write code. If you cant understand it at a glance, what good is it.
Related
i'm a noob in regular expressions.
Il would like to prevent a form for special characters.
The characters auhorized are :
^#{}()<>|_æ+#%.,?:;"~\/=*$£€!
I made a preg_match rule that makes problems
if(preg_match("#[^#{}()<>|_æ+#%.,?:;"~\/=*$£€!]+#",$input)) $error=1;
I know that i should encapsulate special chars but i didn't know to achieve this.
Can you help me please ?
Thanks in advance.
You can use
preg_match('/[^#{}()<>|_æ+#%.,?:;"~\/=*$£€!]+/u', $input)
Note:
Using double quotation marks inside single-quoted string literals allows to avoid extra escaping
When you use a specific char as a regex delimiter char, here you used #, you must escape this char inside the pattern.
Note # is safe to always escape since it is a special regex metacharacter when the x flag is used to enable comment/verbose/free-spacing mode (it is called in a lot of ways across regex references/libraries).
Also, since you are using chars from outside ASCII chars, it is good idea to add u flag (to support Unicode strings as input).
I've been trying to make a few of functions based on RegEx and most of them use \Q and \E as some of the RegEx pattern is user input.
So, let's say hypothetically that we're using the delimiter / and want to match it against / the function would construct something amongst the lines of /\Q/\E/.
I'm not sure why /\Q/\E/ doesn't match / but with every other delimiter it does, unless you use the same delimiter as input.
Maybe, it considers the delimiter the end, even though, it's in a literal-only block and the escape as literal. Not sure, tried a bunch.
Hopefully someone can push me into the right direction as to what workarounds there are for this issue.
It helps to understand that / is not a regex metacharacter, like * or (. It's special because you're using it to delimit the regex itself, and the only way to escape the regex delimiter is with a backslash (\/).
But you shouldn't need to use \Q and \E. The preg_quote() method takes a delimiter argument, so it correctly adds backslashes everywhere they're needed.
As preface, I am new to (and really bad at) writing regular expressions.
I am trying to use a regular expression in the PHP function preg_split, and am looking to delineate by
*
**
`
I'm having trouble because these characters are commands. How can I write a regular expression to do this?
For PCRE and other so-called compatible flavors, you must escape these outside character classes.
. ^ $ * + ? () [ { \ |
The backtick has no special meaning, so you don't need to escape it.
preg_split('/\*{1,2}|`/', $text);
See Demo
Note: For future reference, you may want to look into using preg_quote()
preg_quote() takes str and puts a backslash in front of every character that is part of the regular expression syntax. This is useful if you have a run-time string that you need to match in some text and the string may contain special regex characters.
preg_split("(?:\*{1,2}|\`)", $string);
Ok so I managed to solve a problem at work with regex, but the solution is a bit of a monster.
The string to be validated must be:
zero or more: A-Z a-z 0-9, spaces, or these symbols: . - = + ' , : ( ) /
But, the first and/or last characters must not be a forward slash /
This was my solution (used preg_match php function):
"/^[a-z\d\s\.\-=\+\',:\(\)][a-z\d\s\.\-=\+\',\/:\(\)]*[a-z\d\s\.\-=\+\',:\(\)]$|^[a-z\d\s\.\-=\+\',:\(\)]$/i"
A colleague thinks this is too big and complicated. Well it works, so is it really that bad? Anyone in the mood for some regex-golf?
You can simplify your expression to this:
/^(?:[a-z\d\s.\-=+',:()]+(?:/+[a-z\d\s.\-=+',:()]+)*)?$/i
The outer (?:…)? is to allow an empty string. The [a-z\d\s.\-=+',:()]+ allows to start with one or more of the specified characters except the /. If a / follows, it also must be followed by one or more of the other specified characters ((?:/[a-z\d\s.\-=+',:()]+)*).
Furthermore, inside a character set, you only need to escape the characters \, ], and depending on the position also ^ and -.
Try something like this instead
function validate($string) {
return (preg_match("/[a-zA-Z0-9.\-=+',:()/]*/", $string) && substr($string, 0,1) != '/' && substr($string, -1) != '/'))
}
It's a lot simpler to check the first and last character specifically. Otherwise you're left with dealing with a lot of overhead when it comes to empty strings and such. Your regex, for example, requires the string to be at least one character long, otherwise it doesn't validate. Despite "" fitting your criteria.
'#^(?!/)[a-z\d .=+\',:()/-]*$(?<!/)#i'
As others have observed, most of those characters don't need to be escaped inside a character class. Additionally, the hyphen doesn't need to be escaped if it's the last thing listed, and the slash doesn't need to be escaped if you use a different character as the regex delimiter (# in this case, but ~ is a popular choice, too).
I also ditched the double-quotes in favor of single-quotes, which meant I had to escape the single-quote in the regex. That's worth it because single-quoted strings are so much simpler to work with: no $variable interpolation, no embedded executable {code}, and the only characters you have to escape for them are the single-quote and the backslash.
But the main innovation here is the use of lookahead and lookbehind to exclude the slash as the first or last character. That's not just a code-golf tactic, either; I would write the regex this way anyway, because it expresses my intent so much better. Why force the next guy to parse those almost-identical character classes, when you can just say what you mean? "...but the first and last character can't be slashes."
^([a-zA-Z0-9!##$%^&*|()_\-+=\[\]{}:;\"',<.>?\/~`]{4,})$
Would this regular expression work for these rules?
Must be atleast 4 characters
Characters can be a mix of alphabet (capitalized/non-capitalized), numeric, and the following characters: ! # # $ % ^ & * ( ) _ - + = | [ { } ] ; : ' " , < . > ? /
It's intended to be a password validator. The language is PHP.
Yes?
Honestly, what are you asking for? Why don't you test it?
If, however, you want suggestions on improving it, some questions:
What is this regex checking for?
Why do you have such a large set of allowed characters?
Why don't you use /\w/ instead of /0-9a-zA-Z_/?
Why do you have the whole thing in ()s? You don't need to capture the whole thing, since you already have the whole thing, and they aren't needed to group anything.
What I would do is check the length separately, and then check against a regex to see if it has any bad characters. Your list of good characters seems to be sufficiently large that it might just be easier to do it that way. But it may depend on what you're doing it for.
EDIT: Now that I know this is PHP-centric, /\w/ is safe because PHP uses the PCRE library, which is not exactly Perl, and in PCRE, \w will not match Unicode word characters. Thus, why not check for length and ensure there are no invalid characters:
if(strlen($string) >= 4 && preg_match('[\s~\\]', $string) == 0) {
# valid password
}
Alternatively, use the little-used POSIX character class [[:graph:]]. It should work pretty much the same in PHP as it does in Perl. [[:graph:]] matches any alphanumeric or punctuation character, which sounds like what you want, and [[:^graph:]] should match the opposite. To test if all characters match graph:
preg('^[[:graph:]]+$', $string) == 1
To test if any characters don't match graph:
preg('[[:^graph:]]', $string) == 0
You forgot the comma (,) and full stop (.) and added the tilde (~) and grave accent (`) that were not part of your specification. Additionally just a few characters inside a character set declaration have to be escaped:
^([a-zA-Z0-9!##$%^&*()|_\-+=[\]{}:;"',<.>?/~`]{4,})$
And that as a PHP string declaration for preg_match:
'/^([a-zA-Z0-9!##$%^&*()|_\\-+=[\\]{}:;"\',<.>?\\/~`]{4,})$/'
I noticed that you essentially have all of ASCII, except for backslash, space and the control characters at the start, so what about this one, instead?
^([!-\[\]-~]{4,})$
You are extra escaping and aren't using some predefined character classes (such as \w, or at least \d).
Besides of that and that you are anchoring at the beginning and at the end, meaning that the regex will only match if the string starts and ends matching, it looks correct:
^([a-zA-Z\d\-!$##$%^&*()|_+=\[\]{};,."'<>?/~`]{4,})$
If you really mean to use this as a password validator, it reeks of insecurity:
Why are you allowing 4 chars passwords?
Why are you forbidding some characters? PHP can't handle some? Why would you care? Let the user enter the characters he pleases, after all you'll just end up storing a hash + salt of it.
No. That regular expression would not work for the rules you state, for the simple reason that $ by default matches before the final character if it is a newline. You are allowing password strings like "1234\n".
The solution is simple. Either use \z instead of $, or apply the D modifier to the regex.