I want to use preg_match and regular expression in PHP to check that a string starts with either "+44" or "0", but how can I do this without the + being read as matching the preceding character once or more? Would (+44|0) work?
use the ^ to signify start with and a backslash \ to escape the + character. So you'll check for
^\+44 | ^0
In php, to store the regexp in a string, you don't need to double backslash \ to confuse things, just use single quotes instead like:
$regexp = '^\+44 | ^0';
In fact, you don't even need to use anything, this works too:
$regexp = "^\+44 | ^0";
The backslash is the default escape character for regular expressions. You may have to escape the backslash itself as well if it is used in a PHP string, so you'd use something like "(\\+44|0)" as string constant. The regular expression itself would then be (\+44|0).
You can do it several ways. Amongst those I know two:
One is escaping the + with escape character(i.e. back slash)
^(\+44|0)
Or placing the + inside the character class [] where it means the character as it's literal meaning.
^([+]44|0)
^ is the anchor character that means the start of the string/line based on your flag(modifier).
Related
I am trying to escape a string for use in a regular expression in PHP. So far I tried:
preg_quote(addslashes($string));
I thought I need addslashes in order to properly account for any quotes that are in the string. Then preg_quote escapes the regular expression characters.
However, the problem is that quotes are escaped with backslash, e.g. \'. But then preg_quote escapes the backslash with another one, e.g. \\'. So this leaves the quote unescaped once again. Switching the two functions does not work either because that would leave an unescaped backslash which is then interpreted as a special regular expression character.
Is there a function in PHP to accomplish the task? Or how would one do it?
The proper way is to use preg_quote and specify the used pattern delimiter.
preg_quote() takes str and puts a backslash in front of every character that is part of the regular expression syntax... characters are: . \ + * ? [ ^ ] $ ( ) { } = ! < > | : -
Trying to use a backslash as delimiter is a bad idea. Usually you pick a character, that's not used in the pattern. Commonly used is slash /pattern/, tilde ~pattern~, number sign #pattern# or percent sign %pattern%. It is also possible to use bracket style delimiters: (pattern)
Your regex with modification mentioned in comments by #CasimiretHippolyte and #anubhava.
$pattern = '/(?<![a-z])' . preg_quote($string, "/") . '/i';
Maybe wanted to use \b word boundary. No need for any additional escaping.
I would like to get everything between two stars - except of they have a leading backslash.
So for example:
*hello* world
should return "hello", but
*hello \* world*
should return "hello * world"
I tried the following regex:
/(?<!\\)\*(.+?)(?<!\\)\*/s
which works perfect on http://regex101.com/ but php returns:
Warning: preg_replace(): Compilation failed: missing ) at offset 21
What am I doing wrong?
--
EDIT 1:
Here's my PHP-Code for that:
var_dump(preg_replace('/(?<!\\)\*(.+?)(?<!\\)\*/s', '<strong>$1</strong>', '*hello world*'));
You are not escaping the backslashes correctly which results in escaping the ) character.
To match a \ in PHP you need 4 backslashes
/(?<!\\\\)\*(.+?)(?<!\\\\)\*/s
It must be done like this because every backslash in a C-like string
must be escaped by a backslash. That would give us a regular
expression with 2 backslashes, as you might have assumed at first.
However, each backslash in a regular expression must be escaped by a
backslash, too. This is the reason that we end up with 4 backslashes.
Or use a character class with 2 backslashes
/(?<![\\])\*(.+?)(?<![\\])\*/s
A literal backslash can also be matched using preg_match() by using a
character class instead. Backslashes are not escaped when they appear
within character classes in regular expressions. Therefore (“[\]“)
would match a literal backslash. The backslash must still be escaped
once by another backslash because it is still a C-like string.
Edit Found this article which explains why this is necessary. Also, added explanations.
You can use this regex:
\*(.*?(?<!\\))\*
Working demo
I am trying to learn Regex in PHP and stuck in here now. My ques may appear silly but pls do explain.
I went through a link:
Extra backslash needed in PHP regexp pattern
But I just could not understand something:
In the answer he mentions two statements:
2 backslashes are used for unescaping in a string ("\\\\" -> \\)
1 backslash is used for unescaping in the regex engine (\\ -> \)
My ques:
what does the word "unescaping" actually means? what is the purpose of unescaping?
Why do we need 4 backslashes to include it in the regex?
The backslash has a special meaning in both regexen and PHP. In both cases it is used as an escape character. For example, if you want to write a literal quote character inside a PHP string literal, this won't work:
$str = ''';
PHP would get "confused" which ' ends the string and which is part of the string. That's where \ comes in:
$str = '\'';
It escapes the special meaning of ', so instead of terminating the string literal, it is now just a normal character in the string. There are more escape sequences like \n as well.
This now means that \ is a special character with a special meaning. To escape this conundrum when you want to write a literal \, you'll have to escape literal backslashes as \\:
$str = '\\'; // string literal representing one backslash
This works the same in both PHP and regexen. If you want to write a literal backslash in a regex, you have to write /\\/. Now, since you're writing your regexen as PHP strings, you need to double escape them:
$regex = '/\\\\/';
One pair of \\ is first reduced to one \ by the PHP string escaping mechanism, so the actual regex is /\\/, which is a regex which means "one backslash".
I think you can use "preg_quote()":
http://php.net/preg_quote
This function escapes special chars, so you can give an input as it is, without escaping by yourself:
<?php
$string = "online 24/7. Only for \o/";
$escaped_string = preg_quote($string, "/"); // 2nd param is optional and used if you want to escape also the delimiter of your regex
echo $escaped_string; // $escaped_string: "online 24\/7. Only for \\o\/"
?>
As preface, I am new to (and really bad at) writing regular expressions.
I am trying to use a regular expression in the PHP function preg_split, and am looking to delineate by
*
**
`
I'm having trouble because these characters are commands. How can I write a regular expression to do this?
For PCRE and other so-called compatible flavors, you must escape these outside character classes.
. ^ $ * + ? () [ { \ |
The backtick has no special meaning, so you don't need to escape it.
preg_split('/\*{1,2}|`/', $text);
See Demo
Note: For future reference, you may want to look into using preg_quote()
preg_quote() takes str and puts a backslash in front of every character that is part of the regular expression syntax. This is useful if you have a run-time string that you need to match in some text and the string may contain special regex characters.
preg_split("(?:\*{1,2}|\`)", $string);
How do I put a period into a PHP regular expression?
The way it is used in the code is:
echo(preg_match("/\$\d{1,}\./", '$645.', $matches));
But apparently the period in that $645. doesn't get recognized. Requesting tips on how to make this work.
Since . is a special character, you need to escape it to have it literally, so \..
Remember to also escape the escape character if you want to use it in a string. So if you want to write the regular expression foo\.bar in a string declaration, it needs to be "foo\\.bar".
Escape it. The period has a special meaning within a regular expression in that it represents any character — it's a wildcard. To represent and match a literal . it needs to be escaped which is done via the backslash \, i.e., \.
/[0-9]\.[ab]/
Matches a digit, a period, and "a" or "ab", whereas
/[0-9].[ab]/
Matches a digit, any single character1, and "a" or "ab".
Be aware that PHP uses the backslash as an escape character in double-quoted string, too. In these cases you'll need to doubly escape:
$single = '\.';
$double = "\\.";
UPDATE
This echo(preg_match("/\$\d{1,}./", '$645.', $matches)); could be rewritten as echo(preg_match('/\$\d{1,}\./', '$645.', $matches)); or echo(preg_match("/\\$\\d{1,}\\./", '$645.', $matches));. They both work.
1) Not linefeeds, unless configured via the s modifier.