preg_match not working - php

I am trying to match some link from some texts:
$reg = '#ok is it http://google.com/?s=us#';
$page = 'Well i think ! ok is it http://google.com/?s=us&ui=pl0 anyways it ok';
if(preg_match($reg,$page)){
echo 'it work';
}else{
echo 'not work';
}
Now the problem is , if i use $reg = '#ok is it http://google.com/'; then its ok but when i use that one with "?=" it doesnt.
ok ! i understand there is some problem of syntax error is there any function or ready made function which automatically escape these special characters ?

You have a lot of syntax errors. You must escape all the special chars as '.', '?' and so on. Thus you have to replace the chars like this:
'.' -> '\.'
'?' -> '\?'
...
Anyway, the regex should be like this:
$reg = '#ok is it http:\/\/google\.com/\?s=us#';

Some characters are read as metacharacters by the REGEX engine, meaning that they have a special function within the engine's procedures, a few examples being ? (question mark), \ (slash), . (period), * (asterisk) etcetera.
Just as with strings you would send with SQL that contains metacharacters, you will need to escape these characters manually by adding a trailing slash: \. When escaping the \ character, you might need to escape it three or four times like this: \\\ or \\\\.

Use:
$reg = '#ok is it http://google.com/\?s=us#';

Related

Php regex how to add a backslash "\"?

I want to add a backslash "\" before all non alphanumeric characters like "how are you \:\)", so I used this:
$code = preg_replace('/([^A-Za-z0-9])/i', '\$1', $code);
But it doesn't work. Instead it just echos '\$1'. What am I doing wrong?
I also tried
$code = preg_replace('/([^A-Za-z0-9])/i', '\\$1', $code);
But won't work.
You need four backslashes:
$code = preg_replace('/([^A-Za-z0-9])/i', '\\\\$1', $code);
The reason is that the backslash escapes itself in PHP string context (even single quotes). For PCRE to see even one, you need at least two. But to not being misinterpreted to mask the replacement placeholder, you need to double that still. (Btw, three backslashes would also accidentially work.)
EXAMPLE:
<?php
$str = "Is your name O'reilly?";
// Outputs: Is your name O\'reilly?
echo addslashes($str);
?>

Regexes work in PHP and don't in Erlang. Why?

I tried to rewrite url parsing function written in PHP to Erlang. And I found that these regex don't work in Erlang but work fine in PHP code. Can you tell why and how to make it work with Erlang.
Loose = "^(?:(?![^:#]+:[^:#\/]*#)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:#]*):?([^:#]*))?#)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)".
re:compile( Loose ).
{error,{"nothing to repeat",166}}
Strict = "^(?:([^:\/?#]+):)?(?:\/\/\/?((?:(([^:#]*):?([^:#]*))?#)?([^:\/?#]*)(?::(\d*))?))?(((?:\/(\w:))?((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)".
re:compile( Strict ).
{error,{"nothing to repeat",114}}
But this code works fine:
$url = "http://gazeta.ru/";
$loose = '/^(?:(?![^:#]+:[^:#\/]*#)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:#]*):?([^:#]*))?#)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/';
preg_match($loose, $url, $match);
var_dump( $match );
The character "\" is special in strings in Erlang. There are other special characters which must be preceded by a backslash, these include doublequote and backslash. The technique of marking special characters is called escaping and backslash itself is called an escape character. So "\" must be followed with another character. For example if you want to include character '\' (one backslash) into a string you should write "\\":
CorrectString = "C:\\windows" %% Correct
WrongString = "C:\windows" %% Wrong
Hence you have to change all single backslashes in your regexp to double backslashes. Here is an example in erlang shell:
3> Loose = "^(?:(?![^:#]+:[^:#\\/]*#)([^:\\/?#.]+):)?(?:\\/\\/\\/?)?((?:(([^:#]*):?([^:#]*))?#)?([^:\\/?#]*)(?::(\\d*))?)(((?:\\/(\\w:))?(\\/(?:[^?#](?![^?#\\/]*\\.[^?#\\/.]+(?:[?#]|$)))*\\/?)?([^?#\\/]*))(?:\\?([^#]*))?(?:#(.*))?)".
4> re:compile(Loose).
{ok,{re_pattern,14,0,
<<69,82,67,80,147,2,0,0,16,0,0,0,1,0,0,0,14,0,0,0,0,0,0,
...>>}}

php preg_replace wont accept outside reference

I have the following function which as you can see, replaces certain characters in a string with the pattern, yet it only works when I enter in the pattern as a string like in the first commented out line. I put an echo in there to test what was coming back and its as it should be so I dont know whats going on! Has anyone any clues?
private function check_string( $s )
{
//return preg_replace( '/[^a-z 0-9~%\.:_\\-()"]/i', '', $s );
// a-z 0-9~%\.:_\\-()"
echo $this->permitted_uri_chars;
// /[^a-z 0-9~%\.:_\\-()"]/i
$pattern = '/[^'. $this->permitted_uri_chars .']/i';
return preg_replace( $pattern, '', $s );
}
The error I get is
Message: preg_replace(): Compilation failed: range out of order in character class at offset 18
ANSWER
Thanks to Jason McCreary
$pattern = '/[^'. preg_quote($this->config->item('permitted_uri_chars'), '/') .']+/i';
It is working in the first example because you properly escaped characters for both PHP and the Regular Expression. (i.e. \\).
When using a string, you have only escaped for PHP. So when you use this string in your Regular Expression it is no longer escaped.
This is demonstrated by the following example:
echo '/[^a-z 0-9~%\.:_\\-()"]/i';
// becomes: /[^a-z 0-9~%\.:_\-()"]/i
A few options would be:
Double escape.
Avoid the Regular Expression escaping by placing the dash at the end: /[^a-z 0-9~%.:_()"-]/
Use preg_quote() if you're going to accept strings regular expression syntax.
Note: I'd encourage you to read about escaping inside character classes.

preg_match for . / or \ in PHP

I am trying to match . \ or / using preg_match in PHP.
I thought this would do it but it's matching all strings.
$string = '';
$chars = '/(\.|\\|\/)/';
if (preg_match($chars, $string) != 0) {
echo 'Chars found.';
}
Argument given to preg_match() is string. Strings are automatically escaped by PHP. For example, if you have {\\\\} (backslash) given to the regexp engine, PHP will first parse it creating {\\} (\\ is replaced by \).
Next, regexp engine parses the regexp. It sees {\\} which PHP gave to regexp engine. It sees \ as escape character, so it actually matches \ character which was escaped by \.
In your case, it looks like /(\.|\\|\/)/. PHP gives to regexp engine /(\.|\|\/)/ which is actually either . or |/ (notice that | character was escaped).
Personally, I try to avoid escaping meta-characters, especially with how regexp engine works. I usually use [.] instead, it's more readable. Your regexp written with this would look like /([.]|\\\\|[/])/.
It's possible to do few optimizations. While it's my personal thing, I prefer to use {} as delimiters (yes, you can use pairs of characters). Also, your regexp matches single characters, so you could easily write it as {[.\\\\/]}, which is very readable in my opinion (notice four slashes, it's needed because both PHP and regexp engine parse backslashes).
Also, preg_match() returns number of matches. It will be always bigger than 0, so you can easily consider it to be boolean and avoid writting == 0. Instead, you can insert ! before string to make it negative. But I think you accidentally reversed condition (it matches if it doesn't match). Valid code below:
$string = '';
$chars = '{[.\\\\/]}';
if (preg_match($chars, $string)) {
echo 'Chars found.';
}
Your if logic is flawed. preg_match will return the number of matches. Therefore, == 0 means "no matches".
That said, single quoted strings don't expand escape sequences except \' and \\. You need to double your backslash escape for it to appear in the regex as expected. Change your code to:
$string = '';
$chars = '/(\.|\\\\|\/)/';
if (preg_match($chars, $string) != 0) {
echo 'Chars found.';
}
Here's a test case:
$strings = array('', '.', '/', '\\', 'abc');
$pattern= '/(\.|\\\\|\/)/'
foreach($strings as $string) {
if (preg_match($pattern, $string) > 0) {
printf('String "%s" matched!', $string);
}
}
The issue is probably with PHP. When escaping something in a regex string, you also need to escape the backslashes you use to escape, or PHP will attempt to interpret it as a special character.
As that probably didn't make sense, have an example.
$string = "\." will make PHP attempt to escape the ., and fail. You instead need to change this to $string = "\\\.".
When trying to REGEX match slashes, I would strongly suggest using a different separator character than '/'. It reduces the amount of escaping you need to do and makes it much more readable:
$chars = '%(\.|\\|/)%';
Try this:
$chars = '%(\.|\\\\|/)%'

regex with special characters?

i am looking for a regex that can contain special chracters like / \ . ' "
in short i would like a regex that can match the following:
may contain lowercase
may contain uppercase
may contain a number
may contain space
may contain / \ . ' "
i am making a php script to check if a certain string have the above or not, like a validation check.
The regular expression you are looking for is
^[a-z A-Z0-9\/\\.'"]+$
Remember if you are using PHP you need to use \ to escape the backslashes and the quotation mark you use to encapsulate the string.
In PHP using preg_match it should look like this:
preg_match("/^[a-z A-Z0-9\\/\\\\.'\"]+$/",$value);
This is a good place to find the regular expressions you might want to use.
http://regexpal.com/
You can always escape them by appending a \ in front of the special characters.
try this:
preg_match("/[A-Za-z0-9\/\\.'\"]/", ...)
NikoRoberts is 100% correct.
I would only add the following suggestion: When creating a PHP regex pattern string, always use: single-quotes. There are far fewer chars which need to be escaped (i.e. only the single quote and the backslash itself needs to be escaped (and the backslash only needs to be escaped if it appears at the end of the string)).
When dealing with backslash soup, it helps to print out the (interpreted) regex string. This shows you exactly what is being presented to the regex engine.
Also, a "number" might have an optional sign? Yes? Here is my solution (in the form of a tested script):
<?php // test.php 20110311_1400
$data_good = 'abcdefghijklmnopqrstuvwxyzABCDE'.
'FGHIJKLMNOPQRSTUVWXYZ0123456789+- /\\.\'"';
$data_bad = 'abcABC012~!###$%^&*()';
$re = '%^[a-zA-Z0-9+\- /\\\\.\'"]*$%';
echo($re ."\n");
if (preg_match($re, $data_good)) {
echo("CORRECT: Good data matches.\n");
} else {
echo("ERROR! Good data does NOT match.\n");
}
if (preg_match($re, $data_bad)) {
echo("ERROR! Bad data matches.\n");
} else {
echo("CORRECT: Bad data does NOT match.\n");
}
?>
The following regex will match a single character that fits the description you gave:
[a-zA-Z0-9\ \\\/\.\'\"]
If your point is to insure that ONLY characters in this range of characters are used in your string, then you can use the negation of this which would be:
[^a-zA-Z0-9\ \\\/\.\'\"]
In the second case, you could use your regex to find the bad stuff (that you don't want to be included), and if it didn't find anything then your string pattern must be kosher, because I'm assuming that if you find one character that is not in the proper range, then your string is not valid.
so to put it in PHP syntax:
$regex = "[^a-zA-Z0-9\ \\\/\.\'\"]"
if preg_match( $regex, ... ) {
// handle the bad stuff
}
Edit 1:
I've completely ignored the fact that backslashes are special in php double-quoted strings, so here is a correcting to the above code:
$regex = "[^a-zA-Z0-9\\ \\\\\\/\\.\\'\\\"]"
If that doesn't work it shouldn't take too much for someone to debug how many of the backslashes need to be escaped with a backslash, and what other characters need also to be escaped....

Categories