Php regex how to add a backslash "\"? - php

I want to add a backslash "\" before all non alphanumeric characters like "how are you \:\)", so I used this:
$code = preg_replace('/([^A-Za-z0-9])/i', '\$1', $code);
But it doesn't work. Instead it just echos '\$1'. What am I doing wrong?
I also tried
$code = preg_replace('/([^A-Za-z0-9])/i', '\\$1', $code);
But won't work.

You need four backslashes:
$code = preg_replace('/([^A-Za-z0-9])/i', '\\\\$1', $code);
The reason is that the backslash escapes itself in PHP string context (even single quotes). For PCRE to see even one, you need at least two. But to not being misinterpreted to mask the replacement placeholder, you need to double that still. (Btw, three backslashes would also accidentially work.)

EXAMPLE:
<?php
$str = "Is your name O'reilly?";
// Outputs: Is your name O\'reilly?
echo addslashes($str);
?>

Related

Why is PHP changing the first character after ltrim or str_replace?

I'm trying to remove a part of a directory with PHPs ltrim(), however the result is unexpected. The first letter of my result has the wrong ascii value, and shows up as missing character/box in the browser.
Here is my code:
$stripPath = "public\docroot\4300-4399\computer-system-upgrade";
$directory = "public\docroot\4300-4399\computer-system-upgrade\3.0 Outgoing Documents";
$shortpath = ltrim($directory, $stripPath);
echo $shortpath;
Expected output:
3.0 Outgoing Documents
Actual output:
.0 Outgoing Documents
Note the invisible/non-print character before the dot. Ascii value changed from Hex 33 (the number 3) to Hex 03 (invisible character).
I also tried str_replace() instead of trim(), but the result stays the same.
What am i doing wrong here? How would i get the expected result "3.0 Outgoing Documents"?
When you provide a string value in quotation marks you have to be aware that the backslash is used as a masking character. So, \3 is understood as the ASCII(3) character. In your example you need to provide double backslashes in order to define your desired string (having single backslashes in it):
$stripPath = "public\\docroot\\4300-4399\\computer-system-upgrade\\";
$directory = "public\\docroot\\4300-4399\\computer-system-upgrade\\3.0 Outgoing Documents";
Don't use ltrim.
Ltrim does not replace straight off. It strips kind of like regex does.
Meaning all characters you put in the second argument is used to remove anything.
See example: https://3v4l.org/AfsHJ
The reason it stops at . is because it's not part of $stripPath
You should instead do is use real regex or simple str_replace.
$stripPath = "public\docroot\4300-4399\computer-system-upgrade";
$directory = "public\docroot\4300-4399\computer-system-upgrade\3.0 Outgoing Documents";
$shortpath = str_replace($stripPath, "", $directory);
echo $shortpath;
https://3v4l.org/KF2Iv
That happens because of / mark because / has a special meaning.
If you try this with a space you can get the expected output.
$stripPath = "public\docroot\4300-4399\computer-system-upgrade";
$directory = "public\docroot\4300-4399\computer-system-upgrade 3.0 Outgoing Documents";
$shortpath = ltrim($directory, $stripPath);
echo $shortpath;
Backslash is PHP’s escape sequence. Add a backslash to ‘stripPath’ to trim it from ‘dirctory’

Removing characters using preg_replace function

I use preg_replace function. I want the function not to remove apostrophe (') character. So I want it to return the word as (o'clock) .
How can I do that?
$last_word = "o'clock.";
$new_word= preg_replace('/[^a-zA-Z0-9 ]/','',$last_word);
echo $new_word;
Try:
$last_word = "o'clock.";
$new_word= preg_replace('/[^a-zA-Z0-9\' ]/','',$last_word);
echo $new_word;
Demo here: http://ideone.com/JMH8F
That regex explicitly removes all characters except for letters and numbers. Note the leading "^". So it does what you ask it to.
So most likely you want to add the "'" (apostrophe) to the exclusion set inside the regex:
'/[^a-zA-Z0-9\' ]/'
Change your original '/[^a-zA-Z0-9 ]/' to "/[^a-zA-Z0-9 ']/". This simply includes the apostrophe in the negated character class.
See an online example.
Aside: my suggestion would be to use double-quotes for the string (as you have with "o'clock.") since mixing backslash escapes with PHP strings and regex patterns can get confusing quickly.
Try this. It may help..
$new_word= preg_replace('/\'/', '', $last_word);
Demo: http://so.viperpad.com/F82z9o
That regex you use does not remove the "'" (apostrophe). Instead it does not match the subject string at all because of the "." (dot). In that case preg_replace() returns NULL.

Regex extra spaces in string not in double or single quotes - PHP

I would like to replace extra spaces (instances of consecutive whitespace characters) with one space, as long as those extra spaces are not in double or single quotes (or any other enclosures I may want to include).
I saw some similar questions, but I could not find a direct response to my needs above. Thank you!
Hope you're still looking, or come back to check! This seems to work for me:
'/\s+((["\']).*?(?=\2)\2)|\s\s+/'
...and replace with $1
EDIT
Also, if you need to allow for escaped quotes like \" or \', you could use this expression:
'/\s+((["\'])(\\\\\2|(?!\2).)*?(?=\2)\2)|\s\s+/'
It gets a bit stickier if you want to add support for "balanced" quotes like brackets (e.g. () or {})
END EDIT
Let me know if you find problems or would like some explanation!
HOPEFULLY FINAL EDIT AND WARNINGS
Potential problem: If a quoted string starts at the beginning of the string variable (or file), it will either not count as a quoted string (and have any whitespace reduced) or it will throw off the whole thing, making anything NOT in quotes get treated as though it was in quotes and vice versa -
A potential change that might remedy this is to use the following match expression
/(?:^|\s+)((["\'])(\\\\\2|(?!\2).)*?(?=\2)\2)|\s\s+/
this replaces \s+ with (?:^|\s+) at the beginning of the expression
this will add a space at the beginning of the variable if the string starts with a quote - just trim() or remove that whitespace to continue
I seem to have used the "line by line" approach (like sed, if I'm not mistaken) to reach my original results - if you use the "whole file" or "whole string" setting or approach, carriage-return-line-feed seems to count as two whitespace characters (can't imagine why...), thus turning any newlines into single spaces (unless they are inside quotes and "dot-matches-newline" is used, of course)
this could be resolved by replacing the . and \s shorthand character classes with the specific characters you want to match, like the following:
/(?:^|[ \t]+)((["\'])(\\\\\2|(?!\2)[\s\S])*?(?=\2)\2)|[ \t]{2,}/
this does not require the dot-matches-newline switch and only replaces multiple spaces or tabs - not newlines - with a single space (and of course, only if they are not quoted)
EXAMPLE
This link shows an example of the first expression and last expression in use on sample text on http://codepad.viper-7.com
You could do it in several steps. Consider the following example:
$str = 'This is a string with "Bunch of extra spaces". Leave them "untouched !".';
$id = 0;
$buffer = array();
$str = preg_replace_callback('|".*?"|', function($m) use (&$id, &$buffer) {
$buffer[] = $m[0];
return '__' . $id++;
}, $str);
$str = preg_replace('|\s+|', ' ', $str);
$str = preg_replace_callback('|__(\d+)|', function($m) use ($buffer) {
return $buffer[$m[1]];
}, $str);
echo $str;
This will output the string:
This is a string with "Bunch of extra spaces". Leave them "untouched !".
Although this is is not the prettiest solution.

preg_match not working

I am trying to match some link from some texts:
$reg = '#ok is it http://google.com/?s=us#';
$page = 'Well i think ! ok is it http://google.com/?s=us&ui=pl0 anyways it ok';
if(preg_match($reg,$page)){
echo 'it work';
}else{
echo 'not work';
}
Now the problem is , if i use $reg = '#ok is it http://google.com/'; then its ok but when i use that one with "?=" it doesnt.
ok ! i understand there is some problem of syntax error is there any function or ready made function which automatically escape these special characters ?
You have a lot of syntax errors. You must escape all the special chars as '.', '?' and so on. Thus you have to replace the chars like this:
'.' -> '\.'
'?' -> '\?'
...
Anyway, the regex should be like this:
$reg = '#ok is it http:\/\/google\.com/\?s=us#';
Some characters are read as metacharacters by the REGEX engine, meaning that they have a special function within the engine's procedures, a few examples being ? (question mark), \ (slash), . (period), * (asterisk) etcetera.
Just as with strings you would send with SQL that contains metacharacters, you will need to escape these characters manually by adding a trailing slash: \. When escaping the \ character, you might need to escape it three or four times like this: \\\ or \\\\.
Use:
$reg = '#ok is it http://google.com/\?s=us#';

regex with special characters?

i am looking for a regex that can contain special chracters like / \ . ' "
in short i would like a regex that can match the following:
may contain lowercase
may contain uppercase
may contain a number
may contain space
may contain / \ . ' "
i am making a php script to check if a certain string have the above or not, like a validation check.
The regular expression you are looking for is
^[a-z A-Z0-9\/\\.'"]+$
Remember if you are using PHP you need to use \ to escape the backslashes and the quotation mark you use to encapsulate the string.
In PHP using preg_match it should look like this:
preg_match("/^[a-z A-Z0-9\\/\\\\.'\"]+$/",$value);
This is a good place to find the regular expressions you might want to use.
http://regexpal.com/
You can always escape them by appending a \ in front of the special characters.
try this:
preg_match("/[A-Za-z0-9\/\\.'\"]/", ...)
NikoRoberts is 100% correct.
I would only add the following suggestion: When creating a PHP regex pattern string, always use: single-quotes. There are far fewer chars which need to be escaped (i.e. only the single quote and the backslash itself needs to be escaped (and the backslash only needs to be escaped if it appears at the end of the string)).
When dealing with backslash soup, it helps to print out the (interpreted) regex string. This shows you exactly what is being presented to the regex engine.
Also, a "number" might have an optional sign? Yes? Here is my solution (in the form of a tested script):
<?php // test.php 20110311_1400
$data_good = 'abcdefghijklmnopqrstuvwxyzABCDE'.
'FGHIJKLMNOPQRSTUVWXYZ0123456789+- /\\.\'"';
$data_bad = 'abcABC012~!###$%^&*()';
$re = '%^[a-zA-Z0-9+\- /\\\\.\'"]*$%';
echo($re ."\n");
if (preg_match($re, $data_good)) {
echo("CORRECT: Good data matches.\n");
} else {
echo("ERROR! Good data does NOT match.\n");
}
if (preg_match($re, $data_bad)) {
echo("ERROR! Bad data matches.\n");
} else {
echo("CORRECT: Bad data does NOT match.\n");
}
?>
The following regex will match a single character that fits the description you gave:
[a-zA-Z0-9\ \\\/\.\'\"]
If your point is to insure that ONLY characters in this range of characters are used in your string, then you can use the negation of this which would be:
[^a-zA-Z0-9\ \\\/\.\'\"]
In the second case, you could use your regex to find the bad stuff (that you don't want to be included), and if it didn't find anything then your string pattern must be kosher, because I'm assuming that if you find one character that is not in the proper range, then your string is not valid.
so to put it in PHP syntax:
$regex = "[^a-zA-Z0-9\ \\\/\.\'\"]"
if preg_match( $regex, ... ) {
// handle the bad stuff
}
Edit 1:
I've completely ignored the fact that backslashes are special in php double-quoted strings, so here is a correcting to the above code:
$regex = "[^a-zA-Z0-9\\ \\\\\\/\\.\\'\\\"]"
If that doesn't work it shouldn't take too much for someone to debug how many of the backslashes need to be escaped with a backslash, and what other characters need also to be escaped....

Categories