php string backslash followed by digit - php

I have a variable that is set by a file path. The path is dynamically set based on date as such
$str = "IMAGES\2016\08\01\NM.jpg"
notice the backslashes followed by digits. This is set by the server and I cannot alter it before it reaches my php file, however it seems to be causing those characters to encode, thus making my script break.
I've tried to use str_replace to change the backslashes to forward slashes but according to my understanding of the php manual on blackslashes, it is being encoded before the function has a chance to run.
My question is this:
Is there a way to change how php is reading that string? or is there a way I can alter it so that it becomes usable?

The backslash within the string $str is escaping the character immediately following it, you can prevent this behaviour by using single quotes, or; you can escape the backslash (wait for it...) by using a backslash.
echo $str = "IMAGES\2016\08\01\NM.jpg";
Result: IMAGES?68\NM.jpg
echo $str = "IMAGES\\2016\\08\\01\\NM.jpg";
Result: IMAGES\2016\08\01\NM.jpg
Aside: You could use str_replace or preg_replace to replace each single backslash with two backslashes.

Related

Why is PHP changing the first character after ltrim or str_replace?

I'm trying to remove a part of a directory with PHPs ltrim(), however the result is unexpected. The first letter of my result has the wrong ascii value, and shows up as missing character/box in the browser.
Here is my code:
$stripPath = "public\docroot\4300-4399\computer-system-upgrade";
$directory = "public\docroot\4300-4399\computer-system-upgrade\3.0 Outgoing Documents";
$shortpath = ltrim($directory, $stripPath);
echo $shortpath;
Expected output:
3.0 Outgoing Documents
Actual output:
.0 Outgoing Documents
Note the invisible/non-print character before the dot. Ascii value changed from Hex 33 (the number 3) to Hex 03 (invisible character).
I also tried str_replace() instead of trim(), but the result stays the same.
What am i doing wrong here? How would i get the expected result "3.0 Outgoing Documents"?
When you provide a string value in quotation marks you have to be aware that the backslash is used as a masking character. So, \3 is understood as the ASCII(3) character. In your example you need to provide double backslashes in order to define your desired string (having single backslashes in it):
$stripPath = "public\\docroot\\4300-4399\\computer-system-upgrade\\";
$directory = "public\\docroot\\4300-4399\\computer-system-upgrade\\3.0 Outgoing Documents";
Don't use ltrim.
Ltrim does not replace straight off. It strips kind of like regex does.
Meaning all characters you put in the second argument is used to remove anything.
See example: https://3v4l.org/AfsHJ
The reason it stops at . is because it's not part of $stripPath
You should instead do is use real regex or simple str_replace.
$stripPath = "public\docroot\4300-4399\computer-system-upgrade";
$directory = "public\docroot\4300-4399\computer-system-upgrade\3.0 Outgoing Documents";
$shortpath = str_replace($stripPath, "", $directory);
echo $shortpath;
https://3v4l.org/KF2Iv
That happens because of / mark because / has a special meaning.
If you try this with a space you can get the expected output.
$stripPath = "public\docroot\4300-4399\computer-system-upgrade";
$directory = "public\docroot\4300-4399\computer-system-upgrade 3.0 Outgoing Documents";
$shortpath = ltrim($directory, $stripPath);
echo $shortpath;
Backslash is PHP’s escape sequence. Add a backslash to ‘stripPath’ to trim it from ‘dirctory’

Regexes work in PHP and don't in Erlang. Why?

I tried to rewrite url parsing function written in PHP to Erlang. And I found that these regex don't work in Erlang but work fine in PHP code. Can you tell why and how to make it work with Erlang.
Loose = "^(?:(?![^:#]+:[^:#\/]*#)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:#]*):?([^:#]*))?#)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)".
re:compile( Loose ).
{error,{"nothing to repeat",166}}
Strict = "^(?:([^:\/?#]+):)?(?:\/\/\/?((?:(([^:#]*):?([^:#]*))?#)?([^:\/?#]*)(?::(\d*))?))?(((?:\/(\w:))?((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)".
re:compile( Strict ).
{error,{"nothing to repeat",114}}
But this code works fine:
$url = "http://gazeta.ru/";
$loose = '/^(?:(?![^:#]+:[^:#\/]*#)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:#]*):?([^:#]*))?#)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/';
preg_match($loose, $url, $match);
var_dump( $match );
The character "\" is special in strings in Erlang. There are other special characters which must be preceded by a backslash, these include doublequote and backslash. The technique of marking special characters is called escaping and backslash itself is called an escape character. So "\" must be followed with another character. For example if you want to include character '\' (one backslash) into a string you should write "\\":
CorrectString = "C:\\windows" %% Correct
WrongString = "C:\windows" %% Wrong
Hence you have to change all single backslashes in your regexp to double backslashes. Here is an example in erlang shell:
3> Loose = "^(?:(?![^:#]+:[^:#\\/]*#)([^:\\/?#.]+):)?(?:\\/\\/\\/?)?((?:(([^:#]*):?([^:#]*))?#)?([^:\\/?#]*)(?::(\\d*))?)(((?:\\/(\\w:))?(\\/(?:[^?#](?![^?#\\/]*\\.[^?#\\/.]+(?:[?#]|$)))*\\/?)?([^?#\\/]*))(?:\\?([^#]*))?(?:#(.*))?)".
4> re:compile(Loose).
{ok,{re_pattern,14,0,
<<69,82,67,80,147,2,0,0,16,0,0,0,1,0,0,0,14,0,0,0,0,0,0,
...>>}}

addslashes() only if slashes are not added from before

I'm getting a lot of text values from my database that I need to output with slashes added before characters that need to be quoted.
Problem is that some of the data already has the slashes added there from before, whilst some of it doesn't.
How can I add slashes using for example addslashes() - but at the same time make sure that it doesn't add an extra slash in the cases where the slash is already added?
Example:
Input: test
Output should be: test
Input: test
Output should be: test
This is PHP 5.3.10.
If you know that you don't have any double slashes, simply run addslashes() and then replace all \\ with \.
If you have something like this:
test
Using addslashes(), the output will be:
test
So, you may need to replace every occurrence of more than one \ to be sure
function addslashes($string) {
return preg_replace('/([^\\\\])\"/','$1\"',$string);
}
The answer of Qaflanti is correct but I would like to make it more complete, if you want to escape both single and double quotes.
First option :
function escape_quotes($string) {
return preg_replace("/(^|[^\\\\])(\"|')/","$1\\\\$2", $string);
}
Input
I love \"carots\" but "I" don't like \'cherries\'
Output
I love \"carots\" but \"I\" don\'t like \'cherries\'
Explanation :
The \ has a special meaning inside a quoted expression and will escape the following character in the string, so while you would need to write \\ in a regex to search for the backslash character, in php you need to escape those two backslashes also, adding up to a total of 4 backslashes.
So with that in mind, the first capturing group then searches for a single character that is not a backslash (and not two or four backslashes as misleading as it is)
The second capturing group will search for a double or a single quote
exactly once.
So this finds unescaped quotes (double and single) and add a backslash before the quote thus escaping it.
Second option :
Or it might just be best for you to convert them to html entities from the start :
function htmlentities_quotes($string) {
return str_replace(array('"', "'"), array(""", "&apos;"), $string);
}
And then you just have to use the php function htmlspecialchars_decode($string); to revert it back to how it was.
Input
I love "carots" but "I" don't like 'cherries'
Output
I love "carots" but "I" don&apos;t like
&apos;cherries&apos;

Can't get Regex working in PHP, works in RegEXP program

Here is the input I am searching:
\u003cspan class=\"prs\">email_address#me.com\u003c\/span>
Trying to just return email_address#me.com.
My regex class=\\"prs\\">(.*?)\\ returns "class=\"prs\">email_address#me.com\" in RegExp which is OK, I can work with that result.
But I can't get it to work in PHP.
$regex = "/class=\\\"prs\\\">(.*?)\\/";
Gives me an error "No ending delimiter"
Can someone please help?
Your original code:
$regex = "/class=\\\"prs\\\">(.*?)\\/";
The reason you get No ending delimiter is that although you are escaping the backslash prior to the closing forward slash, what you have done is escaped it in the context of the PHP string, not in the context of the regex engine.
So the PHP string escaping mechanism does its thing, and by the time the regex engine gets it, it will look like this:
/class=\"prs\">(.*?)\/
This means that the regular expression engine will see the backslash at the end of the expression as escaping the forward slash that you are intending to use to close the expression.
The usual PHP solution to this kind of thing is to switch to using single-quoted string instead of a double-quoted one, but this still won't work, as \\ is an escaped backslash in both single and double quoted strings.
What you need to do is double up the number of backslash characters at the end of your string, so your code needs to look like this:
$regex = "/class=\\\"prs\\\">(.*?)\\\\/";
The way to prove what it's doing is to print the contents of the $regex variable, so you can see what the string will look like to the regex engine. These kinds of errors are actually very hard to spot, but looking at the actual content of the string will help you spot them.
Hope that helps.
If you change to single quotes it should fix it
$regex = '/class=\\\"prs\\\">(.*?)\\/';

regex with special characters?

i am looking for a regex that can contain special chracters like / \ . ' "
in short i would like a regex that can match the following:
may contain lowercase
may contain uppercase
may contain a number
may contain space
may contain / \ . ' "
i am making a php script to check if a certain string have the above or not, like a validation check.
The regular expression you are looking for is
^[a-z A-Z0-9\/\\.'"]+$
Remember if you are using PHP you need to use \ to escape the backslashes and the quotation mark you use to encapsulate the string.
In PHP using preg_match it should look like this:
preg_match("/^[a-z A-Z0-9\\/\\\\.'\"]+$/",$value);
This is a good place to find the regular expressions you might want to use.
http://regexpal.com/
You can always escape them by appending a \ in front of the special characters.
try this:
preg_match("/[A-Za-z0-9\/\\.'\"]/", ...)
NikoRoberts is 100% correct.
I would only add the following suggestion: When creating a PHP regex pattern string, always use: single-quotes. There are far fewer chars which need to be escaped (i.e. only the single quote and the backslash itself needs to be escaped (and the backslash only needs to be escaped if it appears at the end of the string)).
When dealing with backslash soup, it helps to print out the (interpreted) regex string. This shows you exactly what is being presented to the regex engine.
Also, a "number" might have an optional sign? Yes? Here is my solution (in the form of a tested script):
<?php // test.php 20110311_1400
$data_good = 'abcdefghijklmnopqrstuvwxyzABCDE'.
'FGHIJKLMNOPQRSTUVWXYZ0123456789+- /\\.\'"';
$data_bad = 'abcABC012~!###$%^&*()';
$re = '%^[a-zA-Z0-9+\- /\\\\.\'"]*$%';
echo($re ."\n");
if (preg_match($re, $data_good)) {
echo("CORRECT: Good data matches.\n");
} else {
echo("ERROR! Good data does NOT match.\n");
}
if (preg_match($re, $data_bad)) {
echo("ERROR! Bad data matches.\n");
} else {
echo("CORRECT: Bad data does NOT match.\n");
}
?>
The following regex will match a single character that fits the description you gave:
[a-zA-Z0-9\ \\\/\.\'\"]
If your point is to insure that ONLY characters in this range of characters are used in your string, then you can use the negation of this which would be:
[^a-zA-Z0-9\ \\\/\.\'\"]
In the second case, you could use your regex to find the bad stuff (that you don't want to be included), and if it didn't find anything then your string pattern must be kosher, because I'm assuming that if you find one character that is not in the proper range, then your string is not valid.
so to put it in PHP syntax:
$regex = "[^a-zA-Z0-9\ \\\/\.\'\"]"
if preg_match( $regex, ... ) {
// handle the bad stuff
}
Edit 1:
I've completely ignored the fact that backslashes are special in php double-quoted strings, so here is a correcting to the above code:
$regex = "[^a-zA-Z0-9\\ \\\\\\/\\.\\'\\\"]"
If that doesn't work it shouldn't take too much for someone to debug how many of the backslashes need to be escaped with a backslash, and what other characters need also to be escaped....

Categories