Exclude Literal backslash in Javascript Regular Expression - php

I'm writing a php forms class with client and server side validation. I'm having problems checking if a literal backslash ("\") exists in a string using regular expressions in javascript.
I want to shy away from solutions other than using regex as this will reduce the amount of special cases between php and js AND reduce the amount of conditional code I need to write.
I've just been using this as an example of what a user may need in this forms class-
A password field that is a string
between 6 and 12 chars long and that
excludes "\","#","$","`"
I have tried:
^[^(\u0008#\$`)]{6,12}$
^[^(\b#\$`)]{6,12}$
^[^(\\#\$`)]{6,12}$
And none of them work for a backslash and I can't work out why. FYI: The latter works fine in PHP.

The regular expression \\ matches a single backslash. In JavaScript, this becomes re = /\\/ or re = new RegExp("\\\\").
ripped straight from http://www.regular-expressions.info/javascript.html

It looks like you've created a grouping of slash-hash-dollar-tick, rather than looking for any of those characters.
try this
var rgx = new RegExp(/^[^\\#\$`]{6,12}$/);

Related

Convert JQuery RegEx into PHP RegEX

Sorry for this question as it is very specific.
I have a JQuery validation RegEx that I would like to use on the back end too:
var forNames = new RegExp("^[^0-9<>'\"/;`%]*$");
I tried in PHP
preg_match('/^[^0-9<>\'\"/;`%]{2,42}$/', $first_name) // I also want to keep the length between 2 and 42 here)
but it does not work, I get Unknown modifier ';' in
The other question, similar to this one is what this person is asking here
Converting Javascript Regex to PHP
I tried his solution, copying the php email validation regex into JQuery with no luck
Thank you
Ps I just unedited what i had added to the regex cause i didnt see it already had answers and it was confusing
You need to escape the / character in your PHP regex string, because that's also the character which is used to signify the end of a regexp (it's called a delimiter):
preg_match('/^[^0-9<>\'\"/;`%]{2,42}$/', $first_name)
^
becomes:
preg_match('/^[^0-9<>\'\"\/;`%]{2,42}$/', $first_name)
The reason you didn't need to do this in your JavaScript code is that you used the RegExp constructor, which essentially automatically escaped it for you. If you had used a RegExp literal you would have had to escape it too:
var forNames = /^[^0-9<>'\"\/;`%]*$/;
As #DelightedD0D commented, make sure to test your RegExp with an interactive tool like regex101, it supports both PHP and JS style regexp and is actually how I was able to catch your error so fast.

Regular Expression as Php input filter For Pdo DSN host or host+port

Right up front, I dislike Regular Expressions. It is desired to allow input of domain or domain + port options of the DSN to be set in a single input. Also that localhost is an option as well as subdomains.
The best I could come was acquired from an article called Domain name regular expression example
Which provides this expression for Java
^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$
It was realized that it almost works but the part for a period is \\. and should be \. in Php
From the php.net manual some PDO_MYSQL DSN examples are:
mysql:host=localhost;dbname=testdb
mysql:host=localhost;port=3307;dbname=testdb
The only part I want to perform the regular expression on is
localhost
localhost;port=3307
This is to be used for a filter of a HTML form as part of a Php based installation of a Php app (hope this make sense).
So this is what I came up with:
'/^((?!-)[a-z0-9-]{1,63}(?<!-)(\.){0,1})+([a-z]{0,9})(?<!\.)((;port=){1}[0-9]{2,6}){0,1}$/i'
It is important that the string does not start or end with hyphens or contain whitespace.
Here is something more in depth https://gist.github.com/CrandellWS/bc0cbcbb1df5c4b4361a
and a link to the overall project https://github.com/CrandellWS/ams
Can this expression be shorter or optimized in order to help prevent end-user errors?
More importantly as Regular Expression is not my strongest point any possible gotchas that can be prevented from please explain how and why.
For My reference these 2 sites have been immensely helpful in figuring out Regular Expressions http://www.regexr.com/ and http://txt2re.com/
If you only want to check if it is valid,(without caring on about match groups):
^[^-][a-z0-9-]{0,63}[^-](\.[a-z]{0,9})*(;port=[0-9]{2,6})?$
If you are not so exact you could test:
^[^-][a-z0-9-]*[^-](\.[a-z]+)*(;port=[0-9]+)?$
or
^[^-][\w-]*[^-](\.\w+)*(;port=\d+)?$
But essentially the every time you shrink it you are losing accuracy
Update 1:
[\w\d-]* vs [A-Za-z0-9-]{1,63} here the length of the string will not be checked
? vs {0,1} is equivalent (just shorter)
\d vs [0-9] is equivalent (just shorter)
\w vs [A-Za-z0-9_] is equivalent (just shorter)
And no negative lookbehinds (?<! ...) they make everything a bit complicated
missing accuracy: It now there are some entries possible, that shouldn't be valid, since length checks are missing and now underscore is also allowed(before not)
Update 2:
To prevent spaces at the beginning characters just add this
^[^\s-][\w-]*[^\s-](\.\w+)*(;port=\d+)?$
[^\s-] ... excludes only spaces or hyphens, any other character is allowed (even a dot)
But to get closer to your expression (without lookbehind)
^\w[\w-]*\w(\.\w+)*(;port=\d+)?$
and to remove the underscores, but it is a bit longer
^[a-z0-9][a-z0-9-]*[a-z0-9](\.[a-z0-9-]+)*(;port=\d+)?$
I can suggest try to make it more strict like this:
example
It dosen't consider unix_socket and it's not short but simple to understand. You can try to make it more precise.
UPDATED
Try also this example
Let me surprise you that parameters in DSN can go in random order.
This is to be used for a filter of a HTML form as part of a Php based installation of a Php app
For my life I won't understand why would you torture a user asking them to create a DSN-like string (which they likely have no idea of) and then torture yourself verifying it. Instead of just asking for separate host and (optional) port fields, just like any installation script in the world does.
Let me suggest you to make yourself familiar with some existing installation scripts, before starting for your own. One from Wordpress will do.
It just occured to me that may be you need help with PHP conditionals. Here you go:
if (isset($_POST['dbhost'])) {
if ($_POST['dbport'])
{
$DB_PORT = $_POST['dbport']
} else {
$DB_PORT = 3306;
}
$DB_HOST = $_POST['dbhost'];
$DB_DATABASE = $_POST['dbname'];
$DB_USERNAME = $_POST['dbuser'];
$DB_PASSWORD = $_POST['dbpass'];
$DB_DSN = 'mysql:host=$DB_HOST;port=$DB_PORT;dbname=$DB_DATABASE";
This simple code will solve all your problems without Regular Expressions you don't like. I hope that your dislikes do not extend to simple conditionals though.

Differences in backslashing between Notepad++ and PHP

EDIT: I found a solution I didn't expect. See below.
Using regex via PHP's preg_match_all , I want to match a certain url (EDIT: that is already escaped) in a string formatted as json. The search works wonderfully in Notepad++ (using regex-matching, of course) but preg_match_all() just returns an empty array.
Testing on tryphpregex.com I found out that somehow my usual approach to escaping a backslash gives a pattern error, i.e. even the simple pattern https:\\ returns an empty result.
I'm utterly confused and have been trying to debug for too long so I may miss the obvious. Maybe one of you can see the simple error?
The string.
The pattern (that works fine in Notepad++, but not in PHP):
%(https:\\/\\/play.spotify.com\\/track\\/)(.*?)(\")%
You don't need to escape the slash in PHP %(https://play.spotify.com/track/)(.*?)(\")%
The Backslash before doule quote is only needed if you enclosures are double quotes too.
Found a solution to my problem.
According to this site, I need to match every backslash with \\\\. Horrible, but true.
So my pattern becomes:
$pattern = "%(https:\\\\/\\\\/play\.spotify\.com\\\\/track\\\\/)(.*?)(\")%";
Please observe that I tried to find a pattern inside a string that didn't contain clear urls, but urls containing escape characters (it was a json-output from spotify)

RegEx to find a PHP RegEx string

I want to match a PHP regex string.
From what I know, they are always in the format (correct me if I am wrong):
/ One opening forward slash
the expression Any regular expression
/ One closing forward slash
[imsxe] Any number of the modifiers NOT REPEATING
My expression for this was:
^/.+/[imsxe]{0,5}$
Written as a PHP string, (with the open/close forward slash and escaped inner forward slashes) it is this:
$regex = '/^\/.+\/[imsxe]{0,5}$/';
which is:
^ From the beginning
/ Literal forward slash
.+ Any character, one or more
/ Literal forward slash
[imsxe]{0,5} Any of the chars i,m,s,x,e, 0-5 times (only 5 to choose from)
$ Until the end
This works, however it allows repeating modifiers, i.e:
This: ^/.+/[imsxe]{0,5}$
Allows this: '/blah/ii'
Allows this: '/blah/eee'
Allows this: '/blah/eise'
etc...
When it should not.
I personally use RegexPal to test, because its free and simple.
If (in order to help me) you would like to do the same, click the link above (or visit http://regexpal.com), paste my expression in the top text box
^/.+/[imsxe]{0,5}$
Then paste my tests in the bottom textbox
/^[0-9]+$/i
/^[0-9]+$/m
/^[0-9]+$/s
/^[0-9]+$/x
/^[0-9]+$/e
/^[0-9]+$/ii
/^[0-9]+$/mm
/^[0-9]+$/ss
/^[0-9]+$/xx
/^[0-9]+$/ee
/^[0-9]+$/iei
/^[0-9]+$/mim
/^[0-9]+$/sis
/^[0-9]+$/xix
/^[0-9]+$/eie
ensure you click the second checkbox at the top where it says '^$ match at line breaks (m)' to enable the multi-line testing.
Thanks for the help
Edit
After reading comments about Regex often having different delimiters i.e
/[0-9]+/ == #[0-9]+#
This is not a problem and can be factored in to my regex solution.
All I really need to know is how to prevent duplicate characters!
Edit
This bit isn't so important but it provides context
The need for such a feature is simple...
I'm using jQuery UI MultiSelect Widget written by Eric Hynds.
Simple demo found here
Now In my application, I'm extending the plugin so that certain options popup a little menu on the right when hovered. The menu that pops up can be ANY html element.
I wanted multiple options to be able to show the same element. So my API works like this:
$('#select_element_id')
// Erics MultiSelect API
.multiselect({
// MultiSelect options
})
// My API
.multiselect_side_pane({
menus: [
{
// This means, when an option with value 'MENU_1' is hovered,
// the element '#my_menu_1' will be shown. This makes attaching
// menus to options REALLY SIMPLE
menu_element: $('#my_menu_1'),
target: ['MENU_1']
},
// However, lets say we have option value 'USER_ID_132', I need the
// target name to be dynamic. What better way to be dynamic than regex?
{
menu_element: $('#user_details_box'),
targets: ['USER_FORM', '/^USER_ID_[0-9]+$/'],
onOpen: function(target)
{
// here the TARGET can be interrogated, and the correct
// user info can be displayed
// Target will be 'USER_FORM' or 'USER_ID_3' or 'USER_ID_234'
// so if it is USER_FORM I can clear the form ready for input,
// and if the target starts with 'USER_ID_', I can parse out
// the user id, and display the correct user info!
}
}
]
});
So as you can see, The whole reason I need to know if a string a regex, is so in the widget code, I can decide whether to treat the TARGET as a string (i.e. 'USER_FORM') or to treat the TARGET as an expression (i.e '/^USER_ID_[0-9]+$/' for USER_ID_234')
Unfortunately, the regexp string can be "anything". The forward slashes you talk about can be a lot of characters. i.e. a hash (#) will also work.
Secondly, to match up to 5 characters without having them double could probably be done with lookahead / lookbehind etc, but will create such complex regexp that it's faster to post-process it.
It is possibly faster to search for the regular expression functions (preg_match, preg_replace etc.) in code to be able to deduct where regular expressions are used.
$var = '#placeholder#';
Is a valid regular expression in PHP, but doesn't have to be one, where:
const ESCAPECHAR = '#';
$var = 'text';
$regexp = ESCAPECHAR . $var . ESCAPECHAR;
Is also valid, but might not be seen as such.
In order to prevent duplicate in modifier section, I'd do:
^/.+/(?:(?=[^i]*i[^i]*)?(?=[^m]*m[^m]*)?(?=[^s]*s[^s]*)?(?=[^x]*x[^x]*)?(?=[^e]*e[^e]*)?)?$

Regex for a Function Call with Multiple Optional Parameters

I'm looking for a regex that will scan a document to match a function call, and return the value of the first parameter (a string literal) only.
The function call could look like any of the following:
MyFunction("MyStringArg");
MyFunction("MyStringArg", true);
MyFunction("MyStringArg", true, true);
I'm currently using:
$pattern = '/Use\s*\(\s*"(.*?)\"\s*\)\s*;/';
This pattern will only match the first form, however.
Thanks in advance for your help!
Update
I was able to solve my problem with:
$pattern = '/Use\s*\(\s*"(.*?)\"/';
Thanks Justin!
~Scott
If you only care about the value of the first parameter, you can just chop off the end of the regex:
$pattern = '/Use\s*\(\s*"(.*?)\"/';
However, you should understand that this (or any pure-regex solution for this problem) will not be perfect, and there will be some possible cases it handles incorrectly. In this case, you'll get false positives, and escaped quotes (\") will break it.
You can ignore escaped quotes by complicating it a bit:
$pattern = '/Use\s*\(\s*"(.*?)(?!<(?:\\\\)*\\)\"/';
This ignores " characters inside the quoted string if they have an odd number of backslashes in front of them.
However, the false-postives issue can't be helped without introducing false-negatives, and vice versa. This is because PHP is an irregular language, so it can't be parsed with "pure" regex, and even modern regex engines that allow recursion are going to need some pretty complex code to do a really thorough job at this.
All I'm saying is, if you're planning a one-off job to quickly scrape through some PHP you wrote yourself, regex is probably fine. If you're looking for something robust and open-ended that will do this on arbitrary PHP code, you need some kind of reflection or PHP parser.
This might be slightly simpler, though will only work if you have double quotes and not single quotes:
$pattern = /Use\s*[^\"]*\"([^\"]*)\"/

Categories