I am trying to build site with custom script. I have problem with allowing space in Name input field. If i include space, it will give error but if input slash or underscore, it accepts.
my name - i am getting not allowed error
my_name - i am getting success message.
code
if(!preg_match('/^[0-9a-zA-Z\xe0-\xef\x80-\xbf._-]+$/i',$nickname)) {
You need to add \s for space to your regex.
I have try this :
[0-9a-zA-Z\xe0-\xef\x80-\xbf._-]+\s*
For debugging regex may be you can use this tools.
Related
I need to pass filenames via the url, e.g.:
http://example.com/images/niceplace.jpg
The problem I'm having is when the file name contains a blank character, e.g.:
http://example.com/images/nice place.jpg
or
http://example.com/images/nice%20place.jpg
For these two URLs, codeigniter complains about the blank char: "The URI you submitted has disallowed characters."
How should I go about fixing this?
I know I can add the blank character to the permitted_uri_chars in config.php but I'm looking for a better solution as there might be other disallowed characters in a filename.
I figured out a solution.
The URL is generated using rawurlencode().
Then, within the images controller, the filename is decoded using rawurldecode(html_entity_decode($filename)).
I successfully tested this solution with a few special characters I can think of and with UTF-8 characters.
You can use this method:
http://php.net/urlencode
Actually, you will run into another issues, when a filename would contain & character, and a few others. urlencode would get rid of all the possible issues.
This configuration option is created to avoid some characters being passed in URI and you want to walkaround it in some cases. I think most appropriate solutions are:
Pass file name as a parameter - http://domain.com/images/?image=test.jpg
Remove all non alfanumeric characters and may be some other (dash, underscore, etc) from file name when you save it. In my opinion, it is better, because you can face other problems with some character in other cases.
One of the better way to work with url's for specified condition is to encode/encrypt your url parameters using encryption/security class in order to maintain URL security:
$encrypt=$this->encrypt->encode($param1) & $this->encrypt->decode($encrypt)
Alternatively if you want special chars to be allowed in the URL then change your config settings in config.php file.
File Location: application/config/config.php
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-';
Add all characters in right side that you want to be allowed with your application.
I want to match a PHP regex string.
From what I know, they are always in the format (correct me if I am wrong):
/ One opening forward slash
the expression Any regular expression
/ One closing forward slash
[imsxe] Any number of the modifiers NOT REPEATING
My expression for this was:
^/.+/[imsxe]{0,5}$
Written as a PHP string, (with the open/close forward slash and escaped inner forward slashes) it is this:
$regex = '/^\/.+\/[imsxe]{0,5}$/';
which is:
^ From the beginning
/ Literal forward slash
.+ Any character, one or more
/ Literal forward slash
[imsxe]{0,5} Any of the chars i,m,s,x,e, 0-5 times (only 5 to choose from)
$ Until the end
This works, however it allows repeating modifiers, i.e:
This: ^/.+/[imsxe]{0,5}$
Allows this: '/blah/ii'
Allows this: '/blah/eee'
Allows this: '/blah/eise'
etc...
When it should not.
I personally use RegexPal to test, because its free and simple.
If (in order to help me) you would like to do the same, click the link above (or visit http://regexpal.com), paste my expression in the top text box
^/.+/[imsxe]{0,5}$
Then paste my tests in the bottom textbox
/^[0-9]+$/i
/^[0-9]+$/m
/^[0-9]+$/s
/^[0-9]+$/x
/^[0-9]+$/e
/^[0-9]+$/ii
/^[0-9]+$/mm
/^[0-9]+$/ss
/^[0-9]+$/xx
/^[0-9]+$/ee
/^[0-9]+$/iei
/^[0-9]+$/mim
/^[0-9]+$/sis
/^[0-9]+$/xix
/^[0-9]+$/eie
ensure you click the second checkbox at the top where it says '^$ match at line breaks (m)' to enable the multi-line testing.
Thanks for the help
Edit
After reading comments about Regex often having different delimiters i.e
/[0-9]+/ == #[0-9]+#
This is not a problem and can be factored in to my regex solution.
All I really need to know is how to prevent duplicate characters!
Edit
This bit isn't so important but it provides context
The need for such a feature is simple...
I'm using jQuery UI MultiSelect Widget written by Eric Hynds.
Simple demo found here
Now In my application, I'm extending the plugin so that certain options popup a little menu on the right when hovered. The menu that pops up can be ANY html element.
I wanted multiple options to be able to show the same element. So my API works like this:
$('#select_element_id')
// Erics MultiSelect API
.multiselect({
// MultiSelect options
})
// My API
.multiselect_side_pane({
menus: [
{
// This means, when an option with value 'MENU_1' is hovered,
// the element '#my_menu_1' will be shown. This makes attaching
// menus to options REALLY SIMPLE
menu_element: $('#my_menu_1'),
target: ['MENU_1']
},
// However, lets say we have option value 'USER_ID_132', I need the
// target name to be dynamic. What better way to be dynamic than regex?
{
menu_element: $('#user_details_box'),
targets: ['USER_FORM', '/^USER_ID_[0-9]+$/'],
onOpen: function(target)
{
// here the TARGET can be interrogated, and the correct
// user info can be displayed
// Target will be 'USER_FORM' or 'USER_ID_3' or 'USER_ID_234'
// so if it is USER_FORM I can clear the form ready for input,
// and if the target starts with 'USER_ID_', I can parse out
// the user id, and display the correct user info!
}
}
]
});
So as you can see, The whole reason I need to know if a string a regex, is so in the widget code, I can decide whether to treat the TARGET as a string (i.e. 'USER_FORM') or to treat the TARGET as an expression (i.e '/^USER_ID_[0-9]+$/' for USER_ID_234')
Unfortunately, the regexp string can be "anything". The forward slashes you talk about can be a lot of characters. i.e. a hash (#) will also work.
Secondly, to match up to 5 characters without having them double could probably be done with lookahead / lookbehind etc, but will create such complex regexp that it's faster to post-process it.
It is possibly faster to search for the regular expression functions (preg_match, preg_replace etc.) in code to be able to deduct where regular expressions are used.
$var = '#placeholder#';
Is a valid regular expression in PHP, but doesn't have to be one, where:
const ESCAPECHAR = '#';
$var = 'text';
$regexp = ESCAPECHAR . $var . ESCAPECHAR;
Is also valid, but might not be seen as such.
In order to prevent duplicate in modifier section, I'd do:
^/.+/(?:(?=[^i]*i[^i]*)?(?=[^m]*m[^m]*)?(?=[^s]*s[^s]*)?(?=[^x]*x[^x]*)?(?=[^e]*e[^e]*)?)?$
I use the following regex to validate a username (input type text in a registration form) in order to make sure that the username contains ONLY alphanumeric characters, dot, dash or underscore.
if (!preg_match('/^[a-zA-Z0-9\.\_-]+$/',$my_name)) { echo 'no_valid'; }
When I type in the text field for instance % or # or # I get back correctly the error message that it's not a valid username, also the valid characters (.-_) are accepted, so it seems to work fine until the time I type & or +, then I can type any invalid character that I have already exclude before by using the preg_match.
Could anyone tell me why is this happening and how can I overcome this issue?
Problem is somewhere else. Your expression is correct. I tested with PHP. Since it happens with '&' character my guess would be that your data is not converted to URL safe characters before send. Try using encodeURI() function in JS.
if (!preg_match('/^[a-zA-Z0-9\.\_-]+$/',urldecode($my_name))) { echo 'no_valid'; }
I'm trying to make an expression that will search through a page like how2bypass.co.cc and return the contents of the "action" attribute in the "form" tag, and the contents of the "name" and "type" attributes in any input tags. I can't use an html parser because my ultimate goal is to automatically detect if a given page is a web proxy, and once sites catch on that I'm doing that they're probably going to start doing silly things like writing the entire document with javascript to stop me from parsing it.
I'm using the code
preg_match_all('/<form.*action\="(.*?)".*>[^<]*<input.*type\=/i', $pageContents, $inputMatches);
which works fine for the action attribute, but once I put a " after type\= the code stops working. why is this? It works fine once, but not twice?
Regular expressions are greedy...
If you inspect the page source, the following is probably matching the first <input with the last type=, and capturing everything in between.
`<input.*type\=`
You're not going to be able to capture the form and all inputs with your current expression because not every input is prefixed with the form markup. You need to approach it one of the following ways:
Capture the entire form markup, <form>...</form>, and then a regex to match all the inputs in the capture
Adjust your current expression to be non-greedy, .*?, and allow for multiple captures of input markup.
Without seeing the target page that you want to extract from, there are only a few things to guess:
The type= attribute might not have double quotes, as type=text is valid too. Or it might have single quotes instead, or some whitespace around the =.
The .* placeholders might fail if there are newlines between or within the tags. Using the /s regex flag is advisable.
And it's usually more reliable to use negated character classes like [^<>]* or [^"] instead of .* anyway.
You don't need to escape the \= equal sign.
And maybe you should split it up. Use one regex to extract the <form>..</form> block. And then search for the <input> tags within.
I'm working on one application ( using PHP, javascript ). Below is the short description about my problem statement
There are two forms avaliable on my application, i.e. SourceFrm and targetFrm.
I am taking input on first form i.e. SourceFrm and doing processing on targetFrm.
Below is the input which I am taking from SourceFrm :
1) Enter your data (Identification of this input box id is 'inputdata' ):
2) Enter id ( Identification input box id is id ):
As per above input feed by user I am posting this data to targetFrm for further processiong.
On TargetFrm :
I am simply assigning inputdata value to php varible.
The spaces which are in between of words are getting lost ( more than one spaces converting to one space).
e.g.
User has added below data on input box and submitted
inputdata:
This is my test.
Here observed that user has added 5 spaces in between 'my' and 'test' word.
After assigning this input data to php variable. After that I printed this value
Below content I am getting
Output:
This is my test.
More than one spaces is converting to one space. This behaviour I checked on all browsers like FF,MSIE7/8 opera, safari, chrome.
If have used '<pre>' before printing php variable i.e.:
print "<pre>";
print $inputdata;
At time spaces are not getting lost (I am getting exact content).
Here my conflict is how do I presrve exact contents without using '<pre>'.
I have used encoding/decoding (htmlentitiesencode() and decode () )functionality, in my further data processing, so it may create some conflict if i replace spaces with . ( May conflict ll occur if i use instead space ).
Is anyone has any ideas, suggestions please suggest.
-Thanks
When you output your variables to HTML, they are parsed as HTML. Any additional white space is brought down to one space.
A simple fix would to replace all spaces with the html entitity to force browsers to display each space.
I wouldn't store the string with all the &nbps; in the database, but when you show it the would ensure that each space is seen.
EDIT
I mean only replace spaces on render...like:
print str_replace(' ', ' ', $inputdata);
HTML is capable of showing only one space. I'm not really sure why, but if you check your source code of rendered webpage containing your string, you'll see that it contains all the space, the browser just doesn't show it.
The same is for other space characters, as tabs.
The way to deal with it depends on type of your content. You can either replace spaces with or leave it as it is or do something completely different, i.e. strip more than one space down to one space.
It really depends on naturel of your data–the only real situation, when you would need more spaces than one, that comes to my mind is if you're trying to indent things with spaces, what actually isn't that great idea.
Edit: older resource:
http://www.sightspecific.com/~mosh/WWW_FAQ/nbsp.html