codeigniter disallowed characters error - php

if i trying to access this url http://localhost/common/news/33/+%E0%B0%95%E0%B1%87%E0%B0%B8.html , it shows an An Error Was Encountered, The URI you submitted has disallowed characters. I set $config['permitted_uri_chars'] = 'a-z 0-9~%.:??_=+-?' ; ..// WHat i do ?

Yeah, if you want to allow non-ASCII bytes you would have to add them to permitted_uri_chars. This feature operates on URL-decoded strings (normally, unless there is something unusual about the environment), so you have to put the verbatim bytes you want in the string and not merely % and the hex digits. (Yes, I said bytes: _filter_uri doesn't use Unicode regex, so you can't use a Unicode range.)
Trying to filter incoming values (instead of encoding outgoing ones) is a ludicrously basic error that it is depressing to find in a popular framework. You can turn this misguided feature off by setting permitted_uri_chars to an empty string, or maybe you would like a range of all bytes except for control codes ("\x20-\xFF"). Unfortunately the _filter_uri function still does crazy, crazy, broken things with some input, HTML-encoding some punctuation on the way in for some unknown bizarre reason. And you don't get to turn this off.
This, along with the broken “anti-XSS” mangler, makes me believe the CodeIgniter team have quite a poor understanding of how string escaping and security issues actually work. I would not trust anything they say on security ever.

What to do?
Stop using unicode characters in an URL - for the same reasons as you shouldn't name files on a filesystem with unicode characters.
But, if you really need it, I'll copy/paste some lines from the config:
Leave blank to allow all characters -- but only if you are insane.

I would NOT suggest trying to decode them or use any other tricks, instead I would suggest using urlencode() and urldecode() functions.
Since I don't have a copy of your code, I can't add examples, if you could provide me some, I can show you an example how to do it.
However, it's pretty straightforward to use, and it's built in PHP4 and PHP5.

I had a similar problem and wanted to share the solution. It was reset password, and I had to send the username and time, as the url will be active for an hour only. Codeigniter will not accept certain characters in url for security reasons and I did not want to change that. So here is what I did:
concat user name, '__' and time() in a var $str
encrypt $str using MCRYPT_BLOWFISH, this may contain '/', '+'
re-encrypt using str2hex (got it from here)
put the encoded string as the 3rd argument in the link sent by
email, like,
http://xyz.com/users/resetpassword/3123213213ABCDEF238746238469898
-you can see that the url contains only 0-9 and A-Z.
When link from email is clicked, get the 3rd uri segment, use
hex2str() to decrypt to blowfish encrypted string, and then apply
blowfish decrypt to get the original string.
split with '__' to get the user name and time
I know that its almost a year till this question was asked, but I am hoping that someone will find this solution helpful after coming here by google.

Related

What unicode character groups should we limit the user to, to create Beautiful URLs?

I recently started looking at adding untrusted usernames in prettied urls, eg:
mysite.com/
mysite.com/user/sarah
mysite.com/user/sarah/article/my-home-in-brugge
mysite.com/user/sarah/settings
etc..
Note the username 'sarah' and the article name 'my-home-in-brugge'.
What I would like to achieve, is that someone could just copy-paste the following url somewhere:
(1)
mysite.com/user/Björk Guðmundsdóttir/articles
mysite.com/user/毛泽东/posts
...and it would just be very clear, before clicking on the link, what to expect to see. The following two exact same urls, where the usernames have been encoded using PHP rawurlencode() (considered the proper way of doing this):
(2)
mysite.com/user/Bj%C3%B6rk%20Gu%C3%B0mundsd%C3%B3ttir/articles
mysite.com/user/%E6%AF%9B%E6%B3%BD%E4%B8%9C/posts
...are a lot less clear.
There are three ways to securely (to some level of guarantee) pass an untrusted name containing readable utf8 characters into a url path as a directory:
A. You reparse the string into allowable characters whilst still keeping it uniquely associated in your database to that user, eg:
(3)
mysite.com/user/bjork-guomundsdottir/articles
mysite.com/user/mao-ze-dong12/posts
B. You limit the user's input at string creation time to acceptable characters for url passing (you ask eg. for alphanumeric characters only):
(4)
mysite.com/user/bjorkguomundsdottir/articles
mysite.com/user/maozedong12/posts
using eg. a regex check (for simplicity sake)
if(!preg_match('/^[\p{L}\p{N}\p{P}\p{Zs}\p{Sm}\p{Sc}]+$/u', trim($sUserInput))) {
//...
}
C. You escape them in full using PHP rawurlencode(), and get the ugly output as in (2).
Question:
I want to focus on B, and push this as far as is possible within KNOWN errors/concerns, until we get the beautiful urls as in (1). I found out that passing many unicode characters in urls is possible in modern browsers. Modern browsers automatically convert unicode characters or non-url parseable characters into encoded characters, allowing the user to Eg. Copy paste the nice-looking unicode urls as in (1), and the browser will get the actual final url right.
For some characters, the browser will not get it right without encoding: Eg. ?, #, / or \ will definitely and clearly break the url.
So: Which characters in the (non-alphanumeric) ascii range can we allow at creation time, accross the entire unicode spectrum, to be injected into a url without escaping? Or better: Which groups of Unicode characters can we allow? Which characters are definitely always blacklisted ? There will be special cases: Spaces look fine, except at the end of the string, otherwise they could be mis-selected. Is there a reference out there, that shows which browsers interprete which unicode character ranges ok?
PS: I am very well aware that using improperly encoded strings in urls will almost never provide a security guarantee. This question is certainly not recommended practice, but I do not see the difference of asking this question, and the done-so-often matter of copy-pasting a url from a website and pasting it into the browser, without thinking it through whether that url was correctly encoded or not (the novice user wouldn't). Has someone looked at this before, and what was their code (regex, conditions, if-statement..) solution?

php encoding url parameters with lowercase letters

I am struggling with encrypting url parameters. I have for example the following urls:
http://www.domain.com/show_user.php?uid=45&s=photos
http://www.domain.com/show_user.php?uid=454&s=information
Now I do not want users to see the plain values of parameters 'uid' and 's' so I encrypted them with base64_encode.
http://www.domain.com/show_user.php?uid=NDU=&s=cGhvdG9z
http://www.domain.com/show_user.php?uid=NDU0&s=aW5mb3JtYXRpb24=
But now I have the problem that I have some capital letters in the URL. I my error log I find errors which are caused by requesting the url with only lowercase letters:
http://www.domain.com/show_user.php?uid=ndu=&s=cghvdg9z
This leads to an error since the string cannot be decrypted anymore.
This obviously isn't a very smart solution to encrypt parameters in url. What would you suggest? What encrypting methods do you use? Which one only creates lowercase letters?
I already want to thank you very much in advance for any help :)
Best regards,
Freddy
There must be a strtolower() in your code somewhere, addresses don't lowercase themselves. Check the code around where you're generating these encoded strings.
Also - As mentioned in that comment, it's not encryption. Functionally, is this something that actually needs to be encrypted, or just obscured?
If you have only this two parameters I would recommend you to write yourself an coding & encoding function. You can also compress all parameters to only one and then decoding it again using split string function.
If this information is somehow secret that you do not want to share, create table for mapping some id to this information, and generate page from these entries. But your need for encoding is not clear, so please elaborate more.

Are there any characters that don't work for _GET variables in a browser?

This is a little out of the blue and it's mostly curiosity. I hope it's not a waste pf time and space.
I was writing a little script to validate accounts with a link so I decided to send an email with a link to the php script and in the link I would put two variables to get with the _GET array. A key and the email. Then I would just search the database with that email and key and change it's activated status to true... No prob. Easy enough even though it may not be very elegant..
I used a script for the generation of the key that I used elsewhere in the site for generating a new password (to reset it for instance) but sometimes it didn't work and after a lot of tries I noticed (and I felt stupid then) that the array my password generation function drew from was this:
'0123456789_!##$%&*()-=+abcdfghjkmnpqrstvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
So naturally I deleted the & character that is used for separating variables in the url... Then in another try I noticed that the link in the email was not recognized whole and stopped after the '#' character as well which I then remembered is used for references in an html so I deleted that as well. In the end I decided to leave only alphanumeric characters to be sure but I am curious; Are ther any more characters that are not 'valid' for url's using utilizing _GET and is there any way to use those characters anyway (maybe ulr encode or somwething)
There are plenty of characters that are invalid. Use urlencode to convert them to URL safe encodings. (Always run that function over any data you are inserting into a URL).
You have to use urlencode() before sending the values to $_GET.
You could use url_encode and url_decode but I would stay away from & # ? these are normal URL characters.
Also when it comes to passwords : dont stress about an algorithm, use sha1 crypt or something along those lines with a salt. These algorithms will be much stronger than your homemade ones.

Is it safe to use (strip_tags, stripslashes, trim) to clear variable that holds URLs

It's quite pleasure to be posting my first question in here :-)
I'm running a URL Shortening / Redirecting service, PHP written.
I aim to store and handle valid URLs data as much as possible within my service.
I noticed that sometimes, invalid URL data is being handled over to the database, holding invalid characters (like spaces in the end or beginning of the URL).
I decided to make my URL-Check mechanism trim, stripslashes and strip_tags the values before storing them.
As far as I can think, these functions will not remove valid charterers that any URL may have.
Kindly, just correct me or advise me if I'm going into the wrong direction.
Regards..
If you're already trimming the incoming variable, as well as filtering it with the other built in PHP methods, and STILL running into issues, try changing the collation of your table to UTF-8 and see if that helps you get rid of the special characters you mention. (Could you paste a few examples to let us know?)

Removing characters from a PHP String

I'm accepting a string from a feed for display on the screen that may or may not include some rubbish I want to filter out. I don't want to filter normal symbols at all.
The values I want to remove look like this: �
It is only this that I want removed. Relevant technology is PHP.
Suggestions appreciated.
This is an encoding problem; you shouldn't try to clean that bogus characters but understand why you're receiving them scrambled.
Try to get your data as Unicode, or to make a agreement with your feed provider to you both use the same encoding.
Thanks for the responses, guys. Unfortunately, those submitted had the following problems:
wrong for obvious reasons:
ereg_replace("[^A-Za-z0-9]", "", $string);
This:
s/[\u00FF-\uFFFF]//
which also uses the deprecated ereg form of regex also didn't work when I converted to preg because the range was simply too large for the regex to handle. Also, there are holes in that range that would allow rubbish to seep through.
This suggestion:
This is an encoding problem; you shouldn't try to clean that bogus characters but understand why you're receiving them scrambled.
while valid, is no good because I don't have any control over how the data I receive is encoded. It comes from an external source. Sometimes there's garbage in there and sometimes there is not.
So, the solution I came up with was relatively dirty, but in the absence of something more robust I'm just accepting all standard letters, numbers and symbols and discarding the rest.
This does seem to work for now. The solution is as follows:
$fixT = str_replace("£", "£", $string);
$fixT = str_replace("€", "€", $fixT);
$fixT = preg_replace("/[^a-zA-Z0-9\s\.\/:!\[\]\*\+\-\|\<\>##\$%\^&\(\)_=\';,'\?\\\{\}`~\"]/", "", $fixT);
If anyone has any better ideas I'm still keen to hear them. Cheers.
You are looking for characters that are outside of the range of glyphs that your font can display. You can find the maximum unicode value that your font can display, and then create a regex that will replace anything above that value with an empty string. An example would be
s/[\u00FF-\uFFFF]//
This would strip anything above character 255.
That's going to be difficult for you to do, since you don't have a solid definition of what to filter and what to keep. Typically, characters that show up as empty squares are anything that the typeface you're using doesn't have a glyph for, so the definition of "stuff that shows up like this: �" is horribly inexact.
It would be much better for you to decide exactly what characters are valid (this is always a good approach anyway, with any kind of data cleanup) and discard everything that is not one of those. The PHP filter function is one possibility to do this, depending on the level of complexity and robustness you require.
If you cant resolve the issue with the data from the feed and need to filter the information then this may help:
PHP5 filter_input is very good for filtering input strings and allows a fair amount of rlexability
filter_input(input_type, variable, filter, options)
You can also filter all of your form data in one line if it requires the same filtering :)
There are some good examples and more information about it here:
http://www.w3schools.com/PHP/func_filter_input.asp
The PHP site has more information on the options here: Validation Filters
Take a look at this question to get the value of each byte in your string. (This assumes that multibyte overloading is turned off.)
Once you have the bytes, you can use them to determine what these "rubbish" characters actually are. It's possible that they're a result of misinterpreting the encoding of the string, or displaying it in the wrong font, or something else. Post them here and people can help you further.
Try this:
Download a sample from the feed manually.
Open it in Notepad++ or another advanced text editor (KATE on Linux is good for this).
Try changing the encoding and converting from one encoding to another.
If you find a setting that makes the characters display properly, then you'll need to either encode your site in that encoding, or convert it from that encoding to whatever you use on your site.
Hello Friends,
try this Regular Expression to remove unicode char from the string :
/*\\u([0-9]|[a-fA-F])([0-9]|[a-fA-F])([0-9]|[a-fA-F])([0-9]|[a-fA-F])/
Thanks,
Chintu(prajapati.chintu.001#gmail.com)

Categories