.htaccess rewrite rule won't allow ' and # - php

I have a rewrite rule that rewrites domain.co.uk/member.php?x=$member to domain.co.uk/$member
It looks like this:
RewriteEngine On
RewriteRule ^([a-zA-Z0-9_-]+)$ member.php?x=$1
RewriteRule ^([a-zA-Z0-9_-]+)/$ member.php?x=$1
I've tried to just add ' and # to the square brackets but then I get a 500 internal server error. I need these characters for peoples usernames
How do I do this?

# is used to specify user and password in a URI string like this:
http://user:passw...#host/path.
You need to urlencode it: %40
Your path will be: /user%40foo.com or something like this
This should work

From RFC 1738:
The characters ";", "/", "?", ":",
"#", "=" and "&" are the characters
which may be reserved for special
meaning within a scheme. No other
characters may be reserved within a
scheme.
and:
Thus, only alphanumerics, the special
characters "$-_.+!*'(),", and
reserved characters used for their
reserved purposes may be used
unencoded within a URL.
What you should do:
Encode the '#' to %40.
Escape the single quote like in the .htaccess like so: \'

Related

htaccess Rewrite rule to accept Hindi characters

I have a link like this
www.example.com/profile.php?name=sagar123
I used this rule:
RewriteRule ^profile/([a-zA-Z0-9_-]+)$ profile.php?name=$1 [L]
and now I can chang my URL to like this:
www.example.com/profile/sagar123
everything is fine but, now I want to use Hindi language characters also like this
www.example.com/profile.php?name=सागर (It's working fine)
www.example.com/profile/सागर (It is not working and showing Server error)
Please help me to write a rule or regex to accept all ([a-zA-Z0-9_-]+) and also Hindi Character.
Thanks and regards,
Hindi chars falls between \u0900-\u097F range. So you can use this inside character class.
To answer your question, most regexes(PCRE) do not support \u notation and support format of \x{900}
([\x{900}-\x{97F}a-zA-Z0-9_-]+)$
In python \u is supported, so :
([\u0900-\u097Fa-zA-Z0-9_-]+)$
see this for regex matching demonstrating both English and Hindi chars getting matched.
Also, see this for reading literal hindi char mapped to their hex values.
Use the (.*) regex class to match any type of character.
Also, you don't need the + operator at the end in your capturing ( and ) parens, as you're using ^ to indicate the beginning of the URL line, and $ to indicate its end, so a + greedy operator doesn't get you anything extra.
It should look like...
RewriteRule ^profile/(.*)$ profile.php?name=$1 [L]
If you need further info, I recommend taking a look at Apache.org: Apache mod_rewrite Introduction. They cover most of the characters I've discussed in this post up to this point: ., (, ), +, etc..

How to allow 1-9 a-z A-Z - _ % in url via htaccess?

I want to allow in url (1-9 , a-z, A-z, -, _ , %)
I have below code in htaccess
RewriteRule ^shop/search/([a-zA-Z0-9_-]+)/?$ shop.php?search=$1 [QSA,NC]
Issue : when space is passed in url
Example
domain.com/shop/search/my%20keyword
It is not working
Basically i want to allow % in url via htaccess
How to do it?
... it is matched against the (%-decoded) URL-path of the request ...
source, emphasis mine.
mod_rewrite never sees the %, it decodes the %20 to a space. If you want to accept %20 in the URL then add space to the character class.
Basically i want to allow % in url via htaccess How to do it?
You can use this rewrite rule with negative character class:
RewriteRule ^shop/search/([^/]+)/?$ shop.php?search=$1 [QSA,NC,L]
[^/]+ will match 1 or more of any character that is not / hence it will match whitespace or any other decoded character also that you want to match.

PHP URL/slug accept chars

I need to write simple routing system, I have only one question.
When I have url/slug like this
/article/1/simple-article-1
What characters should be allowed there.
Of course letters, digits, '-', '/' and?
.htaccess:
Options -Indexes
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?/$1 [L,QSA]
PHP:
if(isset($_SERVER['QUERY_STRING'])) {
if(!preg_match('/^[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*$/', $_SERVER['QUERY_STRING'])) {
return false;
}
$info = explode('/', $_SERVER['QUERY_STRING']);
....
}
What characters should be allowed there.
Usually slugs are all lowercase, with accented characters replaced by letters of the english alphabet and blank characters replaced by a - or an _. Punctuation marks like the period, comma, question mark, exclamation point, apostrophe and quotation mark are generally removed. It may be also truncated to keep a reasonable length.
The reserved chars that may have a particular meaning in the URI are: !, *, ', (, ), ;, :, #, &, =, +, $, /, ?, #, [ and ]. If the character would conflict with a reserved character's purpose, then the conflicting data must be percent-encoded before the URI is formed.
Once you product the URI from its component parts, if you want add characters that are not alpha, digit, -, ., _ or ~ you should always percent-encoding it.
Example:
/article/1/i!want!use!the!exclamation!mark <-- bad
/article/1/i%21want%21use%21the%21exclamation%21mark <-- good

Capturing encoded slashes and Ignoring unencoded slashes

I have a web application that recently had its spec changed to allow for slashes in names of some of its documents. Resultantly, I have had to change my .htaccess file to also match slashes. However, the issue is that I only want to match slashes that are encoded i.e. catch %2F but not /.
Consider the following URL:
http://www.example.com/document/edit/STAT%2F12/
My .htaccess looks like:
RewriteRule ^document\/([a-z0-9-]+)?\/?([a-z0-9-\W\s]+)?\/?$ documents.php?request=$1&id=$2& [NC,QSA,L]
The above request catches the $id as 'STAT/12/' instead of 'STAT/12'. In other words, it matches the trailing slash even though it isn't encoded.
Please note, I have switched on AllowEncodedSlashes On.
That's because the section of your regexp [a-z0-9-\W\s] is catching the slash. If Apache supports it, use a non-greedy capture, or use a different character class.
RewriteRule ^document\/([a-z0-9-]+)?\/?([a-z0-9-\W\s]+?)?\/?$ documents.php?request=$1&id=$2& [NC,QSA,L]
Non-greedy or lazy capture is the ? after the + and will capture as few characters as possible, so it stops before the trailing /.
https://regex101.com/r/uK8zM3/1
The URL encoded stuff will arrive at your server encoded, so if all you need is to capture %2F where you weren't before, just allow % in addition to whatever worked previously. Your character class above allows whitespace for example, I don't think you want to be doing that in a URL!

Match only letters and special characters with RegExp

How can I allow only letters and special characters with a regular expression?
I suggest you use GSkinner's REGEX builder and experiment with a lot of the examples on the right hand side. There are are many variations to get this job done. If you want to be explicit you can use:
/[a-zA-Z!##$%¨&*()-=+/*.{}]/
Tony's answer will also work, but includes more extra characters than the ones you've defined in your comment.
This
$str = $_REQUEST["htmlstringinput"];
preg_match("([\w\-]+[##%.])", $str);
for letters, numbers and special characters in this special character range [##%.] are allowed
and this
$str = $_REQUEST["htmlstringinput"];
preg_match("([-a-zA-Z]+[##%.])", $str);
for only letters and special characters in the same special character range as above
Worked for me. For further reading and research you can go to : http://gskinner.com/RegExr/
/[\p{L}\p{P}]+/u
matches letters and punctuation characters. Or what did you mean by "special characters"?
all characters not a number? how bout this:
/[^\d]*/
Use following code in .htaccess to block all URLs with number (as per OP's comments)
Options +FollowSymlinks -MultiViews
RewriteEngine on
RewriteCond %{REQUEST_URI} ![0-9]
RewriteRule ^user/ /index.php?goto=missed [NC,L]

Categories