Capturing encoded slashes and Ignoring unencoded slashes - php

I have a web application that recently had its spec changed to allow for slashes in names of some of its documents. Resultantly, I have had to change my .htaccess file to also match slashes. However, the issue is that I only want to match slashes that are encoded i.e. catch %2F but not /.
Consider the following URL:
http://www.example.com/document/edit/STAT%2F12/
My .htaccess looks like:
RewriteRule ^document\/([a-z0-9-]+)?\/?([a-z0-9-\W\s]+)?\/?$ documents.php?request=$1&id=$2& [NC,QSA,L]
The above request catches the $id as 'STAT/12/' instead of 'STAT/12'. In other words, it matches the trailing slash even though it isn't encoded.
Please note, I have switched on AllowEncodedSlashes On.

That's because the section of your regexp [a-z0-9-\W\s] is catching the slash. If Apache supports it, use a non-greedy capture, or use a different character class.
RewriteRule ^document\/([a-z0-9-]+)?\/?([a-z0-9-\W\s]+?)?\/?$ documents.php?request=$1&id=$2& [NC,QSA,L]
Non-greedy or lazy capture is the ? after the + and will capture as few characters as possible, so it stops before the trailing /.
https://regex101.com/r/uK8zM3/1
The URL encoded stuff will arrive at your server encoded, so if all you need is to capture %2F where you weren't before, just allow % in addition to whatever worked previously. Your character class above allows whitespace for example, I don't think you want to be doing that in a URL!

Related

htaccess Rewrite rule to accept Hindi characters

I have a link like this
www.example.com/profile.php?name=sagar123
I used this rule:
RewriteRule ^profile/([a-zA-Z0-9_-]+)$ profile.php?name=$1 [L]
and now I can chang my URL to like this:
www.example.com/profile/sagar123
everything is fine but, now I want to use Hindi language characters also like this
www.example.com/profile.php?name=सागर (It's working fine)
www.example.com/profile/सागर (It is not working and showing Server error)
Please help me to write a rule or regex to accept all ([a-zA-Z0-9_-]+) and also Hindi Character.
Thanks and regards,
Hindi chars falls between \u0900-\u097F range. So you can use this inside character class.
To answer your question, most regexes(PCRE) do not support \u notation and support format of \x{900}
([\x{900}-\x{97F}a-zA-Z0-9_-]+)$
In python \u is supported, so :
([\u0900-\u097Fa-zA-Z0-9_-]+)$
see this for regex matching demonstrating both English and Hindi chars getting matched.
Also, see this for reading literal hindi char mapped to their hex values.
Use the (.*) regex class to match any type of character.
Also, you don't need the + operator at the end in your capturing ( and ) parens, as you're using ^ to indicate the beginning of the URL line, and $ to indicate its end, so a + greedy operator doesn't get you anything extra.
It should look like...
RewriteRule ^profile/(.*)$ profile.php?name=$1 [L]
If you need further info, I recommend taking a look at Apache.org: Apache mod_rewrite Introduction. They cover most of the characters I've discussed in this post up to this point: ., (, ), +, etc..

How to allow 1-9 a-z A-Z - _ % in url via htaccess?

I want to allow in url (1-9 , a-z, A-z, -, _ , %)
I have below code in htaccess
RewriteRule ^shop/search/([a-zA-Z0-9_-]+)/?$ shop.php?search=$1 [QSA,NC]
Issue : when space is passed in url
Example
domain.com/shop/search/my%20keyword
It is not working
Basically i want to allow % in url via htaccess
How to do it?
... it is matched against the (%-decoded) URL-path of the request ...
source, emphasis mine.
mod_rewrite never sees the %, it decodes the %20 to a space. If you want to accept %20 in the URL then add space to the character class.
Basically i want to allow % in url via htaccess How to do it?
You can use this rewrite rule with negative character class:
RewriteRule ^shop/search/([^/]+)/?$ shop.php?search=$1 [QSA,NC,L]
[^/]+ will match 1 or more of any character that is not / hence it will match whitespace or any other decoded character also that you want to match.

How can i create better urls using .htacces modrewrite

This are my urls right nowm for my products
http://www.example.com/product.php?product=32723
I want to achieve this
http://www.example.com/product/32723-brand-model-productname
I have been modifying my .htaccess but really with no clue on how to achieve this.
You must match your URL with a RewriteRule pattern and rewrite it to the target URL
RewriteRule ^/product/(\d+)- /product.php?product=$1
This pattern matches any URL starting with /product/ and captures the following digits (\d+) followed by a dash -. The substitution URL will be /product.php?product= with the captured digits $1 appended.
To capture some part of the match, you enclose it in parenthesis (...). Read more on regular expressions used in mod_rewrite at Apache mod_rewrite Introduction - Regular Expressions.
This question is so common I just put up the whole answer with code here:
http://www.prescia.net/bb/coding/5-141018-simple_friendly-url

Delimiters and multiple parameters

I'm trying to use the mod_rewrite module to create smooth URLs. So for example my example.com/pages/group/index.php?id=1&slug=example-keyword would become example.com/group/1-example-keyword.
The problem I'm having is with the second parameter and how it is split. As the second parameter uses dashes how could I fix this as at the moment it throws 404 errors.
.htaccess rule
RewriteRule ^group/([^-]*)-([^-]*)$ /pages/group/index.php?id=$1&slug=$2 [L]
Your regular expression explicitly prohibits dashes in the first and second groups.
Try this using . (any character) instead of [^-] (any character except -) in your second group:
RewriteRule ^group/([^-]*)-(.*)$ /pages/group/index.php?id=$1&slug=$2 [L]
In this expression, everything after group/ but before the first - will be captured in group 1, and everything after the first - will be captured in group 2.

HTACCESS | Adding a second rewrite rule?

I'm trying to write my .htaccess to support two vanity url's, the code will speak for itself as I'm not very good with .htaccess.
RewriteRule ^([a-zA-Z0-9-]+)/?$ index.php?p=$1&s=$2 [L,QSA]
Upon going to for example http://website.com/home/test
I get a 404, but $_GET["p"] still returns back home if I go to just website.com/home.
Why am I getting a 404 when adding in my second variable in the url?
You get a 404 because /home/test does not match the expression ^[a-zA-Z0-9]+/?$. The second group after the first / exists beyond your $ string terminator. You need to add a second () group, which is optional. I have replaced the a-zA-Z0-9 character classes with [^/]+ which matches everything up to the next slash.
The (?:) indicates a non-capturing group encompasing the first /, with a capturing group () inside it to retrieve the $2 component. The entire construct is made optional with ? before the final $ terminator.
RewriteEngine On
RewriteRule ^([^/]+)(?:/([^/]+)/?)?$ index.php?p=$1&s=$2 [L,QSA]

Categories