Special Characters break Generated URLs - php

I have an automated process which generates urls from the title of venue.
I then use the following line to within my .htaccess to get the url to redirect to the correct path
RewriteRule ^recipes/([\w-]+)/(\d+)$ ./recipes_news.php?i=$2 [L,QSA]
a typical URL looks like the link below
www.site.com/recipes/red-curry-chicken/123
Where the last part of the url is the id used to find the actual recipe information.
For some reason unknown to me, anytime a special character such as "ā" occurs, it breaks the url.
Is there something I am missing in the .htaccess code to capture special chacters?
Thanks

Try changing your regex pattern to:
RewriteRule ^recipes/([^/]+)/(\d+)$ ./recipes_news.php?i=$2 [L,QSA]

Related

Why do I have a zero in my $ _GET variable when I rewrite URLs?

I post here because despite the many topics on the net I have not managed to solve my problem.
I concise a website, and to optimize SEO, I must make the URL Rewriting.
I have GET variables passing in the URL and some have spaces that are encoded in the URL by "%20", for example:
mapage.php?produit=aménagements%20bois
So I apply my rewrite rule in the .htaccess file:
RewriteRule ^ma-page-amenagements-bois$ mapage.php?produit=aménagements%20bois [L]
The problem is that URL rewriting worked but a zero appears in my variable $ _GET instead of space ("aménagements0bois" instead of "aménagements bois") when I try the new URL, which distorts the dynamic display of my page.
I would like to know how to solve this problem.
Thank you
You don't need to add encoded characters in your rewrite rule, you can escape spaces with \:
RewriteRule ^ma-page-amenagements-bois$ mapage.php?produit=aménagements\ bois [L]
The reason you get a 0 in your url is because apache uses %1, %2, ... as rewrite variables. And because you don't have a %2, only the 0 remains.

url search for file in subdirectory instead from public_html using htaccess

I am working on a website in which I write code for htaccess but the thing which I wanted to do is not happening. I have url which is:
http://www.example.com/demo.php?id=234&title=ask%20me%20a%20question
I converted to below url using htaccess:
http://www.example.com/234/ask%20me%a%question
htaccess code:
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^([0-9]{4})/([a-z]+)/$ demo.php?url=$1&url2=$2
So. the problem is converted url is search for related file in subdirectory instead of server root i.e; public_html. I want to know how could this problem will solve.
Plz help me. Thanks.
The second parameter in your request requires that characters other than a-z be included, but you are limiting it to a-z.
In addition, you are requesting 234 in the URI, but checking for 4 numbers in the first parameter.
As such, change your rule to the following:
RewriteRule ^([0-9]{3,4})/([^/]+)/?$ demo.php?url=$1&url2=$2 [L]
Changes
Allow 3 or 4 numbers in the first parameter. If you want to be more flexible, you can change it to ([0-9]+).
Check for all characters other than / in the second parameter.
Make the trailing slash optional using /?.
Add the L flag to stop rewriting if the rule is matched (always good to have for when you add other rules).

Grabbing a domain name from URL as a variable by htaccess

Imagine in my website I want to show some analytic about domains, working URL example of what I need:
http://whois.domaintools.com/google.com
As you see in the above URL, it's handling google.com as a variable and pass it to another page to process the given variable, that's exactly what I want.
So for detecting that kind of variable, here is my regex:
/^[a-zA-Z\d]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]+$/
The above RegEx is simple and accepts everything like: google.com, so in my .htaccess file I have:
RewriteRule (^[a-zA-Z\d]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]+$) modules/pages/page.php?domain=$1
The above rule do what I want, but it also redirects my homepage to page.php while there is nothing in the URL, forexample: http://mysitename.com is now being forwarded to page.php
How can I fix this?
Thanks in advance
It redirects also the base domain to page.php because of the regex. You are using the + on all places, the meaning of the plus is "Matches the preceding pattern element one or more times.". (http://en.wikipedia.org/wiki/Regular_expression) If you request the homepage, it redirects because all the elements are appearing zero times, like you defined in the regex.
Instead of the + you should define a minimum and a maximum amount of characters (so the zero occurrences are not evaluated). BTW, a quick search in google for "regex domain" will output a lot of results, which are tested. Use the following for example:
RewriteEngine on
RewriteRule (^(([a-zA-Z]{1})|([a-zA-Z]{1}[a-zA-Z]{1})|([a-zA-Z]{1}[0-9]{1})|([0-9]{1}[a-zA-Z]{1})|([a-zA-Z0-9][a-zA-Z0-9-_]{1,61}[a-zA-Z0-9]))\.([a-zA-Z]{2,6}|[a-zA-Z0-9-]{2,30}\.[a-zA-Z]{2,3})$) modules/pages/page.php?domain=$1
Reference:
Domain name validation with RegEx
Update 1:
If you want to use your own regex, exchange the last "+" with {2,}. The top-level domains have usually at least 2 characters.
RewriteEngine on
RewriteCond %{REQUEST_URI} !(\.html|\.php|\.pdf|\.gif|\.png|\.jpg|\|\.jpeg)$
RewriteRule (^[a-zA-Z\d]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]{2,}$) modules/pages/page.php?domain=$1

Rewrite syntax in .htaccess

I just know how htaccess works but I am always confused with the writing syntax and I appreciate if anyone could help me solving the below htaccess issue.
I have couple pages linking to redirect to something like
http://mydomain.com.au/product-details.php/142/categoryAbstract
but due to the mistakes of previous developer the images are not loading unless that url is
http://mydomain.com.au/product-details.html/142/categoryAbstract
He converted all php pages to html (I really don't know what's this intention in doing that) but
now the url should work even if it as http://mydomain.com.au/product-details.php/142/categoryAbstract
He used the below htaccess for this but its not working. If I manually change the url from .php to .html everything working fine.
RewriteRule ^product-details.html/(.*)/(.*)$ product-details.php?productid=$1&category=$2
I need a working line of code so that even the url http://mydomain.com.au/product-details.php/142/categoryAbstract should work.
You will just need an OR group (a|b) to account for both possibilities:
RewriteRule ^product-details\.(html|php)/(.*)/(.*)$ product-details.php?productid=$1&category=$2
#---------------------------^^^^^^^^^^^
That can be improved a little though. The (.*) are greedy matches. You are better served to use ([^/]+) as the first grouping to match everything up to the next /. I have also escaped the dot as \. so it is matched as a literal instead of any character.
RewriteRule ^product-details\.(html|php)/([^/]+)/(.*)$ product-details.php?productid=$1&category=$2
The .php extension is commonly modified either through rewriting or actual file renaming and server configuration to parse .html as .php in order to hide some server-side information from end users. To prevent them from knowing what technologies the site runs on the back end. It less common to actually rename files to .html than to use URL rewriting to hide the .php, however.
RewriteRule ^product-details.html/(.*)/(.*)$ product-details.php?productid=$1&category=$2
What this rule does is take everything after product-details.html/ and before the last / and a second bit gets taken after the last / until the end of the line. then it takes those bits and puts them where the $1 and $2 are.
to change it so it accepts .html and .php you can change it with
RewriteRule ^product-details(.html|.php)/(.*)/(.*)$ product-details.php?productid=$2&category=$3
Because it looks like the first bit you are grabbing are numbers and (.*) is a greedy selector it may be better to replace it with ([0-9]*) which will only select numbers. that way if you ever have /s in your catagory you'll be fine. giving you:
RewriteRule ^product-details(.html|.php)/([0-9]*)/(.*)$ product-details.php?productid=$2&category=$3

Trouble with encoding special chars for URL through text input

I'm building a PHP application using CodeIgniter. It is similar to Let Me Google That For You where you write a sentence into a text input box, click submit, and you are taken to a URL that displays the result. I wanted the URL to be human-editable, and relatively simple. I've gotten around the CodeIgniter URL routing, so right now my URLs can look something like this:
http://website.com/?q=this+is+a+normal+url
The problem right now is when the sentence contains a special character like a question mark, or a backslash. Both of these mess with my current .htaccess rewrite rules, and it happens even when the character is encoded.
http://website.com/?q=this+is+a+normal+url? OR
http://website.com/?q=this+is+a+normal+url%3F
What does work is double-encoding. For example, if I take the question mark, and encode it to %253F (where the ? is encoded to %3F and the % sign is encoded to %25). This url works properly.
http://website.com/?q=this+is+a+normal+url%253F
Does anyone have an idea of what I can do here? Is there a clever way I could double encode the input? Can I write a .htaccess rewrite rule to get around this? I'm at a loss here. Here are the rewrite rules I'm currently using for everyone.
RewriteEngine on
RewriteCond %{QUERY_STRING} ^q=(.*)$
RewriteRule ^(.*)$ /index.php/app/create/%{QUERY_STRING}? [L]
Note: The way CodeIgniter works is they have a index/application/function/parameter URL setup. I'm feeding the function the full query string right now.
If your’re using Apache 2.2 and later, you can use the B flag to force the backreference to be escaped:
RewriteCond %{QUERY_STRING} ^q=.*
RewriteRule ^ /index.php/app/create/%0? [L,B]
I usually do human readable urls like this
$humanReadableUrl= implode("_",preg_split('/\W+/', trim($input), -1, PREG_SPLIT_NO_EMPTY));
It will remove any non-word characters and will add underscores beetween words

Categories