I'm building a PHP application using CodeIgniter. It is similar to Let Me Google That For You where you write a sentence into a text input box, click submit, and you are taken to a URL that displays the result. I wanted the URL to be human-editable, and relatively simple. I've gotten around the CodeIgniter URL routing, so right now my URLs can look something like this:
http://website.com/?q=this+is+a+normal+url
The problem right now is when the sentence contains a special character like a question mark, or a backslash. Both of these mess with my current .htaccess rewrite rules, and it happens even when the character is encoded.
http://website.com/?q=this+is+a+normal+url? OR
http://website.com/?q=this+is+a+normal+url%3F
What does work is double-encoding. For example, if I take the question mark, and encode it to %253F (where the ? is encoded to %3F and the % sign is encoded to %25). This url works properly.
http://website.com/?q=this+is+a+normal+url%253F
Does anyone have an idea of what I can do here? Is there a clever way I could double encode the input? Can I write a .htaccess rewrite rule to get around this? I'm at a loss here. Here are the rewrite rules I'm currently using for everyone.
RewriteEngine on
RewriteCond %{QUERY_STRING} ^q=(.*)$
RewriteRule ^(.*)$ /index.php/app/create/%{QUERY_STRING}? [L]
Note: The way CodeIgniter works is they have a index/application/function/parameter URL setup. I'm feeding the function the full query string right now.
If your’re using Apache 2.2 and later, you can use the B flag to force the backreference to be escaped:
RewriteCond %{QUERY_STRING} ^q=.*
RewriteRule ^ /index.php/app/create/%0? [L,B]
I usually do human readable urls like this
$humanReadableUrl= implode("_",preg_split('/\W+/', trim($input), -1, PREG_SPLIT_NO_EMPTY));
It will remove any non-word characters and will add underscores beetween words
Related
I have a regexp that seems to work fine when tested at regexp101.com
but that does not give me the same result within mod_rewrite.
So, the URL I am trying to rewrite is:
/modeles-voiture/Nissan/Qashqai+2
The expected result is:
/modeles.php?brand=Nissan&model=Qashqai+2
The rewrite rule:
RewriteCond %{REQUEST_URI} ^/?modeles-voiture [NC]
RewriteRule \/([A-Z][\-A-Za-z]+)\/([\+\-A-Za-z0-9]+$) /modeles.php?brand=$1&model=$2 [L]
What I am getting out of the rewrite rule is:
/modeles-voiture/Nissan/Qashqai 2
Note the missing + sign, which throws off my script at modeles.php
Thanks for your help.
I think you want the [B] flag.
Look at the answer to this question: How to encode special characters using mod_rewrite & Apache?
So then mod_rewrite changes your request '/tag/c++' to 'script.php?tag=c++'. But in a query string component in the application/x-www-form-encoded format, the escaping rules are very slightly different to those that apply in path parts. In particular, '+' is a shorthand for space (which could just as well be encoded as '%20', but this is an old behaviour we'll never be able to change now).
So PHP's form-reading code receives the 'c++' and dumps it in your _GET as C-space-space.
I have a .htaccess file with this in it:
RewriteRule ^search/([a-zA-Z]+)$ index.php?page=search&search=$1
So basically it sends URLs like this:
url.net/search/this
To this:
url.net/?page=search&search=this
But when I send it a URL like:
url.net/search/this+search
I get an error returned as it doesn't know how to deal with +search bit.
Is there a way I can get it to include the + between words when the user clicks search?
I want it so that if the user types i+want+this+or+that or this+is+what+i+want+to+find, so no mater how long it is, it knows how to handle the parse to $_GET['search'] parameter.
You should be able to just include it in the regex...just remember to escape it,
RewriteRule ^search/([a-zA-Z\+]+)$ index.php?page=search&search=$1
Try this regex for the rewire rule:
RewriteRule ^search/([a-zA-Z].+)$ index.php?page=search&search=$1
Note the . before the + sign. Works as a regex here on this live PHP regex site. Yes, I know this is an Apache rewrite rule & PHP has no role at this stage, but basic regex logic should remain the same.
I have an automated process which generates urls from the title of venue.
I then use the following line to within my .htaccess to get the url to redirect to the correct path
RewriteRule ^recipes/([\w-]+)/(\d+)$ ./recipes_news.php?i=$2 [L,QSA]
a typical URL looks like the link below
www.site.com/recipes/red-curry-chicken/123
Where the last part of the url is the id used to find the actual recipe information.
For some reason unknown to me, anytime a special character such as "ā" occurs, it breaks the url.
Is there something I am missing in the .htaccess code to capture special chacters?
Thanks
Try changing your regex pattern to:
RewriteRule ^recipes/([^/]+)/(\d+)$ ./recipes_news.php?i=$2 [L,QSA]
Consider the following scenario:
I want to be able to access http://www.example.com/word/hello/, where the word hello is variable. So I set up .htaccess to configure that.
RewriteEngine On
RewriteRule ^word/(.+)/?$ displayword.php?word=$1 [L]
I used .+ because I also want to filter any symbols such as ?+-.!;: etc.
And I set up my PHP file accordingly:
<?php
echo $_GET['word'];
?>
Remember that this is just a scenario. Now, I went to this URL: http://www.example.com/word/Are you ok?/, and the page outputted this:
Are you ok
And I couldn't figure out why. But then I realised that the question mark symbol is the starting point of the URL variables.
So is there a way to 'url encode' the question mark in the above example, in order for it to be displayed correctly?
There is no need to encode it, try this:
RewriteEngine On
RewriteRule ^word/([a-zA-Z0-9-=_.?]+)/?$ displayword.php?word=$1 [L]
It will display ? in the parameter and any other character you add to the [group]. I did not test if the rule works, though, but I suppose it does. Looks ok and that is not the question.
I don't know heaps about .htaccess files, but you could change your PHP script to use $_SERVER['PATH_INFO'] instead of $_GET or $_REQUEST.
Particularly, this comment might help you out.
In the HTTP protocol the "?" separates the querystring from the rest of the URL, so I don't think it will be possible to use it directly inside the URL. One solution would be to encode the question mark into %3F.
Then you can use string urldecode (string $str) to decode the string.
See this URL Encoding Reference for the encoding of other characters.
Change your code to this:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+word/([^/]+) [NC]
RewriteRule ^ index.php?word=%1 [L,QSA]
Reason this works is because RewriteRule works on %{REQUEST_URI} which gets URI i.e. string before question mark ? however %{THE_REQUEST} works on the full URL that includes question mark ? as well.
I have this url:
http://www.site.com/en/about.php?id=112&name=andrew marshall dickens
and i would like to rewrite it like this:
http://www.site.com/112/andrew-marshall-dickens.html
so far:
RewriteRule ^([^/]*)/([^/]*)\.html$ /en/about.php?id=$1&name=$2 [L]
I'm having trouble with the '-' character.Any suggestions ? Thanks!
Well you're attempting to use a Regex to remove characters from the middle of a string which could have any number of that character in it in the middle of a RewriteRule. On one hand that's not really possible, on the other hand, you're passing the ID in, so I assume you can get the name using the id in your PHP script, so there's not really a need to parse the name from the URL variables, and as a 3rd option, why not just str_replace the - characters in PHP and ucwords() the string before outputting it if you want to use the name variable?
I believe you don't need to pass name param because id can get that.
Anyway:
RewriteRule ^([0-9]+)/([a-z-]+)\.html$ /en/about.php?id=$1&name=$2 [L]
But hey, reading the comment, i just realized: what's your problem? Your regex should already work