Actually I want to rewrite URL through .htaccess.
My actual URL is
front/property/uploadphotos_pid.php?pid=11#NO
I want to convert this to
uploadphotos_pid/11/NO.php
and have written following code:
RewriteRule ^uploadphotos_pid/([a-zA-Z0-9\-]+)/([a-zA-Z0-9\-]+).php$
front/property/uploadphotos_pid.php?pid=$1&#=$2
The fragment identifier (the section of a URI starting with #) is handled entirely client side. It is not sent to the server. The server (which is where mod_rewrite runs) therefore cannot do anything with it.
# must be escaped as %23 in a URL
Related
I have the following rewrite rule in .htaccess :-
RewriteRule ^.*/-y.* /handleurl.php [L]
Its purpose is to display appropriate pages depending on the values in the url, for example:
example.com/books/BookA/-y?act=x will display bookA page
the variable holding the book name is encoded such that ...
example.com/books/Book B/-y?act=x becomes example.com/books/book+B/-y?act=x
... which is fine (it's decoded in handleurl.php)
however if the book is called Book A/B I have ...
example.com/books/Book A/B/-y?act=x which becomes example.com/books/Book+A%2FB/-y?act=x
It appears that htaccess decodes this before the rewrite rule, so the rewrite rule sees too many elements in the URL delineated by the /.
Is there any way I can get the rewrite rule to ignore the encoded / as intended?
I have seen a previous response to a similar question, but I only need the / to be ignored, not other encoded characters.
It appears that htaccess decodes this before the rewrite rule, so the rewrite rule sees too many elements in the URL delineated by the /
This is not the problem. Regardless of whether the URL-path /books/Book+A%2FB/-y is decoded or not makes no difference here*1. Both would match the (rather generous) regex ^.*/-y.* in the RewriteRule pattern.
(*1 But yes, the URL-path matched by the RewriteRule pattern is URL decoded, ie. %-decoded.)
The problem is likely to be that Apache (by default) rejects - with a 404 - any URL that contains a %-encoded slash ie. %2F (or backslash %5C) in the URL-path portion of the URL. This is a security feature, that otherwise "could potentially allow unsafe paths" (source).
However, this can be overridden with the AllowEncodedSlashes directive. But this directive can only be used in a server or virtualhost context. It cannot be used in .htaccess.
You either need to set AllowEncodedSlashes On to allow encoded slashes, which are also decoded, as with other characters. Or set AllowEncodedSlashes NoDecode to permit encoded slashes, but do not decode them - which is preferred and probably what you are expecting.
Aside#1:
RewriteRule ^.*/-y.* /handleurl.php [L]
The regex ^.*/-y.* is very generic, possibly too generic. This is the same as simply /-y. What is the .* after -y intended to match? From your example URLs it looks like -y is always at the end of the URL-path, so this could be anchored, eg. /-y$. And if the URL that you need to match always starts /books/ then maybe this should also be included in the regex?
Aside#2:
...the book name is encoded such that ...
example.com/books/Book B/-y?act=x becomes example.com/books/book+B/-y?act=x ... which is fine (it's decoded in handleurl.php)
This isn't strictly "URL encoded", you have converted the space into a + in the URL-path. The + is a valid "URL encoding" for a space when used in the query string only. A + in the URL-path is a literal + (and will be seen by search engines as such). In the URL-path, a space would be URL encoded as %20. (You may have used the wrong PHP encoding functions, eg. urlencode() instead of rawurlencode()?)
Of course, you are free to convert/encode the URL however you wish to create a more readable URL - providing it's valid.
The rewrite rule was never the problem. I think it was Apache not liking the encoded '/' and the fact that the downstream url handling program was using '/' as a delimiter when identifying the individual url elements. I have to work out: 1) whether I want to allow '/' in the variables that make up the elements of the freindly url, and 2) if so how to pass it without upsetting Apache and how to subsequently disect the url. Maybe I will convert '/' to '~' for the benefit of the URL then convert back to '/' prior to subsequent display. Thank you Mr White.
Given this url:
http://test.com/myfile/product/1
and the following RewriteRule:
RewriteRule ([^/.]+)/?(.*) app/$1.php?$2
I would expect the url to become:
http://test.com/app/myfile.php?product/1
and it does when I use an online htaccess tester. But on my local dev environment I get this:
The requested URL /app/app.php was not found on this server.
Why? This can't be right, right? I suspect it is a bug caused by my setup (docker containers and dinghy-http-proxy) but since I am new to this rewriting I am not sure.
Try this:
RewriteRule ^([^/.]+)/?(.*) app/$1.php?$2
The problem is that the regex can match anywhere in the path string, and since the / is optional the result is unlikely to be what you want.
Also, make sure that you don't have multiple rewrite rules which apply, they will all get processed by default!
http://localhost/foo/profile/%26lt%3Bi%26gt%3Bmarco%26lt%3B%2Fi%26gt%3B
The url above gives me a 404 Error, the url code is this: urlencode(htmlspecialchars($foo));, as for the $foo: <i>badhtml</i>
The url works fine when there's nothing to encode e.g. marco.
Thanks. =D
Update: I'm supposed to capture the segment in the encoded part of the uri, so a 404 isn't supposed to appear.
There isn't any document there, marco is simply the string that I needed to fetch that person's info from db. If the user doesn't exist, it won't throw that ugly error anyways.
Slight idea what's wrong: I found out that if I used <i>badhtml<i>, it works just fine but <i>badhtml</i> won't, what do I do so that I can maintain the / in the <i>?
It probably think of the request as http://localhost/foo/profile/<i>badhtml<**/**i>
Since there is a slash / in the parameter, this is getting interpreted as a path name separator.
The solution, therefore, is to replace all occurrences of a slash with something that doesn't get interpreted as a separator. \u2044 or something. And when reading the parameter back in, change all \u2044s back to normal slashes.
(I chose \u2044 because this character looks remarkably like a normal slash, but you can use anthing that would never occur in the parameter, of course.)
It is most likely that the regex responsible for handling the URL rewrite does not like some of the characters in the URL-encoded string. This is most likely httpd/apache question, rather than PHP. Your best guess is to start by looking at the .htaccess (file containing URL rewrite rules).
This question assumes that your are trying to pass an argument through the URL, rather than access a file named <i>badhtml</i>.
Mr. Lister, you rocked.
"The solution, therefore, is to replace all occurrences of a slash with something that doesn't get interpreted as a separator. \u2044 or something. And when reading the parameter back in, change all \u2044s back to normal slashes."
I would like to rewrite the following URL
www.mysite.com/mypage.php?userid=ca49b6ff-9e90-446e-8a92-38804f3405e7&roleid=037a0e55-d10e-4302-951e-a7864f5e563e
to
www.mysite.com/mypage/userid/ca49b6ff-9e90-446e-8a92-38804f3405e7/roleid/037a0e55-d10e-4302-951e-a7864f5e563e
The problem here is that the php file can be anything. Do i have to specify rules for each page on the .htaccess file?
how can i do this using the rewrite engine in php?
To get the rewrite rule to work, you have to add this to your apache configs (in the virtualhost block):
RewriteEngine On
RewriteRule ^([^/]*)/userid/([^/]*)/roleid/(.*)$ /$1.php?userid=$2&roleid=$3 [L,NS]
RewriteRule basically accepts two arguments. The first one is a regex describing what it should match. Here it is looking for the user requesting a url like /<mypage>/<pid>/roleid/<rid>. The second argument is where it should actually go on your server to do the request (in this case, it is your php file that is doing the request). It refers back to the groups in the regex using $1, $2, and $3.
RewriteEngine on
RewriteBase /
RewriteRule ^mypage\/userid\/(([a-z0-9]).+)\/roleid\/(([a-z0-9]).+)$ www.mysite.com/mypage.php?userid=$1&roleid=$2
No you don't need a separate rule for every php file, you can make the filename variable in your regex something like this:
RewriteRule ^(a-z0-9)/userid/([a-z0-9].+)/roleid/([a-z0-9].+)$ $1.php?userid=$2&roleid=$3
If you want to rewrite the latter URL that is entered in the browser TO the first format, you would want to use a .htaccess file.
However, if you want to produce the pretty URLs in PHP (e.g. for use in link tags), then you have two options.
First, you could simply build the URL directly (instead of converting) which in my opinion is preferred.
Second, you could rewrite the first (ugly) URL to the pretty latter URL. You would then need to use preg_replace() in PHP. See http://php.net/manual/en/function.preg-replace.php for more info. Basically, you would want to use something like
$rewrittenurl = preg_replace("#mysite\.com\/mypage.php?userid=(([a-z0-9\-]).+)\&roleid=(([a-z0-9\-]).+)$", "mysite.com/userid/$1/roleid/$2", $firsturl);
Good luck!
What I'm trying to do:
have pretty URLs in the format 'http://domain.tld/one/two/three', that get handled by a PHP script (index.php) by looking at the REQUEST_URI server variable.
In my example, the REQUEST_URI would be '/one/two/three'. (Btw., is this a good idea in general?)
I'm using Apache's mod_rewrite to achieve that.
Here's the RewriteRule I use in my .htaccess:
RewriteRule ^/?([a-zA-Z/]+)/?$ /index.php [NC,L]
This works really well thus far; it forwards every REQUEST_URI that consists of a-z, A-Z or a '/' to /index.php, where it is processed.
Only drawback: '?' (question marks) and '#' (hash keys) seem to still be allowed in the REQUEST_URI, maybe even more characters that I've yet to find.
Is it possible to restrict those via my .htaccess and an adequate addition to the RewriteRule?
Thanks!
The fragment identifer, e.g. #some-anchor, is controlled by the browser, not the server. JavaScript would be needed to redirect and remove this, although why you would want to do so I am not sure.
[SNIPPED after clarification]
To rewrite only when the query string is empty:
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^/?([a-zA-Z/]+)/?$ /index.php [NC,L]
In mod_rewrite and PHP the variable REQUEST_URI refers to two different part of the URI. In mod_rewrite, %{REQUEST_URI} contains the current URI path; in PHP, $_SERVER['REQUEST_URI'] contains the URI path and query. But in both cases the URI fragment as this part of the URI is not transmitted to the server but only used by the client.
So, when /one/two/three?foo#bar is requested, mod_rewrite’s %{REQUEST_URI} contains /one/two/three and PHP’s $_SERVER['REQUEST_URI'] contains /one/two/three?foo.
The $_SERVER['REQUEST_URI'] variable will contain the original REQUEST_URI as received by the server, before you perform the rewrite. Therefore it's impossible (as far as I know this early in the morning) to remove the query string portion from the REQUEST_URI's attribute, but you naturally have the option of removing it when you process the $_SERVER['REQUEST_URI'] variable in your script.
If you want to only perform your RewriteRule when the query string is not specified, the following should work:
RewriteCond %{QUERY_STRING} !^.+$
RewriteRule ^/?([a-zA-Z/]+)/?$ /index.php [NC,L]
Note that this might be problematic though, since if there's accidentally a query string in a URL that someone uses to link to your site, your script wouldn't be handling it (since the rewrite never happens), so they'll get a 404 response (or whatever the case may be) that might not be as user-friendly as if you had just chosen to silently ignore the trailing information.
If i understand, you want to forbid using of ? and # for your site?
You shouldn't do that, because:
hash (#) is used in AJAX URLs google specification,
question mark (?) is used for example in Google AdWords and Analytics or any Affiliation Program,
So if you force Apache to reject url request containing question mark, people who click on your Ad in AdWords will only see 404 error page.
There is nothing bad in letting people to use both of them. The case is to prevent your site against XSS attacks.
Btw. there is another very importand sign - percent (%) which is used to encode special chars (like Polish or German national letters)