Get page id value and display in URL via .htaccess - php

I'm trying to display SEO friendly URLs by using a rewrite in our .htaccess file, but I can't get it to work (I've researched many of the related topics on StackExhange and elsewhere, but to no avail). I'd like to get the value of the id on this page...
http://199.119.123.135/info/tool_surety_company.php?id=1
...and display the id value in the URL instead of the ugly "tool_surety_company.php?id=1".
I'm going for a result like this: http://199.119.123.135/info/travelers-group
I'm using the following code in my .htaccess file:
RewriteCond %{THE_REQUEST} \ /+info/tool_surety_company\.php\?id=([^&]+)
RewriteRule ^ /info/%1/? [L,R]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^^info/([^/]+)/?$ /info/tool_surety_company.php?id=$1 [QSA]
But I'm receiving a 404 error.
Any ideas? Thanks in advance!

There might be something I'm misunderstanding here but I believe there would need to be a mechanism on the server side code to load the correct content for the new "seo-friendly url". In other words, sure, you can redirect the user to show a different url but how is the server going to know what content to load for that new url?
Here's a good resource for putting together a simple example.
https://moz.com/ugc/using-mod-rewrite-to-convert-dynamic-urls-to-seo-friendly-urls
Update:
From here - https://mediatemple.net/community/products/dv/204643270/using-htaccess-rewrite-rules
TROUBLESHOOTING
404 Not Found
Examine the new URL in your browser closely. Does it match a file that
exists on the server in the new location specified by the rewrite
rule? You may have to make your rewrite rule more broad (you may be
able to remove the $1 from the second string). This will direct
rewrites to the main index page given in the second string. Or, you
may need to copy files from your old location to the new location.
In other words, the only reason you would be getting a 404 is because the server does not find the file that is requested as defined in the URL visible in your browser address bar.
Htaccess Rewrites are enabled by using the Apache module mod_rewrite,
which is one of the most powerful Apache modules and features
availale. Htaccess Rewrites through mod_rewrite provide the special
ability to Rewrite requests internally as well as Redirect request
externally.
When the url in your browser's location bar stays the same for a
request it is an internal rewrite, when the url changes an external
redirection is taking place. This is one of the first, and one of the
biggest mental-blocks people have when learning about mod_rewrite.
More info from here:
http://www.askapache.com/htaccess/modrewrite-tips-tricks.html

Related

Reroute any subdirectory to script

I am trying to setup simple url routing in a Perl web project without haveing to include a framework just for that purpose. I believe this can be accomplished with an .htaccess.
The plan is for any request to the server using example.com/anysubdirectory/... to be routed to a perl/php script that will parse whatever is contained in /anysubdirectory/... and the parameters following it and then determine where to send the user based on that info.
If example.com without any subdirectory is requested I need to still maintain the default behavior of searching for an index page here.
Since the /anysubdirectory/ will be dynamic i'm not able to predefine that /123/ -> option 1 or /abc/ -> option 2
I am not overly familiar with htaccess other than the typical www and base rewrites.
Any help is much appreciated.
I believe I answered my own question using the following in the root .htaccess:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ router.pl?action=$1 [L,NC,QSA]
This seems to be working with my initial testing.
I believe this is how it is working:
If the requested subdirectory is not found as a file
If the requested subdirectory is not found as a directory
Redirect this to the router.pl script along with any leftover parameters from the original url.
EDIT: The above is not working completely, this is still redirecting any file that is not found on the server to the router.pl script. Not really the functionality that I am looking for,i would like this to only happen if it is a subdirectory and not an invalid file
Not sure I want any bot thats guessing filenames to be pegging my script on a regular basis.
Please correct this response if any of the above is not accurate.

POST request to SEO URL Forbidden

I have a basic MVC system that is sending POST data to URLs such as
admin/product/add/
But this is giving me an error
Forbidden
You don't have permission to access
/admin/product/add/ on this server.
Additionally, a 404 Not Found error was encountered while trying to
use an ErrorDocument to handle the request.
The RewriteRule is simply
RewriteRule ^(.*)/$ index.php?uri=$1
Last time I saw this on a server changing file/directory permissions to 755 seemed to fix it but not this time. I have never really understood the reason for the error so was hoping someone may be able to provide some more information?
You have 2 errors:
You don't have permission to access /admin/product/add/ on this server.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
The 2nd one is quite certainly a consequence of the same bug. You may have something in your apache configuration which remove 404 errors from default http server handling and push it to your php application, if this php application was working we would have a nice 404, but...
The first one tells you your php application is not running at all.
So. This first error tell us that apache did try to directly access the directory /path/to/documentroot/admin/product/add/ on your server and to produce a listing of it (well a listing of the directory content would be done only if apache were authorized to do so). But of course this is not a real directory on your server. It is a virtual path in your application. So apache ends up with a 404 (which leads to error 2).
The application handles a virtual path, apache does not manage it. The RewriteRule job is to catch the requested path before apache is trying to serve it and give it to one single php file (index.php) as a query string argument.
So... this rewrite rule was not applied. Things that could prevent this rule to be applied are numerous:
mod_rewrite not activated: is the module present and enabled (RewriteEngine on)?
syntax error: mod rewrite syntax is quite hard to read, sometimes really complex. But here it seems quite simple.
The RewriteRule resulting file is maybe not a valid target for apache. If the index.php file is not present in the DocumentRoot, or not readable by the apache user, then apache will fail. Warning: having a file readable by the apache user means having read rights on the file but also execution rights on all parents directories for the apache user. This is where your classical chmod/chown solutions are fixing the problems.
The rule must be in a valid configuration file. Is this rule in a an apache configuration file, inside a Location or Directory section? Or maybe in the global scope -- this may alter the rewrite Rule syntax--. Or is it in a .htaccess file? If it's a .htacces does apache reads the .htacces files and are mod-rewrite instructions allowed there (AllowOverride None). Isn't there others .htaccess files taking precedence?
So to fix the problem:
If you have an apache version greater than 2.2.16 you can replace the RewriteRule by FallbackRessource /index.php to check that this does not come from a mod-rewrite problem.
try to directly request index.php, so that at least a direct request to this file does work
try to directly access a valid ressource on the documentRoot (a txt file, an image, something that should not be handled by the rewrite but directly served)
check that if any of your virtual paths could map real physical paths Apache is not trying to serve the physical one (like when you write a RewriteCond %{REQUEST_FILENAME}-d) but really push the path to index.php
check apache error logs.
debug mod_rewrite with RewriteLog and RewriteLogLevel
collect facts, settings and tests, and then push that to SO or Servfault.
So the problem is quite simple: the php application is not receiving the request. But there are a very big number of ways to end in this state. The message in itself is not very important. The only way to find the error is to check all parameters (or to have years of bug fixing experience and developing a pre-cognitive intuition organ for lamp bugs -- usually a beard --, like admins). And the only way for us to help you is to find strange facts in a big list of configuration details, this is why good questions contains a lot of informations, even if all theses informations looks simply "classical" for you.
EDIT
To clarify the problem you should edit your answer, track the POST requests with tools such as Chrome developpers tools or firebug (keep the network tracking in record mode to catch several POSTS) or try to replay the post with Live HTTP headers reply. You should try to isolate the problematic POST and give us details. Debug is not magical.
Now I know one magical random POST failure. It's the empty GET url bug. It could be that (or not). If you have one empty GET url hidden somewhere (<IMG SRC="">, url() in css, or an empty LINK in headers for example. As theses hidden POST are defined in HTTP as "replay-the-request-which-launched-the-source-page, and some browsers even replay the POST that gives you the page if they found one. This could lead to broken hidden POSTS.
It could be also that the POST is not sent to the right server. Hard to say. So please collect informations from your comments, add some more network analysis and edit the question which is now really containing not enough facts.
Try this:
RewriteCond %{REQUEST_METHOD} =POST
RewriteRule ^(.*)/$ index.php?uri=$1
Use this:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?uri=$1 [L]
Also use only www or non-www domain but not both at the same time. Redirect users with htaccess where you would like like to...
NonWWW to WWW:
RewriteCond %{HTTP_HOST} !^www\.(.*)$ [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
WWW to NonWWW:
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^www\.(.*)$ http://%1/$1 [R=301,L]

301 redirect in .htaccess for 30,000 errors

I've been tasked to clean up 30,000 or so url errors left behind from an old website as the result of a redesign and development.
I normally use .htaccess to do this, but I doubt it would be wise to have 30,000 301 redirects inside the .htaccess file!
What methods have some of you used to solve this problem?
Thanks in advance.
Here as you can do with apache httpd
RewriteMap escape int:escape
RewriteMap lowercase int:tolower
RewriteMap my_redir_map txt:map_rewrite.txt
RewriteCond ${my_redir_map:${lowercase:${escape:%{HTTP_HOST}%{REQUEST_URI}}}} ^(.+)$
RewriteRule .* http://%1 [R=301,L]
I use this rewrite rules usually directly inside apache httpd configuration.
Inside the map_rewrite.txt file you have a tab delimited file with the list of redirect in the following format:
www.example.it/tag/nozze www.example.it/categoria/matrimonio
www.example.it/tag/pippo www.example.it/pluto
www.example.it/tag/ancora www.google.com
Would be much easier if you can generalize the approach because the redirect have a common pattern. But if not, in this case you only need to add the redirected url into the list.
Take care to study the RewriteMap configuration, because you can also write the list into a different format, for example like a database table.
Please pay attention to this: I have added escape and lowercase only because there are accents into the urls I need to write. If your urls doesn't have accents, you can remove both.
If you want implement these redirects in php, here the code you need:
<?php
$dest_url = "http://example.com/path...";
header("HTTP/1.1 301 Moved Permanently");
header("Location: ".$dest_url);
Create a PHP page to operate as a 404 handler. It should inspect the incoming URL, check if it should map from an old page to a new page, then issue a 301. If there is no mapping then present a 404.
Simply set this page as the 404 handler in your .htaccess and there you go. IIRC this is how Wordpress used to handle 'clean' URLs on IIS before IIS7 brought in URL rewriting without needing a 3rd-party dll.
I have made a redirect class that is on the 404 page that will check the database if there is a valid page to 301 redirect to and redirect it instead of giving the 404 page. If it can't figure that out, it marks it in the database as a 404 page, so it can be fixed later.
Thanks for help guys. I've carried out the suggested course of action from freedev but have created a separate config file within Apache.
Within the httpd.conf file I have added:
# Map settings
Include "conf/extra/map.conf"
The map.conf file:
RewriteEngine On
RewriteEngine on
RewriteMap url_rewrite_map txt:conf/map.map
RewriteCond ${url_rewrite_map:$1|NOT_FOUND} !NOT_FOUND
RewriteRule ^(.*) http://website.com/${url_rewrite_map:$1} [R=301]
The map.map file is formatted as:
/oldname/ /newname
I've added quite a few of the urls for the redirection and so far so good, it isn't having a massive impact on the server like it did when added to .htaccess

How to setup .htaccess to show 404 for unallowed urls?

I noticed in Drupal if you add .php to the url bar of any page it gives you a 404 message; clean urls enabled. The page is obviously a .php, but the .htaccess is preventing the user from being able to tamper with url extensions in the url bar. How could you do this using .htaccess. I have file extensions omitted at the moment, but would also like to add that feature. Thank you.
Also, this question does not pertain to Drupal. I only mentioned Drupal for and example.
Just because a file contains PHP code it doesn't mean it has to have the .php extension; even more so when you're accessing a file over the internet.
When you request http://mysite.com/page and you're using an .htaccess like Drupal's, the request is forwarded onto index.php?q=page whereupon Drupal will check it's database for a path matching page. If it finds one it will display the content for that page, if not it will (rightly) give a 404.
If you want all of your pages to be accessible with a PHP extension you could add an extra rule in your .htaccess file to remove .php from any request where the PHP file doesn't physically exist:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)\.php $1 [NC]
Bear in mind though that this adds zero extra value for your site's visitors (in fact they have to remember a file extension as well as the path to the page), and it exposes exactly what server-side technology you're using so a potential attacker would have some of his work done for him.
Hope that helps.
Could you please explain that in more depth. How can it redirect content into an existing page? Is that common practice / typical way of doing things?
Yes it is a very common practice, used by most frameworks and CMS.
The principle is simple: you setup your .htaccess so that every request which doesn't match a real file or directory will be redirected to a front controller, usually the index.php in the root directory of the application. That front controller handles the request by analyzing the URL and calling the necessary actions.
In this way you can minimize the rewrite rules to just one, and you can offer customized 404 pages.
I dunno Drupal but in the usual php app every request being routed to the front controller which performs some validations and throws 404 on errors.
easy-peasy

How to understand PHP's URL parsing/routing?

I just inherited a website built in PHP. The main page of www.mysite.com has a href to www.mysite.com/index/35.html somewhere in the page. In the site's root directory and its children there is no document 35.html.
The number 35 is actually an id found in a DB which also holds the html contents of the page.
If I load URL: www.mysite.com/index.php?id=35 the same page loads.
How does PHP know how to automatically convert
/index/35.html
to
/index.php?id=35
EDIT
Based on the answers, I have found a .htaccess file containing rewrite instructions that would explain the functionality.
However, IIS doesn't seem to (or is not configured) know how to use this. (probably because this is an Apache feature?)
So this begs the following question: Is there a way to configure IIS to work with this?
it will be done usign URL Rewriting using .htaccess - should be in the webroot.
It may look something like:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [QSA,L]
May have other bits, but what this basically tells apache is to send anything that DOES NOT physically exist to index.php
It doesn't. There is a mod_rewrite rule that rewrites from /index/foo to /index.php?id=foo, either in a .htaccess file somewhere or in the httpd configuration itself.
RewriteEngine On
RewriteRule ^index/([\d]+)\.html /index.php?id=$1 [NC,L]
This is off the top of my head. Any browsers trying to load an address starting with index/ has any number ending in .html will be internally redirected to index.php?id= whatever the number is.
Edit: Just saw that your working on IIS. This probably won't work for you. Sorry.
I think you will be using .htaccess to redirect all requests to index.php. From there You can pass the query string a routing class, which will parse the url and identify the unique ids.
In this case we can say like, your routing class will parse the request /index/35.html to indexController, indexAction, id=35. now you can pass this id to the model to get corresponding page contents
NB : Here I a am assuming you are using mvc pattern. Anyway it can be treated in your own way, with the concept remaining the same. Hope this make sence.

Categories