validating .htaccess before deployment - php

In order to get better SEO and cleaner URLs, I tend to export certain RewriteRules directly into the .htaccess (eg, RewriteRule ^The_North_Face(.*)$ index.php?a=brands&id=27&extras=%1 [NC,L] and so forth for each brand or category). It's a lot more complex than that but today I discovered that the file is only as good as the data it's trying to use. The site owner managed to put empty category names / URLs and some unescaped characters that caused a nasty internal server error, blocking any and all site access (inclusive of the tool to rebuild it).
I realise that the best defence here will probably be good training + failsafe at the CMS level. Regretfully, this is a 3-rd party solution called CubeCart which I can't dip into for the time being, the SEO solution was supposed to be standalone and just using the CubeCart data.
Obviously, I'd have to add some checks to do with brand / category / landing page names. Even so, I'd very much like to parse / validate the newly built .htaccess before replacing the 'live' one in order to avoid possible issues to do with syntax. Are there any syntax validators / ways to test Apache against a new .htaccess?
I can also think of deploying it in a sub-directory, then using curl to GET a few requests as a test, anything else I can do?

You may use something like WordPress does:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
... and then in the index.php file parse the $_SERVER["REDIRECT_URL"] against your website's URI logic. This way it would be easier for you to process any database variables like brand or category automatic through the PHP, without editing the .htaccess file on every content change.

I would suggest redirecting all requests to a file rewrite.php. There, you parse the requested file and match it against an array of rules. You use the result for filling the $_GET array and then include the correct file.
PHP syntax errors are much easier to find and you will definitely not get a 500 error page.

Related

How to create dynamic webpage with custom name?

I have looked around and attempted my own research on this topic but to no avail just yet.
I have a dynamic webpage set up to look for a ID from a database to retrieve elements required. This results in of course the web page looking like www.site.com/page?id=1
My desired outcome would be like a title for this page to be called.
Such as say I had a fruit product it and user went to my site and went to the address /fruit it would it would be the content of ?id=1 just as an example.
I have seen this used on many a site but not sure how this is programmed or works. Is this something to do with a htaccess document?
Thanks in advance. Appreciate all the help.
While this has been asked and answered many times, I know many people find it difficult to search for this since there are so many common "noise" words related to it. For that reason, I believe it's worth answering again.
If you're using Apache as your webserver (which I'm assuming you are since you mention .htaccess), what you're looking for to create those "clean URLs" is mod_rewrite, which takes a set of rules and rewrites the URL requested by the browser to another path or script.
You would typically enable this in your Apache config or in .htaccess, and in a simple form (a one-to-one mapping) at it would look something like this (provided mod_rewrite is installed):
RewriteEngine On
RewriteRule ^fruit$ index.php?type=1 [L]
Now obviously that doesn't scale well if you have a bunch of dynamic pages you want to create, so what you can do is tell all pages that aren't a really file or directory to be passed to a file for processing, like so:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [L]
In this case we're rewriting any request that doesn't resolve to a real file or directory to index.php, and then using the "last" flag [L] to stop processing other rules. Then in our PHP script, we can access the virtual path (in this case /fruit) by using $_SERVER['PATH_INFO'] and doing whatever conditional logic we want with that. If you don't get anything in that variable, ensure that the AcceptPathInfo On directive is set in your Apache config or .htaccess.
A way to test the basic concept/logic without having any rewrite rules would be to use a URL like https://example.com/index.php/fruit. You'll then see that in index.php $_SERVER['PATH_INFO'] will contain the string /fruit. You can rewrite URLs to files in other directories, chain rewrite rules, redirect the browser to other URLs, or even edit environment variables.
There are many good tutorials around using mod_rewrite for clean URLs, so I won't attempt to cover all the nuances here. Just know that it's a very powerful tool, but it's also pretty easy to break your rules if you aren't very comfortable with regular expressions or get lost in the many rules that are commonly in a configuration.
Note that if this is an existing site, you'll also want to use mod_rewrite or mod_redirect to redirect the old URLs to the new ones so they don't break (and for the benefit of having a single URL for search rankings).

Redirect any GET request to a single php script

After many hours messing with .htaccess I've arrived to the conclusion of sending any request to a single PHP script that would handle:
Generation of html (whatever the way, includes or dynamic)
301 Redirections with a lot more flexibility in the logic (for a dumb .htaccess-eer)
404 errors finally if the request makes no sense.
leaving in .htaccess the minimal functionality.
After some tests it seems quite feasible and from my point of view more preferable. So much that I wonder what's wrong or can go wrong with this approach?
Server performance?
In terms of SEO I don't see any issue as the procedure would be "transparent" to the bots.
The redirector.php would expect a query string consisting on the actual request.
What would be the .htaccess code to send everything there?
I prefere to move all your php files in a other directory and put only 1 php file in your htdocs path, which handle all requests. Other files, which you want to pass without php, you can place in that folder too with this htaccess:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /index.php/$0 [L]
Existing Files (JPGs,JS or what ever) are still reachable without PHP. Thats the most flexible way to realize it.
Example:
- /scripts/ # Your PHP Files
- /htdocs/index.php # HTTP reachable Path
- /htdocs/images/test.jpg # reachable without PHP
- /private_files/images/test.jpg # only reachable over a PHP script
You can use this code to redirect all requests to one file:
RewriteEngine on
RewriteRule ^.*?(\?.*)?$ myfile.php$1
Note that all requests (including stylesheets, images, ...) will be redirected as well. There are of course other possibilities (rules), but this is the one I am using and it will keep the query string correct. If you don't need it you can use
RewriteEngine on
RewriteRule ^.*?$ myfile.php
This is a common technique as the bots and even users only see their requested URL and not how it is handled internally. Server performance is not a problem at all.
Because you redirect all URLs to one php file there is no 404 page anymore, because it gets cached by your .php file. So make sure you handle invalid URLs correctly.

Moving a HTML site to PHP

I currently run a site with 750 pages of .html webpages (yeah I know it was a stupid idea, but I'm a novice). I'm looking to move these to php. I don't really want to set up 750 individual 301 redirects and rewrite each page to .php
I've heard that I can use htaccess to this. Anyone know how?
A few additional questions -
Can I permanently redirect these links from html to php without losing my search engine rankings and
if I want to add php to each of the files (i.e. a php file menu (using the include command) to make the links quicker to update will this work? Because won't they still be html files?
Sorry for the stupid questions, but I'm still learning.
Congratulations on a 750 page site - you must have put some work into that.
To collect your current list of pages use a tool called xenu to create an export into excel. You can then easily change the name the files to PHP in column b and create a .htaccees file.
However why would you want 750 php files? If you have lots of data pages, make it one page and suck in the HTML main content and reference one page. If you have a page called warehouse-depot-22-row-44.html then change that to show-warehouse-row.php?depot=22&row=44 and return that content only. This will significantly reduce your number of pages and to start using databases to render the content.
For redirecting you could use the Apache Module mod_rewrite: https://httpd.apache.org/docs/current/mod/mod_rewrite.html
You can use url rewriting to match a specific file name request with a regular expression and then decide where to redirect if matched
RewriteRule ^myname/?$ myname.php [NC,L]
http://www.addedbytes.com/articles/for-beginners/url-rewriting-for-beginners/
Depends on the structure you have.
You want the user to access them in their natural location?
/public_html/folder1/file.php
user would access like
mydomain.com/folder1/file
or you want to map them differently?
Personally I think I would use a rewrite rule to map all requests to my /public_html/index.php and would map the requests from there using php (using include for instance). This gives great flexibility, plus you have a single point of entry for your application which is very beneficial since you can easily maintain control of the application flow.
The .htaccess would look like this
#
# Redirect all to index.php
#
RewriteEngine On
# if a directory or a file exists, use it directly
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# RewriteCond %{REQUEST_URI} !^/index\.php
# RewriteCond %{REQUEST_URI} (/[^.]*|\.(php|html?))$ [NC]
RewriteCond %{REQUEST_URI} (/[^.]*|\.)$ [NC]
RewriteRule .* index.php [L]
of course I place all my not directly accessible files (everything except index and css, js, images, etc) to a folder outside the public_html to ensure no user can ever access them directly ;)
I've had a similar (yet much much smaller) site that went through the same thing.
I have this in my .htaccess:
RewriteEngine On
RewriteRule ^(.*)\.html$ $1.php [L]
This will help redirect any visitors to your .html addresses to your .php addresses.
You hopefully have an IDE (I recommend Aptana), and you can use some of the find/change functions project-wide, and hopefully with some time and patience get your internal links from .html to .php.
But, I caution you a little bit - Perhaps it is time to look into a database based CMS, such as Wordpress or Drupal?

Search-Engine Friendly URLs

I am working on building my first search-engine friendly CMS. I know that perhaps one of the biggest keys to having and SEO site is to have search-engine friendly URLs. So having a link like this:
http://www.mysite.com/product/details/page1
will result in much better rankings than one like this:
http://www.mysite.com/index.php?pageID=37
I know that to create URLs like the first one, I have one of two options:
use a web technology, in this case PHP, to create a directory structure
leverage Apache's mod_rewrite add-on to have these URLs passed to a PHP processor
As far as the PHP goes, I'm pretty comfortable with anything. However, I think the first option would be more difficult to maintain.
Could someone show me how to write an .htaccess file, which will:
silently direct SEO URLs to a processor script
not redirect if the requested URL is an actual directory on the server
Is there a better way than the way I am trying it?
You can use .htaccess for apache, create file in your root folder of web mainly "htdocs" name it ".htaccess" add next content to it
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?url=$1 [QSA,L]
Options -Indexes
</IfModule>
in your php file you can access data from $_GET
$_GET['url'];
Then you can use data to parse what you need.
Yes, the first option would be pretty hard to maintain. If you want to change the header of the pages, you'd need to recalculate all of the pages.
The simplest way to do that would be to have a PHP file named product.php or product/details.php and use the $_SERVER\['PATH_INFO'\] variable to figure out what the client requested.

What's the best way to get rid of .php suffix in url strings so they look pretty? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Remove .php extension with PHP
What's the best way to get rid of .php suffix in url strings so they look pretty?
Thank you in advance;-)
Use apache mod_rewrite (rewriting rules)
http://roshanbh.com.np/2008/02/hide-php-url-rewriting-htaccess.html
Make sure your apache installation has mod_rewrite enabled (will be in httpd.conf, or one of the files linked there, mods-enabled or such) and look into how routing works in cakePHP.
Couple of tips - the rewrite rules are found in the .htaccess files (make sure you don't have a unicode BOM if the server gives a 500 error) and if you do find you need those $_GET paramters, [qsappend] on your rewrite rule should pass them along. If you still get 500s the compilation errors on regexes can be found in apache's error log, invaluable for debugging.
Might be easier to do a simple project with mod_rewrite first, to learn how it works, as the combination of rewrite and routing in cake can get pretty complex pretty fast.
Options +MultiViews
in the Apache configuration.
Here is a gentle introduction into mod_rewrite.
The best way to do so (at least for me) is:
Use just one file to receive all request. In most of the cases it will be the index.php file.
Then, use mod_rewrite rules like this:
:
RewriteEngine On
RewriteBase /the_base_dir_of_your_app/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /the_base_dir_of_your_app/index.php [L]
Then, you can analize the URL using functions like basename($_SERVER['REQUEST_URI']); in order to decide what to do.
Use mod_rewrite - or start using ASP.NET MVC 2 :)
If you use a framework, like CakePHP (or any other) it will do it for you. For free. Right now.
.htaccess:
Permalinks
RewriteEngine on
Remove www
RewriteCond %{HTTP_HOST} ^www.yourdomain.com [NC]
RewriteRule (.*) http://yourdomain.com/$1 [R=301,L]
Links
RewriteRule ^faq$ /faq.php [L]
RewriteRule ^donations$ /donations.php [L]
RewriteRule ^contact$ /contact.php [L]
so they look pretty?
Beauty is in the eye of the beholder. It depends when you consider 'pretty'. A lot also depends on how much you want to get away from the conventions that make a working system possible and the constraints in terms of reconfiguring your site.
While others have mentioned using mod_rewrite, or URL parsing or other such approaches I'm not a fan of these - in addition to being very specific to the type of webserver the code is running on they also break the simple 1:1 mapping beween paths in URIs and paths on the webserver's filesystem.
You could just substitute '.php' with an extension of your choice...but that hardly meets my interpretation of 'pretty'.
The approach I take is to have every script (or at least every script with is intended to be entry point to generaeing a web page) is named as index.php and exists in its own uniquely named directory. The main reason for doing this is nothing to do with making the URL look nice but rather to make the codebase more manageable - I also have strict standards about the naming and placement of include files.
HTH
C.

Categories