php & .htaccess clean url problem - php

i wanna use clean url for my site but i have an big problem!
i have urls like :
index.php?lang=en&mod=product&section=category
index.php?lang=en&mod=product&caption=fetch&id=45
index.php?lang=pe&mod=blog&section=category&id=560
index.php?lang=pe&mod=blog&section=category&id=564
index.php?lang=pe&mod=blog&section=category&id=567
index.php?lang=pe&mod=blog&section=category&id=571
index.php?lang=pe&mod=blog&id=556
index.php?lang=pe&mod=page&id=537
index.php?lang=pe&mod=blog&id=558&o_t=cDate_ASC
index.php?lang=pe&mod=product&caption=fetch&id=7804
As you see i have a problem that my varibale's order is diference toghether and my 3rd or 4th variable are not stable sometimes it's id or sometimes is caption.
i want to set my template url to ( e.g en/product/category ) but when i want to set it in .htaccess it's not clear that theird depth is "id" or is "caption" !
do i should put all variables in my url like this ? :
index.php?lang=en&mod=product&section=category
|
|
|
V
index.php?lang=en&mod=product&section=category&caption=&id=&o_t=&v_t=&offset=
EDIT :
So i use smarty as my template engine.i should change my link address in templates like my clean url ( e.g en/product/category/324 ) . my problem is when i set a link to en/product/34 or en/product/category/23 according to my .htaccess rewrite rules it's not clear that 3rd part is id or category
in this case :
RewriteRule ^/(en|pe)/(product|blog|page)/(category)/([0-9]{1,})/$ index.php?lang=$1&mod=$2&section=$3&id=$4
3rd variable is category an .htaccess define 3rd part as category but as you can see sometimes url has not category and instead of it has id !
My big problem is this

You'd need to make a few rewrite rules I think.
E.G.
RewriteRule ^/(en|pe)/(product|blog|page)/([0-9]{1,})/$ index.php?lang=$1&mod=$2&id=$3
Would rewrite index.php?lang=en&mod=page&id=22 to /en/page/22 (so long as ID was > 1 character)
RewriteRule ^/(en|pe)/(product|blog|page)/(category)/([0-9]{1,})/$ index.php?lang=$1&mod=$2&section=$3&id=$4
Would rewrite index.php?lang=en&mod=blog&section=category&id=22 to /en/blog/category/22
You may need to fiddle with the ^/ at the start depending on if you have a RewriteBase set or not.
EDIT:
Explanation:
^ indicates the starting position
of the URL from the base i.e.
site.com/(whatever here is in the
URL)
(en|pe) means that first
value in that particular rule can be
EITHER en OR pe. To add more is easy
(en|pe|ru|jp) etc. Same goes for the
product/blog/page part. I included
(category) just incase you had other
'section' types that were not
'category'.
[0-9] is any
numeric character 0 to 9. {1,} means
1+ character in length. If you want
between 2 and 4, do {2,4} for
example. Exactly 3 characters? {3}.
It's useful when targetting specific
things.
$ Means the end of the matched
string. If you intend on having
nothing after the id except a /
(could even remove that /) then use
that example as is. If you intend on
having a title of a blog past
afterward, you can do (.*)$ which
means anything can be after the page
id e.g.
/en/blog/category/22/oheyoheyoheyohey
would be the same as
/en/blog/category/22/abcjhrefgwgrjurgh.
If you pass the title as a parameter
&title=this is the title, just do the
same thing as I did in the example
for ID except use [a-zA-Z0-9-+_.] to
include alphanumeric characters, +,
-, _, .
$1 is the order of the paranthesis arguments in the
first argument of the rule. E.G. $1
refers to (en|pe), so lang can either
be en or pe.
IF you want the rule to apply to multiple pages, and not just the index.php, make it:
RewriteRule ^/([a-zA-Z])/(en|pe)/(product|blog|page)/([0-9]{1,})/$ $1.php?lang=$2&mod=$3&id=$4
So in that case, site.com/blah/en/product/22 would relate to site.com/blah.php?lang=en&mod=product&id=22

Why should you do so? I see no problems with your urls, normal user does not mind and those who inspet your site can read it all without problems ...

You have to agree on some ordering of parameters, and use mulitple rewrite rules.
For the order lang > mod > section > caption > id > o_t > v_t > offset you want to have something like this:
RewriteRule ^/(\w+)$ index.php?lang=$1
RewriteRule ^/(\w+)/(\w+)$ index.php?lang=$1&mod=$2
RewriteRule ^/(\w+)/(\w+)/(\w+)$ index.php?lang=$1&mod=$2&section=$3
RewriteRule ^/(\w+)/(\w+)/(\w+)/(\w+)$ index.php?lang=$1&mod=$2&section=$3&caption=$4
RewriteRule ^/(\w+)/(\w+)/(\w+)/(\w+)/(\d+)$ index.php?lang=$1&mod=$2&section=$3&caption=$4&id=$5
...
and so on
Above, I assume lang, mod, section and caption are made up of alphabet characters (no digits or special chars), and the id is made of digits.

The real file index.php does not care about the order of variables in the query string, or if a value is missing, because it receives couple with name-value: the right association is ensured by the presence or absence of the full couple.
To be clear, for index.php is the same if the query string is "?lang=en&mode=blog" or "?mode=blog&lang=en" or if it is only "?lang=en", because the variables are managed inside the script by use of the $_GET associative array, independently by their order or presence inside the array.
What is important is that you plan a correct order of variables inside the new virtual URL's to rewrite because they contain only the variable content, while the variable name is taken from the position inside the virtual URL. That is (note this is pseudo-code):
yourdomain/val1/val2/val3/...etc
to be rewritten in:
index.php?var1=val1&var2=val2&var3=val3&...etc
so in the new URL's you are going to plan, there cannot be missing values.
You can solve this problem by assigning fake values to the missing variables that will not be taken as valid by your script.
As example, if the mode variable is missing, you can put in that position a string that will not be considered valid by the script, so to be managed as if it was empty.
If you have an array of allowed values, you can just add a control *if (in_array())* instead of (or other than) if(empty()).
When you build the links to other page, you can just add this control for a missing value:
*if (empty(val3)) val3 = 'fake_value';*

Related

How to find the last instance of a character

I'm building this real estate script using PHP and I want the listing page url's to be like /listing/this-is-the-title-436. This url is generated in PHP and the last part of the url, after the last instance of ' - ' is the listing id. But I cannot find a way to find the last instance of a dash and use the rest as a variable in .htaccess.
Note that the title can have any amount of spaces therefore any amount of dashes but the listing id will always be at the end, after the last dash.
To summarize, I want urls like /listing/this-is-the-title-436 to redirect to /assets/inc/listing.php?listing=436 with .htaccess.
Any help would be appreciated, thanks!
The easiest way is to test a numerical value at the end:
RewriteEngine on
RewriteRule ^listing/.+-(\d+)$ /assets/inc/listing.php?listing=$1 [L,QSA]
But if you're not just using numeric values, you can also test for the absence of - in the last part:
RewriteRule ^listing/.+?-([^-]+)$ /assets/inc/listing.php?listing=$1 [L,QSA]

Clean Url without Parameter and ? sign

I show in many website that there URLs are clear and perfect, which means in code when I check the href they use are like this:
<a href="HTTP://domain.com/123>
and when we click on it the accurate go to ABC page that name is taken from database and in my code:
<a href="HTTP://domain.com/movie.php?mno=123>
How can I do that type of URL for my website?
Using mod_rewrite for Apache. For example, place into a .htaccess file:
RewriteEngine on
RewriteBase /
RewriteRule ^([a-zA-Z0-9_-]*)$ movie.php?moo=$1
Presuming your valid IDs are made of a-z, A-Z, 0-9, _ and -. Now all queries to domain.com/ABC09-_ etc. will be redirected to domain.com/movie.php?moo=ABC09-_.
Note that you will probably want to actually rewrite the requested entry ID into a proper key-value pair, rather than using it as a query key without a value. In the above, the requested ID would be available under $_GET['moo']. In the pattern you posted, movie.php?ABC, you would have a different variable for each ID, e.g. 'ABC' would be under $_GET['ABC'], etc.

Grabbing a domain name from URL as a variable by htaccess

Imagine in my website I want to show some analytic about domains, working URL example of what I need:
http://whois.domaintools.com/google.com
As you see in the above URL, it's handling google.com as a variable and pass it to another page to process the given variable, that's exactly what I want.
So for detecting that kind of variable, here is my regex:
/^[a-zA-Z\d]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]+$/
The above RegEx is simple and accepts everything like: google.com, so in my .htaccess file I have:
RewriteRule (^[a-zA-Z\d]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]+$) modules/pages/page.php?domain=$1
The above rule do what I want, but it also redirects my homepage to page.php while there is nothing in the URL, forexample: http://mysitename.com is now being forwarded to page.php
How can I fix this?
Thanks in advance
It redirects also the base domain to page.php because of the regex. You are using the + on all places, the meaning of the plus is "Matches the preceding pattern element one or more times.". (http://en.wikipedia.org/wiki/Regular_expression) If you request the homepage, it redirects because all the elements are appearing zero times, like you defined in the regex.
Instead of the + you should define a minimum and a maximum amount of characters (so the zero occurrences are not evaluated). BTW, a quick search in google for "regex domain" will output a lot of results, which are tested. Use the following for example:
RewriteEngine on
RewriteRule (^(([a-zA-Z]{1})|([a-zA-Z]{1}[a-zA-Z]{1})|([a-zA-Z]{1}[0-9]{1})|([0-9]{1}[a-zA-Z]{1})|([a-zA-Z0-9][a-zA-Z0-9-_]{1,61}[a-zA-Z0-9]))\.([a-zA-Z]{2,6}|[a-zA-Z0-9-]{2,30}\.[a-zA-Z]{2,3})$) modules/pages/page.php?domain=$1
Reference:
Domain name validation with RegEx
Update 1:
If you want to use your own regex, exchange the last "+" with {2,}. The top-level domains have usually at least 2 characters.
RewriteEngine on
RewriteCond %{REQUEST_URI} !(\.html|\.php|\.pdf|\.gif|\.png|\.jpg|\|\.jpeg)$
RewriteRule (^[a-zA-Z\d]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]{2,}$) modules/pages/page.php?domain=$1

Detecting language and keeping current url schema

Currently I just have one language in my site,
And I implemented the friendly urls vía the .htaccess, like:
RewriteRule ^post/(.+)/(.+) post.php?id=$2&friendly=1
So:
domain.com is the homepage and domain.com/the-title/5 is the page for the post with ID 5.
Now I would like to make that as the default language urls, and for example, next language would be:
domain.com/es is the homepage and domain.com/es/the-title/6 is the page for the post with ID 6 in spanish. (but previous rule should work, too)
Question is,
How should I adapt my (or additional) rewrite rules to check for the 2 first chars of the url (first split) and add it as a param, like: &lan=es and if it's not found then don't add this parameter?
Lets say:
^post/(.+)/(.+) post.php?id=$2&friendly=1 (english)
^es/post/(.+)/(.+) post.php?id=$2&friendly=1&lan=es (spanish)
But if posible,
To just work with more languages (and add, if needed, the extra parameter),
To just work wit other rules, like:
^es/photo/(.+)/(.+) photo.php?id=$2&friendly=1&lan=es (spanish)
Any suggestions?
Something like this might work. I haven't tested it but you can use RewriteCond to check for a specific structure of the uri and if it matches, use the following rule. If it doesn't then continue on to the original rule.
#Does the uri match 2 characters followed by /post/?
RewriteCond %{REQUEST_URI} ^../post/
#then use this rule and stop processing rules
RewriteRule ^(..)/post/(.+)/(.+) post.php?id=$3&friendly=1&lan=$1 [L]
#Else use this rule
RewriteRule ^post/(.+)/(.+) post.php?id=$2&friendly=1&lan=en
Edit: I added a default language to the end of the second rule. This way there is always a $_GET['lan'] parameter. You could leave it off and set a default in php. Your choice, no difference.
I can only answer you with advice cause we need more context...
Use default pages to do a temporary redirect (302) to the default langauge or the user language.
Use always the same scheme to get the language from the same pattern (http://mydomain.com/en/mypage.php)
Use complete language codes if you will have a large public or for much content, like en_US, fr_FR, fr_CA ...
Prefer negative search in your regex to avoid to capture the following characters, like "before/([^/]+)/after", in some cases, this is mandatory.
If you don't have the language information, the user is not coming from a valid url, redirect him to a page with language informations (default or user language).
If user is using direct php link, redirect him to the official link, to avoid duplicate content. You can use $_SERVER['REQUEST_URI'] to check it.
Use a framework to manage it or at least a base to control the routes.
With these advices, you could use only the following rewrite rule for all your website:
RewriteRule ^([^\/]+)/([^\.]+)\.([\.]+)$ index.php?lang=$1&route=$2&format=$3 [L,QSA]
Here I capture the language (es, en, en_US, fr...), the route (post/5, gotabeer, cats/postit/thumb/2) and the format (html, json, jpeg...).
(I didn't try the rewrite rule but it should work)
Here is what I would suggest:
RewriteRule ^/?((en|es)/)?post/(.+)/(.+)$ post.php?id=$4&friendly=1&lan=$2
Where /? allows optional forward slash at begining of string. This makes rule able to be moved interchangeably between htaccess directory contact and httpd.conf server context
((en|es)/)? Allows for optional specification of one of two accepted language codes.
Note that I did not suggest a wildcard for the language part, as I assume you are only working with a known subset of languages, so using something other than a known language code (or missing the entire thing) should fall through to handling be other rules (or perhaps result in 404).
If this is not the case you can change the first portion of the regex from (en|es) to (.{2}) if you expect exactly two characters, or perhaps (.{2}(-.{2})) if you expect to also handle language codes like es-ES.
This should work for you:
RewriteEngine On
RewriteRule ^([a-z]{2})/post/([^/]+)/([0-9]+)/?$ post.php?id=$3&friendly=1&lan=$1 [L,QSA]
RewriteRule ^post/([^/]+)/([0-9]+)/?$ post.php?id=$2&friendly=1&lan=en [L,QSA]

How to load a specific page for any given pathname URL

Let's say I have a web-page called www.mysite.com
How can I make it so whenever a page is loaded like www.mysite.com/58640 (or any random number) it redirects to www.mysite.com/myPHPpage.php?id=58640.
I'm very new to website development so I don't even really know if I asked this question right or what languages to tag in it...
If it helps I use a UNIX server for my web hosting with NetWorkSolutions
Add this to your .htaccess file in the main directory of your website.
RewriteEngine on
RewriteBase /
RewriteRule ^([0-9]+)$ myPHPpage.php?id=$1 [L]
Brief explanation: it says to match:
^ from start of query/page
[0-9] match numbers
+ any matches of 1 or more
$ end of page requested
The parentheses part say to look for that bit and store it. I can then refer to these replacement variables in the new url. If I had more than one parentheses group then I would use $2, $3 and so on.
If you experience issues with the .htaccess file please refer to this as permissions can cause problems.
If you needed to capture something else such as alphanumeric characters you'd probably want to explore regex a bit. You can do things such as:
RewriteRule ^(.+)$ myPHPpage.php?id=$1 [NC, L]
which match anything or get more specific with things like [a-zA-Z0-9], etc..
Edit: and #Jonathon has a point. In your php file wherever you handle the $_GET['id'] be sure to sanitize it if used in anything resembling an sql query or mail. Since you are using only numbers that makes it easy:
$id = (int)$_GET['id']; // cast as integer - any weird strings will give 0
Keep in mind that if you are not going to just use numbers then you will have to look for some sanitizing function (which abound on google - search for 'php sanitize') to ensure you don't fall to an sql injection attack.

Categories