Understanding a simple regex - php

I am developing a Symfony2 PHP application. In my Wamp server, the application is stored in www/mySite/ and my index.php is www/mySite/web/app_dev.php. Because/ of that, I have URL like 127.0.0.1/mySite/web/app_dev.php
I wanted to change the path so I acces my index file just by typing 127.0.0.1. After some research, I figured out that writting this .htacces in the www folder works :
RewriteEngine on
Rewritecond %{REQUEST_URI} !^/mySite
Rewriterule ^(.*)$ /mySite/web/app_dev.php
The only problem is that I don't understand why. Does somebody explain it to me ?
I don't really understand the two last line, and regex like ^(.*)$
Thanks

This is a simple regex indeed:
^(.*)$
Let's break it up:
^ - begging of a string
( and ) - capture group, used to match part of a string
. - any character
.* - any charactery any number of times
$ - end of a string
So, putting it all together, it means: "match any number of any characters". Later this matched part (part in parentheses) is replaced by /mySite/web/app_dev.php.
To explain regexes a little bit more we could imagine different regexes:
^lorem.*$ - string starting with word "lorem" followed by any number of any characters
^$ - an empty string
^...$ - a string containing three characters.
Now, putting it all together - Apache's rewrite rules are usually built of two directives: RewriteCond and RewriteRule. The latter directive will affect only those requests which match the condition given in the RewriteCond. You can think of them as a "if-then" pair:
# the "if" part - if request URI does not match ^/mySite
Rewritecond %{REQUEST_URI} !^/mySite
# the "then" part - then rewrite it to "/mySite/web/app_dev.php"
Rewriterule ^(.*)$ /mySite/web/app_dev.php

Rewritecond %{REQUEST_URI} !^/mySite
Check and make sure the requested uri does not("!") start with("^") "/mySite"
Rewriterule ^(.*)$ /mySite/web/app_dev.php
Then if that is true, take things starting with("^") any character(".") any amount of times("*") and send it to "/mySite/web/app_dev.php"
So a URI of /controller/site-action will be sent to that file while /mySite/css/style.css would not be.

Many places to check that will give a breakdown and explanation: http://regex101.com/

Regular expressions work character after character. In your `.htaccess it checks if the current URI matches the regex. In this image, follow the line character after character and it returns true:
^ and $ stand for the beginning and end of a string.
. allows any character and * tells to "repeat the last rule as often as possible".

Related

What does the =$1 mean in url rewriting?

I can't find any information on stackoverflow or google about the meaning of =$1. I get superficial information but nothing for beginners like me. What does it do?
If I have something like this:
www.website.com/profile.php?simon
Does the name simon correspond to the $1 variable and why 1?
This is how I understand it:
(.*) profile/profile.php?id=$1
The bold corresponds to:
www.website.com/profile.php?id=simon
Converted with rewrite it becomes:
www.website.com/profile/simon
Am I missing something here?
Edit:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteCond %{REQUEST_FILENAME}.php -d
RewriteRule ^(.*)$ /profile/index.php?id=$1
Does this change
localhost/test/index.php?philip
to:
localhost/test/profile/philip
I tried to enter the url but it failed. I understand what regex does but somehow im utterly confusing how the replacement works.
Backreference:
RewriteRule ^.*$ /?id=$1
$1 would be blank
RewriteRule ^(.*)$ /?id=$1
$1 would be whatever .* matched
RewriteRule ^(a|b|c)/(d|e|f)$ /?id=$1-$2
$1 would be either "a", "b", or "c", depending on which one matched, and $2 would be either "d", "e", or "f", depending on which one matched.
See: http://httpd.apache.org/docs/trunk/rewrite/intro.html#regex
One important thing here has to be remembered: Whenever you use parentheses in Pattern or in one of the CondPattern, back-references are internally created which can be used with the strings $N and %N (see below). These are available for creating the Substitution parameter of a RewriteRule or the TestString parameter of a RewriteCond.
Captures in the RewriteRule patterns are (counterintuitively) available to all preceding RewriteCond directives, because the RewriteRule expression is evaluated before the individual conditions.
Figure 1 shows to which locations the back-references are transferred for expansion as well as illustrating the flow of the RewriteRule, RewriteCond matching. In the next chapters, we will be exploring how to use these back-references, so do not fret if it seems a bit alien to you at first.
Does this change
localhost/test/index.php?philip to: localhost/test/profile/philip
No, It changes localhost/test/profile/philip to localhost/profile/index.php?id=philip. Assuming that the rule is in an htaccess file that is in your "profile" directory, then:
Browser types in or clicks on the link: localhost/test/profile/philip
The request is sent to localhost: /test/profile/philip
The request makes its way through apache's processing pipeline and mod_rewrite is applied to it, and the request is truncated to philip
Assuming that philip is neither a directory or file, the rule matches (.*) to it, and the string philip is captured
The rule then rewrites the request to /profile/index.php?id=philip
First, use Apache documentation rather than Google searches or Forums it's more helpful.
http://httpd.apache.org/docs/2.2/rewrite/intro.html#regex
And this
http://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritecond
Now (.*) is a parenthesized capture group in Regex. It says to match any single character and the asterisk means to repeat it 0 or more times.
When there is only 1 capture group. The numbered back reference is $1. Additional capture groups used or added will then be $2, $3 and so on.
For this example
www.website.com/profile/simon
You would get this rewrite rule.
RewriteRule (.*) profile/profile.php?id=$1
But your back reference $1 won't be simon, it will be profile/simon because you matched all characters requested using (.*).
If you only want to match simon you need to use a partial match like this.
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteCond %{REQUEST_FILENAME}.php -d
RewriteRule ^profile/(.+)/?$ profile/profile.php?id=$1
Then your $1 will only be simon and also the rule won't match any empty strings, meaning if there is no text after /profile/ it won't process the rewrite.
Let me try to explain in layman's terms.
Let's say you would normally link to a page like this...
/listing.php?id=2146_east_fifth_street
Then you create a rewrite rule like this...
RewriteRule ^([A-Za-z0-9_-]+)$ listing.php?id=$1 [NC,L]
This part ^([A-Za-z0-9_-]+)$ says to accept any querystring parameter with uppercase letters / lowercase letters / 0-9 / underscores / hyphens
This part listing.php?id=$1 says what page will be served up to the browser. the $1 asks for the first querystring parameter and appends it to the URL like this... your-domain.com/2146_east_fifth_street
That's what you see in the URL bar instead of... your-domain.com/listing.php?id=2146_east_fifth_street
EDIT
The second part of the rewrite rule is where the "real" page is located.
If you want your url to read /profile/philip
Your rewrite rule would start with /profile/ like this...
RewriteRule ^profile/(.*)$ path/to/the/real/file/index.php?id=$1
in .htaccess $1 is a back-reference to a group, usually from a regex statement.
Each group has its own reference, so a rewrite like
RewriteRule /profile/(.*)/([0-9]) /profile/index.php/$1/$2
$1 would equal the value of (.*) that group
$2 would equal the value of ([0-9]) which can only include numbers
and so on...
It helps when id numbers and url's are dynamic. So you do not need to manually add them one by one.
Example url:
website.com/profile/idealcastle/25555
And then in php or other languages, you can pull these "url segments". Just like using a "query" parameter, ?id=simon It's much better to use proper urls for SEO purposes.

mod_rewrite RewriteRule for abc.php?a=1&b=2 to abc/2.html

I am a real newbie to the either mod_rewrite or Regex.
Therefore I just need your help for the following problem.
I got a PHP-Page that looks just like:
stuff.php?id=1&text=2
I know want to to look like
stuff/2.html
Do anyone of you have the RewriteRule line for the htaccess in mind to let it look just like this?
Thanks a lot in advance!
A rewrite rule for this particular page:
RewriteRule ^stuff/2\.html$ stuff.php?id=1&text=2
And if 2 should be dynamic:
RewriteRule ^stuff/([0-9]+)\.html stuff.php?id=1&text=$1
A little explanation:
^ and $ stand for start and end of the string, so we don't match longstuff/2.html.php.
The dot has to be escaped \. because otherwise it has a special meaning in RegEx ("any character")
the parantheses in the second pattern are a "capture group", their content will be available in the rewrite as $n (with n = number of capture group, in this case 1)
[0-9] is a character class, matches one character of the class, in this case a digit
+ means "one or more"
Here's a rule to redirect stuff/2.html to stuff.php?id=1&text=2
RewriteRule ^stuff/([\d]+)\.html$ stuff.php?id=1&text=$1 [L]
Notice [\d]+ will only accept numbers, if you want to allow letters and caret, use the following rule :
RewriteRule ^stuff/([\w-]+)\.html$ stuff.php?id=1&text=$1 [L]

URL Rewrite with htaccess and PHP

I have a URL: search/?word=asdf and want to redirect to: search/word/asdf/ and running internally: ?cmd=search&word=asdf
This so you can get the PHP $ _GET ['cmd'] and $ _GET ['word'].
How to do it in htaccess?
EDIT:
My .htaccess now is:
RewriteRule search(.*) %{HTTP_REFERER}cmd/search$1
RewriteRule cmd/search/?key-word=(.*) %{HTTP_REFERER}cmd/search/key-word/$1
But this not working. The new URL ever is:
localhost/bruc/sandbox/electrolux/trunk/cmd/search/?key-word=asdf
But it should be: localhost/bruc/sandbox/electrolux/trunk/cmd/search/key-word/asdf
So, I redirect this correct URL to: localhost/bruc/sandbox/electrolux/trunk/?cmd=search&key-word=asdf
But not working fine! Try, my approach here: http://htaccess.madewithlove.be/
Try RewriteRule ^([^/]*)/word/([^/]*)$ /?cmd=$1&word=$2 [L]. I believe that will accomplish your goal.
Try this :
RewriteEngine on
RewriteRule ^search/word/(.*)$ /?cmd=search&word=$1 [L]
Check this.
RewriteEngine on
RewriteRule ^([^/]+)/([^/]+)/([^/]+) /?cmd=$1&word=$2 [L]
There are three parts to this:
RewriteRule specifies that this is a rule for rewriting (as opposed to a condition or some other directive). The command is to rewrite part 2 into part 3.
This part is a regex, and the rule will be run only if the URL matches this regex. In this case, it says - look for the beginning of the string, then a bunch of non-slash characters, then a slash, then another bunch of non-slash characters. then again bunch of non-slash characters, then a slash, then another bunch of non-slash characters. The parentheses mean the parts within the parentheses will be stored for future reference.
Finally, this part says to rewrite the given URL in this format. $1 and $2 refer to the parts that were captured and stored.

Trouble with URL rewriting in .htaccess

My .htaccess file looks like this:
RewriteEngine On
RewriteRule ^articles/(\d+)*$ ./articles.php?id=$1
So, if the URL foo.com/articles/123 is requested, control is transferred to articles.php?id=123.
However, if the requested URL is:
foo.com/articles/123/
or
foo.com/articles/123/whatever
I get a "404 Not Found" response.
I would like to call articles.php?id=123 in all these cases. So, if the URL starts with foo.com/articles/[digits]... no matter what other characters follow the digits, I would like to execute articles.php?id=[digits]. (The rest of the URL is discarded.)
How do I have to change the regular expression in order to achieve this?
Just don't look for the end:
RewriteRule ^articles/(\d+) ./articles.php?id=$1
You do need to allow the trailing / with:
RewriteRule ^articles/(\d+)/?$
The \d+ will only match decimals. And the $ would disallow matches beyond the end.
If you also need trailing identifiers, then you need to allow them too. Then it might be best to make the match unspecific:
RewriteRule ^articles/(.+)$
Here .+ matches virtually anything.
But if you want to keep the numeric id separate then combine those two options:
RewriteRule ^articles/(\d+)(/.*)?$ ./articles.php?id=$1

(curiosity) .htaccess RewriteCond directive

I want to know what following code is doing in .htaccess file
RewriteEngine On
RewriteCond %{REQUEST_URI} !(swf|thumbs|index.php|template.php)
RewriteRule ^([^/\.]+).php?$ template.php?cat=$1 [L,QSA]
Thanks in advance....
To answer the question about what the regexp actually means (as per my comment on the original answer above):
Each part in order:
^ - start of line
() - the grouping that $1 represents
[^/\.] - any character that is not / or a literal .
+ - more than one of the above character class
.php - obvious (though the . should be escaped, so it should be \.php)
? - unescaped in a regexp means 0 or 1 of the previous character
$ - end of line.
You'd probably be best off reading some regexp tutorials such as:
http://www.regular-expressions.info/tutorial.html
If the request uri does not contain "swf", "thumbs" and so on
RewriteCond %{REQUEST_URI} !(swf|thumbs|index.php|template.php)
make /template.php?cat=etc out of /etc.php
RewriteRule ^([^/\.]+).php?$ template.php?cat=$1 [L,QSA]
L = "last rule" and QSA = append any existing query string to the newly created target uri.
The first line turns on the rewrite engine
The second line says "if the request uri doesn't contain swf or thumbs or index.php...
The third line will only be executed if line 2 was true (i.e. the URL didn't contain any of those strings). It will rewrite a URL like /123.php? to template.php?cat=123. The rewrite is internal so the user will just see the /123.php in their browser window.

Categories