I am attempting to implement an oembed provider using the Silverstripe framework but have come across an issue.
I have a controller routed from the url /omebed.json and it works fine if I call something like /omebed.json?mediaurl=mymovie.mp4.
However the Oembed standard states it should be /omebed.json?url=mymovie.mp4
But Silverstripe internally checks the $_GET['url'] variable and will attempt to route to that page/controller.
So SilverStripe is trying to route to /mymovie.mp4 skipping my controller and hitting the ErrorPage_Controller creating a 404.
I'm thinking im going to have to extend the ErrorPage_Controller and rejig it if the url is oembed.json, but this seems a little hackish.
Any suggestions?
Cheers
Extending on #Stephen's answer, here is a way to get around that issue without duplicating main.php and without modifying it directly.
What I did was create a _ss_environment.php file which is added early on in the loading process of Silverstripe.
_ss_environment.php
global $url;
$url = $_GET['raw_url'];
if (isset($_GET['url']))
{
unset($_GET['url']);
}
// IIS includes get variables in url
$i = strpos($url, '?');
if($i !== false)
{
$url = substr($url, 0, $i);
}
.htaccess
RewriteCond %{REQUEST_URI} ^(.*)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !\.php$
RewriteRule .* framework/main.php?raw_url=%1 [QSA]
So here is what is happening:
The .htaccess is now using raw_url instead of url
_ss_environment.php is being called early in the loading process, setting the global $url variable that main.php normally sets. This is set with raw_url rather than url.
To prevent main.php to just override it again when it sees your url query string parameter, it is unset (Silverstripe seems to reset this later as far as my test is concerned).
Lastly is a little block of code that main.php would normally run if $_GET['url'] is set, copied as-is for apparent support in IIS. (If you don't use IIS, you likely won't need it.)
This has a few benefits:
No update to main.php allows upgrading Silverstripe slightly easier in the future
Runs the minimal amount of code needed to "trick" Silverstripe into thinking it is running normally.
The one obvious drawback to any solution for changing away form the url query string parameter is if anything looks at the parameter directly. With how Silverstripe works, it is more likely that code uses the $url global variable or the Director class rather than looking at the query string for the current URL.
I tested this on a 3.1 site by doing the changes I mentioned and:
Creating a controller called TestController
In the init function of the controller, I am running the following:
var_dump($_GET['url']);
var_dump($this->getRequest()->getVars());
Visited /TestController?url=abc123, saw the value of both dumps have "abc123" as the value for the URL parameter.
Navigated to a few other custom pages on the site to make sure they were still working (no issues that I saw)
Unfortunately, I haven't been able to find documentation for the order of inclusion in regards to _config.php and _ss_environment.php. However, after browsing through the code, I have worked out it is this:
main.php runs, first main task is to require core/Constants.php
Constants.php's first task is to search for _ss_environment.php in the base folder and potential parent folders. If it finds it, it will be included.
Going back to main.php (and after the $_GET['url'] check is done in main.php), it will start an ErrorControlChain which it internally does another require for core/Core.php
Inside Core.php, it performs calls for the config manifest
ConfigManifest.php exposes the functions to actually add _config.php files and for them to be required.
I could probably go on however I think this gives a pretty good picture of what is going on. I don't really see a way around not using the _ss_environment.php file. Nothing else gets included early enough that you can hook into without modifying core code.
I had a quick play with this the other day. And looking at what main.php does it might be best to hack away at it rather than ErrorPage_controller.
For startes SS's default .htaccess file does this:
<IfModule mod_rewrite.c>
SetEnv HTTP_MOD_REWRITE On
RewriteEngine On
# RewriteBase /silverstripe
RewriteCond %{REQUEST_URI} ^(.*)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* framework/main.php?url=%1&%{QUERY_STRING} [L]
</IfModule>
Note the ?url changing that to something else and then changing main.php's usage as well may/should help or will cause a heap of extra errors and sadness.
To avoid hacking the core/framework, you could change the .htaccess to target a copy of main.php in mysite (with appropriate include changes).
Related
I have a userprofile system in which a dynamic page (profile.php) changes as the id of user changes..
For eg. profile.php?id=2 displays the profile of user having id=2.. But i want the address to be as user/user_name.php. So providing each user a unique profile-page address..
Is it possible without creating a seperate page for each user?
Thnx
Ok, let´s talk about apache´s mod_rewrite. Basically what people usually do is that they setup one php page eg. index.php and redirect all the requests there (except those that request existent files and directories) and index.php then routes these requests to proper files/presenters/controllers, etc.
I´m gonna show you a very simple example how can this be done, it´s just to give you the idea how it works in basics and ofc there are better ways to do this (for example take a look at some framework).
So here is the very simple .htaccess file, placed in the same directory as index.php:
<IfModule mod_rewrite.c>
RewriteEngine On
# prevents files starting with dot to be viewed by browser
RewriteRule /\.|^\. - [F]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php?query=$1 [L]
</IfModule>
And here is the index.php:
<?php
$request = explode("/", $_GET["query"]);
// now you have your request in an array and you can do something with it
// like include proper files, passing it to your application class, whatever.
// for the sake of simplicity let me just show you the example of including a file
// based on the first query item
// first check it´s some file we want to be included
$pages = array("page1", "page2", "page3");
if(!in_array($request[0], $pages)) $request[0] = $pages[0];
include "pages/".$request[0];
But I highly recommend you not to reinvent the wheel and take a look at some existing php framework. You´ll find out that it saves you a lot of work, once you learn how to use it ofc. To mention some - Zend Framework, Symfony and the one I´m using - Nette Framework. There are many more, so choose whatever suits your needs.
I am trying to create my own PHP MVC framework for learning purpose. I have the following directory structure:
localhost/mvc:
.htaccess
index.php
application
controller
model
view
config/
routes.php
error/
error.php
Inside application/config/routes.php I have the following code:
$route['default_controller'] = "MyController";
Now what I am trying to achieve is when any user visits my root directory using browser I want to get the value of $route['default_controller'] from route.php file and load the php class inside the folder controller that matches with the value .
And also if any user tries to visit my application using an url like this: localhost/mvc/cars, I want to search the class name cars inside my controller folder and load it. In case there is no class called cars then I want to take the user to error/error.php
I guess to achieve the above targets I have to work with the .htaccess file in the root directory. Could you please tell me what to code there? If there is any other way to achieve this please suggest me.
I have tried to use the .htaccess codes from here, but its not working for me
It all sounds well and good from a buzzword standpoint, but to me this is all a little confusing because I see PHP's model as an MVC model already. It's providing the API for you to program with and deliver your content to your web server Apache and your database (something like MySQL). It translates the code(model) for you into HTML(view) ... provided that's what you intend, and you're supplying code as the user input (control). Getting too wrapped up in the terminologies gets a little distracting and can lead to chaos when you bring someone in to collaborate who isn't familiar with your conventions. (This should probably never be used in a production environment for a paying gig.)
I can tell you that on the page that you referenced they guy's .htaccess file needs a little work. The [L] flag tells mod_rewrite that this is the last command to process when the rule returns true. So you would either need to do this:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^(.*)$ public/$1 [L]
</IfModule>
Or the following... but he was using a passthru flag which means that he is implying there are other things that could be processed prior to the last rule (eg. might be rewrite_base or alias), but that's not actually the case with his .htaccess file since it's a little bare. So this code would work similar to the code above but not exactly the same. They can't be used together though, and really there would be no need to:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*) index.php?url=$1
</IfModule>
The difference is the in the way it's processed. On the first .htaccess example you're passing any file to index.php regardless of whether it exists or not. You can [accidentally] rewrite a path that has a real file so that the real file is never accessed using this method. An example might be you have a file called site.css that can't be accessed because it's being redirected back to index.php.
On the second ruleset he's at least checking to see if the server doesn't have a file or a directory by the name being requested, then they're forwarding it to index.php as a $_GET variable (which seems a little pointless).
The way I typically write these (since I know mod_rewrite is already loaded in the config) is to to this:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^mydomain.com
RewriteRule (.*) http://www.mydomain.com/$1 [R=301,L]
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteCond %{SCRIPT_FILENAME} !-d
RewriteRule .* index.php
In my PHP code I pull the $_SERVER['REQUEST_URI'] and match it against a list of URIs from the database. If there's a match then I know it's a real page (or at least a record existed at some point in time). If there's not a match, then I explode the request_uri and force it through the database using a FULLTEXT search to see what potentially might match on the site.
Note: if you blindly trust the request_uri and query the database directly without cleaning it you run the risk of SQL injection. You do not want to be pwnd.
<?php
$intended_path = $_SERVER['REQUEST_URI'];
if(in_array($intended_path,$uris_from_database)){
//show the page.
} else {
$search_phrase = preg_replace('!/!',' ',$intended_path);
$search_phrase = mysqli_real_escape_string($search_phrase);
$sql = "SELECT * FROM pages WHERE MATCH (title,content) AGAINST ('$search_phrase');"
}
Sorry if this sounds a bit pedantic, but I've had experience managing a couple of million dollar (scratch) website builds that have had their hurdles with people not sticking to a standard convention (or at least the agreed upon team consensus).
I've always been bad at apache and used very simple solutions. Right now I have built a cms software.. but the .htaccess is starting to be a huge downsize.
I will first explain, how my friendly-urls work and look like. My language-switch is url based and always contains two characters. And it looks like this: stackoverflow.com/en/ this makes the switching really easy and since its url based.. it works well in the SEO terms. Also, if no language-id is set, then the default language will be used (stackoverflow.com/).
There are no page-ids in numbers. I have unique page-ids in text: stackoverflow.com/services.html and for SEO and folder-directories-anti-conflict purposes .html at the end..
For subpages I have "$current_page" and "$parent_page" style variables: stackoverflow.com/services/translating.html Services being the parent and translating being the current page.
Some sample code too (I nerfed it alot, so you don't think its incomplete):
RewriteRule ^(et|en|fi)\/(.+)\/(.+)\.html index.php?language=$1&pagelink=$3&parentlink=$2 [L,NC,QSA]
RewriteRule ^(.+)\/(.+)\.html index.php?language=0&pagelink=$2&parentlink=$1 [L,NC,QSA]
RewriteRule ^(.+)\.html index.php?language=0&pagelink=$1&parentlink=0 [L,NC,QSA]
How can I make the language-switch part more dynamic?
This method ..^(et|en|fi)\/.. means, that when I set up the cms, I must manually set the languages list. Best bet would be to set it somehow from the cms settings. Because, this way there are no conflicts related to folders. Is it possible global apace variable via php and then display it the .htaccess file? Something like this: ..^(LANGUAGELISTS)\/..? If this isn't possible, then next best thing would be to match 2 characters in that location and pass it as $_GET['language'].
How can I have unlimited parents dynamically?
Meaning, that the "$parent_page" is not set statically and I have unlimited children, similar to this: stackoverflow.com/services/translating/english/somesubpage.html. If that is possible, then also, how will it be used in the php, with an array?
Bounty edit
First part of the question is basically solved, unless somebody comes up with some php -> apache-array -> .htaccess way.
However, the second part of the question is still not solved. Since this is been the problem with all my projects and could possibly help somebody else in the future, I decided to add bounty to this question.
To answer your first question:
You could use RewriteRule ^([a-zA-Z]{2})([/]?)(.*)$ path/file.php?language=$1
This limits the first string to two characters and passes it on to $_GET['language']
Edit: adding RewriteCond %{REQUEST_FILENAME} !-f
and RewriteCond %{REQUEST_FILENAME} !-d will prevent conflicts with existing directories / files
Second question is much more difficult..
Update:
What Shad and toopay say is a good start in my opinion.
Using explode() to seperate levels and comparing it to the slug is quite simple.
But it's getting complicated once you want to add flexibility to the script.
function get_URL_items() {
$get_URL_items_url = $_SERVER['REQUEST_URI'];
$get_URL_items_vars = explode("/",$get_URL_items_url);
for ($get_URL_items_i = 0; $get_URL_items_i < count($get_URL_items_vars) ; $get_URL_items_i++) {
if(strlen(trim($get_URL_items_vars[$get_URL_items_i])) == 0) {
unset($get_URL_items_vars[$get_URL_items_i]);
}
}
return $get_URL_items_vars;
Let's say you you've got a website with a sub-section called "Festival" and a database filled with info for 100+ artist and you want your URLs to look like website.com/festival/<artistgenre>/<artistname>/.
You don't want to create 100+ pages in your CMS so <artistgenre> and <artistname> are some kind of wildcards.
I found it hard to achieve this without a lot of if/else statements like:
$item = get_URL_items();
if(is_user($item[2]) && is_genre($item[1]) && is_festival($item[0])) {
// do mysql stuff here
}
If I were you, I would use something like this:
.htaccess:
Options +FollowSymLinks
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !/main.php$
RewriteRule ^([a-zA-Z]{2})?(.*)$ main.php?lang=$1&path=$2 [L,QSA]
main.php:
$langs = array('en','de','ru'); // list of supported languages
$currentLang = isset($_GET['lang'])&&in_array($_GET['lang']) ? $_GET['lang'] : $defaultLang; // current selected language
$path = $_GET['path']; // current path
Then, in main.php you may parser path according to your needs
In answer to your bounty question I would use this:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([A-Z]{2}\/)*(([A-Z]+\/)*)([A-Z]+)\.html$ index.php?lang=$1&parents=$2&pagelink=$4 [NC,QSA,L]
Since you want to be able to handle any number of generations/levels in your URL, have you thought about how you want to catch them in you PHP script?
You definitely don't want to be going and checking isset($_GET['parent1']);isset($_GET['parent2']) etc etc etc.
As some of the other responses have indicated, you really need to be parsing your URL inside your PHP script; to that end, my RewriteRule hands off the entire 'parents' section of the URL, which your script can then explode and parse; but doesn't interfere with normal no-parent urls. =)
I somehow think this answer won't be very popular but here goes anyway. :)
mod_rewrite reaches a point where using it the old fashioned way with regular expressions becomes annoying. I suggest you skip all the pain and swap to using an external program/script to do your rewrites. I wouldn't suggest you do rewrites on all files using this method, but instead just for the urls that most users will see and type. As long as you know how to write efficient code you can even redirect to a php script to do the rewrites (as I have done in the past on a very high traffic site) and it will not have a noticeable effect on load times. If you ever reach a point where the rewrites are the main thing slowing down your site you can then switch it out for a program written in a quicker language, however I'd be surprised if you reach that.
Some things be aware of:
You need to set a rewrite lock directive or you will get lots of crazy output.
Remember that the rewrite script is a command line PHP script. It has no knowledge of things such as the $_SERVER global. This is surprisingly easy to forget.
This script is loaded at server start so any changes to it require a server restart before they take effect.
Always test this on the command line by passing a url and checking the output before restarting the server. If your script is broken restarting the server will result in anything from non functioning rewrites to the server not starting at all.
It a bit more hassle in the beginning, but once you have set this up you will find adding new rewrite rules to be an absolute breeze and a hell of a lot more flexible.
Here is the only tutorial I was able to find on how to do this using PHP...
Using MySQL to control mod_rewrite via PHP
This is far from the standard way of doing rewrites so I imagine I'm going to cop a lot of flack for this answer. Oh well. :)
Well, for SEO part, i think its better to have slug for each article (referencing you are use this for CMS). Means in your database, you have some "translation" table which translate the requesting uri/slug and associated it with $parent_page.
As almost every programmer, I'm writing my own PHP framework for educational purposes. And now I'm looking at the problem with parsing URLs for MVC routing.
Framework will use friendly URLs everywhere. But the question is how to parse them in front controller. For example the link /foo/bar/aaa/bbb may mean "Call the controller's foo action bar and pass parameter aaa with value bbb. But in case someone installs a framework into the subdirectory of the domain root, the directory part should be stripped before determining controller name and action name. And I'm looking for a way to do it safely.
Also I would like to support a fallback case if URL rewriting is not supported on the server.
On different systems different sets of $_SERVER variables are defined. For example, on my local machine from the set of PATH_INFO, REQUEST_URI, REQUEST_URL, ORIG_REQUEST_URI, SCRIPT_NAME, PHP_SELF only REQUEST_URI, SCRIPT_NAME and PHP_SELF are defined. I wonder, if I can rely on them.
Mature frameworks like Symfony or ZF have some compicated algorithms of parsing URLs (at least it seemed to be so). So, I can't just take a part from there for mine.
Two workarounds:
Add config variable with url / instalation directory to your application, and strip it from $_SERVER['REQUEST_URI']
Make apache rewrite it to get variable
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php?myrequest=$1 [QSA,L]
I'm currently doing the same research. But everything I see is so complicated that I'll most probably continue using mod_rewrite anyway. After all you end up with the same thing rather you use SEF with PHP or mod_rewrite with apache. Anyway I'll be monitoring this topic.. it's interesting :)
Hope the php gurus around here have some more info about this :)
Edit:
It really depends on what you want to do. For my needs I hardcoded most of the pages so they looked SEF. But something like the example below should work as well.
RewriteEngine on
RewriteRule ^/posts/([A-Za-z0-9_\-]+)/([A-Za-z0-9_\-]+)\.html$ posts.php?$1=$2 [NC]
With this example above:
http://localhost/posts/view/23
http://localhost/posts/delete/23
is equal to:
http://localhost/posts.php?view=23
http://localhost/posts.php?delete=23
It really depends on what exactly you're doing :)
The example above should be working but I haven't tested them.
I usually use the following for determining an application base URL path, assuming all your requests always goes through the same gateway script:
$base = dirname($_SERVER['PHP_SELF']);
For your second question, if you want to check if mod_rewrite is enabled, you can use:
if (in_array('mod_rewrite', apache_get_modules())) {
// rewrite is enabled
}
However, it doesn't necessarily means that RewriteEngine is enabled, so you probably should use an extra condition:
if (in_array('mod_rewrite', apache_get_modules()) &&
preg_match('/RewriteEngine +On/i', file_get_contents('/path/to/.htaccess'))) {
// rewrite is enabled and active
}
Maybe you could take PHP_SELF and remove the first n chars where n is the length of SCRIPT_NAME.
Edit: Oops... seems like you can just take PHP_SELF: http://php.about.com/od/learnphp/qt/_SERVER_PHP.htm
What's the best way to implement a URL interpreter / dispatcher, such as found in Django and RoR, in PHP?
It should be able to interpret a query string as follows:
/users/show/4 maps to
area = Users
action = show
Id = 4
/contents/list/20/10 maps to
area = Contents
action = list
Start = 20
Count = 10
/toggle/projects/10/active maps to
action = toggle
area = Projects
id = 10
field = active
Where the query string can be a specified GET / POST variable, or a string passed to the interpreter.
Edit: I'd prefer an implementation that does not use mod_rewrite.
Edit: This question is not about clean urls, but about interpreting a URL. Drupal uses mod_rewrite to redirect requests such as http://host/node/5 to http://host/?q=node/5. It then interprets the value of $_REQUEST['q']. I'm interested in the interpreting part.
If appropriate, you can use one that already exists in an MVC framework.
Check out projects such as -- in no particular order -- Zend Framework, CakePHP, Symfony, Code Ignitor, Kohana, Solar and Akelos.
have a look at the cakephp implementation as an example:
https://trac.cakephp.org/browser/trunk/cake/1.2.x.x/cake/dispatcher.php
https://trac.cakephp.org/browser/trunk/cake/1.2.x.x/cake/libs/router.php
You could also do something with mod_rewrite:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ $1 [L]
RewriteRule ^([a-z]{2})/(.*)$ $2?lang=$1 [QSA,L]
RewriteRule ^(.*)$ index.php?url=$1 [QSA,L]
</IfModule>
This would catch urls like /en/foo /de/foo and pass them to index.php with GET parameters 'lang' amd 'url'. Something similar can be done for 'projects', 'actions' etc
Why specifically would you prefer not to use mod_rewrite? RoR uses mod_rewrite. I'm not sure how Django does this, but mod_php defaults to mapping URLs to files, so unless you create a system that writes a separate PHPfile for every possible URL (a maintenance nightmare), you'll need to use mod_rewrite for clean URLs.
What you are describing in your question should actually be the URL mapper part. For that, you could use a PEAR package called Net_URL_Mapper. For some information on how to use that class, have a look at this unit test.
The way that I do this is very simple.
I use wordpress' .htaccess file:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
What this .htaccess does is when something returns a 404, it sends the user to index.php.
In the above, /index.php is the "interpreter" for the URL.
In index.php, I have something along the lines of:
$req = $_SERVER['REQUEST_URI'];
$req = explode("/",$req);
The second line splits up the URL into sections based on "/". You can have
$area = $req['0'];
$action= $req['1'];
$id = $req['2'];
What I end up doing is:
function get_page($offset) {//offset is the chunk of URL we want to look at
$req = $_SERVER['REQUEST_URI'];
$req = explode("/",$req);
$page = $req[$offset];
return $page;
}
$area = get_page(0);
$action = get_page(1);
$id = get_page(2);
Hope this helps!
Just to second #Cheekysoft's suggestion, check out the Zend_Controller component of the Zend Framework. It is an implementation of the Front Controller pattern that can be used independently of the rest of the framework (assuming you would rather not use a complete MVC framework).
And obviously, CakePHP is the most similar to the RoR style.
I'm doing a PHP framework that does just what you are describing - taking what Django is doing, and bringing it to PHP, and here's how I'm solving this at the moment:
To get the nice and clean RESTful URLs that Django have (http://example.com/do/this/and/that/), you are unfortunately required to have mod_rewrite. But everything isn't as glum as it would seem, because you can achieve almost the same thing with a URI that contains the script's filename (http://example.com/index.php/do/this/and/that/). My mod_rewrite just forwards all calls to that format, so it's almost as usable as without the mod_rewrite trick.
To be truthful, I'm currently doing the latter method by GET (http://example.com/index.php?do/this/and/that/), and fixing stuff in case there are some genuine GET variables passed around. But my initial research says that using the direct slash after the filename should be even easier. You can dig it out with a certain $_SERVER superglobal index and doesn't require any Apache configuration. Can't remember the exact index off-hand, but you can trivially do a phpinfo() testpage to see how stuff look like under the hood.
Hope this helps.