IP Geo location with Mod_Rewrite & PHP

IP Geo location with Mod_Rewrite & PHP - php

I am writing a small script in which it redirects to country specific landing pages(example: if you come from Germany you will be re-directed to xyz.com/de/) this redirection happens using index.php which connects to web service returns the country the user is accessing the website from then I redirect the user using 301 to a the new page xyz.com/de/
I have two questions
1- Can the same functionality integrated with mod_rewrite, if so what is the advantage in terms of performance and SEO quality?
2- Can the mod_rewrite save the query string including GCLID on the redirects (as I am concatenating the $_SERVER to php redirection

You can install mod_geoip on your server, which enables database-based geolocation lookup directly inside Apache. Look at the examples for exactly the scenario you talk about.
The advantage would be much better performance, since the lookup will be done locally using a database, instead of needing to call an external web service. It also requires virtually no code once this is set up, easing maintenance. You will only have to make sure your local copy of the lookup database is regularly updated, typically with a weekly/daily cron job.
You can rewrite the URL in any way you want appending any parameters you want.
SEO-wise it should have no effect at all compared to PHP based redirects, since to the client the behaviour appears exactly the same.

mod_rewrite can't do geolocation, nor can it connect to an external service
If your PHP code is doing the 301 redirect, then you'll need to preserve the query string in your PHP code. If you have an htaccess rule doing the 301 redirect, then the query string should be passed through with the redirect.
The documentation states:
Modifying the Query String
By default, the query string is passed through unchanged. [...] When you want to erase an existing query string, end the substitution string with just a question mark. To combine new and old query strings, use the [QSA] flag.

In answer to question 1.
You can do the Geo IP direction in the vhost Apache configuration if you have mod_geoip/mod_geoip2 installed.
You can also do it using mod_rewrite if the mod_geoip/mod_geoip2 installed.
In answer to question 2.
You can use mod_rewrite to keep the existing query string on the rewritten url there are some examples of this here

Related

have different static url in dynamic page

I have a website where each person has his personal profile. I would like to have static URL like mywebsite/user1, mywebsite/user2, but actually I would remain in the same page and change the content dynamically. A reason is that when I open the site I ask to a database some data, and I don't want to ask it each time I change page.
I don't like url like mywebsite?user=1
Is there a solution?
Thank you
[EDIT better explenation]
I have a dynamic page that shows the user profile of my website. So the URL is something like http://mywebsite.me?user=2
but i would like to have a static link, like
http://mywebsite.me/user2name
Why I want this? Because it's easy to remember and write, and because i can change dynamically the content of the page, without asking each time data to my database (i need some shared info in all the pages. info are the same for all the pages)

Yes there are solutions to your problem!
The first solution is server dependend. I am a little unsure how this works on an IIS server but it's quiet simple in Apache. Apache can take directives from a file called .htaccess. The .htaccess file needs to be in the same folder as your active script to work. It also needs the directive AllowOverride All and the module mod_rewrite loaded in the main server configuration. If you have all this set up you need to edit your .htaccess file to contain the following
RewriteEngine on
RewriteRule ^mywebsite/([^/\.]+)/?$ index.php?user=$1 [L]
This will allow you to access mywebsite/index.php?user=12 with mywebsite/12.
A beginner guide to mod_rewrite.
You could also fake this with only PHP. It will not be as pretty as the previous example but it is doable. Also, take into concideration that you are working with user input so the data is to be concidered tainted. The user needs to access the script via mywebsite/index.php/user/12.
<?php
$request = $_SERVER['REQUEST_URI'];
$request = explode($request, '/'); // $request[0] will contain the name of the .php file
$user[$request[1]] = $request[2];
/* Do stuff with $user['user'] */
?>
These are the quickest way I know to acheive what you want.

First off, please familiarise yourself with the solution I have presented here: http://codeumbra.eu/how-to-make-a-blazing-fast-ajax-call-to-a-zend-framework-application
This does exactly what you propose: eliminates all the unnecessary database queries and executes only the one that's currently needed (in your case: fetch user data). If your application doesn't use Zend Framework, the principle remains the same regardless - you'll just have to open the database connection the way that is required by your application. Or just use PDO or whatever you're comfortable with.
Essentially, the method assumes you make an AJAX call to the site to fetch the data you want. It's easy in jQuery (example provided in the article mentioned above). You can replace the previous user's data with the requested one's using JavaScript as well on success (I hope you're familiar with AJAX; if not, please leave a comment and I will explain in more detail).
[EDIT]
Since you've explained in your edit that what you mean is URI rewriting, I can suggest implemensting a simple URI router. The basics behind how it works are described here: http://mingos.eu/2012/09/the-basics-of-uri-routing. You can make your router as complex or as simple as needed by your application.

The URL does not dictate whether or not you make a database call. Those are two separate issues. You typically set up your server so example.com/username is rewritten internally to example.com/user.php?id=username. You're still running PHP, the URL is just masking it. That's called pretty URLs, realized by URL rewriting.
If you want to avoid calling the database, cache your data. E.g. in the above user.php script, you generate a complete HTML page, then write it into a cache folder somewhere, then next time instead of generating the page again the script just outputs the contents of the already created page. Or you just cache the database data somewhere, but still generate the HTML anew every time.
You could write an actual HTML file to /username, so the web server will serve it directly without even bothering PHP. That's not typically what you want though, since it's hard to update/expire those files and you also typically want some dynamic content on there.

Select all from your database.
Then create file containing the scripts contents(index.php?user='s) for each one. set the file name to user_id/user_name you got from the SELECT statement.
This will create a page for each user in the present folder.
To avoid having to recreate 'static' pages, you could set a new column named say 'indexedyet' and change it to 1 on creating a file. You select only files which have this as 0. You could perform this via cronjob once a day or so.
This leaves you vulenderable to user data changes though, as they won't autmatically update. a tactic to use here is to update the static page on any editing.
Another, probably better (sorry not had enough coffee yet-) ideal would be to create a folder on a users registration. Make the index.php page tailored to them on registration and then anything like www.mysite.com/myuser will show their 'tailored version'. Again update the page on user updates.
I would be happy to provide examples depending on your approach.

No require, no include, no url rewriting, yet the script is executed without being in the url

I am trying to trace the flow of execution in some legacy code. We have a report being accessed with
http://site.com/?nq=showreport&action=view
This is the puzzle:
in index.php there is no $_GET['nq'] or $_GET['action'] (and no
$_REQUEST either),
index.php, or any sources it includes, do not include showreport.php,
in .htaccess there is no url-rewriting
yet, showreport.php gets executed.
I have access to cPanel (but no apache config file) on the server and this is live code I cannot take any liberty with.
What could be making this happen? Where should I look?
Update
Funny thing - sent the client a link to this question in a status update to keep him in the loop; minutes latter all access was revoked and client informed me that the project is cancelled. I believe I have taken enough care not to leave any traces to where the code actually is ...
I am relieved this has been taken off me now, but I am also itching to know what it was!
Thank you everybody for your time and help.

There are "a hundreds" ways to parse a URL - in various layers (system, httpd server, CGI script). So it's not possible to answer your question specifically with the information you have got provided.
You leave a quite distinct hint "legacy code". I assume what you mean is, you don't want to fully read the code, understand it even that much to locate the piece of the application in question that is parsing that parameter.
It would be good however if you leave some hints "how legacy" that code is: Age, PHP version targeted etc. This can help.
It was not always that $_GET was used to access these values (same is true for $_REQUEST, they are cousins).
Let's take a look in the PHP 3 manual Mirror:
HTTP_GET_VARS
An associative array of variables passed to the current script via the HTTP GET method.
Is the script making use of this array probably? That's just a guess, this was a valid method to access these parameter for quite some time.
Anyway, this must not be what you search for. There was this often misunderstood and mis-used (literally abused) feature called register globals PHP Manual in PHP. So you might just be searching for $nq.
Next to that, there's always the request uri and apache / environment / cgi variables. See the link to the PHP 3 manual above it lists many of those. Compare this with the current manual to get a broad understanding.
In any case, you might have grep or a multi file search available (Eclipse has a nice build in one if you need to inspect legacy code inside some IDE).
So in the end of the day you might just look for a string like nq, 'nq', "nq" or $nq. Then check what this search brings up. String based search is a good entry into a codebase you don't know at all.

I’d install xdebug and use its function trace to look piece by piece what it is doing.
EDIT:
Okay, just an idea, but... Maybe your application is some kind of include hell like application I’m sometimes forced to mess at work? One file includes another, it includes another and that includes original file again... So maybe your index file includes some file that eventually causes this file to get included?
Another EDIT:
Or, sometimes application devs didn’t know what is a $_GET variable and parsed the urls themselves -> doing manual includes based to based urls.

I don't know how it works, but I know that Wordpress/Silverstipe is using is own url-rewriting to parse url to find posts/tags/etc. So the url parsing maybe done in a PHP script.

Check your config files (php.ini and .htaccess), you may have auto_prepend_file set.

check your crontab, [sorry I don't know where you would find it in cpanel]
- does the script fire at a specific time or can you see it definitely fires only when you request a specific page?
-sean
EDIT:
If crontab is out, take a look at index.php [and it's includes] and look for code that either loops over the url parameters without specifically noting "nq" and anything that might be parsing the query string [probably something like: $_SERVER['QUERY_STRING'] ]
-sean

You should give debug_backtrace() (or debug_print_backtrace() a try. The output is similar to the output of an Exception-stacktrace, thus it should help you to find out, what is called when and from where. If you don't have the possibility to run the application on a local development system, make sure, that nobody else can see the output

Are you sure that you are looking at the right config or server? If you go the url above you get an error page that seems to indicate that the server is actually a microsoft iis server and not an apache one.

Is there a way to pass variables except sessions and get variables?

My problem is not so easy to describe ... for me :-) so please be lenient towards me.
I have several ways to view a list. which means, there are some possibilities how to come to and create the view which displays my list. this wokrs well with parallel opend browser tabs and is desired though.
if I click on an item of my list I come to a detail-view of that item.
at this view I want to know from which type of list the link was "called". the first problem is, that the referrer will allways be the same and the second: I should not append a get variable to the url. (and it should not be a submitted form too)
if I store it to the session, I will overwrite my session param when working in a parallel tab as well.
what is the best way to still achive my goal, of knowing which mode the previous list was in.

You need to use something to differentiate one page from another, otherwise your server won't know what you're asking for.
You can POST your request: this will hide the URL parameters, but will hinder your back button functionality.
You can GET your request: this will make your URLs more "ugly" but you should be able to work around that by passing short, concise identifiers like www.example.com/listDetail?id=12
If you can set up mod_rewrite, then you can GET requests to a url like www.example.com/listDetails/12, and apache will rewrite the request behind the scenes to look more like www.example.com/listDetails?id=12 but the user will never see it -- they will just see the original, clean/friendly version.
You said you don't have access to the server configuration -- I assume this is because you are on a shared server? Most shared servers already have mod_rewrite installed. And while the apache vhost is typically the most appropriate place to put rewrite rules, they can also be put in a .htaccess file within any directory you want to control. (Sometimes the server configuration disables this, but usually on a shared host, it is enabled) Look into creating .htaccess files and how to use mod_rewrite

How to pass parameters with hash in PHP

Recently when i saw google results page, the query and other parameters where passed with # (hash) instead of the usual "?"
Also, in facebook i saw the same thing. This was quite interesting and after a simple search with google, i found the results related to perl and Ruby but no result with PHP.
Is it possible to pass parameters with # in PHP instead of "?" or is this possible only with perl/Ruby. This will be useful and search engines will not parse the parameters in the URLs.
Any ideas will be helpful to me.

Traditionally, the # told the browser to automatically scroll to a particular point in the page, which was (and still is) often used to implement links from one part of a page (e.g. a table of contents) to another (e.g. a section heading).
However, it also has the effect of causing the URL containing the # to be recorded in the history, even if it's identical to the previous URL except for the # and what follows it. (In other words, the user is still on the same page.) This means that the back button can be used to get back to the state that you were previously in, even if that state-change doesn't correspond to a page-load.
Modern AJAX applications therefore often use it to signify that something has happened that the user might want to "go back" from.

Nope, it is not possible.
What have you seen is just a decoration, to reflect an AJAX call in the address bar.
No matter what language you choose - all of them sits on the server side and communicate with browser using HTTP protocol. And no anchor allowed in HTTP requests. That's completely client side thing

You are running into confusion in your search results because the term hash is overloaded, as is the concept of parameter passing.
You are seeing references to the concept of passing values in a hash beacause an associative array is called a hash in some languages (which is short for a hash table).
The # character is also confusingly named. It is called, "hash", "pound", "number" and "octothorpe". Since I grew up in the US, I call it a "pound sign" in my head, which is likely annoying to users of more British English, and is no less fraught with potential for confusion (consider "£").
Passing function arguments in a hash in Perl is a nice way to get named arguments for a routine. PHP has positional arguments only, but using an Array works nicely there.
Many web libraries use a hash/associative array type structure for form values. Keys are typically the field id, and values are the field values.
In a URI the # denotes the start of the fragment specifier. It identifies a part of the page that the URI points to. It is generally not used to pass request information from the client back to the server.

there probably is some server rewriting or so.
Example with apache server, you can handle some uri like
http://www.mysite.com#something
and rewrite it as
http://www.mysite.com/perl/script.pl?data=something
and so process it as a simple GET query to your script.pl
This is all server-side processing, un-visible to the client

Best way to get data from "clean" and "dirty" URLs

I'm writing an application that gets data from URLs, but I want to make it an option whether or not the user uses "clean" urls (ex: http://example.com/hello/world) or "dirty" urls (ex: http://example.com/?action=hello&sub=world).
What would be the best way to get variables form both URL schemes?

If your mod_rewrite has a rule like the following:
RewriteRule ^hello/world /?action=hello&sub=world [NC,L]
or, the more generalised:
// Assuming only lowercase letters in action & sub..
RewriteRule ^([a-z]+)/([a-z]+) /?action=$1&sub=$2 [NC,L]
then the same PHP script is being called, with the $_REQUEST variables available whichever way the user accesses the page (dirty or clean url).
We recently moved a large part of our site to clean urls (still supporting the older, "dirty" urls) and rules like the above meant we didn't have to rewrite any code that relied on $_REQUEST params, only the mod_rewrite rules.
Update
Mod_rewrite is an Apache module, but there are a number of options available for IIS also.
Whichever web server you decide to support, the mod_rewrite approach will likely result in the least amount of work for you. Without it, you'd likely have to create a load of files to mimic the structure of your clean urls, e.g. in your webserver root you'd create a directory hello, placing a file world into it, containing something like the following:
// Set the $_REQUEST params to mimic dirty url
$_REQUEST['action'] = 'hello';
$_REQUEST['sub'] = 'world';
// Include existing file so we don't need to re-do our logic
// (assuming index.php at web root)
include('../index.php');
As the number of parameters you wish to handle 'cleanly' increases, so will the number of directories and stub files you require, which will greatly increase your maintenance burden.
mod_rewrite is designed for exactly this sort of problem, and is now supported on IIS as well as Apache, so I'd strongly recommend going in that direction!

If you're application is running in Apache server, I would recommend the use of mod_rewrite.
Basically, you code your application to use "dirty" URLs inside. What I mean by this is that you can still use the "clean" URLs in the templates and such, but you use the "dirty" version when parsing the URL. Like, you're real and "diry" URL is www.domain.com/index.php?a=1&b=2, inside of your code you are still going to use $_GET['a'] and $_GET['b']. Then, with the power of mod_rewrite just make the URLs like www.domain.com/1/2/ point to the "dirty" URL. (this is just an example of how things can be done)

sounds like a pain, but i guess create a function that analyzes the URL for every request. First determine if it's "dirty" or "clean" URL. For that, I would first look for the presence of the question mark character and proceed from there (additional checking will obviously be requried). For the "dirty" URL, use PHP's normal Get method retrieval capabilities ($variable_name). For the "clean" one, I would use Regular Expressions. That would be the most flexible (and efficient) way to parse the URL and extract potential variables.

A quick and dirty way might be to simply check for GET variables. If there are any, it's dirty, if not, it's clean. Of course this depends on what exactly you mean by dirty URLs.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.