Simplest way to detect client locale in PHP - php

I would like to be able to detect what country a visitor is from on my website, using PHP.
Please note that I'm not trying to use this as a security measure or for anything important, just changing the spelling of some words (Americans seems to believe that the word "enrolment" has 2 Ls.... crazy yanks), and perhaps to give a default option in a "Select your country" list.
As such, using a Geolocation database is a tad over-the-top and I really don't want to muck about with installing new PHP libraries just for this, so what's the easiest/simplest way to find what country a visitor is from?

PHP provides a function since 5.3.0 to parse the $_SERVER['HTTP_ACCEPT_LANGUAGE'] variable into a locale.
$locale = Locale::acceptFromHttp($_SERVER['HTTP_ACCEPT_LANGUAGE']);
echo $locale; // returns "en_US"
Documentation: https://www.php.net/manual/en/locale.acceptfromhttp.php

Not guaranteed, but most browsers submit an Accept-Language HTTP header that specifies en-us if they're from the US. Some older browsers only said they are en, though. And not all machines are set up correctly to indicate which locale they prefer. But it's a good first guess.
English-UK based-users usually set their system or user locale to English-UK, which in default browser configurations should result in en-gb as the Accept Language header. (An earlier version of this said en-uk; that was a typo, sorry.) Other countries also have en locales, such as en-za (south africa), and, primarily theoretically, combinations like en-jp are also possible.
Geo-IP based guesses will less likely be correct on the preferred language/locale, however. Google thinks that content-negotiation based on IP address geolocation makes sense, which really annoys me when I'm in Japan or Korea...

You can check out the HTTP_ACCEPT_LANGUAGE header (from $_SERVER) that most browsers will send.
Take a look at Zend_Locale for an example, or maybe you might even want to use the lib.

You can do some IP comparaison without having a whole library to do it.
Solution #1
Use an API, this way nothing is needed from your side. This is a web API that let you know the country:
Example: http://api.hostip.info/get_html.php?ip=12.215.42.19
Return : Country: UNITED STATES (US)
Solution #2
But, Have you think to use the browser agent language? You might be able to know the type of english from it.
Solution #3
This website called BlockCountry let you have a list of IP by country. Of course, you do not want to block, but you can use the list of IP and compare them (get all US IP...) this might not be accurate...

Given your stated purpose, the Accept-Language header is a more suitable solution than IP-based geolocation. Indeed, it's precisely the intended purpose of Accept-Language.

I use the HTTP_ACCEPT_LANGUAGE
$localePreferences = explode(",",$_SERVER['HTTP_ACCEPT_LANGUAGE']);
if(is_array($localePreferences) && count($localePreferences) > 0) {
$browserLocale = $localePreferences[0];
$_SESSION['browser_locale'] = $browserLocale;
}

Parse $_SERVER["HTTP_ACCEPT_LANGUAGE"] to get country and browser's locale.

For identifying your visitors country I've used GeoIP extension, very simple to use.

The http://countries.nerd.dk service is what I use for IP-to-country mapping. It works really well and being based on DNS, is cached well too.
You can also download the database for local use if you don't want to rely on an external service.

Or you can do the following:
download 'geoip.dat' and geoip.inc from http://www.maxmind.com/app/geoip_country
in geoip.inc header you will find how to use it (eg. initialize and the rest...)

GeoIP extension is good choice.

One thing is which language viewer wants, second - which you can serve:
$SystemLocales = explode("\n", shell_exec('locale -a'));
$BrowserLocales = explode(",",str_replace("-","_",$_SERVER["HTTP_ACCEPT_LANGUAGE"])); // brosers use en-US, Linux uses en_US
for($i=0;$i<count($BrowserLocales);$i++) {
list($BrowserLocales[$i])=explode(";",$BrowserLocales[$i]); //trick for "en;q=0.8"
for($j=0;$j<count($SystemLocales);$j++) {
if ($BrowserLocales[$i]==substr($SystemLocales[$j],0,strlen($BrowserLocales[$i]))){
setlocale(LC_ALL, $SystemLocales[$j]);
break 2; // found and set, so no more check is needed
}
}
}
for example, mine system serves only:
C
POSIX
pl_PL.UTF-8
and my browser languages are: pl, en-US, en => so the only correct locale is pl_PL.UTF-8.
When no successful comparison is found - there's no setlocale at all.

Related

Coordinates: convert from WGS84 to a *specific* UTM-region

There are a lot of free tools out there to convert from UTM to Lat/Long. Fine enough, but I need to go the other way; from WGS-84 to lat/long-format.
But it's more complicated than that; 'couse I need the result to be in UTM-33 (nordic) zone.
This might sound like a bad idea; why would I like to "force" the zone to 33N, when the geographical point might be laying in another zone ...
Well; the thing is that I already have a database with UTM33-coordinates of every address in Norway.
Those of you that are familiar with UTM # Northern Europe, knows that Norway spans across several zones; from 31 to 36.
(Okay, maybe we only spans from 32 to 36, cause of the strange width of zone 32V, but thats another discussion).
So, back to my problem: all my addresses are already given in UTM-33 format (with negative values when out of range). How can i proceed to get my Lat/Long into UTM-33?
I need a solution in PHP, and after a lot of debugging with "gPoint", I found out it just won't work ...
(gPoint is great to convert from/to UTM, but it will always return the UTM x/y-pair in the "correct" zone block. I don't want that! I need to always get results in zone 33, regardless of what is actual correct..)
I finally found a solution though Proj4 (thanx SlightlyCuban)!
// Conversion from WGS84 to UTM33
$proj4 = new Proj4php();
$projWGS84 = new Proj4phpProj('EPSG:4326');
$projUTM33N = new Proj4phpProj('EPSG:2078');
$result = $proj4->transform($projWGS84, $projUTM33N, new proj4phpPoint($long, $lat));
It is impossible to guess correct format for transformation, and I was struggling for a while ... then I discovered a really handy webpage: http://spatialreference.org. Here I found the definition I needed (EPSG:2078 = "+proj=utm +zone=33 +ellps=intl +units=m +no_defs")
The PHP-implementation of Proj4 needs hard-coded definitions; you cannot pass by a definition ad-hoc. EPSG:2078 was originally missing from Proj4php. Yay; lucky me!
I would therefor recommend to test Proj4 via Node.js and proj4js (see my little demo at http://pastebin.com/1BP8cWpj).
To succeed, I had to fork the Proj4PHP-library, and add a definition file for EPSG:2078 ...

PHP - detect text direction (ltr/rtl) from UA string or external service, is it possible?

I would like to detect not only the client language (which I already do using a combination of different methods, from UA string extraction to geolocation services) but I would also like to detect the text direction in an automatic fashion, if possible.
I know there aren't too many languages using right-to-left direction (not as many as left-to-right ones, at least), so a possible solution would be to do something like $rtl = ['ar', 'he', ..., '<whatever>']; if (in_array(substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2), $rtl)) { $direction = "rtl"; } else { $direction = "ltr"; } but looks to me like there's (probably) a better solution.
I'm still studying some of the language recognition API's out there like LangID, AlchemyAPI and DetectLanguage, but they seem to do the same: recognize the text language, but not the text direction.
Any recommended approach?
The information about the direction of a language is accessible through the ICU library.
Using the cosmopolitan package it could be as simple as below.
<?php
require __DIR__. "/vendor/autoload.php";
use Salarmehr\Cosmopolitan\Cosmo;
echo Cosmo::create('fa')->direction(); // rlt
echo Cosmo::create('en')->direction(); // ltr

node.js hmac.digest() output seems wrong

I'm trying to implement a Facebook library with node.js, and the request signing isn't working. I have the PHP example seen here translated into node. I'm trying it out with the example given there, where the secret is the string "secret". My code looks like this:
var signedRequest = request.signed_request.split('.');
var sig = b64url.decode(signedRequest[0]);
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
console.log(sig == expected); // false
I can't console.log the decoded strings themselves, because they have special characters that cause the console to clear (if you have a suggestion to get around that please let me know) but I can output the b64url encodings of them.
The expected encoded sig, as you can see on the FB documentation, is
vlXgu64BQGFSQrY0ZcJBZASMvYvTHu9GQ0YM9rjPSso
My expected value, when encoded, is
wr5Vw6DCu8KuAUBhUkLCtjRlw4JBZATCjMK9wovDkx7Dr0ZDRgzDtsK4w49Kw4o
So why do I think it's digest that's wrong? Maybe the error is on my side? Well, if I execute the exact example in PHP given in the documentation, the correct result comes out. But if I change the hash_hmac call so the last parameter is false, outputting hex, I get
YmU1NWUwYmJhZTAxNDA2MTUyNDJiNjM0NjVjMjQxNjQwNDhjYmQ4YmQzMWVlZjQ2NDM0NjBjZjZiOGNmNGFjYQ==
Now, if I go back to my javascript code, and change my hmac code to .digest("hex") instead of the default "binary" and log the base64 encoding of the result, I get... surprise!
YmU1NWUwYmJhZTAxNDA2MTUyNDJiNjM0NjVjMjQxNjQwNDhjYmQ4YmQzMWVlZjQ2NDM0NjBjZjZiOGNmNGFjYQ
Same, except the == signs are missing off the end, but I think that's a console thing. I can't imagine that being the issue, without them it's not even a valid base64 string length.
So, how come the digest method outputs the correct result when using hex, but the wrong answer when using binary? Is the binary not quite the same as the "raw" output of the PHP equivalent? And if that's the case what is the correct way to call it?
We have discovered that this was indeed a bug in the crypto lib, and was a known issue logged on github. We will have to upgrade and get the fix.
I am Tesserex's partner. I believe the answer may have been combination of both Tesserex's self posted answer and Juicy Scripter's answer. We were still using Node ver. 0.4.7. The bug Tesserex mentioned can be found here: https://github.com/joyent/node/issues/324. I'm not entirely certain that this bug affected us, but it seems a good possibility. We updated Node to ver 0.6.5 and applied Juicy Scripter's solution and everything is now working. Thank you.
As a note about the suggestion of using existing libraries. Most of the existing libraries require express, this is something we are trying to avoid do to some of the specifics of our application. Also the existing libraries tend to assume that your using node.js like a web server and answering a single users request at a time. We are using persistent connections with websockets and our facebook client will be handling session data for multiple users simultaneously. Eventually I hope to make our Facebook client open source for use with applications like ours.
Actually there is no problem with digest, the results of b64url.decode are in utf8 encoding by default (which can be specified by second parameter) if you use:
var sig = b64url.decode(signedRequest[0], 'binary');
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
// sig === expected
signature and result of digest will be the same.
You may also check this by turning digest results into utf8 encoded string:
var sig = b64url.decode(signedRequest[0]);
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
var expected_buffer = new Buffer(expected_sig.digest(), 'binary');
// sig === expected_buffer.toString()
Also you may consider using existing libraries to do that kind of work (and probably more), to name a few:
facebook-wrapper
everyauth

Calculating difference between username and email in javascript

for security reasons i want the users on my website not to be able to register a username that resembles their email adress. Someone with email adress user#domain.com cant register as user or us.er, etc
For example i want this not to be possible:
tester -> tester#mydomain.com (wrong)
tes.ter -> tester#mydomain.com (wrong)
etc.
But i do want to be able to use the following:
tester6 -> tester#mydomain.com (good)
etc.
//edit
tester6 is wrong too. i ment user6 -> tester#mydomain.com (good).
Does anyone have an idea how to achieve this, or something as close as possible. I am checking this in javascript, and after that on the server in php.
Ciao!
ps. Maybe there is some jquery plugin to do this, i can't find this so far. The downside tho of using a plugin for this, is that i have to implement the same in php. If it is a long plugin it will take some time to translate.
//Edit again
If i only check the part before the # they can still use userhotmailcom, or usergmail, etc. If they supply that there email is abvious.
Typically, I use the Levenshtein distance algorithm to check whether a password looks like a login.
PHP has a native levenshtein function and here is one written in JavaScript.
Something like this?
var charsRe = /[.+]/g; // Add your characters here
if (username.replace(charsRe, '') == email.split('#')[0].replace(charsRe, ''))
doError();
If all you want is to disallow user names that vary from the email address only with periods (.), you can remove periods from the user name and compare it with email address.
//I don't know php - translating this pseudo code won't be hard
$email = "someone#something.com"
$emailname = $email.substring(0, $email.indexOf('#'));
$uname = "som.e.on.e";
$uname = $uname.replace(/\./g, "");//regex matching a '.' globally
if($uname === $emailname)
showInvalidNameErrorMessage();
Modified regex to prevent hyphens and underscores /[\-._]/g
Well, I am a newbie PHP developer. But the answer I have in my mind is, wouldn't it be great if you just allow them to register only with their email address (which won't be shared with others) and then ask for their first name and last name separately and only show their first name within public contents (i.e. Blogs, etc). I am not an expert in programming and if I am wrong please correct me and still I couldn't understand what you by security for you. Sorry for the bad English, I am not a native English speaker.

Checking browser's language by PHP?

How can I check the language of user's browser by PHP?
I need to show a different page for people in US and in UK.
I tried the following code unsuccessfully
<?php
if(ereg("us", $_SERVER["HTTP_ACCEPT_LANGUAGE"]))
include('http://page.com/us.txt');
else
include('http://page.com/uk.txt');
?>
I run a specific code for people in US and for them in UK.
Likely just a case sensitivity issue; eregi('en-us') or preg_match('/en-us/i') should have picked it up.
However, just looking for ‘en-us’ in the header may get it wrong sometimes, in particular when both the US and UK languages are listed. “Accept-Language” is actually quite a complicated header, which really you'd want a proper parser for.
If you have PECL the whole job is already done for you: http://www.php.net/manual/en/function.http-negotiate-language.php
I don't know why the other answers are going for the User-Agent header; this is utterly bogus. User-Agent is not mandated to hold a language value in any particular place, and for some browsers (eg. Opera, and some minor browser I've never heard of called ‘Internet Explorer’) it will not at all. Where it does contain a language, that'll be the of language the browser build was installed in, not the user's preferred language which is what you should be looking at. (This setting will default to the build language, but can be customised by the user from the preferences UI.)
Try this:
<?
if(preg_match('/en-us/i', $_SERVER['HTTP_USER_AGENT']))
include('http://page.com/us.txt');
else
include('http://page.com/uk.txt');
?>
A probably more reliable way of doing this is to perform a regex on the $_SERVER['HTTP_USER_AGENT'] string.
<?php
if(preg_match('/en-US/', $_SERVER['HTTP_USER_AGENT']))
include('http://page.com/us.txt');
else
include('http://page.com/uk.txt');
?>
You are not guaranteed to get a valid and useful user-agent string, so make sure that the else statement contains a reasonable alternative.
This is a zend based solution. It will also work when you add other languages.
<?php
include_once "Zend/Locale.php";
$zend_locale = new Zend_Locale(Zend_Locale::BROWSER);
// returns en for English, de for German etc.
echo $browser_language = $zend_locale->getLanguage();
echo "<br />\n";
// returns en_US for American English, en_GB for British English etc.
echo $browser_locale = $zend_locale->toString();
echo "<br />\n";
Solution seen on:
http://www.mpopp.net/2010/07/how-to-detect-the-users-preferred-language-smarter-than-google/

Categories