I am looking for a way to calculate approximate distance between two UK postcodes (distance in the straight line is good enough) for analysing data. Preferably easily accessible from java, but C#, native C++ etc. are fine as well.
First, you need to translate the postcode into useful coordinates. For example, the Easting and Northing values from a Postcode lookup table, like the one from here: http://www.doogal.co.uk/UKPostcodes.php
These Easting and Northing are UK Ordnance Survey grid coordinates in metres from the OS map origin.
Convert them into Kilometres by dividing by 1000
Then use a simple Pythagoros triangle formula. Say the two points have Easting and Northing values (in kilometres) of E1, N1 and E2, N2
Distance between them in Kilometres = Square root of ( abs(E1-E2)^2 + abs(N1-M2)^2 )
The following question answers the exact same question only in PHP specifically:
Using PHP and google Maps Api to work out distance between 2 post codes (UK)
It leverages a web service API though, so you should be able to use any basic rest api to leverage it (I'm sure there are well documented options in Java, C# and C++).
Related
I have a group of users. The user count could be 50 or could be 2000. Each should have a long/lat that I have retrieved from Google Geo api.
I need to query them all, and group them by proximity and a certain count. Say the count is 12 and I have 120 users in the group. I want to group people by how close they are (long/lat) to other people. So that I wind up with 10 groups of people who are close in proximity.
I currently have the google geo coding api setup and would prefer to use that.
TIA.
-- Update
I have been googling about this for awhile and it appears that I am looking for a spatial query that returns groups by proximity.
Keep in mind that this problem grows exponentially with every user you add, as the amount of distance calculations is linked to the square of the number of users (it's actually N*(N-1) distances... so a 2000 user base would mean almost 4 million distance calculations on every pass. Just keep that in mind when sizing the resources you need
Are you looking to group them based on straight-line (actually great circle) distance or based on walking/driving distance?
If the former, the great circle distance can be approximated with simple math if you're able to tolerate a small margin of error and wish to assume the earth is a sphere. From GCMAP.com:
Earth's hypothetical shape is called the geoid and is approximated by
an ellipsoid or an oblate sphereoid. A simpler model is to use a
sphere, which is pretty close and makes the math MUCH easier. Assuming
a sphere of radius 6371.2 km, convert longitude and latitude to
radians (multiply by pi/180) and then use the following formula:
theta = lon2 - lon1
dist = acos(sin(lat1) × sin(lat2) + cos(lat1) × cos(lat2) × cos(theta))
if (dist < 0) dist = dist + pi
dist = dist × 6371.2
The resulting distance is in kilometers.
Now, if you need precise calculations and are willing to spend the CPU cycles needed for much complex math, you can use Vincenty's Formulae, which uses the WGS-84 reference ellipsoid model of the earth which is used for navigation, mapping and whatnot. More info HERE
As to the algorithm itself, you need to build a to-from matrix with the result of each calculation. Each row and column would represent each node. Two simplifications you may consider:
Distance does not depend on direction of travel, so $dist[n][m] == $dist[m][n] (no need to calculate the whole matrix, just half of it)
Distance from a node to itself is always 0, so no need to calculate it, but since you're intending to group by proximity, to avoid a user being grouped with itself, you may want to always force $dist[m][m] to an arbitrarily defined and abnormally large constant ($dist[m][m] = 22000 (miles) for instance. Will work as long as all your users are on the planet)
After making all the calculations, use an array sorting method to find the X closest nodes to each node and there you have it
(you may or may not want to prevent a user being grouped on more than one group, but that's just business logic)
Actual code would be a little too much to provide at this time without seeing some of your progress first, but this is basically what you need to do algoritmically.
... it appears that I am looking for a spatial query that returns groups by proximity. ...
You could use hdbscan. Your groups are actually clusters in hdbscan wording. You would need to work with min_cluster_size and min_samples to get your groups right.
https://hdbscan.readthedocs.io/en/latest/parameter_selection.html
https://hdbscan.readthedocs.io/en/latest/
It appears that hdbscan runs under Python.
Here are two links on how to call Python from PHP:
Calling Python in PHP,
Running a Python script from PHP
Here is some more information on which clustering algorithm to choose:
http://nbviewer.jupyter.org/github/scikit-learn-contrib/hdbscan/blob/master/notebooks/Comparing%20Clustering%20Algorithms.ipynb
http://scikit-learn.org/stable/modules/clustering.html#clustering
Use GeoHash algorithm[1]. There is a PHP implementation[2]. You may pre-calculate geohashes with different precision, store them in SQL database alongside lat-lon values and query using native GROUP BY.
https://en.wikipedia.org/wiki/Geohash
https://github.com/lvht/geohash
i am designing a recruitment database, and i need it to perform several tasks that all involve integrating a way of calculating distance:
1) calculate the distance the candidate lives from the client?
2) calculate the clients within a radius of the candidates available for work on any given day?
3) calculating the number of candidates with the correct qualifications for the vacancy within a radius of the client.
as you can see i need to calculate the distance in 2 main ways 1) radius 2) as the crow flies, i would prefer exact distance but the first will do.
i know that i can integrate Google maps or some other web based mapping but i want the system to be stand alone so it can function without an internet connection.
the system will have a HTML5 front end and the Back end is in Mysql and PHP.
thank you biagio
The distance between 2 points could be calculated with the following formula :
6371*acos(cos(LatitudeA)*cos(LatitudeB)*cos(longitudeB-longitudeA)+sin(LatitudeA)*sin(latitudeB))
Of course it's a "crow flies" approximation in Km.
Wich can be translated to php by :
$longA = 2.3458*(M_PI/180); // M_PI is a php constant
$latA = 48.8608*(M_PI/180);
$longB = 5.0356*(M_PI/180);
$latB = 47.3225*(M_PI/180);
$subBA = bcsub ($longB, $longA, 20);
$cosLatA = cos($latA);
$cosLatB = cos($latB);
$sinLatA = sin($latA);
$sinLatB = sin($latB);
$distance = 6371*acos($cosLatA*$cosLatB*cos($subBA)+$sinLatA*$sinLatB);
echo $distance ;
With that you could compute the distance between two points (people) and of course determine if a point is in a radius of an other.
Just a thought: If your clients and candidates live in a very limited area, I would use a rectangular map and enter x and y values for all cities in that area in a database.
Due to the pythagorean theorem, you can say:
distance² = (x-value client - x-value candidate)² + (y-value client - y-value candidate)²
First, isn't the radius actually the 'as crow flies' distance from the client to the candidates? Anyway...you will need to figure out an algorithm for computing what you need based on the distance between the points.
If you want the system to be stand-alone you will need to have the values for the coordinates (latitude, longitude) of the points. For this, you will probably need and internet connection and the use of a service like google maps to find the coordinates based on the addresses of the points. But once you store then in your db, you will not need an internet connection.
There is a formula for computing the distance between two points based on the coordinates, you will easily find it :)
Our database currently stores 2 values - the longitude and the latitude. E.g.:
Longitude: -0.310150 N
Latitude: 52.688930 W
What we would like to do now is convert these values into the Ordanance Survey Grid Reference (OSGR) for British locations. Is there an easy way to do this in PHP?
The formula for this conversion is pretty complex because osgr and gps use different projections
This page provides an example in C that should be easy enough to re write in PHP.
This question already has answers here:
MySQL Great Circle Distance (Haversine formula)
(9 answers)
Closed 2 years ago.
Each user in my db is associated to a city (with it's longitude and latitude)
How would I go about finding out which cities are close to one another?
i.e. in England, Cambridge is fairly close to London.
So If I have a user who lives in Cambridge. Users close to them would be users living in close surrounding cities, such as London, Hertford etc.
Any ideas how I could go about this? And also, how would I define what is close? i.e. in the UK close would be much closer than if it were in the US as the US is far more spread out.
Ideas and suggestions. Also, do you know any services that provide this sort of functionality?
Thanks
If you can call an external web service, you can use the GeoNames API for locating nearby cities within some radius that you define:
http://www.geonames.org/export/web-services.html
Getting coordinates from City names is called reverse geo coding. Google maps has a nice Api fot that.
There is also the Geonames project where you get huge databases of cities, zip codes etc and their cooridnates
However if you already have the coordinates, its a simple calculation to get the distance.
The tricky thing is to get a nice performant version of it. You probably have it stored in a mysql database, so you need to do it there and fast.
It is absolutely possible. I once did a project including that code, I will fetch it and post it here.
However to speed things up I would recommend first doing a rectangular selection around the center coordinates. This is very, very fast using bee tree indexes or even better stuff like multidimensional range search. Then inside that you can then calculate the exact distances on a limited set of data.
Outside that recangular selection the directions are so vast that it does not need to be displayed or calculated so accurately. Or just display the country, continent or something like that.
I am still at the office but when i get home i can fetch the codes for you. Int he meantime it would be good if you could inform me how you store your data.
Edit: in the mean time here you have a function which looks right to me (i did it without a function in one query...)
CREATE FUNCTION `get_distance_between_geo_locations`(`lat1` FLOAT, `long1` FLOAT, `lat2` FLOAT, `long2` FLOAT)
RETURNS FLOAT
LANGUAGE SQL
DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT ''
BEGIN
DECLARE distance FLOAT DEFAULT -1;
DECLARE earthRadius FLOAT DEFAULT 6371.009;
-- 3958.761 --miles
-- 6371.009 --km
DECLARE axis FLOAT;
IF ((lat1 IS NOT NULL) AND (long1 IS NOT NULL) AND (lat2 IS NOT NULL) AND (long2 IS NOT NULL)) THEN -- bit of protection against bad data
SET axis = (SIN(RADIANS(lat2-lat1)/2) * SIN(RADIANS(lat2-lat1)/2) + COS(RADIANS(lat1)) * COS(RADIANS(lat2)) * SIN(RADIANS(long2-long1)/2) * SIN(RADIANS(long2-long1)/2));
SET distance = earthRadius * (2 * ATAN2(SQRT(axis), SQRT(1-axis)));
END IF;
RETURN distance;
END;
i quoted this from here: http://sebastian-bauer.ws/en/2010/12/12/geo-koordinaten-mysql-funktion-zur-berechnung-des-abstands.html
and here is another link: http://www.andrewseward.co.uk/2010/04/sql-function-to-calculate-distance.html
The simplest way to do this would be to calculate a bounding box from the latitude and longitude of the city and a distance (by converting the distance to degrees of longitude).
Once you have that box (min latitude, max latitude, min longitude, max longitude), query for other cities whose latitude and longitude are inside the bounding box. This will get you an approximate list, and should be quite fast as it will be able to use any indexes you might have on the latitude and longitude columns.
From there you can narrow the list down if desired using a real "distance between points on a sphere" function.
You need a spatial index or GIS functionality. What database are you using? MySQL and PostgreSQL both have GIS support which would allow you to find the N nearest cities using an SQL query.
Another option you might want to consider would be to put all of the cities into a spatial search tree like a kd-tree. Kd-trees efficiently support nearest-neighbor searches, as well as fast searches for all points in a given bounding box. You could then find nearby cities by searching for a few of the city's nearest neighbors, then using the distance to those neighbors to get an estimate size for a bounding box to search in.
In my database I have a list of places and for each I have a street name and number, postcode, city and county. Some of them have a latitude and longitude location.
And I have the geo location of the city centre for example. I would like to display only the places that are within X miles of the city centre on a google map.
Incase this would need a geo location for each of my places to work, I could perhaps set up a script to use google maps api to use geocoding to get a geo location for all my places and update the database with the lat/lng. Then I would have a database full of lat and long locations to work from.
Once all the places have a lat/lng then maybe mysql can return the within range addresses?
This is not hard once you have lat / long data, and if somebody gives you the great circle distance formula in mySQL format.
#maggie gave a good reference. How to efficiently find the closest locations nearby a given location
Indexing strategy: Keep in mind that one minute of latitude (1/60 degree) is one nautical mile, or 1.1515 statute miles (approximately) all over the world. So index your latitude column and do your search like this. (If you're in the part of the world that uses km, you can convert; sorry for the Old-British-Empire-Centric answer, but they did define the nautical mile.)
WHERE mylat BETWEEN column.lat-(myradius*1.1515) AND column.lat+(myradius*1.1515)
AND (the big distance formula) <= myradius
This will give you both decent data base indexing AND reasonably accurate distance circles.
One extra refinement: You can index longitude too. The trouble is that ground distance isn't directly related to longitude. At the equator it is one nautical mile per minute, but it gets smaller, and at the poles there are singularities. So, you can add another term to your WHERE. It gives correct results but isn't as selective as latitude indexing. But it still helps the indexing lookup, especially if you have lots of rows to sift through. So you get:
WHERE mylat BETWEEN column.lat-(myradius*1.1515) AND column.lat+(myradius*1.1515)
AND mylon BETWEEN column.lon-(myradius*1.1515) AND column.lon+(myradius*1.1515)
AND (the big distance formula) < myradius
Most likely you want to use a space-filling-curve or a spatial index to reduce your 2D problem to a 1D problem. For example you can combine the lat/long pair with a z-curve or a hilbert curve. I use for myself a hilbert curve to search for postcodes. You can find my solution at phpclasses.org ( hilbert-curve ).