Right I have been trying to work out how to compare a given postcode to a database of say store addresses and have them ordered in terms of which one is closest to the given postcode (or ZIP code I guess).
This is mainly out of interest, rather than me asking you for advice and then selling it to a client :-O
First of all after research I discovered that you have to do distance with Lat/Long so I found an API that converts postcodes/zip codes to lat long and now my DB has a structure such as id, store_name, lat, long, postcode and I can convert a given postcode to a lat long.
But how in SQL do I make a query for the ones closest to a given lat long?
Try something like this:
// get all the zipcodes within the specified radius - default 20
function zipcodeRadius($lat, $lon, $radius)
{
$radius = $radius ? $radius : 20;
$sql = 'SELECT distinct(ZipCode) FROM zipcode WHERE (3958*3.1415926*sqrt((Latitude-'.$lat.')*(Latitude-'.$lat.') + cos(Latitude/57.29578)*cos('.$lat.'/57.29578)*(Longitude-'.$lon.')*(Longitude-'.$lon.'))/180) <= '.$radius.';';
$result = $this->db->query($sql);
// get each result
$zipcodeList = array();
while($row = $this->db->fetch_array($result))
{
array_push($zipcodeList, $row['ZipCode']);
}
return $zipcodeList;
}
UPDATE:
There is some discussion about efficiency. Here is a little benchmark for you with this query. I have a database that contains EVERY zipcode in the US. Some of them are duplicate because of the way zipcodes work (outside the scope of this topic). So I have just under 80k records. I ran a 20 mile radius distance on 90210:
SELECT distinct(ZipCode) FROM zipcodes WHERE (3958*3.1415926*sqrt((Latitude-34.09663010)*(Latitude-34.09663010) + cos(Latitude/57.29578)*cos(34.09663010/57.29578)*(Longitude- -118.41242981)*(Longitude- -118.41242981))/180) <= 20
I got back 366 total records and Query took 0.1770 sec. How much more efficient do you need?
check out this great open source project
Disclaimer: Not my project, and nor am I contributor. Purely a recommendation.
See this answer to a previous question for an example of calculating a bounding box before querying MySQL. This allows the complex formula in the MySQL query to run against a subset of the database entries, rather than against every entry in the table.
Related
I have table "vehicle_location" and "coordinates" column in table datatype is geomatry
and in my controller i am getting lat and long and radius in request so i want to find vehicle location data using radius
Have a look at the formula explained on Wikipedia: https://en.wikipedia.org/wiki/Great-circle_distance
You'll find someone asking about the same question here: Measuring the distance between two coordinates in PHP
Ideally, it would be good to be able to reduce the calculation of distances to only cars that are not to far from your location. So typically, I would start by an SQL query that only returns the vehicules which have latitude and longitude values in the nearby, according to the given radius.
Then, in a second step, you can calculate all the distances between these cars and your position, with the algorithm (which takes some calculation time) and sort them after.
The ideal thing to do is to try and do the calculation directly in SQL if possible, so that you can sort them and filter them with the radius. If it gets to complicated then do the calculation and sorting in PHP.
I have an SQL database containing hotel information, some of which is the geocoded lat/lng generated by Googles geocoder.
I want to be able to select (directly using an SQL query) all the hotels within a certain range. This range will never be more than 50km so I dont need to go as detailed as alot of answers on here are suggesting (taking into account earth curvature and the fact its not a perfect sphere isnt an issue over the distances im searching).
Im thinking a simple Pythagorian formula would suffice, but I dont know what the latitude and longitude figures represent (and therefore how to convert to metres) and also ive read on a couple of 'simple' solutions to my problem that there are issues with their formulas and calculating distances between two locations either side of the meridian line (as I am based in London this will be a big issue for me!!)
Any help would be great, thankyou!
----Helpful Information-----
My database stores the geocoded data in the following format:
geo_lat: 51.5033630,
geo_lon; -0.1276250
This is a select clause that will get your distance into kilometers. From there you can use a where clause to filter it down to less than 25 kilometers or whatever you want. If you want it in miles just take off the * 1.609344 conversion.
$latitude = [current_latitude];
$longitude = [current_longitude];
SELECT
((((acos(sin((".$latitude."*pi()/180)) * sin((`geo_lat`*pi()/180))+cos((".$latitude."*pi()/180)) * cos((`geo_lat`*pi()/180)) * cos(((".$longitude."- `geo_lon`)* pi()/180))))*180/pi())*60*1.1515) * 1.609344) as distance
FROM
[table_name]
WHERE distance
You can use a simple map projection and straight distances for example equirectangular projection. In the formula on this website you can also use a simplier formula without the square root:http://www.movable-type.co.uk/scripts/latlong.html. Of course you can use a bounding box to filter the query:How to calculate the bounding box for a given lat/lng location?, https://gis.stackexchange.com/questions/19760/how-do-i-calculate-the-bounding-box-for-given-a-distance-and-latitude-longitude.
In an app I'm building, One of the features i'd like users to be able to discover people around them easily. I'm using the GPS to get the latitude and longitude, then storing that information in the mysql db along with the other user information under a column for latitude and another with the longitude. What's would the best practice be to do a query that looks at the person whos making the query's information...and then have it find the closest possible relatable lat and longitude then start there and limit it to 24 results? And furthermore, how would I remember where it stopped so I could allow another query to start where it left off and return more getting further and further away
Basically, to generalize how can I do a mysql query that starts as close as it can to 2 latitude and longitude points that I supply and then have it grab the next 24 sorted by closest to furthest?
I feel like its going to be hard to do because its being based on 2 columns. Is there a way I should/could be combining the GPS values into 1 column so it will be easy to find relative distance?
Maybe I could somehow get the zip code (but then that might cause non US problems). I'm not sure. I'm stuck.
Just search for "Haversine Formula" here on Stackoverflow and you will find several related questions.
As #cdonner mentioned, there are a number of resources for the Haversine formula which you use to transform lat and long into distance. You would pass in a distance variable based on how your formula is set up, usually based on miles and run your query starting at the closest radius. Using a php loop, you can simply increase the distance and re-run the query until you get the desired number of results. And do check out that google link re #Maleck13 as well, very helpful.
I have a system which will return all users from the database and order the results by lowest distance from a reference zip code.
For example: User will come on the site, enter zip code and it will return him all other users who are nearest to his zip (ascending order)
How am i doing this now and why is it a problem ?
The system contains more than 30 million users and their zipcodes. I am retreiving all the users in a particular state and city (narrows the dataset down to about 10,000).
This is where the problem actually happens. Now, all the result sent by mysql (10,000) rows to PHP are sent to a zipcode calculator library which calculates this distance between the base zip code and user's zip code - 10,000 times. Then orders the result by the zip code nearest.
As you can see, this is very badly optimized code. And the 10,000 records are looped through twice. Not to mention the amount of RAM each httpd process takes just transferring data to and fro mysql.
What I would like to ask the gurus in here that is there anyway to optimize this ?
I have a few ideas of my own, but i'm not sure how efficient they are.
Try to do all the zipcode calculation and ordering in mysql itself and return the paginated number of rows.
For this, i will need to move the distance between zipcode calculation logic to a stored procedure. This way I am preventing the processing of 10,000 records in PHP. However, there is still a problem. I would not need to calculate distance for zip codes which have already been calculated (for 2 users having the same zip code).
Secondly, how do i order rows in mysql using a stored procedure ?
What do you guys think ? Is this a good way ? Can i expect a performance boost using this ?
Do you have any other suggestions ?
I know this question is huge, and i really appreciate the time you have taken to read till the end. I would really like to hear your thoughts about this.
As I'm not overly familiar with PHP or MySQL, I can only give some basic tips but they should help. This also assumes you have no direct way of interfacing with the zip library from MySQL.
First, as it's doubtful that you have 10k zip codes in a city, take your existing query and do something like
SELECT DISTINCT ZipCode FROM Users WHERE ...
This will probably return a few dozen zip codes max, and no duplicates. Run this through your zip code library. That library itself is probably a source of slowness, as it has to look up the zip codes, and do a bunch of fancy trig to get actual distance. Take the results of this, and insert it into a temp table with just the zip code and the distance.
Once done with that list, have another query that gets the rest of the user data you want, and JOIN into the the temp table on zip code to get your distance.
This should give you quite a large speedup. You can do whatever paging you need in the second query after the results have been calculated. And no more looping through 10k rows.
I suggest that you narrow the latitude and longitude ranges before you compute the accurate distance for filtering and sorting purposes.
What I mean is if you do a full table scan and compute distances for all zip codes in the database relative to your reference point, it will be very slow.
Instead, filter zipcode by proximity. I mean if you have latitude 10 and longitude 20, first compute the maximum angular range for the proximity you want. Lets say you want a proximity range of 10 miles. That may translate into 0.15 degrees. So you need to filter you zip codes first latitude between 10-0.15 and 10+0.15 and longitude between 20-0.15 and 20+0.15 .
Only after that you include the accurate distance clause in your SQL query condition. That will be much faster because you no longer do full scan and you can eventually use range indexes on longitude and latitude fields.
To translate miles into degrees find the narrow range, keep in mind that the Earth has , approximately 25,000 miles of perimeter, divide 25000 by 360 degrees which gives 70 miles per degree. If you want a range of 10 miles, your range in degrees will be at most 0.15 degrees.
Keep in mind that these calculations are not accurate (the Earth is not exactly well rounded) but that is not important. What is important is that you find a degree range value that is higher than the really accurate value.
If you can get the latitude and longitude for all zipcodes into MySQL, or have an easy way of fetching the lat/long for your base zipcode and feeding it into your MySQL query, then you can order your 10k users by distance inside MySQL. There is a very similar question and answer here which gives you the correct math for the distance function. You may also want to investigate Mysql spatial extensions which would let you insert and index your lat/longs as 2D POINT data.
I have to query a database of thousands of entries and order this by the distance from a specified point.
The issue is that each entry has a latitude and longitude and I would need to retrieve each entry to calculate its distance. With a large database, I don't want to retrieve each row, this may take some time.
Is there any way to build this into the mysql query so that I only need to retrieve the nearest 15 entries.
E.g.
`SELECT events.id, caclDistance($latlng, events.location) AS distance FROM events ORDER BY distance LIMIT 0,15`
function caclDistance($old, $new){
//Calculates the distance between $old and $new
}
Option 1:
Do the calculation on the database by switching to a database that supports GeoIP.
Option 2:
Do the calculation on the databaseusing a stored procedure like this:
CREATE FUNCTION calcDistance (latA double, lonA double, latB double, LonB double)
RETURNS double DETERMINISTIC
BEGIN
SET #RlatA = radians(latA);
SET #RlonA = radians(lonA);
SET #RlatB = radians(latB);
SET #RlonB = radians(LonB);
SET #deltaLat = #RlatA - #RlatB;
SET #deltaLon = #RlonA - #RlonB;
SET #d = SIN(#deltaLat/2) * SIN(#deltaLat/2) +
COS(#RlatA) * COS(#RlatB) * SIN(#deltaLon/2)*SIN(#deltaLon/2);
RETURN 2 * ASIN(SQRT(#d)) * 6371.01;
END//
If you have an index on latitude and longitude in your database, you can reduce the number of calculations that need to be calculated by working out an initial bounding box in PHP ($minLat, $maxLat, $minLong and $maxLong), and limiting the rows to a subset of your entries based on that (WHERE latitude BETWEEN $minLat AND $maxLat AND longitude BETWEEN $minLong AND $maxLong). Then MySQL only needs to execute the distance calculation for that subset of rows.
If you're simply using a stored procedure to calculate the distance) then SQL still has to look through every record in your database, and to calculate the distance for every record in your database before it can decide whether to return that row or discard it.
Because the calculation is relatively slow to execute, it would be better if you could reduce the set of rows that need to be calculated, eliminating rows that will clearly fall outside of the required distance, so that we're only executing the expensive calculation for a smaller number of rows.
If you consider that what you're doing is basically drawing a circle on a map, centred on your initial point, and with a radius of distance; then the formula simply identifies which rows fall within that circle... but it still has to checking every single row.
Using a bounding box is like drawing a square on the map first with the left, right, top and bottom edges at the appropriate distance from our centre point. Our circle will then be drawn within that box, with the Northmost, Eastmost, Southmost and Westmost points on the circle touching the borders of the box. Some rows will fall outside that box, so SQL doesn't even bother trying to calculate the distance for those rows. It only calculates the distance for those rows that fall within the bounding box to see if they fall within the circle as well.
Within your PHP (guess you're running PHP from the $ variable name), we can use a very simple calculation that works out the minimum and maximum latitude and longitude based on our distance, then set those values in the WHERE clause of your SQL statement. This is effectively our box, and anything that falls outside of that is automatically discarded without any need to actually calculate its distance.
There's a good explanation of this (with PHP code) on the Movable Type website that should be essential reading for anybody planning to do any GeoPositioning work in PHP.
EDIT
The value 6371.01 in the calcDistance stored procedure is the multiplier to give you a returned result in kilometers. Use appropriate alternative multipliers if you want to result in miles, nautical miles, meters, whatever
SELECT events.id FROM events
ORDER BY pow((lat - pointlat),2) + pow((lon - pointlon),2) ASC
LIMIT 0,15
You dont have to calculate the absolute distance in meters using the radius of the earth and so forth.
To get the closest points you only need the points ordered with relative distance.
Is this what you're looking for? http://zcentric.com/2010/03/11/calculate-distance-in-mysql-with-latitude-and-longitude/
i think stored procedures are what you're looking for.
If your question is a "find my nearest" or "store finder" type question then you can google for those terms. Generally though, that type of data is accompanied by a postal code of some description, and it is possible to narrow down the list (as Mark Maker points out) by association with postal code.
Every case is different, and this may not apply to you, just throwing it out there.