I have a static table with location name, latitude, longitude, tolerance. for example:
NY, 40.7128, 74.0060, 100 x 50
There are 600 records at this time, the table will grow slowly.
Also there is a dynamic table with MAC address, and coordinates that change every few minutes:
AC233F271FE4, 40.7228, 74.0110
With 4000 records
I want to count how many MAC addresses are within tolerance for each location. Accuracy for earth being a sphere/ellipsoid is not important.
At first I was going to calculate the distance between two points in SQL query when I want to display count, but now I am thinking if it is better to calculate this in php when I update coordinates for the MAC address. I could add a location column, calculate closest point, update MACs lat/long/location every few minutes. Then SQL query for displaying count would be a simple SELECT COUNT GROUP.
The main question is - at what stage is it better to determine location within tolerance?
Second question would be - do I use geography function of SQL (I have read that it is slow) or Haversine formula and how to implement tolerance into distance between two points?
P.S. been goggling, current plan is to make a function that will find nearest location name for given coordinates and make a computed column in the dynamic table which will call the function. In my mind it will work like this:
receive MAC address, lat, long => update lat, long where MAC the same as received => computed column runs the function to get the name for updated coordinates => in front end I can hopefully display MAC + location name.
This looks overcomplicated and confusing to me, would be educational to find a better way.
This feels like a great use of the geospatial capabilities in SQL Server. And you can (mostly) do it with what you have, but with some augmentations for performance. Here's what I'd suggest.
Add a column to your static locations table that represents a polygon for your tolerance. I'm not sure what "100 x 50" means (a bounding box that's 100m x 50m? if so, what's the orientation?). Either way, presumably you can derive a polygon given the lat/long and the tolerance. Persist that in a column of type geography. Put an index on this column.
Similarly, add a column to your dynamic table. In some ways, this is easier than the above as you can just make it a computed column (persisted if you can) that has the definition of newColumn as geography::Point(Latitude, Longitude, 4236). Index this column as well.
Finally, you have what you need. You can now run a geospatial query like so:
select *
from dbo.dynamicTable as d
join dbo.staticTable as s
on s.BoundingBox.STContains(d.Point) = 1;
The indexing on the the two tables is what makes this reasonable. Otherwise, it has to do a Cartesian join between the two. That's unlikely to perform well, regardless of the specifics.
Related
I have table "vehicle_location" and "coordinates" column in table datatype is geomatry
and in my controller i am getting lat and long and radius in request so i want to find vehicle location data using radius
Have a look at the formula explained on Wikipedia: https://en.wikipedia.org/wiki/Great-circle_distance
You'll find someone asking about the same question here: Measuring the distance between two coordinates in PHP
Ideally, it would be good to be able to reduce the calculation of distances to only cars that are not to far from your location. So typically, I would start by an SQL query that only returns the vehicules which have latitude and longitude values in the nearby, according to the given radius.
Then, in a second step, you can calculate all the distances between these cars and your position, with the algorithm (which takes some calculation time) and sort them after.
The ideal thing to do is to try and do the calculation directly in SQL if possible, so that you can sort them and filter them with the radius. If it gets to complicated then do the calculation and sorting in PHP.
How could I retrieve a list of cities which are enroute(waypoints) between 2 gps coordinates?
I have a table of all cities, lat-lon.
So if I have a starting location (lat-lon) and ending location (lat-lon)...
It must be very easy to determine the path of cities (from table) to pass by(waypoints) to get from start(lat-lon) to en (lat-lon)?
I have looked different algorithms and bearing. Still not clear for me.
If you're using the between point A and B method, then you'd just query the cities with Latitude and Longitude between the first and the second, respectively.
If you want to get the cities that are within X miles of a straight line from A to B, then you'd calculate the starting point and slope, and then query cities which are within X miles of the line that creates
If you're not using a simple point A to point B method which ignores roads, then you'll need some kind of data on the actual roads between A and B for us to give you an answer. This can be done using a Node system in your db, and it can also be done by using various geolocation APIs that are out there.
the solution to this can be found by standard discrete routing algorithms
those algorithms need a set of nodes (start, destination, your cities) and edges between those nodes (representing the possible roads or more generally the distances between the locations.)
nodes and edges form a graph ... start point and destination are known ... now you can use algorithms like A* or djikstra to solve a route along this graph
a typical problem for this approach could be that you don't have definitions for the edges (the usable direct paths between locations). you could create such a "road network" in various ways, for example:
initialize "Network_ID" with 0
take your starting location, and find the closest other location. measure the distance and multiply it by a factor. now connect each location to the original location which has a distance less than this value and is not connected to the current location yet. add all locations that were connected by this step to a list. mark the current location with the current "Network_ID" repeat this step for the next location on that list. if your list runs out of locations, increment "Network_ID" and choose a random location that has not yet been processed and repeat the step
after all locations have been processed you have one or more road networks (if more than one, they are not connected yet. add a suitable connection edge between them, or restart the process with a greater factor)
you have to make sure, that either start and destination have the same network_ID or that both networks have been connected
Hmm... I have used BETWEEN min AND max for something like this, but not quite the same.
try maybe:
SELECT * from `cities` WHERE `lat` BETWEEN 'minlat' AND 'maxlat' AND `lon` BETWEEN 'minlon' and 'maxlon';
something like that may work
look at mysql comparisons here:
http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html
I know this is a late answer, but if you are still working on this problem you should read this:-
http://dev.mysql.com/doc/refman/5.6/en/spatial-extensions.html
I have a system which will return all users from the database and order the results by lowest distance from a reference zip code.
For example: User will come on the site, enter zip code and it will return him all other users who are nearest to his zip (ascending order)
How am i doing this now and why is it a problem ?
The system contains more than 30 million users and their zipcodes. I am retreiving all the users in a particular state and city (narrows the dataset down to about 10,000).
This is where the problem actually happens. Now, all the result sent by mysql (10,000) rows to PHP are sent to a zipcode calculator library which calculates this distance between the base zip code and user's zip code - 10,000 times. Then orders the result by the zip code nearest.
As you can see, this is very badly optimized code. And the 10,000 records are looped through twice. Not to mention the amount of RAM each httpd process takes just transferring data to and fro mysql.
What I would like to ask the gurus in here that is there anyway to optimize this ?
I have a few ideas of my own, but i'm not sure how efficient they are.
Try to do all the zipcode calculation and ordering in mysql itself and return the paginated number of rows.
For this, i will need to move the distance between zipcode calculation logic to a stored procedure. This way I am preventing the processing of 10,000 records in PHP. However, there is still a problem. I would not need to calculate distance for zip codes which have already been calculated (for 2 users having the same zip code).
Secondly, how do i order rows in mysql using a stored procedure ?
What do you guys think ? Is this a good way ? Can i expect a performance boost using this ?
Do you have any other suggestions ?
I know this question is huge, and i really appreciate the time you have taken to read till the end. I would really like to hear your thoughts about this.
As I'm not overly familiar with PHP or MySQL, I can only give some basic tips but they should help. This also assumes you have no direct way of interfacing with the zip library from MySQL.
First, as it's doubtful that you have 10k zip codes in a city, take your existing query and do something like
SELECT DISTINCT ZipCode FROM Users WHERE ...
This will probably return a few dozen zip codes max, and no duplicates. Run this through your zip code library. That library itself is probably a source of slowness, as it has to look up the zip codes, and do a bunch of fancy trig to get actual distance. Take the results of this, and insert it into a temp table with just the zip code and the distance.
Once done with that list, have another query that gets the rest of the user data you want, and JOIN into the the temp table on zip code to get your distance.
This should give you quite a large speedup. You can do whatever paging you need in the second query after the results have been calculated. And no more looping through 10k rows.
I suggest that you narrow the latitude and longitude ranges before you compute the accurate distance for filtering and sorting purposes.
What I mean is if you do a full table scan and compute distances for all zip codes in the database relative to your reference point, it will be very slow.
Instead, filter zipcode by proximity. I mean if you have latitude 10 and longitude 20, first compute the maximum angular range for the proximity you want. Lets say you want a proximity range of 10 miles. That may translate into 0.15 degrees. So you need to filter you zip codes first latitude between 10-0.15 and 10+0.15 and longitude between 20-0.15 and 20+0.15 .
Only after that you include the accurate distance clause in your SQL query condition. That will be much faster because you no longer do full scan and you can eventually use range indexes on longitude and latitude fields.
To translate miles into degrees find the narrow range, keep in mind that the Earth has , approximately 25,000 miles of perimeter, divide 25000 by 360 degrees which gives 70 miles per degree. If you want a range of 10 miles, your range in degrees will be at most 0.15 degrees.
Keep in mind that these calculations are not accurate (the Earth is not exactly well rounded) but that is not important. What is important is that you find a degree range value that is higher than the really accurate value.
If you can get the latitude and longitude for all zipcodes into MySQL, or have an easy way of fetching the lat/long for your base zipcode and feeding it into your MySQL query, then you can order your 10k users by distance inside MySQL. There is a very similar question and answer here which gives you the correct math for the distance function. You may also want to investigate Mysql spatial extensions which would let you insert and index your lat/longs as 2D POINT data.
This question already has answers here:
MySQL Great Circle Distance (Haversine formula)
(9 answers)
Closed 2 years ago.
Each user in my db is associated to a city (with it's longitude and latitude)
How would I go about finding out which cities are close to one another?
i.e. in England, Cambridge is fairly close to London.
So If I have a user who lives in Cambridge. Users close to them would be users living in close surrounding cities, such as London, Hertford etc.
Any ideas how I could go about this? And also, how would I define what is close? i.e. in the UK close would be much closer than if it were in the US as the US is far more spread out.
Ideas and suggestions. Also, do you know any services that provide this sort of functionality?
Thanks
If you can call an external web service, you can use the GeoNames API for locating nearby cities within some radius that you define:
http://www.geonames.org/export/web-services.html
Getting coordinates from City names is called reverse geo coding. Google maps has a nice Api fot that.
There is also the Geonames project where you get huge databases of cities, zip codes etc and their cooridnates
However if you already have the coordinates, its a simple calculation to get the distance.
The tricky thing is to get a nice performant version of it. You probably have it stored in a mysql database, so you need to do it there and fast.
It is absolutely possible. I once did a project including that code, I will fetch it and post it here.
However to speed things up I would recommend first doing a rectangular selection around the center coordinates. This is very, very fast using bee tree indexes or even better stuff like multidimensional range search. Then inside that you can then calculate the exact distances on a limited set of data.
Outside that recangular selection the directions are so vast that it does not need to be displayed or calculated so accurately. Or just display the country, continent or something like that.
I am still at the office but when i get home i can fetch the codes for you. Int he meantime it would be good if you could inform me how you store your data.
Edit: in the mean time here you have a function which looks right to me (i did it without a function in one query...)
CREATE FUNCTION `get_distance_between_geo_locations`(`lat1` FLOAT, `long1` FLOAT, `lat2` FLOAT, `long2` FLOAT)
RETURNS FLOAT
LANGUAGE SQL
DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT ''
BEGIN
DECLARE distance FLOAT DEFAULT -1;
DECLARE earthRadius FLOAT DEFAULT 6371.009;
-- 3958.761 --miles
-- 6371.009 --km
DECLARE axis FLOAT;
IF ((lat1 IS NOT NULL) AND (long1 IS NOT NULL) AND (lat2 IS NOT NULL) AND (long2 IS NOT NULL)) THEN -- bit of protection against bad data
SET axis = (SIN(RADIANS(lat2-lat1)/2) * SIN(RADIANS(lat2-lat1)/2) + COS(RADIANS(lat1)) * COS(RADIANS(lat2)) * SIN(RADIANS(long2-long1)/2) * SIN(RADIANS(long2-long1)/2));
SET distance = earthRadius * (2 * ATAN2(SQRT(axis), SQRT(1-axis)));
END IF;
RETURN distance;
END;
i quoted this from here: http://sebastian-bauer.ws/en/2010/12/12/geo-koordinaten-mysql-funktion-zur-berechnung-des-abstands.html
and here is another link: http://www.andrewseward.co.uk/2010/04/sql-function-to-calculate-distance.html
The simplest way to do this would be to calculate a bounding box from the latitude and longitude of the city and a distance (by converting the distance to degrees of longitude).
Once you have that box (min latitude, max latitude, min longitude, max longitude), query for other cities whose latitude and longitude are inside the bounding box. This will get you an approximate list, and should be quite fast as it will be able to use any indexes you might have on the latitude and longitude columns.
From there you can narrow the list down if desired using a real "distance between points on a sphere" function.
You need a spatial index or GIS functionality. What database are you using? MySQL and PostgreSQL both have GIS support which would allow you to find the N nearest cities using an SQL query.
Another option you might want to consider would be to put all of the cities into a spatial search tree like a kd-tree. Kd-trees efficiently support nearest-neighbor searches, as well as fast searches for all points in a given bounding box. You could then find nearby cities by searching for a few of the city's nearest neighbors, then using the distance to those neighbors to get an estimate size for a bounding box to search in.
I have to query a database of thousands of entries and order this by the distance from a specified point.
The issue is that each entry has a latitude and longitude and I would need to retrieve each entry to calculate its distance. With a large database, I don't want to retrieve each row, this may take some time.
Is there any way to build this into the mysql query so that I only need to retrieve the nearest 15 entries.
E.g.
`SELECT events.id, caclDistance($latlng, events.location) AS distance FROM events ORDER BY distance LIMIT 0,15`
function caclDistance($old, $new){
//Calculates the distance between $old and $new
}
Option 1:
Do the calculation on the database by switching to a database that supports GeoIP.
Option 2:
Do the calculation on the databaseusing a stored procedure like this:
CREATE FUNCTION calcDistance (latA double, lonA double, latB double, LonB double)
RETURNS double DETERMINISTIC
BEGIN
SET #RlatA = radians(latA);
SET #RlonA = radians(lonA);
SET #RlatB = radians(latB);
SET #RlonB = radians(LonB);
SET #deltaLat = #RlatA - #RlatB;
SET #deltaLon = #RlonA - #RlonB;
SET #d = SIN(#deltaLat/2) * SIN(#deltaLat/2) +
COS(#RlatA) * COS(#RlatB) * SIN(#deltaLon/2)*SIN(#deltaLon/2);
RETURN 2 * ASIN(SQRT(#d)) * 6371.01;
END//
If you have an index on latitude and longitude in your database, you can reduce the number of calculations that need to be calculated by working out an initial bounding box in PHP ($minLat, $maxLat, $minLong and $maxLong), and limiting the rows to a subset of your entries based on that (WHERE latitude BETWEEN $minLat AND $maxLat AND longitude BETWEEN $minLong AND $maxLong). Then MySQL only needs to execute the distance calculation for that subset of rows.
If you're simply using a stored procedure to calculate the distance) then SQL still has to look through every record in your database, and to calculate the distance for every record in your database before it can decide whether to return that row or discard it.
Because the calculation is relatively slow to execute, it would be better if you could reduce the set of rows that need to be calculated, eliminating rows that will clearly fall outside of the required distance, so that we're only executing the expensive calculation for a smaller number of rows.
If you consider that what you're doing is basically drawing a circle on a map, centred on your initial point, and with a radius of distance; then the formula simply identifies which rows fall within that circle... but it still has to checking every single row.
Using a bounding box is like drawing a square on the map first with the left, right, top and bottom edges at the appropriate distance from our centre point. Our circle will then be drawn within that box, with the Northmost, Eastmost, Southmost and Westmost points on the circle touching the borders of the box. Some rows will fall outside that box, so SQL doesn't even bother trying to calculate the distance for those rows. It only calculates the distance for those rows that fall within the bounding box to see if they fall within the circle as well.
Within your PHP (guess you're running PHP from the $ variable name), we can use a very simple calculation that works out the minimum and maximum latitude and longitude based on our distance, then set those values in the WHERE clause of your SQL statement. This is effectively our box, and anything that falls outside of that is automatically discarded without any need to actually calculate its distance.
There's a good explanation of this (with PHP code) on the Movable Type website that should be essential reading for anybody planning to do any GeoPositioning work in PHP.
EDIT
The value 6371.01 in the calcDistance stored procedure is the multiplier to give you a returned result in kilometers. Use appropriate alternative multipliers if you want to result in miles, nautical miles, meters, whatever
SELECT events.id FROM events
ORDER BY pow((lat - pointlat),2) + pow((lon - pointlon),2) ASC
LIMIT 0,15
You dont have to calculate the absolute distance in meters using the radius of the earth and so forth.
To get the closest points you only need the points ordered with relative distance.
Is this what you're looking for? http://zcentric.com/2010/03/11/calculate-distance-in-mysql-with-latitude-and-longitude/
i think stored procedures are what you're looking for.
If your question is a "find my nearest" or "store finder" type question then you can google for those terms. Generally though, that type of data is accompanied by a postal code of some description, and it is possible to narrow down the list (as Mark Maker points out) by association with postal code.
Every case is different, and this may not apply to you, just throwing it out there.