This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to prevent SQL injection in PHP?
Select distinct rows from MySQL Database
Hi I have a phonegap application that stores latitude, longitude, address and severity when a road surface deformation (such as pothole and speedbump) is detected.
So far saving these in the database through a PHP PDO is not a problem at all. The PDO is designed in a way that if the pothole being reported has already been reported 10 times (checks the database for any entries within a 15 meter range), then it would not be reported (i.e inserted in the database again). Also, loading the surface deformations is not a problem either, i am using the Haversine formula to do that, where I pass the latitude and longitude of the user and get the values within a certain distance.
$stmt = $dbh->prepare("
SELECT
lat, lng,
( 6378160 * acos( cos( radians(?) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(?) ) + sin( radians(?) ) * sin( radians( lat ) ) ) ) AS distance
FROM myTable
HAVING distance > 0
ORDER BY distance
LIMIT 0 , 30
");
The issue I have is that since the same pothole can be reported 10 times, I am ending up having the same pothole reported back to the application for charting on a map 10 times. What I need to do is, get the list of potholes that are within a certain distance from the user (done using the haversine formula), and then out of this list, filter the potholes so that I only get distinct potholes rather than the same pothole being returned back 10 times. Anyone has any idea how I can make such filtering? Can anyone tell me how is it posssible to do this in PHP/PDO or point me to some similar tutorial if available?
Here is what I need to do in brief: say I am near pothole A and Pothole B, and say I have 6 reports for pothole A, and 8 reports for pothole B (and so on) in the database. By using the haversine formula I get all the values of the reports for pothole A and pothole B (ie 14 results). What I need is rather I get the midpoint of the reports for pothole A and midpoint of the reports for Pothole B (using: http://www.geomidpoint.com/calculation.html) and return back 2 results (one for A and one for B) rather than 14 results.
You have to separate potholes and reports. First query what potholes are around GROUP BY lat, lng and then query one report for each pothole with LIMIT 1.
$rows = dbh_query("SELECT id FROM ... ");
foreach ($rows as $row)
{
dbh_query("SELECT ... WHERE id = :id LIMIT 1", array('id' => $row['id']));
}
I think you have 2 options to elaborate. Unfortunately both implies some complications.
1) Instead of writing multiple observations, you should always clusterize them at writing time. For example, if you support space granularity of 10 meters, then every time a new measurement arrives that is less then in 10 meters from existing record, you do not add a new pothole, but change average values (latitude, longitude, counter) in the nearest existing record. This way you'll end up with 2 records for potholes A and B for your example, so you can use DISTINCT query.
2) For every request, you can fetch all records in 15 meter range from existing table and create a temporary table on them for calculating a probability density function along the road axe. Again, this requires to choose a granularity which can be simulated as decimal accuracy in the ROUND function. For example, if you would have a stored function for calculating distance between current point and existing records, you could write:
INSERT INTO `temppdf` (dist, pothole_id)
SELECT FROM `maintable`
ROUND(distance(#current_lat, #current_lon, maintable.lat, maintable.lon), -1), pothole_id
WHERE distance(#current_lat, #current_lon, maintable.lat, maintable.lon) < 15;
Then you can query temppdf for rows with maximum counters for every pothole, something like this:
SELECT pothole, MAX(cnt) as `peak` FROM
(SELECT DISTINCT pothole, COUNT(dist) as cnt FROM `temppdf` GROUP BY pothole, dist) as `subq`
GROUP BY pothole;
The potholes with counters larger than a threshold are the result.
Related
I have a mysql table containing locations, for example:
ID latitude longitude value
1 11.11111 22.22222 1
2 33.33333 44.44444 2
3 11.11112 22.22223 5
I want to select records which are located near to each other (in the above example, rows 1 and 3), so that I can insert them in a new table as one record. By saying near, let's say 100 meters. I would like the new table to be:
ID latitude longitude value
1 11.11111 22.22222 3
2 33.33333 44.44444 2
As you can see, I would like to keep the coordinates of the first record and insert an average of the two records in column 'value'.
The formula I use for distance is the following:
R*SQRT(((lon2*c-lon1*c)*cos(0.5*(lat2*c+lat1*c)))^2 + (lat2*c-lat1*c)^2)
where
[lat1, lon1] the coordinates of the first location,
[lat2, lon2] the coordinates of the second location,
[R] the average radius of Earth in meters,
[c] a number to convert degrees to radians.
The formula works well for short distances.
So, my problem is not the conversion of lat,lon to distances but my SQL. I know how to select records that have a maximum distance of 100 meters from specific lat,lon coordinates but I dont know how to select records with a maximum distance from each other.
One way I did it -and it works- is by looping the records one by one in PHP and for each one, making an SQL query to select records that are near. But this way, I do an SQL query in a loop and as far as I know, this is a bad practice, especially if the records are gonna be thousands.
I hope I was clear. If not, I would be glad to give you additional information.
Thanks for helping.
Here is a SQL to get all places within the range:
SELECT
ID,
latitude,
longitude,
(6371 * acos (cos( radians(origin.latitude)) * cos( radians( destination.latitude ))
* cos( radians(destination.longitude) - radians(origin.longitude)) + sin(radians(origin.latitude))
* sin( radians(destination.latitude)))) AS distance
FROM myTable as destination, myTable as origin
WHERE destination.id = myId
HAVING distance < 100 --some value in kilometers
6371 is a constant for kilometers.
3959 is a constant for miles.
This topic have more answers: MySQL Great Circle Distance (Haversine formula)
I want to fetch 5 rows from table whose "lat" value is nearest to 30. I am developing Google MAP app, where i need the 5 nearest location from Data Base.
My table looks like that,
MySQL provides a Math function that turns negative numbers into absolute values. By using that, you can get the five closest locations whether their lat is slightly lower or higher than 30:
ORDER BY ABS(lat - 30) ASC LIMIT 5
The ASC is optional as it is the default sorting order in all DBMS (thanks Gordon).
On a map "nearest location" should be based on the distance of two points, otherwise lat 30.00, long 75.00 will be the "nearest" location.
A quite exact calculation of the distance between two points (latitude/longitude) is based on the Haversine formula:
DEGREES(ACOS(COS(RADIANS(ref_latitude)) *
COS(RADIANS(latitude)) *
COS(RADIANS(ref_longitude) - RADIANS(longitude)) +
SIN(RADIANS(ref_latitude)) *
SIN(RADIANS(latitude)))) AS distance
latitude = `lat`
longitude = `long`
ref_latitude & ref_longitude = the point you want to find the nearest locations from
`DOUBLE` should be used for calculation
This results in degrees, multiply by 111.195 for an approximate distance in kilometers or by 69.093 for miles.
As you want near locations you might go for a more simple calculation using the Pythagorean theorem
sqrt(power(lat-ref_latitude, 2) +
power((lng-ref_longitude)*cos(radians(ref_latitude)), 2))
Again multiply by 111.195 for kilometers or by 69.093 for miles.
Now simply ORDER BY this distance.
And instead of comparing to all rows in your database you should restrict the number of rows to compare, e.g.
WHERE latitude BETWEEN ref_latitude - 0.2 and ref_latitude + 0.2
AND longitude BETWEEN ref_longitude - 0.2 and ref_longitude + 0.2
Btw, some DBMSes support geospatial extensions like distance functions or geospatial indexes.
As a note, if you want to do this efficiently and you have an index on lat, then the following more complex query should perform better:
select t.*
from ((select t.*
from table t
where lat <= 30
order by lat desc
limit 5
) union all
(select t.*
from table t
where lat > 30
order by lat asc
limit 5
)
) t
order by abs(lat - 30)
limit 5;
The two subqueries can use an index on lat to avoid sorting. The outer query is then only sorting 10 rows.
This question already has answers here:
MySQL Great Circle Distance (Haversine formula)
(9 answers)
Closed 8 years ago.
I'm using the following query to get the next local events using their location. My database is growing and I've got events from all over Europe. So basically, the query calculates distances from events of two different countries (like Spain and Germany) which is really not useful since I'm looking for the next events within 20 KM.
# latitude & longitude are the fields in DB
# 43.57 & 3.85 are given values representing the local test point
SELECT ( 6366 * ACOS( COS( RADIANS( 43.57 ) ) * COS( RADIANS( latitude ) ) * COS( RADIANS( longitude ) - RADIANS( 3.85 ) ) + SIN( RADIANS( 43.57 ) ) * SIN( RADIANS( latitude ) ) ) ) dist
FROM events
HAVING dist >0 AND dist <=20
ORDER BY event_time ASC
LIMIT 0 , 5
So basically, this query is going to get all the distances of all the events before being able to use the HAVING.
Is there a better way?
Calculate the longitude and latitude of the four points of a box that encloses the radius surrounding the test point. Over distances of a few kilometres you can ignore errors due to the curvature of the Earth.
We can use a rough approximation to limit an initial search, then refine things later.
At 50 degrees North (roughly Paris) a degree of latitude (North/South) is approximately 111.229km. A degree of longitude (East/West) is approximately 71.695km (I got these numbers from this page From this 20Km is approximately 10.8 seconds of latitude, and 16.74 seconds of longitude. Calculate a bounding box by adding and subtracting these numbers from the latitude and longitude of your test point.
Query for locations in that box using longitude and latitude.
If you're worried that your bounding box may be too large, or that an event in a corner of the box is outside the radius, calculate an accurate distance for those points you've already identified and filter out the few you don't want.
Note
These figures approximate for 50deg North. As you move north of that your value for km of longitude will become too long, and south of that it will be too short. You could use a small table of lookup values, or calculate the actual distance from the latitude, or simply increase the initial bounding box in the longitude direction. The box is just a first approximation, so as long as it's bigger than the target area it's exact size doesn't matter too much.
I have been searching quite a bit for an answer, but maybe I'm just not using the correct terminology. I am creating an app that will access a database to return a list of other users that are within a certain distance of the users location. I've never worked with this type of data, and I don't really know what the values mean. I'd like to do all the calculations on the backend with either MySQL or PHP. Currently, I am storing the latitude and longitude as doubles within the database. I can access them and store them, but I have no idea how I might be able to sort them based on distance. Perhaps I should be using a different type or some technique that is common in this area. TIA.
It sounds like you need to use the haversine formula which gets the distance between two sets of long/lat coordindates (adjusting for curvature of the earth).
If you run a query with that as an output, you can easily sort them based on minimum distance from the user.
Here is a link to implementing the haversine in 9 commonly used languages and here is a SO question which implements it inside a SQL query.
Here is the query that you could adapt (gets anything within 25 miles ordered from closest to furthest):
SELECT
id,
( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance
FROM
markers
HAVING
distance < 25
ORDER BY
distance
LIMIT
0 , 20;
I would suggest using Vicenty's Inverse Formula (http://en.wikipedia.org/wiki/Vincenty's_formulae) instead of the Haversine Great Circle distance, since Vincenty's been shown to be more accurate (Vincenty assumes the earth is an oblate spheroid instead of a perfect sphere, which is more accurate).
Here's the original Vincenty paper for the formula:
http://www.ngs.noaa.gov/PUBS_LIB/inverse.pdf - Section 4
Here's the actual code from the Android platform that is used to calculate distance for distanceTo(Location), which uses Vincenty's Inverse Formula: https://github.com/android/platform_frameworks_base/blob/master/location/java/android/location/Location.java#L272
As to sorting distances based on a database query, for optimum performance you'll want to use a spatial database that allows spatial queries. MySQL has a spatial database plugin:
http://dev.mysql.com/doc/refman/5.0/en/spatial-extensions.html
Check out this post, which should give you the details to go from there, including notes on precision using Vicenty:
Geo-Search (Distance) in PHP/MySQL (Performance)
Ok I am building a takeaway finder that will find takeaways within a set distance of a uk postal code. What will happen is the user puts his/her postcode in an input box and clicks submit, the site then searches for takeaways near the user. But the catch is that this search is based on the individual takeaways delivery distance. So if a takeaway has a delivery distance of say 12 miles and the persons postcode is within 12 miles of the takeaway it will show in the results.
So far I have uk postcode database with lang and lat coordinates and also the takeaway database table holds the takeaways own postcode and its delivery distance but not the long and lat values of the takeaways postcode.
What I am asking for is not so much the code but help with the logic in how to do this.
I have the following query that will find all postcodes within a set radius of a given long and lat but Im not sure if its in miles and if it is the fastest it could be:
SELECT * , 6371 * ACos( Cos( RADIANS( latitude ) ) * Cos( RADIANS( 56.0062 ) ) * Cos( RADIANS( - 3.78189 ) - RADIANS( longitude ) ) + Sin( RADIANS( latitude ) ) * Sin( RADIANS( 56.0062 ) ) ) AS Distance
FROM postcodes
HAVING Distance <= '10'
ORDER BY Distance
LIMIT 3720 , 30
For performance, consider eliminating fields you don't need. The problem is you're sorting on a calculated value, so each row needs to be examined.
Ideally, you would perform additional filtering to reduce the number of rows you need. Perhaps matching prefixes of postal codes could help. You may make the observation that if the first X characters of the postal code don't match, then it must be more than 12 miles away.
If you have a lot of fields to retrieve, you can also see a big performance boost from late row lookup. In your case, this is particularly helpful, because you can provide a much smaller dataset for MySQL to sort.
The idea would be pull only the ID and distances for each record, sort them, and then pull the top N records (however many you need). You can then use the IDs you fetched to join back to the original table and retrieve the rest of the data. This helps because it allows MySQL to use less memory when performing the sort and if your dataset isn't in memory, you can potentially avoid some disk seeks as well depending on how big your rows are.
Another completely separate option. If you're focusing solely on the UK, you could consider using some type of projection into a Cartesian coordinate system. I believe OSGB is probably suitable for the UK, and should give minimal error.
This opens up the possibility of using MySQL's spatial extensions to add an R-tree index on a series of point columns. This wouldn't give you accurate enough distances on it's own, but it would enable you to narrow down the data set to a significantly smaller portion where true distances could be efficiently calculated.