Way finding / navigation with neo4j

Way finding / navigation with neo4j - php

i know I'll probably get scolded for asking this but I'm hoping someone can help point me in the right direction.
I'm trying to create a way finding application like googles maps but have it mark routes to multiple points on a map.
I am attempting to make an animal transport tool that allows multiple people to meet in a path from point A to B. I thought I read you can do this with a graph database.
Can anyone offer resources on map databases, articles, etc to help me? I've been trying to google for articles but can find any matches so I assume I am using the wrong terms like 'way finding' or 'directions'
Thanks!

Graph databases works well for path finding, especially with shortestPath as mentioned by FrobberOfBits.
This works well as long as you have nodes that can form a path. From what I understand in your comments, in fact you'll need to build these nodes dynamically first depending of the user driving preferences.
Also, apparently you don't have currently a data model for Point A to Point B ?
This is how I would proceed, briefly :
The use case is the Point A is my home and the point 2 is where live the Neo4j Community CareTaker, on a map it would be :
All the little white points in the route from A to B can be Neo4j nodes.
Now you say, that for e.g., there is an awesome girl with a driving preference of 50km in order to join the route, so the current route has to change.
As in your application, you do not force users to choose an intermediary point, you will have to create it on the fly.
So my quick solution, is to get the point on the radius that will be the closest to the destination point of the initial route.
To achieve it, first you'll need the initial bearing from the current position of the girl to the destination (here Dresden)
Where lat1 and lon1 are the position of the girl, and lat2, lon2 the position of Dresden
Then, given this initial bearing you can calculate the position starting from the girl in the bearing bearing and 50 kms distance for e.g.
In PHP it would look like (tested) :
// Function that calculate new coordinates from point A
// Given a bearing and a distance to point B
function getAwayPosition($lat1, $lon1, $lat2, $lon2, $distance)
{
// Getting true course from point A to point B
$la1 = deg2rad($lat1);
$la2 = deg2rad($lat2);
$lo1 = deg2rad($lon1);
$lo2 = deg2rad($lon2);
$y = sin($la2-$la1)*cos($lo1);
$x = cos($lo1*sin($lo2))-sin($lo1)*cos($lo2)*cos($la2-$la1);
$course = fmod(rad2deg(atan2($y, $x)+360), 360);
// Getting the new lat/lon points from lat1/lon1 given the course and the distance
$bearing = deg2rad($course); //back to radians for the coordinates calculation
$laa = asin(sin($la1 * cos($distance/6371 + $la1 * cos($course))));
$loo = $lo1 + atan2(sin($bearing) * sin($distance/6371) * cos($la1), cos($distance/6371) - sin($la1) * sin($laa));
return array(
'lat' => rad2deg($laa),
'lon' => rad2deg($loo),
'bearing' => $course
);
}
NB: There is also the haversin function in Cypher for calculating earth distances, I didn't play that much with it though
So, you can add a node that would represent the desired position for the girl in order to join the trip.
So far, we have now 3 good nodes for Neo4j :
Now I think it would be up to your data model for the rest of the process, if the route from A to B don't form already Neo4j nodes in your database, you can build it dynamically by calculating the distance from all the points with Google Maps API and set is as relationship property.
Then you can do a Cypher shortestPath and reduce on the relationship distance property.
If you have already nodes representing multiple intermediate points of the initial route from A to B, you can use Neo4j Spatial and get the closest point from the desired position of the girl to the intermediate nodes of the route and create a relationship with the distance.
And again shortestPath for the best mapped route.
The second solution would be better, because you will immediately have a graph to play with :
And a simple Cypher query if you want to get the path with the least kilometers :
MATCH (a:Point {id:'A'}), (b:Point {id:'B'})
MATCH p=allShortestPaths((a)-[:NEXT_POINT]-(b))
RETURN p, reduce(totalKm = 0, x in relationships(p)| totalKm + x.kilometers) as distance
ORDER BY distance ASC
There is also the Djikstra algorithm with cost properties available with the Neo4j ReST API http://neo4j.com/docs/stable/rest-api-graph-algos.html
Some resources :
https://software.intel.com/node/341473
http://www.movable-type.co.uk/scripts/latlong.html
http://williams.best.vwh.net/avform.htm
So as a conclusion, if you want to use Neo4j for determining the best
route, you'll need to help him with some data in the database ;-)

What you appear to be describing is a route from A to B via different waypoints (people's locations), but with the goal that the overall distance travelled from A to B is minimised. A path in a graph with these characteristics is known as a Minimum Spanning Tree. Crucially, the fact that people can travel to meet the "main route" simply introduces additional waypoints it does not change the overall problem, or the way it is classically solved.
Neither Cypher nor the REST API have a function to calculate minimum spanning trees and unfortunately simply collecting shortest paths between individual nodes (e.g using Dijkstra's algorithm) and then trying to create a path from these shortest-path segments will not be optimal in the general case.
However it is not difficult to implement Prim's algorithm via a server plugin or unmanaged extension in Neo4j, and I would recommend you look into this.

I'd start with this article. To ask a more specific question, you'll need a data model and an example of what you're trying to do.
But in general the language to use is probably "shortest path". Generally, if you're trying to find a path from point A to point B, you're looking for the "shortest path" where the "cost of the path" is minimal. Sometimes cost is distance, and sometimes cost is time - but either way you're usually looking for the shortest path, not just any path.

Related

how to implement the query based on the places range from x minutes to xx minutes time for drive?

There are some places I can choose from. I want to choose one to be my source place, and select driving time cost is less than 30 minutes. So there are maybe some places I can drive there cost less than 30 minutes will be showed.
So, what is the best way I should to save all these places data and query them on specific conditions?
Before I asking this question, I've tried to save all these places latitude and longitude. Whenever a new place has been saved to the database, I will request HERE map routing API to calculate distances and drive time between the new one with all old places info in a database, then save them in the distance table.
When a user wants to query places like the above example. I will join places table and distance table to query like:
SELECT place.id, place.name from place join distance on place_id = place.id where distance cost_time < 30;
There are some problem make me upset. If the number of old places is too big(actually it will), the time hanging after saving a place to the database will be much more.
So, I know I used a bad method to implement my goal. But I don't know how can I do, can someone help me with this problem?
last but not least, forget my poor English, if something is unclear, I'll try my best to describe it. Thank you.

You probably need to build a connected graph and compute the distances to other points on the fly.
When a new point is added, compute its distance with the X nearest neighbours only and store them in a database.
Then, you can use a algorithm like Dijkstra to find all the points at less than 30 units from your source.
You will lose some precision, as the cost to drive from A to C, then C to B will be usually greater then the direct path from A to B. And the time you saved on adding a new point, you will "lost" it to do the computation of the Dijkstra algorithm.

How to group objects based on longitude/latitude proximity using laravel/php

I have a group of users. The user count could be 50 or could be 2000. Each should have a long/lat that I have retrieved from Google Geo api.
I need to query them all, and group them by proximity and a certain count. Say the count is 12 and I have 120 users in the group. I want to group people by how close they are (long/lat) to other people. So that I wind up with 10 groups of people who are close in proximity.
I currently have the google geo coding api setup and would prefer to use that.
TIA.
-- Update
I have been googling about this for awhile and it appears that I am looking for a spatial query that returns groups by proximity.

Keep in mind that this problem grows exponentially with every user you add, as the amount of distance calculations is linked to the square of the number of users (it's actually N*(N-1) distances... so a 2000 user base would mean almost 4 million distance calculations on every pass. Just keep that in mind when sizing the resources you need
Are you looking to group them based on straight-line (actually great circle) distance or based on walking/driving distance?
If the former, the great circle distance can be approximated with simple math if you're able to tolerate a small margin of error and wish to assume the earth is a sphere. From GCMAP.com:
Earth's hypothetical shape is called the geoid and is approximated by
an ellipsoid or an oblate sphereoid. A simpler model is to use a
sphere, which is pretty close and makes the math MUCH easier. Assuming
a sphere of radius 6371.2 km, convert longitude and latitude to
radians (multiply by pi/180) and then use the following formula:
theta = lon2 - lon1
dist = acos(sin(lat1) × sin(lat2) + cos(lat1) × cos(lat2) × cos(theta))
if (dist < 0) dist = dist + pi
dist = dist × 6371.2
The resulting distance is in kilometers.
Now, if you need precise calculations and are willing to spend the CPU cycles needed for much complex math, you can use Vincenty's Formulae, which uses the WGS-84 reference ellipsoid model of the earth which is used for navigation, mapping and whatnot. More info HERE
As to the algorithm itself, you need to build a to-from matrix with the result of each calculation. Each row and column would represent each node. Two simplifications you may consider:
Distance does not depend on direction of travel, so $dist[n][m] == $dist[m][n] (no need to calculate the whole matrix, just half of it)
Distance from a node to itself is always 0, so no need to calculate it, but since you're intending to group by proximity, to avoid a user being grouped with itself, you may want to always force $dist[m][m] to an arbitrarily defined and abnormally large constant ($dist[m][m] = 22000 (miles) for instance. Will work as long as all your users are on the planet)
After making all the calculations, use an array sorting method to find the X closest nodes to each node and there you have it
(you may or may not want to prevent a user being grouped on more than one group, but that's just business logic)
Actual code would be a little too much to provide at this time without seeing some of your progress first, but this is basically what you need to do algoritmically.

... it appears that I am looking for a spatial query that returns groups by proximity. ...
You could use hdbscan. Your groups are actually clusters in hdbscan wording. You would need to work with min_cluster_size and min_samples to get your groups right.
https://hdbscan.readthedocs.io/en/latest/parameter_selection.html
https://hdbscan.readthedocs.io/en/latest/
It appears that hdbscan runs under Python.
Here are two links on how to call Python from PHP:
Calling Python in PHP,
Running a Python script from PHP
Here is some more information on which clustering algorithm to choose:
http://nbviewer.jupyter.org/github/scikit-learn-contrib/hdbscan/blob/master/notebooks/Comparing%20Clustering%20Algorithms.ipynb
http://scikit-learn.org/stable/modules/clustering.html#clustering

Use GeoHash algorithm[1]. There is a PHP implementation[2]. You may pre-calculate geohashes with different precision, store them in SQL database alongside lat-lon values and query using native GROUP BY.
https://en.wikipedia.org/wiki/Geohash
https://github.com/lvht/geohash

Sorting by distance in MySQL with spatial analysis functions and data types

I'm building a php web app with Laravel 5.5 and I need to display a list of places (eg. stores) sorted by their distance from a user-specified location.
The places will be stored in a MySQL database and should be retrieved as Eloquent ORM model instances.
Doing some research I found many posts and questions on this topic (presenting different solutions), but, having very little experience with databases and geolocation/geospatial analysis, they mostly confused me, and I'd like to know what approach to follow and what are the best practices in this case.
Most answers I read suggest using the haversine formula or the spherical law of cosines in the SQL query, which would look something like (example taken from this answer):
$sf = 3.14159 / 180; // scaling factor
$sql = "SELECT * FROM table
WHERE lon BETWEEN '$minLon' AND '$maxLon'
AND lat BETWEEN '$minLat' AND '$maxLat'
ORDER BY ACOS(SIN(lat*$sf)*SIN($lat*$sf) + COS(lat*$sf)*COS($lat*$sf)*COS((lon-$lon)*$sf))";
This post points out the fact that, over short distances, assuming the Earth flat and computing a simple euclidean distance is a good approximation and is faster than using the haversine formula.
Since I only need to sort places within a single city at a time, this seems to be a good solution.
However, most of these posts and SO answers are a few years old and I was wondering if there is now (MySQL 5.7) a better solution.
For example, none of those post use any of MySQL “Spatial Analysis Functions”, like ST_Distance_Sphere and ST_Distance which seem to be exactly for that purpose.
Is there any reason (eg. performance, precision) not to use these functions instead of writing the formula in the query? (I don't know which algorithm is internally used for these functions)
I also don't know how I should store the coordinates of each place.
Most of the examples I've seen assume the coordinates to be stored in separate lat, lon columns as doubles or as FLOAT(10,6) (as in this example by google), but also MySQL POINT data type seems appropriate for storing geographic coordinates.
What are the pros and cons of these two approaches?
How can indexes be used to speed up these kind of queries? For example I've read about “spatial indexes”, but I think they can only be used for limiting the results with something like MBRContains(), not to actually order the results by distance.
So, how should I store the coordinates of places and how should I query them to be ordered by distance?

Other than the ST_Distance_Sphere, 5.7 does not bring anything extra to the table. (SPATIAL was already implemented.)
For 'thousands' of points, the code you have is probably the best. Include
INDEX(lat, lng),
INDEX(lng, lat)
And I would not worry about the curvature of the earth unless you are stretching thousands of miles (kms). Even then the code and that function should be good enough.
Do not use FLOAT(m,n), use only FLOAT. The link below gives the precision available to FLOAT and other representations.
If you have so many points that you can't cache the table and its indexes entirely (many millions of points), you could use this , which uses a couple of tricks to avoid lengthy scans like the above solution. Because of PARTITION limitations, lat/lng are represented as scaled integers. (But that is easy enough to convert in the input/output.) The earth's curvature, poles, and dateline are all handled.

I use a table that has lat & long associate with zip codes that I found. I use the haversine formula to find all zipcodes within a certain range. I then use that list of zip codes that are returned from that query and find all business with those zip codes. Maybe that solution will work for you. It was pretty easy to implement. This also eliminates you having to know the lat and long for the each business as long as you know the zip code.

Use ST_DISTANCE_SPHERE or MBRContains to get distance between points or points within a bound - much faster than doing Haversine formula which can't use indices and is not built for querying distances and because MySql is slow with range queries. Refer mysql documentation.
Haversine formula is probably good for small applications and most of the older answer refer to that solution because older versions of MySql innodb did not have spatial indexes.
The broad method of doing it is as follows - the below is from my working code in Java - hope you can tailor it for PHP as per your needs
First save the incoming data as a Point in database (Do note that the coordinate formula uses longitude, latitude convention)
GeometryFactory factory = new GeometryFactory();
Point point = factory.createPoint(new Coordinate(officeDto.getLongitude(), officeDto.getLatitude()));//IMP:Longitude,Latitude
officeDb.setLocation(point);
Create Spatial Indexes using the following in mysql
CREATE SPATIAL INDEX location ON office (location);
You might get the error "All parts of a SPATIAL index must be NOT NULL". That is because spatial indexes can only be created if the field is NOT NULL - in such a case convert the field to non-null
Finally, call the custom function ST_DISTANCE_SPHERE from your code as follows.
SELECT st_distance_sphere( office.getLocation , project.getLocation)
as distance FROM ....
Note: office.getLocation and project.getLocation both return POINT types. Native SQL method is as below from documentation
ST_Distance_Sphere(g1, g2 [, radius])
which returns the mimimum spherical distance between two points and/or multipoints on a sphere, in meters, or NULL if any geometry argument is NULL or empty.

calculate distance

i am designing a recruitment database, and i need it to perform several tasks that all involve integrating a way of calculating distance:
1) calculate the distance the candidate lives from the client?
2) calculate the clients within a radius of the candidates available for work on any given day?
3) calculating the number of candidates with the correct qualifications for the vacancy within a radius of the client.
as you can see i need to calculate the distance in 2 main ways 1) radius 2) as the crow flies, i would prefer exact distance but the first will do.
i know that i can integrate Google maps or some other web based mapping but i want the system to be stand alone so it can function without an internet connection.
the system will have a HTML5 front end and the Back end is in Mysql and PHP.
thank you biagio

The distance between 2 points could be calculated with the following formula :
6371*acos(cos(LatitudeA)*cos(LatitudeB)*cos(longitudeB-longitudeA)+sin(LatitudeA)*sin(latitudeB))
Of course it's a "crow flies" approximation in Km.
Wich can be translated to php by :
$longA = 2.3458*(M_PI/180); // M_PI is a php constant
$latA = 48.8608*(M_PI/180);
$longB = 5.0356*(M_PI/180);
$latB = 47.3225*(M_PI/180);
$subBA = bcsub ($longB, $longA, 20);
$cosLatA = cos($latA);
$cosLatB = cos($latB);
$sinLatA = sin($latA);
$sinLatB = sin($latB);
$distance = 6371*acos($cosLatA*$cosLatB*cos($subBA)+$sinLatA*$sinLatB);
echo $distance ;
With that you could compute the distance between two points (people) and of course determine if a point is in a radius of an other.

Just a thought: If your clients and candidates live in a very limited area, I would use a rectangular map and enter x and y values for all cities in that area in a database.
Due to the pythagorean theorem, you can say:
distance² = (x-value client - x-value candidate)² + (y-value client - y-value candidate)²

First, isn't the radius actually the 'as crow flies' distance from the client to the candidates? Anyway...you will need to figure out an algorithm for computing what you need based on the distance between the points.
If you want the system to be stand-alone you will need to have the values for the coordinates (latitude, longitude) of the points. For this, you will probably need and internet connection and the use of a service like google maps to find the coordinates based on the addresses of the points. But once you store then in your db, you will not need an internet connection.
There is a formula for computing the distance between two points based on the coordinates, you will easily find it :)

Getting waypoints between 2 gps coordinates from cities table

How could I retrieve a list of cities which are enroute(waypoints) between 2 gps coordinates?
I have a table of all cities, lat-lon.
So if I have a starting location (lat-lon) and ending location (lat-lon)...
It must be very easy to determine the path of cities (from table) to pass by(waypoints) to get from start(lat-lon) to en (lat-lon)?
I have looked different algorithms and bearing. Still not clear for me.

If you're using the between point A and B method, then you'd just query the cities with Latitude and Longitude between the first and the second, respectively.
If you want to get the cities that are within X miles of a straight line from A to B, then you'd calculate the starting point and slope, and then query cities which are within X miles of the line that creates
If you're not using a simple point A to point B method which ignores roads, then you'll need some kind of data on the actual roads between A and B for us to give you an answer. This can be done using a Node system in your db, and it can also be done by using various geolocation APIs that are out there.

the solution to this can be found by standard discrete routing algorithms
those algorithms need a set of nodes (start, destination, your cities) and edges between those nodes (representing the possible roads or more generally the distances between the locations.)
nodes and edges form a graph ... start point and destination are known ... now you can use algorithms like A* or djikstra to solve a route along this graph
a typical problem for this approach could be that you don't have definitions for the edges (the usable direct paths between locations). you could create such a "road network" in various ways, for example:
initialize "Network_ID" with 0
take your starting location, and find the closest other location. measure the distance and multiply it by a factor. now connect each location to the original location which has a distance less than this value and is not connected to the current location yet. add all locations that were connected by this step to a list. mark the current location with the current "Network_ID" repeat this step for the next location on that list. if your list runs out of locations, increment "Network_ID" and choose a random location that has not yet been processed and repeat the step
after all locations have been processed you have one or more road networks (if more than one, they are not connected yet. add a suitable connection edge between them, or restart the process with a greater factor)
you have to make sure, that either start and destination have the same network_ID or that both networks have been connected

Hmm... I have used BETWEEN min AND max for something like this, but not quite the same.
try maybe:
SELECT * from `cities` WHERE `lat` BETWEEN 'minlat' AND 'maxlat' AND `lon` BETWEEN 'minlon' and 'maxlon';
something like that may work
look at mysql comparisons here:
http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html

I know this is a late answer, but if you are still working on this problem you should read this:-
http://dev.mysql.com/doc/refman/5.6/en/spatial-extensions.html

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.