Graph theory in public transportation

Graph theory in public transportation - php

Hi,
I am now developing public transportation guide software. In Europe an US Google Map provides this but in Turkey it does not. I have database which contains all stations’ latitudes and longitudes, and other bus- line, station information. In my plan, firstly I will use graph theory (stations are vertexes; edge weights are distances among stations) and connect stations which are on the same bus-line; then find routes. After that I will demonstrate route on Google Map. I have accomplished the first step, connecting stations. However, after that I found a mistake in my plan which is shown in the figure.
Person wants to go near A to K, the program should say
Walk to station A
Got on bus 8 at station A
Got off bus 8 at station E
Walk to station H
Got on bus 970 at station H
Got off bus 970 at station K
But, there is no connection between station E and H. So, graph algorithm cannot find a route from A to K. I should define a walk path between E and H. However this is only small demonstration of the city there are over 6500 station in the city. How can I solve this problem?
I have an idea that add connections between stations with in 1km range; but I think that it is inefficient.
Thanks.

Add a directed edge between E and H with weight say 'a'. Choose 'a' in such a way that none of the edge weights in your network have the same weight 'a'. For example, you can choose 'a' to be 0 as I am sure none of the edges in your network have weight zero as that would mean there is no distance between some two stations. Next, code your program in such a way that whenever there is a route selected that contains an edge with weight 'a', then the program should say, "Walk from station E to station H" or whichever nodes the edge with weight 'a' connects.
Otherwise, you can have a graph such there are two directed edges from one node to another whenever you have to walk. Say one edge is weighted 0 and the other by the distance between the two stations. Make sure your program is coded in such a way that when there are two edges encountered between say stations A and B, one with weight 0 and the other with say 'x', then the program gives the instruction, "Walk from station A to B, a distance of 'x' km."

Related

Calculating possible tournament outcomes

I am trying to write a section of code in PHP which will work out for each team the best and worst possible outcome from a round robin type tournament.
This code will be executed after each round of games and so will lookup the current W-L-T record for each team as well as the future schedule of games for each team (all of this information is already stored in a database).
My initial thought was to run through each permutation of ranking of each team and remembering the extreme limits for each teams performance. However upon further thinking I realise that for the twelve teams in this case that would result in over 479 million permutations (which may take a little time to calculate, let alone being concise code).
I have unfortunately reached, I fear, the limit of my imagination in devising a logic system to deal with this so any help anyone could offer would be great.
Cheers in advance
Edward

I'll assume a loss is worth 0 points, a tie 1 point and a win 2 points.
For each team t
Sort the teams by their current point table so the last place
team(s) come first and the top teams come last. Put all teams tied with t before t.
Let i be the position of team t in this list
From here on I'll name teams by their position in the list. So we have
from left to right, teams currently worse than i, teams tied with team i, team i,
and finally teams better than i.
Make a working copy of your matrix. For the rest of this
iteration I'll implicitly refer to the working copy.
Suppose (in the working copy) that team i has loses all its remaining games.
For j from 0 up to i
Make a backup copy of the working copy.
for( k:=n-1 ; k < j and j is behind or tied with i ; k := k-1 )
If k hasn't played j and j is behind i
suppose that j beats k
Else if k hasn't played j /* and is tied with k */
suppose that j ties k
if j is still behind i
revert to the backup made before the preceding loop
discard the backup copy
for all games j has yet to play suppose j loses
At this point, all remaining games in the working copy are between teams ahead
of team i, assume all remaining games are ties.
Now (if we have really constructed a worst case scenario) the rank of team i
in the working copy is the worst it can do. I.e. team i beats "count
I'm not completely sure this give the exact lower bound. An upper bound would be symmetric.

round robin tournament home away distribution

I am writing a script that creates a tournament fixtures using round robin algorithm with first team fixed. And it works well.
Problem is that when I create those fixtures I have to distribute home and away as close as possible to HAHAHA... pattern where H - is home and A - is away. Where limit is that team cannot play 3 home(or away) matches in a row.
What I tried is preserving how many home and away matches each team played and then team with lowest home or away number will play where it should.
For example
Team 1 (2 H and 1 A) VS Team 2 (with 2 H and 2 A)
Result would be :
Team 2(H) vs Team 1(A) // because Team 1 played least number away of games
Question: Is there other way to implement such home away distribution, and if is what would be the idea behind it?

The equal distribution pattern that you seek is not readily available. The suggestion to do a 'random shuffle' does not solve the problem. Distributing teams equally with opponents, equally as home & visitor, and equally to play in the time/location slots can be done. There are different requirements that must be met for an even number of teams and an odd number of teams. Add to this that the math to create each schedule is totally different (for example a 7 team league schedule is different than an 8 team league).
Checkout the information provided on this link about "equal distribution".
Equal distribution of; teams, time slots, & home & visitor is possible only if you have the correct number of time slots available for the number of teams you are scheduling. Understanding the structure of schedules is very important. Your question above about equal Home & Away (H & A) is answered in the link above. The best you can do is no more than two H or two A games in a row in each round robin. There is a minor exception where a team could have 3 Home or 3 Away games in a row when a round robin is ending and starting the next round robin. This only happens to a few teams, is unavoidable, but H & A is balanced at the end of each 2 round robins.
When scheduling teams for round robin play, in the simplest of terms you are looking to create a round robin of teams, a round robin of home & visitor status, and a round robin of time/location slots... all at the same time.
To further complicate the subject it takes a different number of round robins (one) to satisfy equal 'team' distribution, a different number of round robins (two) to satisfy 'home & visitor' balance, and a different number of round robins to satisfy 'time slot' balance. The number of round robins needed to balance all teams playing equally in all the time slots, for an even number of teams, is equal to half the number of teams being scheduled. This changes when scheduling an odd number of teams.

#Bob R The 'unavoidable' exception of 3H or 3A at the join is in fact avoidable. See D. de Werra (1981) 'Scheduling in sports', in 'Studies on Graphs and Discrete Programming' (editor P. Hansen), North Holland, pp 381-395.

Data set creation for eventual integration into MySQL & Google Maps API for a website? (a la point-in-polygon, collision theorem, etc)

I've managed over the past few months to teach myself PHP, PDO & SQL, and have built a basic dynamic website with user registration/email activation/ and login logout functionality, following PHP/SQL best practices. Now I'm stuck on the next task...
I've created a huge dataset of squares/polygons (3 million+), each 1 minute of latitude & longitude in size, stored in a PHP array with a single set of coordinates (the top left corner). To extrapolate a square-like shape, I simply add 0.016 degrees (~1 minute) to each direction and generate the other 3 coordinates.
I now need to check that each polygon in said array is over at least some portion of land in the United States.... i.e. if one were to produce a graphical output of my completed data set and take a look at the San Fransisco coastline, they'd see something like this.
It's similar to the point-in-polygon problem, except it's dealing with another polygon instead of a point, the other polygon is a country border, and I'm not just looking at intersections. I want to check if:
A polygon/square intersects with the polygon. (Think coastline/border).
A polygon/square is inside the polygon. (Think continental U.S.).
A polygon/square contains part of the polygon. (Think small island).
This is illustrated with my crudely drawn image:
If it matches any of these three conditions, I want to keep the square. If it does not interact with the big polygon in anyway (i.e. it's over water), discard it.
I was thinking the big polygon would be a shapefile of the U.S., that or a KML file which I could strip the coordinates out of to create a very complex polygon from.
Then, I thought I'd pass these matching squares and square ID's over to a csv file for integration into a MySQL table containing a set of coordinates of each square (in fact, I'm not even sure of the best practices for handling tables of that size in MySQL, but I'll come to that when need be). The eventual goal would then be to develop a map using Google Maps API via Javascript to display these squares over a map on the website I'm coding (obviously only showing squares within the viewpoint to make sure I don't tax my db to death). I'm pretty sure I'd have to pass such information through PHP first, too. But all of that seems relatively easy compared to the task of actually making said data set.
This is obviously something that cannot be done by hand, so it needs automating. I know a bit of Python, so would that be of help? Any other tips on where to start? Someone willing to write some of the code for me?

Here is a solution that will be efficient, and as simple as possible to implement. Note that I do not say simple, but as simple as possible. This is a tricky problem, as it turns out.
1) Get U.S. polygon data using Shapefiles or KFL, which will yield a set of polygon shapes (land masses), each defined by a list of vertices.
2) Create a set of axis aligned bounding box (AABB) rectangles for the United States: one for Alaska and each Alaskan island, one for each Hawaiian island, one for the Continental United States, and one for each little island off the coast of the Continental U.S. (e.g., Bald Head Island in N.C., Catalina off the coast of California). Each bounding box is defined as a rectangle with the corners which are the minimum and maximum latitude and longitude for the shape. My guess is that there will be a few hundred of these. For example, for Hawaii's big island, the latitude runs 18°55′N to 28°27′N, and the longitude runs 154°48′W to 178°22′W. Most of your global lat/long pairs get thrown out at this step, as they are not in any of those few hundred bounding boxes. For example, your bounding box at 10°20'W, 30°40'N (a spot in the Atlantic Ocean near Las Palmas, Africa) does not overlap Hawaii, because 10°20'W is less than 154°48′W. This bit would be easy to code in Python.
3) If the lat/long pair DOES overlap one of the several hundred AABB rectangles, you then need to test it against the single polygon within the AABB rectangle. To do this it is strongly recommended to use the Minkowski Difference (MD). Please thoroughly review this website first:
http://www.wildbunny.co.uk/blog/2011/04/20/collision-detection-for-dummies/
In particular, look at the "poly versus poly" demo halfway down the page, and play with it a little. When you do, you will see that when you take the MD of the 2 shapes, if that MD contains the origin, then the two shapes are overlapping. So, all you need to do then is take the Minkowski Difference of the 2 polygons, which itself results in a new polygon (B - A, in the demo), and then see if that polygon contains the origin.
4) There are many papers online regarding algorithms to implement MD, but I don't know if you'll have the ability to read the paper and translate that into code. Since it is tricky vector math to take the MD of the two polygons (the lat/long rectangle you're testing, and the polygon contained in the bounding box which overlapped the lat/long rectangle), and you have told us that your experience level is not high yet, I would suggest using a library that already implements MD, or even better, implements collision detection.
For example:
http://physics2d.com/content/gjk-algorithm
Here, you can see the relevant pseudo-code, which you could port into Python:
if aO cross ac > 0 //if O is to the right of ac
if aO dot ac > 0 //if O is ahead of the point a on the line ac
simplex = [a, c]
d =-((ac.unit() dot aO) * ac + a)
else // O is behind a on the line ac
simplex = [a]
d = aO
else if ab cross aO > 0 //if O is to the left of ab
if ab dot aO > 0 //if O is ahead of the point a on the line ab
simplex = [a, b]
d =-((ab.unit() dot aO) * ab + a)
else // O is behind a on the line ab
simplex = [a]
d = aO
else // O if both to the right of ac and to the left of ab
return true //we intersect!
If you are unable to port this yourself, perhaps you could contact either of the authors of the 2 links I've included here--they both implemented the MD algorithm in Flash, perhaps you could license the source code.
5) Finally, assuming that you've handled the collision detection, you can simply store in the database a boolean as to whether the lat/long pair is part of the United States. Once that's done, I have no doubt you will be able to do as you'd like with your Google Maps piece.
So, to sum up, the only difficult piece here is to either 1) implement the collision detection GJK algorithm, or alternatively, 2) write an algorithm that will first calculate the Minkowski Difference between your lat/long pair and the land polygon contained within your AABB and then secondly see if that MD polygon contains the origin. If you use that approach, Ray Casting (typical point-in-a-polygon solution) would do the trick with the second part.
I hope this gives you a start in the right direction!

I think this other question answers a good portion of what you are trying to do
How do I determine if two convex polygons intersect?
The other portion is that if you are using a database, I would load in all polygons near your view point from both sets (the set of the map polygons and the set of other polygons you generated) and then run the above algorithm on this smaller set of polygons and you can generate a list of all polygons in your set that should be overlayed on the map.

Algorithm that creates "teams" based on a numeric skill value

I am building an application that helps manage frisbee "hat tournaments". The idea is people sign up for this "hat tournament". When they sign up, the provide us with a numeric value between 1 and 6 which represents their skill level.
Currently, we are taking this huge list of people who signed up, and manually trying to create teams out of this based on the skill levels of each player. I figured, I could automate this by creating an algorithm that splits up the teams as evenly as possible.
The only data feeding into this is the array of "players" and a desired "number of teams". Generally speaking we are looking at 120 players and 8 teams.
My current thought process is to basically have a running "score" for each team. This running score is the total of all assigned players skill levels. I loop through each skill level. I go through rounds of picks once inside skill level loop. The order of the picks is recalculated each round based on the running score of a team.
This actually works fairly well, but its not perfect. For example, I had a range of 5 pts in my sample data array. I could very easily, manually swap players around and make the discrepancy no more then 1 pt between teams.. the problem is getting that done programatically.
Here is my code thus far: http://pastebin.com/LAi42Brq
Snippet of what data looks like:
[2] => Array
(
[user__id] => 181
[user__first_name] => Stephen
[user__skill_level] => 5
)
[3] => Array
(
[user__id] => 182
[user__first_name] => Phil
[user__skill_level] => 6
)
Can anyone think of a better, easier, more efficient way to do this? Many thanks in advance!!

I think you're making things too complicated. If you have T teams, sort your players according to their skill level. Choose the top T players to be captains of the teams. Then, starting with captain 1, each captain in turn chooses the player (s)he wants on the team. This will probably be the person at the top of the list of unchosen players.
This algorithm has worked in playgrounds (and, I dare say on the frisbee fields of California) for aeons and will produce results as 'fair' as any more complicated pseudo-statistical method.

A simple solution could be to first generating a team selection order, then each team would "select" one of the highest skilled player available. For the next round the order is reversed, the last team to select a player gets first pick and the first team gets the last pick. For each round you reverse the picking order.
First round picking order could be:
A - B - C - D - E
second round would then be:
E - D - C - B - A
and then
A - B - C - D - E etc.

It looks like this problem really is NP-hard, being a variant of the Multiprocessor scheduling problem.
"h00ligan"s suggestions is equivalent to the LPT algorithm.
Another heuristic strategy would be a variation of this algorithm:
First round: pick the best, second round: pair the teams with the worst (add from the end), etc.
With the example "6,5,5,3,3,1" and 2 teams this would give the teams "6,1,5" (=12) and "5,3,3" (=11). The strategy of "h00ligan" would give the teams "6,3,3" (=12) and "5,5,1" (=11).

This problem is unfortunately NP-Hard. Have a look at bin packing which is probably a good place to start and includes an algorithm you can hopefully tweak, this may or may not be useful depending on how "fair" two teams with the same score need to be.

Coordinates of world continents

I am building a script that's powered by google maps where user chooses the a location from a map and save the long, lat in database aa a part of directory listing form.
I was thinking to add a search functionality by world continent for (America, Asia, Europe, Africa).
but this require having the coordinates of these locations like America long between 'xx' and 'yy', lat between 'aa' and 'bb' so I can look it up in the database.*
And I don't seem to find these info any where,
Any help would be appreciated.

Ok, as a quick solution for this search functionality, I'd set-up two tables in a database. One would map every listing to a country, another would map every country to a continent, so that search could be performed joining these two tables. Use google geocoding to get country from latitude/longitude if needed.
A mysql continent/country database can be found here.

This website will help you quickly get your own bounding boxes:
http://bboxfinder.com
For example, for one project I didn't want to include Europe all the way North to Knivskjellodden or east to the Urals, so I drew this:

There are no simple "lat/lng bounding boxes" for the continents because the continents have irregular boundaries. For example Ankara in Turkey has approximatively lat=40, lng=33. But the latitude 40 crosses Europe, Asia and America; and the longitude 33 crosses Europe, Asia and Africa.
Similarly, there are no simple "lat/lng bounding boxes" for states.

If you know the tiling algorithm of google maps you can try the spatial index to get all geo codes from a bounding box from a continent. Like the tiling algorithm this method is not very accurate.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.