Parsing huge tables in tables with skipping?

Parsing huge tables in tables with skipping? - php

I'm developing an android application that parses json and looks for two of the same words.
For the current day.
But i'm just asking now for the best way to parse all that HTML code into Json.
So this site is in dutch. But I'll try to guide you a bit,
The top table says Maandag through vrijdag, which translated, are the days (Monday to Friday) from left to right.
I need to skip these cells. Also the cells from that say: 1, 2, 3, 4, 5 from top to bottom.
But now! The cells from the 'main' table has a table in it. So a table in a cell.
I only need the third cell from the table in the cell.
So for example:
At "Vrijdag", "1" It says: "L119"
I only need to have these together with the day and number.
so that is probably gonna be a 3D array json.
I can do further explanation if needed.
But for a conclusion:
I need a way to parse the classrooms ( ex. "L119", "D60", "S-5", "C162" ) together with the day and numbers on the left into a 3D json array.
It would be awesome if you could send the the source code but If you do so please also provide some simple explanation. I will also put your name in the credits if you want. But it will be a Dutch app.

NVM I got it myself.
If you wonder how i did it:
I used something called: "Simple_HTML_Dom.php" I used this to find the right table and content!
-Tim

Related

Sorting JSON array from last to first by date in PHP, using a variable

thanks for taking the time to look at my question. I'm working on something that's dynamic and has data added daily, but is sorted from first to last in the JSON file, when I want to go from last (most recent) to first.
What I did was get the total number of entries by simply using
$total = $response['result']['total'];
Which gives me the total, 84. If I want the next last in the sequence, I just go:
echo $total - 1
Which gives me 83.
This is where the problem comes in
I'm arranging my data like this:
echo $reo[0]['previous_day_doses_administered'];
Where the [0] is, I want $total - 1
However, I'm not sure how to get it, as it would be
$reo[$response['result']['total']]['previous_day_doses_administered'];
which is definitely wrong. I'm wondering how to convert the $total to an integer, and have it no longer associated with the JSON array. I'm sorry if this is confusing! I just want to have the last 10 entries show up, instead of the first 10.
So, that's basically it. I've tried a couple things I found on here, but it screwed everything up. The API updates daily. If you want the link, let me know.
Thanks for taking the time to read this, and sorry that I'm being so confusing!

Searching for matrix way finding algorithm

i am developing a board game in php and now i have problems in writing an algorithm...
the game board is a multidimensional array ($board[10][10]) to define rows and columns of the board matrix or vector...
now i have to loop through the complete board but with a dynamic start point. for example the user selects cell [5,6] this is the start point for the loop. goal is to find all available board cells around the selected cell to find the target cells for a move method. i think i need a performant and efficient way to do this. does anyone know an algorithm to loop through a matrix/vector, only ones every field to find the available and used cells?
extra rule...
in the picture appended is a blue field selected (is a little bigger than the other). the available fields are only on the right side. the left side are available but not reachable from the current selected position... i think this is a extra information which makes the algorithm a little bit complicated....
big thx so far!
kind regards

not completely sure that I got the requirements right, so let me restate them:
You want an efficient algorithm to loop through all elements of an nxn matrix with n approximately 10, which starts at a given element (i,j) and is ordered by distance from (i,j)!?
I'd loop through a distance variable d from 0 to n/2
then for each value of d loop for l through -(2*d) to +(2*d)-1
pick the the cells (i+d, j+l), if i>=0 also pick (i+l,j-d),(i+l, j+d)
for each cell you have to apply a modulo n, to map negativ indexes back to the matrix.
This considers the matrix basically a torus, glueing upper and lower edge as well as left and right edge together.
If you don't like that you can let run d up to n and instead of a modulo operation just ignore values outside the matrix.
These aproaches give you the fields directly in the correct order. For small fields I do doubt any kind of optimization on this level has much of an effect in most situations, Nicholas approach might be just as good.
Update
I slightly modified the cells to pick in order to honor the rule 'only consider fields that are right from the current column or on the same column'

If your map is only 10x10, I'd loop through from [0][0], collecting all the possible spaces for the player to move, then grade the spaces by distance to current player position. N is small, so the fact that the algorithm has O(N^2) shouldn't affect your performance much.
Maybe someone with more background in algorithms has something up their sleeve.

Mysql query to show results from between two columns

Hi i have a mysql database, in which i have two columns Year_from & Year_two.
what i am trying to do is find a way where i can show the dates that are missing as buttons, for example if year from is 2006 and year to is 2008, i of course want to show 2006, 2007 and 2008. is this possible, as there isn't the value of 2007 in the database.
I haven't worked on any code yet as i am not sure if this is possible, or how i would achieve it.
Any ideas would be appreciated.
Thanks

Use range($year_from, $year_to) to generate a list of all years. Compare that array with the one you got from the database using array_diff() and bold the missing ones.

This needs to be done in code.
Basically, you'll read a bunch of records from the database, iterate through an array of years you'd like to show and check (every iteration) if there is a matching record present in the result set.

Twitter style trends with php/mysql

I am coding a social network and I need a way to list the most used trends, All statuses are stored in a content field, so what it is exactly that I need to do is match hashtag mentions such as: #trend1 #trend2 #anothertrend
And sort by them, Is there a way I can do this with MySQL? Or would I have to do this only with PHP?
Thanks in advance

The maths behind trends are somewhat complex; machine learning may be a bit over the top, but you probably need to work through some examples.
If you go with #deadtrunk's sample code, you would miss trends that have fired up in the last half hour; if you go with #eggyal's example, you miss trends that have been going strong all day, but calmed down in the last half hour.
The classic solution to this problem is to use a derivative function (http://en.wikipedia.org/wiki/Derivative); it's worth building a sample database and experimenting with this, and making your solution flexible enough to change this over time.
Whilst you want to build something simple, your users will be used to trends, and will assume it's broken if it doesn't work the way they expect.

You should probably extract the hash tags using PHP code, and then store them in your database separately from the content of the post. This way you'll be able to query them directly, rather then parsing the content every time you sort.

I think it is better to store tags in dedicated table and then perform queries on it.
So if you have a following table layout
trend | date
You'll be able to get trends using following query:
SELECT COUNT(*), trend FROM `trends` WHERE `date` = '2012-05-10' GROUP BY trend
18 test2
7 test3

Create a table that associates hashtags with statuses.
Select all status updates from some recent period - say, the last half hour - joined with the hashtag association table and group by hashtag.
The count in each group is an indication of "trend".

Tricky file parsing. Inconsistent Delimeters

I need to parse a file with the following format.
0000000 ...ISBN.. ..Author.. ..Title.. ..Edit.. ..Year.. ..Pub.. ..Comments.. NrtlExt Nrtl Next Navg NQoH UrtlExt Urtl Uext Uavg UQoH ABS NEB MBS FOL
ABE0001 0-679-73378-7 ABE WOMAN IN THE DUNES (INT'L ED) 1st 64 RANDOM 0.00 13.90 0.00 10.43 0 21.00 10.50 6.44 3.22 2 2.00 0.50 2.00 2.00 ABS
The ID and ISBN are not a problem, the title is. There is no set length for these fields, and there are no solid delimiters- the space can be used for most of the file.
Another issue is that there is not always an entry in the comments field. When there is, there are spaced within the content.
So I can get the first two, and the last fourteen. I need some help figuring out how to parse the middle six fields.
This file was generated by an older program that I cannot change. I am using php to parse this file.

I would also ask myself 'How good does this have to be' and 'How many records are there'?
If, for example, you are parsing this list to put up a catalog of books to sell on a website - you probably want to be as good as you can, but expect that you will miss some titles and build in feedback mechanism so your users can help you fix the issue ( and make it easy for you to fix it in your new format).
On the other hand, if you absolutely have to get it right because you will loose lots of money for each wrong parse, and there are only a few thousand books, you should plan on getting close, and then doing a human review of the entire file.
(In my first job, we spend six weeks on a data conversion project to convert 150 records - not a good use of time).

Find the title and publisher of the book by ISBN (in some on-line database) and parse only the rest :)
BTW. are you sure that what looks like space actually is a space? There are more "invisible" characters (like non-break space). I know, not a good idea, but apparently author of that format was pretty creative...

You need to analyze you data by hand and find out what year, edition and publisher look like. For example if you find that year is always two digits and publisher always comes from some limited list, this is something you can start with.

While I don't see any way other then guessing a bit I'd go about it something like this:
I'd scale off what I know I can parse out reliably. Leaving you with ABE WOMAN IN THE DUNES (INT'L ED) 1st 64 RANDOM
From there I'd try locate the Edition and split the string into two at that position after storing and removing the Edition leaving you with ABE WOMAN IN THE DUNES (INT'L ED) & 64 RANDOM, another option is to try with the year but of course Titles such as 1984 might present a problem . (Guessing edition is of course assuming it's 7th, 51st etc for all editions).
Finally I'd assume I could somewhat reliable guess the year 64 at the start of the second string and further limit the Publisher(/Comment) part.
The rest is pure guesswork unless you got a list of authors/publishers somewhere to match against as I'd assume there are not only comments with spaces but also publishers with spaces in their names. But at least you should be down to 2 strings containing Author/Title in one and Publisher(/Comments) in the other.
All in all it should limit the manual part a bit.
Once done I'd also save it in a better format somewhere so I don't have to go about parsing it again ;)

I don't know if the pcre engine allows multiple groups from within selection, therefore:
([A-Z0-1]{7})\ (\d-\d{3}-\d{5}-\d)\
(.+)\ (\d(?:st|nd|rd))\ \d{2}\
([^\d.]+)\ (\d+.\d{2})\ (\d+.\d{2})\
(\d+.\d{2})\ (\d+.\d{2})\ (\d{1})\
(\d+.\d{2})\ (\d+.\d{2})\ (\d+.\d{2})\
(\d+.\d{2})\ (\d)\ (\d+.\d{2})\
(\d+.\d{2})\ (\d+.\d{2})\ (\d+.\d{2})\
(\w{3})
It does look quite ugly and doesn't fix your author-title problem but it matches quite good for the rest of it.
Concerning your problem I don't see any solution but having a lookup table for authors or using other services to lookup title and author via the ISBN.
Thats if unlike in your example above the authors are not just represented by their first name.
Also double check all exception that might occur with the above regex as titles may contain 1st or alike.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Parsing huge tables in tables with skipping? - php

NVM I got it myself. If you wonder how i did it: I used something called: "Simple_HTML_Dom.php" I used this to find the right table and content! -Tim

Related

Sorting JSON array from last to first by date in PHP, using a variable

Searching for matrix way finding algorithm

Mysql query to show results from between two columns

Twitter style trends with php/mysql

Tricky file parsing. Inconsistent Delimeters

Categories

Resources