PHP matrix splitting algorithm - php

I have the following issue:
Given a sqare matrix (n×n) I want to be able to split it in a certain number of distinct areas. I made the splitting by diagonals but I'm not happy with the result. I want to be able to split the matrix based on at least 2 functions(chosen randomly). The problem with that is that I don't have any ideea on how to implement such splitting. Any suggestion is welcome. Please note that the small parts will have to be recombined later on in the application.

Related

Recursive array analysis (math program)

I'm making a PHP trigonometry tool using PHP that can analyze how best to solve a given problem.
For instance, I know angle A and sides b and c, and I need the computer to calculate which formulas to use in which order to make the best mathematical solution to finding an unknown value.
Right now, I have created an array with numerous options on how to find the unknown value:
Image of array
The array is made like this:[formula: xxx] is a suggestion on what formula to use to find a previous value.
[target: xxx] is the name of the value we're looking for in order to satisfy the need of the previous formula.
There might be more than one target attached to a formula, because a formula might need more than one information in order to be complete.
At the end of each path there is an array showing a mathematical function which can be solved with the information we have, and if you track backwards from there you have enough information to solve the task which was initially given (which was angle B (cannot be seen on the image, but it is))
So, out of all of these solution options I need to find the shortest solution, the solution which requires the fewest steps.
Keep in mind that a formula might have two unknown variables, which means that we need to calculate the total sum of the shortest path and end up with the an array containing the optimal path.
Bonus question: I'm aware that this only solves the problem using the information provided by the user, maybe at some point along the way we find a variable, which is needed at some other point. Maybe if a function needs two variables, it is faster to find A before B before A calculates a "target" which might be needed in B.
I would really like to have a solution for the first part, but I will need to solve the "bonus question" later on.

Get longest common substring based on string similarity

I have a table with a column that includes names like:
Home Improvement Guide
Home Improvement Advice
Home Improvement Costs
Home Gardening Tips
I would like the result to be:
Home Improvement
Home Gardening Tips
Based on a search for the word 'Home'.
This can be accomplished in MySQL or PHP or a combination of the two. I have been pulling my hair out trying to figure this out, any help in the right directly would be greatly appreciated. Thanks.
Edit / Problem kinda solved:
I think this problem can be solved much easier by changing the logic a little. For anyone else with this problem, here is my solution.
Get the sql results
Find the first occurrence of the searched word, one string at a time, and get the next word in the string to the right of it.
The results would include the searched word concatenated with the distinct adjoining word.
Not as good of a solution, but it works for my project. Thanks for the help everyone.
This is too long for a comment.
I don't think that Levenshtein distance does what you want. Consider:
Home Improvement
Home Improvement Advice on Kitchen Remodeling
Home Gardening
The first and third are closer by the Levenshtein measure than the first and third. And yet, I'm guessing that you want the first and second to be paired.
I have an idea of the algorithm you want. Something like this:
Compare every returned string to every other string
Measure the length of the initial overlap
Find the maximum over all the strings strings, pair those
Repeat the process with the second largest overlap and so on
Painful, but not impossible to implement in SQL. Maybe very painful.
What this suggests to me is that you are looking for a hierarchy among the products. My suggestion is to just include a category column and return the category. You may need to manually insert the categories into your data.

All possible combinations from sets

I have a set of numbers:
1,22
1,46
32,1
1,9
32,22
1,14
1,45
1,33
33,22
45,22
32,46
32,9
3,1
3,9
3,22
3,32
3,46
9,22
46,22
46,45
46,33
15,1
15,46
15,6
15,22
15,3
15,9
15,45
15,33
15,32
15,14
I need to get combinations from them with a rule that each new pair can only be appended if the latter number is the same as the first in the pair.
For example if I have a pair {15,1}, the next on can be only {1,46} and the next {46,45}, and the final pair must end with the first number of the whole set. In this case it could be for example {45,1}.
So the end result of sets with 4 set limit would be
{15,1,1,46,46,45,45,1}
I can do basic power sets and generate all possible combinations from set of numbers but this seems to be too advanced for me.
I can do C, Javascript or PHP so all the help or solutions to this are highly appreciated. And for clarification, this is not a homework, this is just something I would like to learn and understand.
This looks as if some graph data structure, and some graph algorithms, would be appropriate. Your graph would comprise nodes (each of which is a number) and edges (each of which represents one of your pairs). Then write the appropriate routine for walking round the graph. It's not entirely clear from your question what the rules for the walk are, but I guess you know.
EDIT
Of course, I should point out that what you have is already a graph data structure, it's called an adjacency list. Google around for algorithms and representations.

PHP library for word clustering/NLP?

What I am trying to implement is a rather trivial "take search results (as in title & short description), cluster them into meaningful named groups" program in PHP.
After hours of googling and countless searches on SO (yielding interesting results as always, albeit nothing really useful) I'm still unable to find any PHP library that would help me handle clustering.
Is there such a PHP library out there that I might have missed?
If not, is there any FOSS that handles clustering and has a decent API?
Like this:
Use a list of stopwords, get all words or phrases not in the stopwords, count occurances of each, sort in descending order.
The stopwords needs to be a list of all common English terms. It should also include punctuation, and you will need to preg_replace all the punctuation to be a separate word first, e.g. "Something, like this." -> "Something , like this ." OR, you can just remove all punctuation.
$content=preg_replace('/[^a-z\s]/', '', $content); // remove punctuation
$stopwords='the|and|is|your|me|for|where|etc...';
$stopwords=explode('|',$stopwords);
$stopwords=array_flip($stopwords);
$result=array(); $temp=array();
foreach ($content as $s)
if (isset($stopwords[$s]) OR strlen($s)<3)
{
if (sizeof($temp)>0)
{
$result[]=implode(' ',$temp);
$temp=array();
}
} else $temp[]=$s;
if (sizeof($temp)>0) $result[]=implode(' ',$temp);
$phrases=array_count_values($result);
arsort($phrases);
Now you have an associative array in order of the frequency of terms that occur in your input data.
How you want to do the matches depends upon you, and it depends largely on the length of the strings in the input data.
I would see if any of the top 3 array keys match any of the top 3 from any other in the data. These are then your groups.
Let me know if you have any trouble with this.
"... cluster them into meaningful groups" is a bit to vague, you'll need to be more specific.
For starters you could look into K-Means clustering.
Have a look at this page and website:
PHP/irInformation Retrieval and other interesting topics
EDIT: You could try some data mining yourself by cross referencing search results with something like the open directory dmoz RDF data dump and then enumerate the matching categories.
EDIT2: And here is a dmoz/category question that also mentions "Faceted Search"!
Dmoz/Monster algorithme to calculate count of each category and sub category?
If you're doing this for English only, you could use WordNet: http://wordnet.princeton.edu/. It's a lexicon widely used in research which provides, among other things, sets of synonyms for English words. The shortest distance between two words could then serve as a similarity metric to do clustering yourself as zaf proposed.
Apparently there is a PHP interface to WordNet here: http://www.foxsurfer.com/wordnet/. It came up in this question: How to use word Net with php, but I have not tried it. However, interfacing with a command line tool from PHP yourself is feasible as well.
You could also have a look at Programming Collective Intelligence (Chapter 3 : Discovering Groups) by Toby Segaran which goes through just this use case using Python. However, you should be able to implement things in PHP once you understand how it works.
Even though it is not PHP, the Carrot2 project offers several clustering engines and can be integrated with Solr.
This may be way off but check out OpenCalais. They have a web service which allows you to pass a block of text in and it will pass you back a parseable response of things that it found in the text, such as places, people, facts etc. You could use these categories to build your "clouds" and too choose which results to display.
I've used this library a few times in php and it's always been quite easy to work with.
Again, might not be relevant to what your trying to do. Maybe you could post an example of what your trying to accomplish?
If you can pre-define the filters for your faceted search (the named groups) then it will be much easier.
Rather than relying on an algorithm that uses the current searcher's input and their particular results to generate the filter list, you would use an aggregate of the most commonly performed searches by all users and then tag results with them if they match.
You would end up with a table (or something) of URLs in a many-to-many join to a table of tags, so each result url could have several appropriate tags.
When the user searches, you simply match their search against the full index. But for the filters, you take the top results from among the current resultset.
I'll work on query examples if you want.

Pattern matching for people who dont know algorithms - finding adjacent X's in a grid

I'm wondering what the best method would be for me to approach a problem where I need to find adjacent (horizontal, vertical, diagonal) X's in a grid which is provided.
I wanted to know what the recursive way, and the nonrecursive way would be. I tried a recursive method of checking each column, and then iterating rows - that gives me X's in one direction - should I write seperate recursive functions for the other directions?
Example grid:
XXX0X
0000X
00X00
XXXX0
0000X
output should be :
(0,0),(1,0),(2,0)
(4,0),(4,1)
(2,2),(0,3),(1,3),(2,3)(3,3)
You may want to check out the Flood Fill algorithm. You can find it on Wikipedia.
I think what you're describing is more or less it. What you do is basically:
For a given position:
If it is of the desired color (in your case 'O'):
mark it (say, re-color it to a color 'M'),
recurse on all desirable directions (run the same algorithm
on new positions, which are +/-1 away);
else
do nothing.
In your case, the result are the positions marked 'M'. If you want to find additional adjacencies, you can always reset the ones marked 'M' and start the algorithm on a different position.
EDIT: According to your examples, it seems you're looking for adjacent 'X's. :)

Categories