I want PHP to randomly create a multi-dimensional array by picking a vast amount of items out of predefined lists for n times, but never with 2 times the same.
Let me put that to human words in a real-life example: i want to write a list of vegetables and meat and i want php to make a menu for me, with every day something else then yesterday.
I tried and all i got was the scrambling but there were always doubles :s
Try the shuffle function http://us2.php.net/manual/en/function.shuffle.php
Use either array_rand() or shuffle().
Random != unique
You need to either:
a) create a list containing every possible combination and then randomly select and remove one
or
b) store your results so your random selection can be compared to previous selections.
Related
In my MySQL table I have the field name, which is unique. However the contents of the field are gathered on different places. So it is possible I have 2 records with a very similar name instead of second one being discarded, due to spelling errors.
Now I want to find those entries that are very similar to another one. For that I loop through all my records, and compare the name to other entries by looping through all the records again. Problem is that there are over 15k records which takes way too much time. Is there a way to do this faster?
this is my code:
for($x=0;$x<count($serie1);$x++)
{
for($y=0;$y<count($serie2);$y++)
{
$sim=levenshtein($serie1[$x]['naam'],$serie2[$y]['naam']);
if($sim==1)
print("{$A[$x]['naam']} --> {$B[$y]['naam']} = {$sim}<br>");
}
}
}
A preamble: such a task will always be time consuming, and there will always be some pairs that slip through.
Nevertheless, a few ideas :
1. actually, the algorithm can be (a bit) improved
assuming that $series1 and $series2 have the same values in the same order, you don't need to loop over the whole second array in the inner loop every time. In this use case you only need to evaluate each value pair once - levenshtein('a', 'b') is sufficient, you don't need levenshtein('b', 'a') as well (and neither do you need levenstein('a', 'a'))
under these assumptions, you can write your function like this:
for($x=0;$x<count($serie1);$x++)
{
for($y=$x+1;$y<count($serie2);$y++) // <-- $y doesn't need to start at 0
{
$sim=levenshtein($serie1[$x]['naam'],$serie2[$y]['naam']);
if($sim==1)
print("{$A[$x]['naam']} --> {$B[$y]['naam']} = {$sim}<br>");
}
}
2. maybe MySQL is faster
there examples in the net for levenshtein() implementations as a MySQL function. An example on SO is here: How to add levenshtein function in mysql?
If you are comfortable with complex(ish) SQL, you could delegate the heavy lifting to MySQL and at least gain a bit of performance because you aren't fetching the whole 16k rows into the PHP runtime.
3. don't do everything at once / save your results
of course you have to run the function once for every record, but after the initial run, you only have to check new entries since the last run. Schedule a chronjob that once every day/week/month.. checks all new records. You would need an inserted_at column in your table and would still need to compare the new names with every other name entry.
3.5 do some of the work onInsert
a) if the wait is acceptable, do a check once a new record should be inserted, so that you either write it to a log oder give a direct feedback to the user. (A tangent: this could be a good use case for an asynchrony task queue like http://gearman.org/ -> start a new process for the check in the background, return with the success message for the insert immediately)
b) PHP has two other function to help with searching for almost similar strings: metaphone() and soundex() . These functions generate abstract hashes that represent how a string will sound when spoken. You could generate (one or both of) these hashes on each insert, store them as a separate field in your table and use simple SQL functions to find records with similar hashes
The trouble with levenshtein is it only compares string a to string b. I built a spelling corrector once that puts all the strings a into a big trie, and that functioned as a dictionary. Then it would look up any string b in that dictionary, finding all nearest-matching words. I did it first in Fortran (!), then in Pascal. It would be easiest in a more modern language, but I suspect php would not make it easy. Look here.
I have a data in My-Sql column like this
T_interest
1,14,49,145,203,302
It represents each value for personal interest keywords.
I tried to extract the value and distinguish whether it has the value or not for the checkbox.
if(strstr($u_interest['u_interest'], ','.$row['i_idx'])):
$selected = 'checked';
here is the php command that I use right now.
but it doesn't extract exact value from the database.
Let's say I want to check if the data has 14 number or from this user table
T_interest
1,14,49,145,203,302
and if I use above command it tells me that a user has two values.
14,145
It looks like PHP strstr command tells me two values because these two have 14 number.
So, can you help me why this is happening?
If you want more php lines I can post them.
Explode it into an array, this way you get the individual values.
explode(",",$t_interest["u_interest"]);
Then you can test for equality much easier.
You really need to read up on database normalization. A properly normalized design would make this problem moot.
As for your question, since you're forced to use string operations, you'll have to check for multiple different cases:
1) the number you want is at the START of the string
2) the number you want is at the END of the string
3) the number you want is the ONLY number in the string
4) the number you want is in the MIDDLE of thee string:
SELECT ...
WHERE
(T_Interest = 14) OR // only number
(T_Interest LIKE '14,%') OR /// at the begnning
(T_Interest LIKE '%,14') OR // at the end
(T_Insertest LIKE '%,14,%') // in the middle
I'm creating a lottery contest for my site, and I need to know the easiest way to compare numbers, so that no two people can choose the same numbers. It's 7 sets of numbers, each number is a number between 1 and 30.
For example, if user 1 chooses: 1, 7, 9, 17, 22, 25, 29 how can I make sure that user 2 can't choose those same exact number?
I was thinking about throwing all 7 numbers into an array, sort it so the numbers are in order, then join them into one string. Then when another user chooses their 7 numbers, it does the same, then compares the two. Is there a better way of doing it?
What you describe sounds like the best way to me, IF you are dealing with all submissions in the same script - I would trim(implode(',',$array)) the sorted array, store the resulting string in an array and call in_array() to determine whether the value already exists.
HOWEVER I suspect that what you are actually doing is storing the selections in a database table and comparing later submissions against this table. In this case (I am taking a liberty and assuming MySQL here but I would say it is the most common engine used with PHP) you should create a table with 7 columns choice_1, choice_2 ... choice_7(along with whatever other columns you want) and create a unique index across all seven choice_* columns. This means that when you try and insert a duplicate row, the query will fail. This lets MySQL do all the work for you.
Try array_diff. There are some really good examples on php.net.
In php - how do I display 5 results from possible 50 randomly but ensure all results are displayed equal amount.
For example table has 50 entries.
I wish to show 5 of these randomly with every page load but also need to ensure all results are displayed rotationally an equal number of times.
I've spent hours googling for this but can't work it out - would very much like your help please.
please scroll down for "biased randomness" if you dont want to read.
In mysql you can just use SeleCT * From table order by rand() limit 5.
What you want just does not work. Its logically contradicting.
You have to understand that complete randomness by definition means equal distribution after an infinite period of time.
The longer the interval of selection the more evenly the distribution.
If you MUST have even distribution of selection for example every 24h interval, you cannot use a random algorithm. It is by definition contradicting.
It really depends no what your goal is.
You could for example take some element by random and then lower the possibity for the same element to be re-chosen at the next run. This way you can do a heuristic that gives you a more evenly distribution after a shorter amount of time. But its not random. Well certain parts are.
You could also randomly select from your database, mark the elements as selected, and now select only from those not yet selected. When no element is left, reset all.
Very trivial but might do your job.
You can also do something like that with timestamps to make the distribution a bit more elegant.
This could probably look like ORDER BY RAND()*((timestamps-min(timestamps))/(max(timetamps)-min(timestamps))) DESC or something like that. Basically you could normalize the timestamp of selection of an entry using the time interval window so it gets something between 0 and 1 and then multiply it by rand.. then you have 50% fresh stuff less likely selected and 50% randomness... i am not sure about the formular above, just typed it down. probably wrong but the principle works.
I think what you want is generally referred to as "biased randomness". there are a lot of papers on that and some articles on SO. for example here:
Biased random in SQL?
Copy the 50 results to some temporary place (file, database, whatever you use). Then everytime you need random values, select 5 random values from the 50 and delete them from your temporary data set.
Once your temporary data set is empty, create a new one copying the original again.
I am wondering if the shuffle() array function is the correct way to randomize array results.
Basically, I have some ad codes in an array, and I use this to display 1 random ad each time, but there is an ad that seems to appear much much more than anything! I mean out of 20 times it appeared about 18 times. I'd thought by randomizing the results I'd get around equal views for each ad but thats not the case.
It makes me question here. Is shuffle the correct way to do this.. or do I need something totally different?
Here is my code to grab the random ad code at a time.
if (count($eligible_ads) > 1) {
shuffle($eligible_ads);
echo stripslashes($eligible_ads[0]['code']);
}
You could also simply pick a random key using array_rand, instead of shuffling the whole thing. It won't get any more random than that*. You can't say something's not random with test data based on only 20 runs. It should even out if you run it a few thousand more times.
In other words:
* As far as PRNGs go at least.
Random means Random. It wouldn't be random if it gave equal views to each ad. shuffle() randomizes the order of the elements in the array, associative or otherwise. If your goal is to give equal views to each "ad" in the array, then no, shuffle() is not what your after. You could probably write a function that uses boolean logic to give equal views though.
If you don't mind the hit of running a database query or two, throw the ads into a table, and create a column that will store how many times that ad was displayed. SELECT the row with the MIN(views) ordered by RAND() and increase the view count of the row that was selected. This will ensure every ad is viewed the same number of times, but shown in a random order.