Scratchcard PHP script - php

I'm working on a scratchcard script and I was wondering if someone could help me out, if you don't understand odds this may melt your brain a little!
So, the odds can vary: 1/$x; Let's say for now: $x = 36;
So here's what I am trying to understand...
I want to generate 9 random numbers between 1 and 5.
I want the odds of 3 numbers matching equivalent to 1/36.
It must be impossible to generate over 3 duplicate numbers at a time.
I can imagine an array loop of some kind would probably be the correct way of passage?

Sometimes - and this is one of those times - cheating is the best way to do what you want to do.
a) Set up an array of your 9 numbers, and a 2nd frequency array (5 elements) that counts which number occurs how often.
b) Generate a random number 1-5. Set the 1st and 2nd card to this number, and mark this number with 2 in your freqency array.
c) If random(36) < 1 (1/36 probability), set your 3rd card to the same number and mark this number with 3 in your frequency array.
d) Generate the rest of the cards - find a random number, repeat while frequency of the found number >=2, set the next card to number, increase frequency of the found number.
e) When finished, shuffle the cards (generate 2 random numbers between 1 and 9, swap the 2 cards, repeat 20-30 times).
Part d) is what i call cheating - you've put your 1/36 probabilty in step c), and in d) you just make sure you don't generate another match. e) is used to hide that from the user.

Related

Anti-forgery unique serial number generation

I am trying to generate a random serial number to put on holographic stickers in order to let customers check if the purchased product is authentic or not.
Preface:
Once you input that and query that code it will be nulled, so next time you do it again you receive a message that the product might be fake because the code is already used.
Considering that I should make this system for a factory that produces no more than 2/3 millions pieces a year, for me is a bit hard understand how to set up everything, at least the 1st time…
I thought about 20 digits code in 4 groups (no letters because must be very easy for the user read and input the code)
12345-67890-98765-43210
This is what I think is the easiest way to do everything:
function mycheckdigit()
{
...
return $myserial;
}
$mycustomcode="123";
$qty=20000;
$myfile = fopen("./thefile.txt","w") or die("Houston we got a problem here");
//using a txt file for a test, should be a DB instead...
for($i=0;$i<=$qty;$i++) {
$txt = date("y").$mycustomcode.str_pad(gettimeofday()['usec'],6,STR_PAD_LEFT).random_int(1000000,9999999). "\n";
//here the code to make check digits
mycheckdigit($txt);
fwrite($myfile,$myserial);
}
fclose($myfile);
The 1st group identifying something like year: 18 and 3 custom code
The 2nd group include microtime (gettimeofday()['usec'])
The 3rd completely random
last group including 3 random number and a check digit for group 1 and a check digit for group 2
in short:
Y= year
E= part of the EAN or custom code
M= Microtime generated number (gettimeofday()['usec'])
D= random_int() digits
C= Check Digit
YYEEE-MMMMM-MDDDD-DDDCC
In this way, I have a prefix that changes every year, I can recognize what brand is the product (so I could use one DB source only) and I still have enough random digits to be - maybe - quite unique if I consider that I will “pick-up” only a portion of the numbers from 1,000,000 and 9,999,999 and split it following using above sorting
Some questions for you:
Do you think I have enough combinations to not generate same code in one year considering 2 million codes? I would not use a lookup in the DB for the same code if it is not really necessary because could slow down batch generation (executed in batch during production process)
Could be better put some also unique identifier, like a day of the year (001-365) and make random_int() 3 digits shorter? Please Consider that I will generate codes monthly and not daily (but I think there is no big change in uniqueness)
Considering that backend in PHP I am thinking to use mt_rand() function, could be a good approach?
UPDATE: After the #apokryfos suggestion, I read more about UUID generation and similar I found a good compromise using random_int() instead.
Because I just need digits, so HEX hashes are not useful for my needs and making things more complicated
I would avoid using complex cryptographic things like RSA keys and so on…
I don’t need that level of security and complexity, I just need a way to generate a unique serial number, most unique as possible that is not easy to be guessed and nulled if you don’t scratch the sticker (so number creation should not be made A to Z, but randomly)
You can play with 11 random digits per year so that's 11 digit numbers 1 to 99999999999 (99.9 billion is a lot more than 2 million) so w.r.t. enough combinations I think you're covered.
However using mt_rand you're likely to get collisions. Here's a way to plan your way to 2 million random numbers before using the database:
<?php
$arr = [];
while (count($arr) < 1000000) {
$num = mt_rand(1, 99999999999);
$numStr = str_pad($num,11,0,STR_PAD_LEFT); //Force 11 digits
if (!isset($arr[$numStr])) {
$arr[$numStr] = true;
}
}
$keys= array_keys($arr);
The number of collisions is generally low (the first collision occurs at at about 300 000 - 500 000 numbers generated so it's pretty rare.
Each value in the array $keys is an 11 digit number which is random and unique.
This approach is relatively fast but be aware it will need quite a bit of memory (more than 128MB).
This being said, a more generally used method is to generate a universally unique identifier (UUID) which is a lot more likely to be unique and will therefore does not really need checking for uniqueness.

Subset Sum floats Elimations

I will be happy to get some help. I have the following problem:
I'm given a list of numbers and a target number.
subset_sum([11.96,1,15.04,7.8,20,10,11.13,9,11,1.07,8.04,9], 20)
I need to find an algorithm that will find all numbers that combined will sum target number ex: 20.
First find all int equal 20
And next for example the best combinations here are:
11.96 + 8.04
1 + 10 + 9
11.13 + 7.8 + 1.07
9 + 11
Remaining value 15.04.
I need an algorithm that uses 1 value only once and it could use from 1 to n values to sum target number.
I tried some recursion in PHP but runs out of memory really fast (50k values) so a solution in Python will help (time/memory wise).
I'd be glad for some guidance here.
One possible solution is this: Finding all possible combinations of numbers to reach a given sum
The only difference is that I need to put a flag on elements already used so it won't be used twice and I can reduce the number of possible combinations
Thanks for anyone willing to help.
there are many ways to think about this problem.
If you do recursion make sure to identify your end cases first, then proceed with the rest of the program.
This is the first thing that comes to mind.
<?php
subset_sum([11.96,1,15.04,7.8,20,10,11.13,9,11,1.07,8.04,9], 20);
function subset_sum($a,$s,$c = array())
{
if($s<0)
return;
if($s!=0&&count($a)==0)
return;
if($s!=0)
{
foreach($a as $xd=>$xdd)
{
unset($a[$xd]);
subset_sum($a,$s-$xdd,array_merge($c,array($xdd)));
}
}
else
print_r($c);
}
?>
This is possible solution, but it's not pretty:
import itertools
import operator
from functools import reduce
def subset_num(array, num):
subsets = reduce(operator.add, [list(itertools.combinations(array, r)) for r in range(1, 1 + len(array))])
return [subset for subset in subsets if sum(subset) == num]
print(subset_num([11.96,1,15.04,7.8,20,10,11.13,9,11,1.07,8.04,9], 20))
Output:
[(20,), (11.96, 8.04), (9, 11), (11, 9), (1, 10, 9), (1, 10, 9), (7.8, 11.13, 1.07)]
DISCLAIMER: this is not a full solution, it is a way to just help you build the possible subsets. It does not help you to pick which ones go together (without using the same item more than once and getting the lowest remainder).
Using dynamic programming you can build all the subsets that add up to the given sum, then you will need to go through them and find which combination of subsets is best for you.
To build this archive you can (I'm assuming we're dealing with non-negative numbers only) put the items in a column, go from top to bottom and for each element compute all the subsets that add up to the sum or a lower number than it and that include only items from the column that are in the place you are looking at or higher. When you build a subset you put in its node both the sum of the subset (which may be the given sum or smaller) and the items that are included in the subset. So in order to compute the subsets for an item [i] you need only look at the subsets you've created for item [i-1]. For each of them there are 3 options:
1) the subset's sum is the given sum ---> Keep the subset as it is and move to the next one.
2) the subset's sum is smaller than the given sum but larger than it if item [i] is added to it ---> Keep the subset as it is and move on to the next one.
3) the subset's sum is smaller than the given sum and it will still be smaller or equal to it if item [i] is added to it ---> Keep one copy of the subset as it is and create another one with item [i] added to it (both as a member and added to the sum of the subset).
When you're done with the last item (item [n]), look at the subsets you've created - each one has its sum in its node and you can see which ones are equal to the given sum (and which ones are smaller - you don't need those anymore).
As I wrote at the beginning - now you need to figure out how to take the best combination of subsets that do not have a shared member between any of them.
Basically you're left with a problem that resembles the classic knapsack problem but with another limitation (not every stone can be taken with every other stone). Maybe the limitation actually helps, I'm not sure.
A bit more about the advantage of dynamic programming in this case
The basic idea of dynamic programming instead of recursion is to trade redundancy of operations with occupation of memory space. By that I mean to say that recursion with a complex problem (normally a backtrack knapsack-like problem, as we have here) normally ends up calculating the same thing a fair amount of times because the different branches of calculation have no concept of each other's operations and results. Dynamic programming saves the results and uses them along the way to build "bigger" results, relying on the previous/"smaller" ones. Because the use of the stack is much more straightforward than in recursion, you don't get the memory problem you get with recursion regarding the maintenance of the function's state, but you do need to handle a great deal of memory that you store (sometimes you can optimise that).
So for example in our problem, trying to combine a subset that would add up to the required sum, the branch that starts with item A and the branch that starts with item B do not know of each other's operations. let's assume item C and item D together add up to the sum, but either of them added alone to A or B would not exceed the sum, and that A don't go with B in the solution (we can have sum=10, A=B=4, C=D=5 and there is no subset that sums up to 2 (so A and B can't be in the same group)). The branch trying to figure out A's group would (after trying and rejecting having B in its group) add C (A+C=9) and then add D, in which point would reject this group and trackback (A+C+D=14 > sum=10). The same would happen to B of course (A=B) because the branch figuring out B's group has no information regarding what just happened to the branch dealing with A. So in fact we've calculated C+D twice, and haven't even used it yet (and we're about to calculate it yet a third time to realise they belong in a group of their own).
NOTE:
Looking around while writing this answer I came across a technique I was not familiar with and might be a better solution for you: memoization. Taken from wikipedia:
memoization is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
So I have a possbile solution:
#compute difference between 2 list but keep duplicates
def list_difference(a, b):
count = Counter(a) # count items in a
count.subtract(b) # subtract items that are in b
diff = []
for x in a:
if count[x] > 0:
count[x] -= 1
diff.append(x)
return diff
#return combination of numbers that match target
def subset_sum(numbers, target, partial=[]):
s = sum(partial)
# check if the partial sum is equals to target
if s == target:
print "--------------------------------------------sum_is(%s)=%s" % (partial, target)
return partial
else:
if s >= target:
return # if we reach the number why bother to continue
for i in range(len(numbers)):
n = numbers[i]
remaining = numbers[i+1:]
rest = subset_sum(remaining, target, partial + [n])
if type(rest) is list:
#repeat until rest is > target and rest is not the same as previous
def repeatUntil(subset, target):
currSubset = []
while sum(subset) > target and currSubset != subset:
diff = subset_sum(subset, target)
currSubset = subset
subset = list_difference(subset, diff)
return subset
Output:
--------------------------------------------sum_is([11.96, 8.04])=20
--------------------------------------------sum_is([1, 10, 9])=20
--------------------------------------------sum_is([7.8, 11.13, 1.07])=20
--------------------------------------------sum_is([20])=20
--------------------------------------------sum_is([9, 11])=20
[15.04]
Unfortunately this solution does work for a small list. For a big list still trying to break the list in small chunks and calculate but the answer is not quite correct. You can see it o a new thread here:
Finding unique combinations of numbers to reach a given sum

Lottery number analysis

I'm trying to perform some basic analysis on Lotto results :)
I have a database that looks something like:
id|no|day|dd|mmm|yyyy|n1|n2|n3|n4|n5|n6|bb|jackpot|wins|machine|set
--------------------------------------------------------------------
1 |22|mon|22|aug|1999|01|05|11|29|38|39|04|2003202| 1 | Topaz | 3
2 |23|tue|24|aug|1999|01|06|16|21|25|39|03|2003202| 2 | Pearl | 1
That's just an example. So, n1 to n6 are standard balls in the lottery and bb stands for the bonus ball.
I want to write a PHP/SQL code that will display just one random sequence of numbers that have yet to come out. However, If the numbers 01, 04, 05, 11, 29, 38 and 39 have come out, I don't want the code to print out them numbers but just in a different order, as in theory them set of numbers are already winning numbers.
I just can't get my head around the logic of this. I'd appreciate any help.
Thanks in advance
Assuming that the balls are stored in ascending order in your database like the examples you've given, you could just generate a random sequence of 6 numbers, sort them and then generate 1 random bonus number. Once you've done that it would just be a matter of doing a simple SQL query into your database and seeing if it comes back with a result:
$nums=...//generate your 6 numbers plus bonus number here
sort($nums);
$mysqli=new mysqli('...','...','...','...');
$stmt=$mysqli->prepare("SELECT * FROM table
WHERE n1=? AND n2=? AND n3=? AND n4=? AND n5=? AND n6=? AND bb=?");
$stmt->bind_param('iiiiiii', $nums[0], $nums[1], $nums[2], $nums[3], $nums[4], $nums[5], $nums[6]);
$stmt->execute();
$stmt->store_result();
if($stmt->num_rows==0)
//your numbers have not been drawn before - return them
else
//otherwise loop round and try again
As long as both list of numbers (but not the bonus ball) are sorted you won't have any problems with a different ordering of an already drawn set of numbers.
This will become less efficient as your database of previous draws gets fuller, but I don't think you'll have to worry about that for a few decades. :-)
What about sorting each already drawn result (each row) in some order, ascending maybe, then sort the set of already drawn results (all rows)? Then you will have a easy to look up in list in which you can see what is left to be drawn.
Say for example you want a never drawn set before? You would just have to loop through the list until you spot a "hole", which would be a never before drawn set. If you would like to optimise further you could store at what index you last found a "hole" as well. Then you would never need to loop through the same part of the list twice, and you could even abandon "completed" parts of the list to save disk space, or if you would like the new number you come up with to seam random you could start at a random offset in the list.
To do this effectively you should make an extra column to store the pre-sorted set. For example if you have (5, 3, 6, 4, 1, 2) that column could contain 010203040506. Add in enough zeros so that the numbers occur on a fixed offset basis.

generate a random number between 1 and x where a lower number is more likely than a higher one

This is more of a maths/general programming question, but I am programming with PHP is that makes a difference.
I think the easiest way to explain is with an example.
If the range is between 1 and 10.
I want to generate a number that is between 1 an 10 but is more likely lower than high.
The only way I can think is generate an array with 10 elements equal to 1, 9 elements equal to 2, 8 elements equal to 3.....1 element equal to 10. Then generate a random number based on the number of elements.
The trouble is I am potentially dealing with 1 - 100000 and that array would be ridiculously big.
So how best to do it?
Generate a random number between 0 and a random number!
Generate a number between 1 and foo(n), where foo runs an algorithm over n (e.g. a logarithmic function). Then reverse foo() on the result.
Generate number n which is 0 <= n < 1, multiply it by itself, than multiply by x, run floor on it and add 1. Sorry I used php toooo long ago to write code in it
You could do
$rand = floor(100000 * (rand(0, 1)*rand(0, 1)));
Or something along these lines
There are basically two (or more?) ways to map uniform density to any distribution function: Inverse transformation sampling and Rejection sampling. I think in your case you should use the former.
Quick and simple:
rand(1, rand(1, n))
What you need to do is generate a random number over a greater interval (preferably floating point), and map that into [1,10] in a nonuniform way. Exactly what way depends on how much more likely you want a 1 to be than a 9 or 10.
For C language solutions, see these libraries. You may find use for this in PHP.
Generally speaking, it looks like you want to draw a random number from a Poisson distribution rather than the [uniform distribution](http://en.wikipedia.org/wiki/Uniform_distribution_(continuous)). On the wiki page cited above there is a section which specifically states how you can use the continuous distribution to generate a pseudo-Poisson distribution... check it out. Note that you may want to test different values of λ to ensure the distribution works as you want it to.
It depends on what distribution you want to have exactly, i.e., what number should appear with what probability.
For instance, for even n you could do the following: generate one integer random number x between 1 and n/2 and generate a second number between 1 and n+1. If y > x you generate x otherwise you generate n-x+1. This should give you the distribution in your example.
I think this should give the requested distribution:
Generate a random number in the range 1 .. x. Generate another one in the range 1 .. x+1.
Return the minimum of the two.
Let's think about how your array idea changes the probabilities. Normally every element from 1 to n has a probability of 1/n and is thus equally likely.
Since you have n entries for 1, n-1 entries for 2...1 entry for n, then the total number of entries you have is an arithmetic series. The sum of an arithmetic series counting from 1 to n is n(1+n)/2. So now we know every element's probability should use that as the denominator.
Element 1 has n entries, so it's probability is n/n(1+n)/2. Element 2 is n-1/n(1+n)/2 ... n is 1/n(1+n)/2. That gives a general formula of the numerator as n+1 -i, where i is the number you are checking. That means we now have a function for the probability of any element as n-i+1/n(1+n)/2. all probabilities are between 0 and 1 and sum to 1 by definition, and that is key to the next step.
How can we use this function to skew the number of times an element appears? It's easier with continuous distributions (ie doubles instead of ints) but we can do it. First let's make an array of our probabilities, call it c, and make a running sum of them (cumsum) and store it back in c. If that doesn't make sense, its just a loop like
for(j=0; j < n-1; j++)
if(j) c[j]+=c[j-1]
Now that we have this cumulative distribution, generate a number i from 0 to 1 (a double, not an int. We can check if i is between 0 and c[0], return 1. if i is between c[1] and c[2] return 2...all the way up to n. e.g.
for(j=0; j < n=1;j++)
if(i %lt;= c[j]) return i+1
This will distribute the integers according to the probabilities you have calculated.
<?php
//get random number between 1 and 10,000
$random = mt_rand(1, 10000);
?>

Permutations of Varying Size

I'm trying to write a function in PHP that gets all permutations of all possible sizes. I think an example would be the best way to start off:
$my_array = array(1,1,2,3);
Possible permutations of varying size:
1
1 // * See Note
2
3
1,1
1,2
1,3
// And so forth, for all the sets of size 2
1,1,2
1,1,3
1,2,1
// And so forth, for all the sets of size 3
1,1,2,3
1,1,3,2
// And so forth, for all the sets of size 4
Note: I don't care if there's a duplicate or not. For the purposes of this example, all future duplicates have been omitted.
What I have so far in PHP:
function getPermutations($my_array){
$permutation_length = 1;
$keep_going = true;
while($keep_going){
while($there_are_still_permutations_with_this_length){
// Generate the next permutation and return it into an array
// Of course, the actual important part of the code is what I'm having trouble with.
}
$permutation_length++;
if($permutation_length>count($my_array)){
$keep_going = false;
}
else{
$keep_going = true;
}
}
return $return_array;
}
The closest thing I can think of is shuffling the array, picking the first n elements, seeing if it's already in the results array, and if it's not, add it in, and then stop when there are mathematically no more possible permutations for that length. But it's ugly and resource-inefficient.
Any pseudocode algorithms would be greatly appreciated.
Also, for super-duper (worthless) bonus points, is there a way to get just 1 permutation with the function but make it so that it doesn't have to recalculate all previous permutations to get the next?
For example, I pass it a parameter 3, which means it's already done 3 permutations, and it just generates number 4 without redoing the previous 3? (Passing it the parameter is not necessary, it could keep track in a global or static).
The reason I ask this is because as the array grows, so does the number of possible combinations. Suffice it to say that one small data set with only a dozen elements grows quickly into the trillions of possible combinations and I don't want to task PHP with holding trillions of permutations in its memory at once.
Sorry no php code, but I can give you an algorithm.
It can be done with small amounts of memory and since you don't care about dupes, the code will be simple too.
First: Generate all possible subsets.
If you view the subset as a bit vector, you can see that there is a 1-1 correspondence to a set and a binary number.
So if your array had 12 elements, you will have 2^12 subsets (including empty set).
So to generate a subset, you start with 0 and keep incrementing till you reach 2^12. At each stage you read the set bits in the number to get the appropriate subset from the array.
Once you get one subset, you can now run through its permutations.
The next permutation (of the array indices, not the elements themselves) can be generated in lexicographic order like here: http://www.de-brauwer.be/wiki/wikka.php?wakka=Permutations and can be done with minimal memory.
You should be able to combine these two to give your-self a next_permutation function. Instead of passing in numbers, you could pass in an array of 12 elements which contains the previous permutation, plus possibly some more info (little memory again) of whether you need to go to the next subset etc.
You should actually be able to find very fast algorithms which use minimal memory, provide a next_permutation type feature and do not generate dupes: Search the web for multiset permutation/combination generation.
Hope that helps. Good luck!
The best set of functions I've come up with was the one provided by some user at the comments of the shuffle function on php.net Here is the link It works pretty good.
Hope it's useful.
The problem seems to be trying to give an index to every permutation and having a constant access time. I cannot think of a constant time algorithm, but maybe you can improve this one to be so. This algorithm has a time complexity of O(n) where n is the length of your set. The space complexity should be reducible to O(1).
Assume our set is 1,1,2,3 and we want the 10th permutation. Also, note that we will index each element of the set from 0 to 3. Going by your order, this means the single element permutations come first, then the two element, and so on. We are going to subtract from the number 10 until we can completely determine the 10th permutation.
First up are the single element permutations. There are 4 of those, so we can view this as subtracting one four times from 10. We are left with 6, so clearly we need to start considering the two element permutations. There are 12 of these, and we can view this as subtracting three up to four times from 6. We discover that the second time we subtract 3, we are left with 0. This means the indexes of our permutation must be 2 (because we subtracted 3 twice) and 0, because 0 is the remainder. Therefore, our permutation must be 2,1.
Division and modulus may help you.
If we were looking for the 12th permutation, we would run into the case where we have a remainder of 2. Depending on your desired behavior, the permutation 2,2 might not be valid. Getting around this is very simple, however, as we can trivially detect that the indexes 2 and 2 (not to be confused with the element) are the same, so the second one should be bumped to 3. Thus the 12th permutation can trivially be calculated as 2,3.
The biggest confusion right now is that the indexes and the element values happen to match up. I hope my algorithm explanation is not too confusing because of that. If it is, I will use a set other than your example and reword things.
Inputs: Permutation index k, indexed set S.
Pseudocode:
L = {S_1}
for i = 2 to |S| do
Insert S_i before L_{k % i}
k <- k / i
loop
return L
This algorithm can also be easily modified to work with duplicates.

Categories