Related
I need to generate three different random numbers without repeating, Three different random numbers need to be within 10 of the answer
for the sample IQ Question: 4,6 ,9,6,14,6,... Ans:19
A: random numbers
B: random numbers
C: random numbers
D: random numbers
one of them is the answer
I am now using the following code but sometimes the numbers are repeated, I have tried shuffle But which one is really random cannot satisfy random numbers need to be within 10 of the answer
$ans = $row['answer'];
$a = rand (1,10);
$a1 = rand($ans-$a ,$ans+$a);
$a2 = rand($ans-$a ,$ans+$a);
$a3 = rand($ans-$a ,$ans+$a);
As shown in previous answers (e.g. Generating random numbers without repeats, Simple random variable php without repeat, Generating random numbers without repeats) you can use shuffle to randomise a range, and then pick three items using array_slice.
The difference in your case is how you define the range:
Rather than 1 to 10, you want $ans - 10 to $ans + 10
You want to exclude the right answer
One way to build that is as two ranges: lower limit up to but not including right answer, and right answer + 1 up to upper limit.
function generate_wrong_answers($rightAnswer) {
// Generate all wrong guesses from 10 below to 10 above,
// but miss out the correct answer
$wrongAnswers = array_merge(
range($rightAnswer - 10, $rightAnswer - 1),
range($rightAnswer + 1, $rightAnswer + 10)
);
// Randomise
shuffle($wrongAnswers);
// Pick 3
return array_slice($wrongAnswers, 0, 3);
}
I have the necessity to store many numbers (i can decide which numbers) as a single unique number from which i should be able to retrieve the original number.
I already know 2 ways to do this:
1) Fundamental theorem of arithmetic (Prime Numbers)
Say i have 5 values, i assign a prime number other than 1 to each value
a = 2
b = 3
c = 5
d = 7
e = 13
If i want to store a, b and c i can multiply them 2*3*5=30 and i know no other product of primes can be 30. Then to check if a value contains, for example, b, all i need to do is 30 % b == 0
2) Bitmask
Just like Linux permissions, use powers of 2 and sum each value
But these 2 methods grow up fast (1st way faster than 2nd), and using prime numbers requires me to have a lot of primes.
Is there any other method to do this efficiently when you have, for example, a thousand values?
If you are storing, say, base 10 numbers, then do a conversion through base 11 numbers. With the increased base, you have an extra 'digit'. Use that digit as a separator. So, three base 10 numbers "10, 42, 457" become "10A42A457": a single base 11 number (with 'A' as the additional digit).
Whatever base your original numbers are in, increase the base by 1 and concatenate, using the extra digit as a separator. That will give you a single number in the increased base.
That single number can be stored in whatever number base you find convenient: binary, denary or hex for example.
To retrieve your original numbers just convert to base 11 (or whatever) and replace the extra digit with separators.
ETA: You don't have to use base 11. The single number "10A42A457" is also a valid hexadecimal number, so any base of 11 or above could be used. Hex may be easier to work with than base 11.
Is there any other method to do this efficiently when you have, for example, a thousand values?
I an not a mathematician but it's basic math, all depends on range
Range 0-1: You want to store 4 numbers 0-1 - it's basically binary system
Number1 + Number2 * 2^1 + Number3 * 2^2 + Number4 * 2^3
Range 0-50 You want to store 4 numbers 0-49
Number1 + Number2 * 50^1 + Number3 * 50^2 + Number4 * 50^3
Range 0-X You want to store N numbers 0-X
Number1 + Number2 * (X+1)^1 + Number3 * (X+1)^2 + ... + NumberN * (X+1)^(N-1)
If you have no pattern for your numbers (so it can get compressed in some way) there is really no other way.
It's also super easy for computer to resolve the number unlike the prime numbers
Predetermined values
#FlorainK comment pointed me to fact I missed
(i can decide which numbers)
The only logical solution is give your numbers references
0 is 15342
1 is 6547
2 is 76234
3 is "i like stack overflow"
4 is 42141
so you'll work range 0-4 (5 options) and whatever combination length. Use reference when "encoding" and "decoding" the number
a thousand values?
so you'll work with Range 0-999
0 is 62342
1 is 7456345653
2 is 45656234532
...
998 is 7623452
999 is 4324234326453
Let's say you use 64-bit system and programming/db language that works with 64-bit integers
2^64 = 18446744073709551616
your max range is 1000^X < 18446744073709551616 where X is number of numbers you can store in one single 64-bit integer number
Which is only 6.
You can store only 6 separate numbers 0-999 that will fit one 64-bit integer number.
0,0,0,0,0,0 is 0
1,0,0,0,0,0 is 1
0,1,0,0,0,0 is 1000
999,999,999,999,999,999 is ~1e+18
Ok so you want to store "a,b,c" or "a,b" or "a,b,c,d" or "a" etc. (thanks #FlorianK)
in such case just could use bitwise operators and powers of two
$a = 1 << 0; // 1
$b = 1 << 1; // 2
$c = 1 << 2; // 4
$d = 1 << 3; // 8
.. etc
let's say $flag has $a and $c
$flag = $a | $c; // $flag is integer here
now check it
$ok = ($flag & $a) && ($flag & $c); // true
$ok = ($flag & $a) && ($flag & $b); // false
so in 64 bit system/language/os you can use up to 64 flags which gives you a 2^64 combinations
there is no really other option. prime numbers are much worse for this as you skip many numbers in-between while binary system uses every single number.
I see you are using database and you want to store this in DB.
I really think we are dealing here with XY Problem and you should reconsider your application instead of making such workarounds.
I need to create a function which takes a single integer as argument in the range 0-N and returns a seemingly random number in the same range.
Each input number should always have exactly one output and it should always be the same.
Such a function would produce something like this:
f(1) = 4
f(2) = 1
f(3) = 5
f(4) = 2
f(5) = 3
I believe this could be accomplished by some kind of a hashing algorithm? I don't need anything complex, just not something too simple like f(1) = 2, f(2) = 3 etc.
The biggest issue is that I need this to be reversible. E.g. the above table should be true left-to-right as well as right-to-left, using a different function for the right-to-left conversion is fine.
I know the easiest way is to create an array, shuffle it and just store the relations in a db or something, but as I need N to be quite large I'd like to avoid this if possible.
Edit: For my particular case N is a specific number, it's exactly 16777216 (64^4).
If the range is always a power of two -- like [0,16777216) -- then you can use exclusive-or just as #MarkBaker suggested. It just doesn't work so easily if your range is not a power of two.
You can use addition and subtraction modulo N, although these alone are too obvious, so you have to combine it with something else.
You can also do multiplication modulo-N, but reversing that is complicated. To make it simpler, we can isolate the bottom eight bits and multiply those and add them in a way that doesn't interfere with those bits so we can use them again to reverse the operation.
I don't know PHP so I'm going to give an example in C, instead. Maybe it's the same.
int enc(int x) {
x = x + 4799 * 256 * (x % 256);
x = x + 8896843;
x = x ^ 4777277;
return (x + 1073741824) % 16777216;
}
And to decode, play the operations back in reverse order:
int dec(int x) {
x = x + 1073741824;
x = x ^ 4777277;
x = x - 8896843;
x = x - 4799 * 256 * (x % 256);
return x % 16777216;
}
That 1073741824 must be a multiple of N, and 256 must be a factor of N, and if N is not a power of two then you can't (necessarily) use exclusive-or (^ is exclusive-or in C and I assume in PHP too). The other numbers you can fiddle with, and add and remove stages, at your leisure.
The addition of 1073741824 in both functions is to ensure that x stays positive; this is so that the modulo operation doesn't ever give a negative result, even after we've subtracted values from x which might have made it go negative in the interim.
I offered to describe how I "randomly" scramble up 9-digit SSNs when producing research data sets. This does not replace or hash an SSN. It re-orders the digits. It is difficult to put the digits back in the correct order if you don't know the order in which they were scrambled. I have a gut feeling that this is not what the questioner really wants. So, I am happy to delete this answer if it is deemed off-topic.
I know that I have 9 digits. So, I start with an array that has 9 index values in order:
$a = array(0,1,2,3,4,5,6,7,8);
Now, I need to turn a key that I can remember into a way to shuffle the array. The shuffling has to be the same order for the same key every time. I use a couple tricks. I use crc32 to turn a word into a number. I use srand/rand to get a predictable order of random values. Note: mt_rand no longer produces the same sequence of random digits with the same seed, so I have to use rand.
srand(crc32("My secret key"));
usort($a, function($a, $b) { return rand(-1,1); });
The array $a still has the digits 0 through 8, but they are shuffled. If I use the same keyword I will get the same shuffled order every time. That lets me repeat this every month and get the same result. Then, with a shuffled array, I can pick the digits off the SSN. First, I ensure it has 9 characters (some SSNs are sent as integers and a leading 0 is omitted). Then, I build a masked SSN by picking the digits using $a.
$ssn = str_pad($ssn, 9, '0', STR_PAD_LEFT);
$masked_ssn = '';
foreach($a as $i) $masked_ssn.= $ssn{$i};
$masked_ssn will now have all the digits in $ssn, but in a different order. Technically, there are keywords that make $a become the original ordered array after shuffling, but that is very very rare.
Hopefully this makes sense. If so, you can do it all much faster. If you turn the original string into an array of characters, you can shuffle the array of characters. You just need to reseed rand every time.
$ssn = "111223333"; // Assume I'm using a proper 9-digit SSN
$a = str_split($ssn);
srand(crc32("My secret key"));
usort($a, function($a, $b) { return rand(-1,1); });
$masked_ssn = implode('', $a);
This is not really faster in a runtime way because rand is a rather expensive function and you run rand a hell of lot more here. If you are masking thousands of values as I do, you will want to use an index array that is shuffled just once, not a shuffling for every value.
Now, how do I undo it? Assume I'm using the first method with the index array. It will be something like $a = {5, 3, 6, 1, 0, 2, 7, 8, 4}. Those are the indexes for the original SSN in the masked order. So, I can easily build the original SSN.
$ssn = '000000000'; // I like to define all 9 characters before I start
foreach($a as $i=>$j) $ssn[$j] = $masked_ssn{$i};
As you can see, $i counts from 0 to 8 across the masked SSN. $j counts 5, 3, 6... and puts each value from the masked SSN in the correct place in the original SSN.
Looks like you've got good answer, but still there is an alternative. Linear Congruential Generator (LCG) could provide 1-to-1 mapping and it is known to be a reversible using Euclid's algorithm. For 24bit
Xi = [(A * Xi-1) + C] Mod M
where M = 2^24 = 16,777,216
A = 16,598,013
C = 12,820,163
For LCG reversability take a look at Reversible pseudo-random sequence generator
I have a 2 dimensional arrays in php containing the Ranges. for example:
From.........To
---------------
125..........3957
4000.........5500
5217628......52198281
52272128.....52273151
523030528....523229183
and so on
and it is a very long list. now I want to see if a number given by user is in range.
for example numbers 130, 4200, 52272933 are in my range but numbers 1, 5600 are not.
of course I can count all indexes and see if my number is bigger than first and smaller than second item. but is there a faster algorithm or a more efficient way of doing it using php function?
added later
It is sorted. it is actually numbers created with ip2long() showing all IPs of a country.
I just wrote a code for it:
$ips[1] = array (2,20,100);
$ips[2] = array (10,30,200);
$n=11;// input ip
$count = count($ips);
for ($i = 0; $i <= $count; $i++) {
if ($n>=$ips[1][$i]){
if ($n<=$ips[2][$i]){
echo "$i found";
break;
}
}else if($n<$ips[1][$i]){echo "not found";break;}
}
in this situation numbers 2,8,22,and 200 are in range. but not numbers 1,11,300
Put the ranges in a flat array, sorted from lower to higher, like this:
a[0] = 125
a[1] = 3957
a[2] = 4000
a[3] = 5500
a[4] = 5217628
a[5] = 52198281
a[6] = 52272128
a[7] = 52273151
a[8] = 523030528
a[9] = 523229183
Then do a binary search to determine at what index of this array the number in question should be inserted. If the insertion index is even then the number is not in any sub-range. If the insertion index is odd, then the number falls inside one of the ranges.
Examples:
n = 20 inserts at index 0 ==> not in a range
n = 126 inserts at index 1 ==> within a range
n = 523030529 inserts at index 9 ==> within a range
You can speed things up by implementing a binary search algorithm. Thus, you don't have to look at every range.
Then you can use in_array to check if the number is in the array.
I'm not sure if I got you right, do your arrays really look like this:
array(125, 126, 127, ..., 3957);
If so, what's the point? Why not just have?
array(125, 3957);
That contains all the information necessary.
The example you give suggests that the numbers may be large and the space sparse by comparison.
At that point, you don't have very many options. If the array is sorted, binary search is about all there is. If the array is not sorted, you're down to plain, old CS101 linear search.
The correct data structure to use for this problem is an interval tree. This is, in general, much faster than binary search.
I am assuming that the ranges do not overlap.
If that is the case, you can maintain a map data structure that is keyed on the lower value of the range.
Now all you have to do (given the number N) is to find the key in the map that is just lower than N (using binary search - logarithmic complexity) and then check if the number is lesser than the right value.
Basically, it is a binary search (logarithmic) on the constructed map.
From a pragmatic point of view, a linear search may very well turn out to be the fastest lookup method. Think of page faults and hard disk seek time here.
If your array is large enough (whatever "enough" actually means), it may be wise to stuff your IPs in a SQL database and let the database figure out how to efficiently compute SELECT ID FROM ip_numbers WHERE x BETWEEN start AND end;.
when you use the random(min,max) function in most languages, what is the distribution like ?
what if i want to produce a range of numbers for 20% of the time, and another range of numbers for 80% of the time, how can i generate series of random number that follows that ?
ex) i should get random frequency but the frequency of "1" must be higher by around 20% than the frequency of "0"
For most languages, the random number generated can be dependent on an algorithm within that language, or generated randomly based on the several factors such as time, processor, seed number.
The distribution is not normal. In fact say if the function returns 5 integers, all 5 integers have a fair chance of appearing in the next function call. This is also known as uniformed distribution.
So say if you wish to produce a number (say 7) for 20% of the time, and another number (say 13) for 80% of the time, you can do an array like this:
var arr = [7,13,13,13,13];
var picked = arr[Math.floor(Math.random()*arr.length)] ;
// since Math.random() returns a float from 0.0 to 1.0
So thus 7 has a 20% chance of appearing, and 13 has 80% chance.
This is one possible method:
ranges = [(10..15), (20..30)]
selector = [0, 0, 1,1,1,1,1,1,1,1] # 80:20 distribution array
# now select a range randomly
random_within_range(ranges(selector[random(10)]))
def random_within_range range
rand (range.last - range.begin - (range.exclude_end? ? 1 : 0)) + range.begin
end
Most pseudo random generators built-in programming languages produce a uniform distribution, i.e. each value within the range has the same probability of being produced as any other value in the range. Indeed in some cases this requirement is part of the language standard. Some languages such as Python or R support various of the common distributions.
If the language doesn't support it, you either have to use mathematical tricks to produce other distributions such as a normal distribution from a uniform one, or you can look for third-party libraries which perform this function.
Your problem seems much simpler however since the random variable is discrete (and of the simpler type thereof, i.e binary). The trick for these is to produce a random number form the uniform distribution, in a given range, say 0 to 999, and to split this range in the proportions associated with each value, in the case at hand this would be something like :
If (RandomNumber) < 200 // 20%
RandomVariable = 0
Else // 80%
RandomVariable = 1
This logic can of course be applied to n discrete variables.
Your question differs from your example quite a bit. So I'll answer both and you can figure out whichever answers what you're really looking for.
1) Your example (I don't know ruby or java, so bear with me)
First generate a random number from a uniform distribution from 0 to 1, we'll call it X.
You can then setup a if/else (i.e. if ( x < .2) {1} else {0})
2) Generating random numbers from a normal distribution with skew
You can look into skewed distributions such as a skewed student T's distribution with high degree of freedom.
You can also use the normal CDF and just pick off numbers that way.
Here's a paper which discusses how to do it with multiple random numbers from a uniform distribution
Finally, you can use a non-parametric approach which would involve kernal density estimation (I suspect you aren't looking for anything this sophisticated however).
Like anybody says, pseudo-random number generator on most languages implements the uniform distribution over (0,1).
If you have two responses categories (0,1) with p probability for 1, you have a Bernoulli distribution and can be emulated with
# returns 1 with p probability and 0 with (1-p) probability
def bernoulli(p)
rand()<p ? 1:0;
end
Simple as that.
Skewed normal distribution is a entirely different beast, made by the 'union' of pdf and cdf of a normal distribution to create the skew. You can read Azzalini's work here. Using gem distribution, you can generate the probability density function, with
# require 'distribution'
def sn_pdf(x,alpha)
sp = 2*Distribution::Normal.pdf(x)*Distribution::Normal.cdf(x*alpha)
end
Obtains the cdf is difficult, because there isn't an analytical solution, so you should integrate.
To obtain random numbers from a skewed normal, you could use the acceptation-rejection algorithm.
Most computer languages have a uniform distribution to their (pseudo) random integer generators. So each integer is equally likely.
For your example, suppose you want "1" 55% of the time and "0" 45% of the time.
To get unequal these frequencies, try generating a random number between 1 and 100. If the number generated is from 1 to 55, output "1"; otherwise output "0".
How about
var oneFreq = 80.0/100.0;
var output = 0;
if (Math.random() > oneFreq)
output = 1;
or, if you want 20% of the values to be between 0 and 100, and 80% to be between 100 and 200.
var oneFreq = 80.0/100.0;
var oneRange = 100;
var zeroRange = 100;
var output = Math.random();
if (output > oneFreq)
output = zeroRange + Math.floor(oneRange * (output - oneFreq));
else
output = Math.floor(zeroRange * output);
In ruby I would do it like this:
class DistributedRandom
def initialize(left, right = nil)
if right
#distribution = [0] * left + [1] * right
else
#distribution = left
end
end
def get
#distribution[rand #distribution.length]
end
end
Running a test with 80:20 distribution:
test = [0,0]
rnd = DistributedRandom.new 80, 20 # 80:20 distribution
10000.times { test[rnd.get] += 1 }; puts "Test 1", test
Running a test with 20% more distribution on the right side:
test = [0,0]
rnd = DistributedRandom.new 100, 120 # +20% distribution
10000.times { test[rnd.get] += 1 }; puts "Test 2", test
Running a test with custom distribution with a trigonometric function over 91 discrete values, output however does not fit very well into the previous tests:
test = [0,0]
rnd = DistributedRandom.new((0..90).map {|x| Math.sin(Math::PI * x / 180.0)})
10000.times { test[rnd.get] += 1 }; puts "Test 3", test
Have a look at this lecture if you want a good mathematical understanding.