I'm generating an array of random numbers, between 0 and 2 with this code:
for ($j = 0; $j < 60; $j++) {
for ($i = 0; $i < 100; $i++) {
$value = rand(0,2);
$DBH->query("INSERT INTO map (x, y, value) VALUES($i, $j, $value);");
}
And i found and oddity, as you may see here, the rows are random, but they repeat:
22121000210211220022122200120200122000122121
22121000210211220022122200120200122000122121
22121000210211220022122200120200122000122121
22121000210211220022122200120200122000122121
22121000210211220022122200120200122000122121
How can avoid that?
You might want to explicitly seed your generator using srand, e.g. srand(time()) (note that the srand link has a better example of seeding than just using time, depends on how random you need, I suppose).
Failing that
You could try using mt_rand with mt_srand
You could always use MySQL's rand function to generate the numbers as a workaround.
Related
I'm reading "PHP 7 Data Structures and Algorithms" chapter "Shortest path using the Floyd-Warshall algorithm"
the author is generating a graph with this code:
$totalVertices = 5;
$graph = [];
for ($i = 0; $i < $totalVertices; $i++) {
for ($j = 0; $j < $totalVertices; $j++) {
$graph[$i][$j] = $i == $j ? 0 : PHP_INT_MAX;
}
}
i don't understand this line :
$graph[$i][$j] = $i == $j ? 0 : PHP_INT_MAX;
looks like a one line if statement
is it the same as ?
if ($i == $j) {
$graph[$i][$j] = 0;
} else {
$graph[$i][$j] = PHP_INT_MAX;
}
what is the point of using PHP_INT_MAX ?
at the end what does the graph look like ?
You've correctly understood the ternary (? :) operator
To answer the other part of your question, have a look if the following makes sense to you.
First:
The author initializes the $graph array using the following code:
<?php
$totalVertices = 5; // total nodes (use 0, 1, 2, 3, and 4 instead of A, B, C, D, and E, respectively)
$graph = [];
for ($i = 0; $i < $totalVertices; $i++) {
for ($j = 0; $j < $totalVertices; $j++) {
$graph[$i][$j] = $i == $j ? 0 : PHP_INT_MAX;
}
}
which results in the following matrix
All the nodes(vertices) on the main diagonal(grey) are set to 0 as a node's distance to itself equals 0.
All the remaining nodes in the 'matrix' are set to PHP_INT_MAX (the largest integer supported) - we'll see why this is in a minute.
Second:
The author then sets the distances between the nodes that have a direct connection(edges), writing them manually to the $graph array, as follows:
$graph[0][1] = $graph[1][0] = 10;
$graph[2][1] = $graph[1][2] = 5;
$graph[0][3] = $graph[3][0] = 5;
$graph[3][1] = $graph[1][3] = 5;
$graph[4][1] = $graph[1][4] = 10;
$graph[3][4] = $graph[4][3] = 20;
This results in the following 'matrix' stored in array $graph (green: edge distances):
So why does the author use PHP_INT_MAX for the nodes that are not directly connected(the non-edges)?
The reason is, because it allows for the algorithm to work with
node-connection(edge) distances up to and including PHP_INT_MAX.
In this particular example, any number smaller than 20 in stead of PHP_INT_MAX in the ternary would warp the outcomes of the algorithm - it would spit out wrong results.
Or another way to look at this, in this particular example the author could have just used any number bigger than 20 in stead of PHP_INT_MAX to get satisfactory results from the algorithm,
because the biggest distance between two directly connected nodes in this case equals 20. Use any number smaller than 20 and the results will come out wrong.
You can give it a try, and test:
$graph[$i][$j] = $i == $j ? 0 : 19;
the algorithm will now tell us that the shortest distance between A to E - i.e. $graph[0][4] equals 19... WRONG
So using PHP_INT_MAX here gives 'leeway', it allows for the algorithm to work successfully with edge distances smaller than or equal to 9223372036854775807 (the largest int that can be stored on a 64 bit system),
or 2147483647 (on a 32 bit system).
You have two questions here.
The first is regarding the syntax condition ? val_if_true : val_if_false. This is called the "ternary operator". Your assessment regarding the behavior is correct.
The second is regarding the use of PHP_INT_MAX. All distances between two nodes are being initialized to one of two values: 0 if nodes i and j are the same node (i.e. a vertex), and PHP_INT_MAX if the nodes are not the same (i.e. an edge). That is, a node's distance to itself is 0 and a node's distance to any other node is the largest integer value PHP recognizes. The reason for this is that the Floyd-Warshall algorithm utilizes the concept of "infinity" to represent minimum distances that have not yet been calculated, but as there is no concept of "infinity" in PHP, the value PHP_INT_MAX is being used as a stand-in for it.
I am not sure but when i print_r the array, both random generated string are the same instead of different.
$amount_of_files = 2;
$generated_file_names = array();
for($i = 0; $i < $amount_of_files; $i++){
$generated_file_names[] = substr(md5(time()), 0, 10);
}
time() returns it's value to the nearest second - your code is executing in much less time than that so the value is the same. If you want random values for each item in the array use rand() or mt_rand() instead.
You can use like this
<?php
$amount_of_files = 2;
$generated_file_names = array();
for($i = 0; $i < $amount_of_files; $i++){
$generated_file_names[] = substr(md5(rand()),0,10);
}
print_r($generated_file_names);
?>
you need microtime() php is looping soo fast 0 to 2 and time() is not changing so md5 is same and sub_str is same for all.
I need to generate x amount of random odd numbers, within a given range.
I know this can be achieved with simple looping, but I'm unsure which approach would be the best, and is there a better mathematical way of solving this.
EDIT: Also I cannot have the same number more than once.
Generate x integer values over half the range, and for each value double it and add 1.
ANSWERING REVISED QUESTION: 1) Generate a list of candidates in range, shuffle them, and then take the first x. Or 2) generate values as per my original recommendation, and reject and retry if the generated value is in the list of already generated values.
The first will work better if x is a substantial fraction of the range, the latter if x is small relative to the range.
ADDENDUM: Should have thought of this approach earlier, it's based on conditional probability. I don't know php (I came at this from the "random" tag), so I'll express it as pseudo-code:
generate(x, upper_limit)
loop with index i from upper_limit downto 1 by 2
p_value = x / floor((i + 1) / 2)
if rand <= p_value
include i in selected set
decrement x
return/exit if x <= 0
end if
end loop
end generate
x is the desired number of values to generate, upper_limit is the largest odd number in the range, and rand generates a uniformly distributed random number between zero and one. Basically, it steps through the candidate set of odd numbers and accepts or rejects each one based how many values you still need and how many candidates still remain.
I've tested this and it really works. It requires less intermediate storage than shuffling and fewer iterations than the original acceptance/rejection.
Generate a list of elements in the range, remove the element you want in your random series. Repeat x times.
Or you can generate an array with the odd numbers in the range, then do a shuffle
Generation is easy:
$range_array = array();
for( $i = 0; $i < $max_value; $i++){
$range_array[] .= $i*2 + 1;
}
Shuffle
shuffle( $range_array );
splice out the x first elements.
$result = array_slice( $range_array, 0, $x );
This is a complete solution.
function mt_rands($min_rand, $max_rand, $num_rand){
if(!is_integer($min_rand) or !is_integer($max_rand)){
return false;
}
if($min_rand >= $max_rand){
return false;
}
if(!is_integer($num_rand) or ($num_rand < 1)){
return false;
}
if($num_rand <= ($max_rand - $min_rand)){
return false;
}
$rands = array();
while(count($rands) < $num_rand){
$loops = 0;
do{
++$loops; // loop limiter, use it if you want to
$rand = mt_rand($min_rand, $max_rand);
}while(in_array($rand, $rands, true));
$rands[] = $rand;
}
return $rands;
}
// let's see how it went
var_export($rands = mt_rands(0, 50, 5));
Code is not tested. Just wrote it. Can be improved a bit but it's up to you.
This code generates 5 odd unique numbers in the interval [1, 20]. Change $min, $max and $n = 5 according to your needs.
<?php
function odd_filter($x)
{
if (($x % 2) == 1)
{
return true;
}
return false;
}
// seed with microseconds
function make_seed()
{
list($usec, $sec) = explode(' ', microtime());
return (float) $sec + ((float) $usec * 100000);
}
srand(make_seed());
$min = 1;
$max = 20;
//number of random numbers
$n = 5;
if (($max - $min + 1)/2 < $n)
{
print "iterval [$min, $max] is too short to generate $n odd numbers!\n";
exit(1);
}
$result = array();
for ($i = 0; $i < $n; ++$i)
{
$x = rand($min, $max);
//not exists in the hash and is odd
if(!isset($result{$x}) && odd_filter($x))
{
$result[$x] = 1;
}
else//new iteration needed
{
--$i;
}
}
$result = array_keys($result);
var_dump($result);
I know the more efficient way to have a loop over array is a foreach, or to store count in a variable to avoid to call it multiple times.
But I am curious if PHP have some kind of "caching" stuff like:
for ($i=0; $i<count($myarray); $i++) { /* ... */ }
Does it have something similar and I am missing it, or it does not have anything and you should code:
$count=count($myarray);
for ($i=0; $i<$count; $i++) { /* ... */ }
PHP does exactly what you tell it to. The length of the array may change inside the loop, so it may be on purpose that you're calling count on each iteration. PHP doesn't try to infer what you mean here, and neither should it. Therefore the standard way to do this is:
for ($i = 0, $length = count($myarray); $i < $length; $i++)
PHP will execute the count each time the loop iterates. However, PHP does keep internal track of the array's size, so count is a relatively cheap operation. It's not as if PHP is literally counting each element in the array. But it's still not free.
Using a very simple 10 million item array doing a simple variable increment, I get 2.5 seconds for the in-loop count version, and 0.9 seconds for the count-before-loop. A fairly large difference, but not 'massive'.
edit: the code:
$x = range(1, 10000000);
$z = 0;
$start = microtime(true);
for ($i = 0; $i < count($x); $i++) {
$z++;
}
$end = microtime(true); // $end - $start = 2.5047581195831
Switching to do
$count = count($x);
for ($i = 0; $i < $count; $i++) {
and otherwise everything else the same, the time is 0.96466398239136
PHP is an imperative language, and that means it is not supposed to optimize away anything that can possibly have any effect. Given that it's also an interpreted language, it couldn't be done safely even if someone really wanted.
Plus, if you simply want to iterate over the array, you really want to use foreach. In that case, not only the count, but the whole array will be copied (and you can modify the original one as you wish). Or you can modify it in place using foreach ($arr as &$el) { $el = ... }; unset($el);. What I mean to say is that PHP (as any other language) often provides better solutions to your original problem (if you have any).
I want a random number generator with non-uniform distribution, ie:
// prints 0 with 0.1 probability, and 1 with 0.9 probability
echo probRandom(array(10, 90));
This is what I have right now:
/**
* method to generated a *not uniformly* random index
*
* #param array $probs int array with weights
* #return int a random index in $probs
*/
function probRandom($probs) {
$size = count($probs);
// construct probability vector
$prob_vector = array();
$ptr = 0;
for ($i=0; $i<$size; $i++) {
$ptr += $probs[$i];
$prob_vector[$i] = $ptr;
}
// get a random number
$rand = rand(0, $ptr);
for ($i=0, $ret = false; $ret === false; $i++) {
if ($rand <= $prob_vector[$i])
return $i;
}
}
Can anyone think of a better way? Possibly one that doesn't require me to do pre-processing?
If you know the sum of all elements in $probs, you can do this without preprocessing.
Like so:
$max = sum($probs);
$r = rand(0,$max-1);
$tot = 0;
for ($i = 0; $i < length($probs); $i++) {
$tot += $probs[$i];
if ($r < $tot) {
return $i;
}
}
This will do what you want in O(N) time, where N is the length of the array. This is a firm lower bound on the algorithmic runtime of such an algorithm, as each element in the input must be considered.
The probability a given index $i is selected is $probs[$i]/sum($probs), given that the rand function returns independent uniformly distributed integers in the given range.
In your solution you generate an accumulated probability vector, which is very useful.
I have two suggestions for improvement:
if $probs are static, i.e. it's the same vector every time you want to generate a random number, you can preprocess $prob_vector just once and keep it.
you can use binary search for the $i (Newton bisection method)
EDIT: I now see that you ask for a solution without preprocessing.
Without preprocessing, you will end up with worst case linear runtime (i.e., double the length of the vector, and your running time will double as well).
Here is a method that doesn't require preprocessing. It does, however, require you to know a maximum limit of the elements in $probs:
Rejection method
Pick a random index, $i and a random number, X (uniformly) between 0 and max($probs)-1, inclusive.
If X is less than $probs[$i], you're done - $i is your random number
Otherwise reject $i (hence the name of the method) and restart.