i have this code:
while ($sum<16 || $sum>18){
$totala = 0;
$totalb = 0;
$totalc = 0;
$ranka = mysql_query("SELECT duration FROM table WHERE rank=1 ORDER BY rand() LIMIT 1");
$rankb = mysql_query("SELECT duration FROM table WHERE rank=2 ORDER BY rand() LIMIT 1");
$rankc = mysql_query("SELECT duration FROM table WHERE rank=3 ORDER BY rand() LIMIT 1");
while ($rowa = mysql_fetch_array($ranka)) {
echo $rowa['duration'] . "<br/>";
$totala = $totala + $rowa['duration'];
}
while ($rowb = mysql_fetch_array($rankb)) {
$totalb = $totalb + $rowb['duration'];
}
while ($rowc = mysql_fetch_array($rankc)) {
$totalc = $totalc + $rowc['duration'];
}
$sum=$totala+$totalb+$totalc;
}
echo $sum;
It works fine, But the problem is until "$sum=16" the "echo $rowa['duration']" executes, the question is, is there a away to "echo" only the latest executed code in the "while ($rowa = mysql_fetch_array($ranka))" i this while loop?
Because most of the times returns all the numbers until the "$sum=16"
You are explicitly echoing the $rowa['duration'] in the first inner while loop. If you only want to print the last duration from the $ranka set, simple change the echo to $rowa_duration = $rowa['duration'] then echo it outside the loop.
while ($rowa = mysql_fetch_array($ranka)) {
$rowa_duration = $rowa['duration'];
$totala = $totala + $rowa['duration'];
}
echo $rowa_duration . '<br/>';
What you are doing there is bad on multiple levels. And your english horrid. Well .. practice makes perfect. You could try joining ##php chat room on FreeNode server. That would improve both your english and php skills .. it sure helped me a lot. Anyway ..
The SQL
First of all, to use ORDER BY RAND() is extremely ignorant (at best). As your tables begin the get larger, this operation will make your queries slower. It has n * log2(n) complexity, which means that selecting querying table with 1000 entries will take ~3000 times longer then querying table with 10 entries.
To learn more about it , you should read this blog post, but as for your current queries , the solution would look like:
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 1
LIMIT 1
This would select random duration from the table.
But since you you are actually selecting data with 3 different ranks ( 1, 2 and 3 ), it would make sense to create a UNION of three queries :
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 1
LIMIT 1
UNION ALL
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 2
LIMIT 1
UNION ALL
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 3
LIMIT 1
Look scary, but it actually will be faster then what you are currently using, and the result will be three entries from duration column.
PHP with SQL
You are still using the old mysql_* functions to access database. This form of API is more then 10 years old and should not be used, when writing new code. The old functions are not maintained (fixed and/or improved ) anymore and even community has begun the process of deprecating said functions.
Instead you should be using either PDO or MySQLi. Which one to use depends on your personal preferences and what is actually available to you. I prefer PDO (because of named parameters and support for other RDBMS), but that's somewhat subjective choice.
Other issue with you php/mysql code is that you seem to pointlessly loop thought items. Your queries have LIMIT 1, which means that there will be only one row. No point in making a loop.
There is potential for endless loop if maximum value for duration is 1. At the start of loop you will have $sum === 15 which fits the first while condition. And at the end that loop you can have $sum === 18 , which satisfies the second loop condition ... and then it is off to the infinity and your SQL server chokes.
And if you are using fractions for duration, then the total value of 3 new results needs to be even smaller. Just over 2. Start with 15.99 , ends with 18.01 (that's additional 2.02 in duration or less the 0.7 per each). Again .. endless loop.
Suggestion
Here is how i would do it:
$pdo = new PDO('mysql:dbname=my_db;host=localhost', 'username', 'password');
$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$sum = 0;
while ( $sum < 16 )
{
$query = 'that LARGE query above';
$statement = $pdo->prepare( $query );
if ( $statement->execute() )
{
$data = $statement->fetchAll( PDO::FETCH_ASSOC );
$sum += $data[0]['duration']+$data[1]['duration']+$data[2]['duration'];
}
}
echo $data[0]['duration'];
This should do what your code did .. or at least, what i assume, was your intentions.
Related
I would like to SELECT certain data out of my mysql DB. I am working with a php loop and a sql statement with a LIMIT and UNION.
Problem: The speed of my query is terrible. One UNION statement tooks 2-4 seconds. Due to the loop the Overall-Query takes 3 Minutes.
Is there a chance to optimize my query?
I tried to separate the "three" statements and merge the results. But this is not really faster. So I think that the UNION is not my problem.
PHP/SQL:
My code is running through two-foreach-loops. The code is working properly. But the performance is the problem.
$sql_country = "SELECT country FROM country_list";
foreach ($db->query($sql_country) as $row_country) { //first loop (150 entries)
$sql_color = "SELECT color FROM color_list";
foreach ($db->query($sql_color) as $row_color) { //second loop (10 entries)
$sql_all = "(SELECT ID, price FROM company
WHERE country = '".$row_country['country']."'
AND color = '".$row_color['color']."'
AND price BETWEEN 2.5 AND 4.5
order by price DESC LIMIT 2)
UNION
(SELECT ID, price FROM company
WHERE country = '".$row_country['country']."'
AND color = '".$row_color['color']."'
AND price BETWEEN 5.5 AND 8.2
order by price DESC LIMIT 2)
UNION
(SELECT ID, price FROM company
WHERE country = '".$row_country['country']."'
AND color = '".$row_color['color']."'
AND price BETWEEN 8.5 AND 10.8
order by price DESC LIMIT 2)";
foreach ($db->query($sql_all) as $row_all) {
$shopID[] = $row_all['ID']; //I just need these IDs
}
}
}
Do you have any idea or hints to get this faster?
An index on (country, color, price, ID) should improve the performance of single queries from seconds to a couple of milliseconds or even less. But you still have the problem of executing 1500 queries. Depending on your system, a single query execution can add an overhead of about 10 ms, which would add up to 15 seconds in your case. You need to find a way to minimize the number of queries - In best case to a single query.
For low limits (like 2 in your case), you can combine multiple LIMIT 1 subqueries with different offsets. I would generate such a query dynamically.
$priceRanges = [
['2.5', '4.5'],
['5.5', '8.2'],
['8.5', '10.8'],
];
$limit = 2;
$offsets = range(0, $limit - 1);
$queryParts = [];
foreach ($priceRanges as $range) {
$rangeFrom = $range[0];
$rangeTo = $range[1];
foreach ($offsets as $offset) {
$queryParts[] = "
select (
select ID
from company cmp
where cmp.country = cnt.country
and cmp.color = clr.color
and cmp.price between {$rangeFrom} AND {$rangeTo}
order by cmp.price desc
limit 1
offset {$offset}
) as ID
from country_list cnt
cross join color_list clr
having ID is not null
";
}
}
$query = implode(' UNION ALL ', $queryParts);
This will generate a quite long UNION query. You can see a PHP demo on rexester.com and SQL demo on db-fiddle.com.
I can't guarantee it will be any faster. But it's worth a try.
I got a little problem, I've got a database, in that database are different names, id, and coins. I want to show people their rank, so your rank has to be 1 if you have the most coins, and 78172 as example when your number 78172 with coins.
I know I can do something like this:
SELECT `naam` , `coins`
FROM `gebruikers`
ORDER BY `coins` DESC
But how can I get the rank you are, in PHP :S ?
You can use a loop and a counter. The first row from MySql is going the first rank,I.e first in the list.
I presume you want something like:
1st - John Doe
2nd - Jane Doe
..
..
right?
See: http://www.if-not-true-then-false.com/2010/php-1st-2nd-3rd-4th-5th-6th-php-add-ordinal-number-suffix
Helped me a while ago.
You could use a new varariable
$i = "1";
pe care o poti folosi in structura ta foreach,while,for,repeat si o incrementezi mereu.
and you use it in structures like foreach,while,for,repeat and increment it
$i++;
this is the simplest way
No code samples above... so here it is in PHP
// Your SQL query above, with limits, in this case it starts from the 11th ranking (0 is the starting index) up to the 20th
$start = 10; // 0-based index
$page_size = 10;
$stmt = $pdo->query("SELECT `naam` , `coins` FROM `gebruikers` ORDER BY `coins` DESC LIMIT {$start}, {$page_size}");
$data = $stmt->fetchAll();
// In your template or whatever you use to output
foreach ($data as $rank => $row) {
// array index is 0-based, so add 1 and where you wanted to started to get rank
echo ($rank + 1 + $start) . ": {$row['naam']}<br />";
}
Note: I'm too lazy to put in a prepared statement, but please look it up and use prepared statements.
If you have a session table, you would pull the records from that, then use those values to get the coin values, and sort descending.
If we assume your Session table is sessions(session_id int not null auto_increment, user_id int not null, session_time,...) and we assume that only users who are logged in would have a session value, then your SQL would look something like this: (Note:I am assuming that you also have a user_id column on your gebruikers table)
SELECT g.*
FROM gebruikers as g, sessions as s WHERE s.user_id = g.user_id
ORDER BY g.coins DESC
You would then use a row iterator to loop through the results and display "1", "2", "3", etc. The short version of which would look like
//Connect to database using whatever method you like, I will assume mysql_connect()
$sql = "SELECT g.* FROM gebruikers as g, sessions as s WHERE s.user_id = g.user_id ORDER BY g.coins DESC";
$result = mysql_query($sql,$con); //Where $con is your mysql_connect() variable;
$i = 0;
while($row = mysql_fetch_assoc($result,$con)){
$row['rank'] = $i;
$i++;
//Whatever else you need to do;
}
EDIT
In messing around with a SQLFiddle found at http://sqlfiddle.com/#!2/8faa9/6
I came accross something that works there; I don't know if it will work when given in php, but I figured I would show it to you either way
SET #rank = 0; SELECT *,(#rank := #rank+1) as rank FROM something order by coins DESC
EDIT 2
This works in a php query from a file.
SELECT #rank:=#rank as rank,
g.*
FROM
(SELECT #rank:=0) as z,
gebruikers as g
ORDER BY coins DESC
If you want to get the rank of one specific user, you can do that in mysql directly by counting the number of users that have more coins that the user you want to rank:
SELECT COUNT(*)
FROM `gebruikers`
WHERE `coins` > (SELECT `coins` FROM `gebruikers` WHERE `naam` = :some_name)
(assuming a search by name)
Now the rank will be the count returned + 1.
Or you do SELECT COUNT(*) + 1 in mysql...
Hi i have such table information:
what i want to do with php with while or just in mysql, is to SUM (time_used) of the rows with status 44 until its reached row with status 55. after that it should begin from start with new summing.
first query should return 37, second 76 (keep in mind it should be universal, for unlimited occurrences of 55 status row)
i thought of a way with time/date filtering and have this:
select sum(time_used) as sumed
from timelog
where start_time > (select end_time from timelog where (status='55')
ORDER BY id DESC LIMIT 1) ORDER BY id DESC
but this works only for last combination of 44 and 55
i know i will need two way filtering( < end_time and > end_time) so it will work for all cases, but cant think of a way to do it in php
can anyone help me?
EDIT:
sqlfiddle whoever want it:
http://sqlfiddle.com/#!2/33820/2/0
There are two ways to do it: Plain SQL or PHP. If you treat thousands of rows, it may be interresting to choose between the two by testing performance.
Plain SQL
select project_id, task_id, user_id, sum(time_used) as time_used,
min(start_time) as start_time, max(end_time) as end_time, max(comment) as comment from
(select t.id, t.project_id, t.task_id, t.user_id, t.time_used,
count(t2.id) as count55, t.start_time, t.end_time, t.comment
from timelog t
left join timelog t2 on t.id>t2.id and t2.status=55 and t.task_id=t2.task_id
group by t.id) as t
group by count55;
I assume here that a task can belong to one user only
SQL and PHP
$link = mysqli_connect( ... );
$query = "select id, project_id, task_id, user_id, time_used, start_time, end_time, status
from timelog order by id";
$result = mysqli_query($link, $query);
$table = array();
$time_used = 0;
$start_sum = true;
$i = 0;
while($row = mysqli_fetch_assoc ($result)){
if($start_sum){
$table[$i] = $row;
$start_sum = false;
} else {
$table[$i]['time_used'] += $row['time_used'];
$table[$i]['end_time'] += $row['end_time'];
}
if($row['state'] == 55){
$i++;
$start_sum = true;
}
}
If two tasks can run in simultaneously, solution 1 will work, but solution 2 will need to be adapted in order to take this in account.
here is my intepretation:
http://sqlfiddle.com/#!2/33820/45
set #n=0;
select project_id, task_id, user_id,sum(time_used) from (
SELECT time_used,project_id, task_id, user_id,
#n:=if(status=55,#n+1,#n),
if(status=55,-1,#n) as grouper FROM timelog
) as t
where grouper>-1
group by grouper;
I'm neither a php nor MySQL programmer, but I can explain the logic you want to follow. You can then code it.
First, query your db and return the results to php.
Next, set two sum variables to 0.
Start looping through your query results. Increment the first sum variable until you reach the first row that has status 55. Once you do, start incrementing the second variable.
The tricky part will be to sort your query by the row number of the table. Here is a link that will help you with that part.
I've got a MySQL table with a bunch of entries in it, and a column called "Multiplier." The default (and most common) value for this column is 0, but it could be any number.
What I need to do is select a single entry from that table at random. However, the rows are weighted according to the number in the "Multiplier" column. A value of 0 means that it's not weighted at all. A value of 1 means that it's weighted twice as much, as if the entry were in the table twice. A value of 2 means that it's weighted three times as much, as if the entry were in the table three times.
I'm trying to modify what my developers have already given me, so sorry if the setup doesn't make a whole lot of sense. I could probably change it but want to keep as much of the existing table setup as possible.
I've been trying to figure out how to do this with SELECT and RAND(), but don't know how to do the weighting. Is it possible?
This guy asks the same question. He says the same as Frank, but the weightings don't come out right and in the comments someone suggests using ORDER BY -LOG(1.0 - RAND()) / Multiplier, which in my testing gave pretty much perfect results.
(If any mathematicians out there want to explain why this is correct, please enlighten me! But it works.)
The disadvantage would be that you couldn't set the weighting to 0 to temporarily disable an option, as you would end up dividing by zero. But you could always filter it out with a WHERE Multiplier > 0.
For a much better performance (specially on big tables), first index the weight column and use this query:
SELECT * FROM tbl AS t1 JOIN (SELECT id FROM tbl ORDER BY -LOG(1-RAND())/weight LIMIT 10) AS t2 ON t1.id = t2.id
On 40MB table the usual query takes 1s on my i7 machine and this one takes 0.04s.
For explanation of why this is faster see MySQL select 10 random rows from 600K rows fast
Don't use 0, 1 and 2 but 1, 2 and 3. Then you can use this value as a multiplier:
SELECT * FROM tablename ORDER BY (RAND() * Multiplier);
Well, I would put the logic of weights in PHP:
<?php
$weight_array = array(0, 1, 1, 2, 2, 2);
$multiplier = $weight_array[array_rand($weight_array)];
?>
and the query:
SELECT *
FROM `table`
WHERE Multiplier = $multiplier
ORDER BY RAND()
LIMIT 1
I think it will work :)
While I realise this is an question on MySQL, the following may be useful for someone using SQLite3 which has subtly different implementations of RANDOM and LOG.
SELECT * FROM table ORDER BY (-LOG(abs(RANDOM() % 10000))/weight) LIMIT 1;
weight is a column in table containing integers (I've used 1-100 as the range in my table).
RANDOM() in SQLite produces numbers between -9.2E18 and +9.2E18 (see SQLite docs for more info). I used the modulo operator to get the range of numbers down a bit.
abs() will remove the negatives to avoid problems with LOG which only handles non-zero positive numbers.
LOG() is not actually present in a default install of SQLite3. I used the php SQLite3 CreateFunction call to use the php function in SQL. See the PHP docs for info on this.
For others Googling this subject, I believe you can also do something like this:
SELECT strategy_id
FROM weighted_strategies AS t1
WHERE (
SELECT SUM(weight)
FROM weighted_strategies AS t2
WHERE t2.strategy_id<=t1.strategy_id
)>#RAND AND
weight>0
LIMIT 1
The total sum of weights for all records must be n-1, and #RAND should be a random value between 0 and n-1 inclusive.
#RAND could be set in SQL or inserted as a integer value from the calling code.
The subselect will sum up all the preceeding records' weights, checking it it exceeds the random value supplied.
<?php
/**
* Demonstration of weighted random selection of MySQL database.
*/
$conn = mysql_connect('localhost', 'root', '');
// prepare table and data.
mysql_select_db('test', $conn);
mysql_query("drop table if exists temp_wrs", $conn);
mysql_query("create table temp_wrs (
id int not null auto_increment,
val varchar(16),
weight tinyint,
upto smallint,
primary key (id)
)", $conn);
$base_data = array( // value-weight pair array.
'A' => 5,
'B' => 3,
'C' => 2,
'D' => 7,
'E' => 6,
'F' => 3,
'G' => 5,
'H' => 4
);
foreach($base_data as $val => $weight) {
mysql_query("insert into temp_wrs (val, weight) values ('".$val."', ".$weight.")", $conn);
}
// calculate the sum of weight.
$rs = mysql_query('select sum(weight) as s from temp_wrs', $conn);
$row = mysql_fetch_assoc($rs);
$sum = $row['s'];
mysql_free_result($rs);
// update range based on their weight.
// each "upto" columns will set by sub-sum of weight.
mysql_query("update temp_wrs a, (
select id, (select sum(weight) from temp_wrs where id <= i.id) as subsum from temp_wrs i
) b
set a.upto = b.subsum
where a.id = b.id", $conn);
$result = array();
foreach($base_data as $val => $weight) {
$result[$val] = 0;
}
// do weighted random select ($sum * $times) times.
$times = 100;
$loop_count = $sum * $times;
for($i = 0; $i < $loop_count; $i++) {
$rand = rand(0, $sum-1);
// select the row which $rand pointing.
$rs = mysql_query('select * from temp_wrs where upto > '.$rand.' order by id limit 1', $conn);
$row = mysql_fetch_assoc($rs);
$result[$row['val']] += 1;
mysql_free_result($rs);
}
// clean up.
mysql_query("drop table if exists temp_wrs");
mysql_close($conn);
?>
<table>
<thead>
<th>DATA</th>
<th>WEIGHT</th>
<th>ACTUALLY SELECTED<br />BY <?php echo $loop_count; ?> TIMES</th>
</thead>
<tbody>
<?php foreach($base_data as $val => $weight) : ?>
<tr>
<th><?php echo $val; ?></th>
<td><?php echo $weight; ?></td>
<td><?php echo $result[$val]; ?></td>
</tr>
<?php endforeach; ?>
<tbody>
</table>
if you want to select N rows...
re-calculate the sum.
reset range ("upto" column).
select the row which $rand pointing.
previously selected rows should be excluded on each selection loop. where ... id not in (3, 5);
SELECT * FROM tablename ORDER BY -LOG(RAND()) / Multiplier;
Is the one which gives you the correct distribution.
SELECT * FROM tablename ORDER BY (RAND() * Multiplier);
Gives you the wrong distribution.
For example, there are two entries A and B in the table. A is with weight 100 while B is with weight 200.
For the first one (exponential random variable), it gives you Pr(A winning) = 1/3 while the second one gives you 1/4, which is not correct.
I wish I can show you the math. However I do not have enough rep to post relevant link.
Whatever you do, it is giong to be terrible because it will involve:
* Getting the total "weights" for all columns as ONE number (including applying the multiplier).
* Getting a random number between 0 and that total.
* Getting all entries and runing them along, deducting the weight from the random number and choosing the one entry when you run out of items.
In average you will run along half the table. Performance - unless the table is small, then do it outside mySQL in memory - will be SLOW.
The result of the pseudo-code (rand(1, num) % rand(1, num)) will get more toward 0 and less toward num. Subtract the result from num to get the opposite.
So if my application language is PHP, it should look something like this:
$arr = mysql_fetch_array(mysql_query(
'SELECT MAX(`Multiplier`) AS `max_mul` FROM tbl'
));
$MaxMul = $arr['max_mul']; // Holds the maximum value of the Multiplier column
$mul = $MaxMul - ( rand(1, $MaxMul) % rand(1, $MaxMul) );
mysql_query("SELECT * FROM tbl WHERE Multiplier=$mul ORDER BY RAND() LIMIT 1");
Explanation of the code above:
Fetch the highest value in the Multiplier column
calculate a random Multiplier value (weighted toward the maximum value in the Multiplier column)
Fetch a random row which has that Multiplier value
It's also achievable merely by using MySQL.
Proving that the pseudo-code (rand(1, num) % rand(1, num)) will weight toward 0:
Execute the following PHP code to see why (in this example, 16 is the highest number):
$v = array();
for($i=1; $i<=16; ++$i)
for($k=1; $k<=16; ++$k)
isset($v[$i % $k]) ? ++$v[$i % $k] : ($v[$i % $k] = 1);
foreach($v as $num => $times)
echo '<div style="margin-left:', $times ,'px">
times: ',$times,' # num = ', $num ,'</div>';
#ali 's answer works great but you can not control how much your result skews toward higher or lower weights, you can change multiplier but it's not a very dynamic approach.
i optimized the code by adding POWER(weight,skewIndex) instead of weight which makes higher weights to appear more with values more than 1 for skewIndex and appear less with values between 0 and 1.
SELECT * FROM tbl AS t1 JOIN (SELECT id FROM tbl ORDER BY -LOG(1-RAND())/POWER(weight,skewIndex) LIMIT 10) AS t2 ON t1.id = t2.id
you can analyze query results with
SELECT AVG(weight) FROM tbl AS t1 JOIN (SELECT id FROM tbl ORDER BY -LOG(1-RAND())/POWER(weight,skewIndex) LIMIT 10) AS t2 ON t1.id = t2.id
for example setting skewIndex to 3 gives me average of 78% while skewIndex of 1 gives average of 65%
I want to show a random record from the database. I would like to be able to show X number of random records if I choose. Therefore I need to select the top X records from a randomly selected list of IDs
(There will never be more than 500 records involved to choose from, unless the earth dramatically increases in size. Currently there are 66 possibles.)
This function works, but how can I make it better?
/***************************************************/
/* RandomSite */
//****************/
// Returns an array of random site IDs or NULL
/***************************************************/
function RandomSite($intNumberofSites = 1) {
$arrOutput = NULL;
//open the database
GetDatabaseConnection('dev');
//inefficient
//$strSQL = "SELECT id FROM site_info WHERE major <> 0 ORDER BY RAND() LIMIT ".$intNumberofSites.";";
//Not wonderfully random
//$strSQL = "SELECT id FROM site_info WHERE major <> 0 AND id >= (SELECT FLOOR( COUNT(*) * RAND()) FROM site_info ) ORDER BY id LIMIT ".$intNumberofSites.";";
//Manual selection from available pool of candidates ?? Can I do this better ??
$strSQL = "SELECT id FROM site_info WHERE major <> 0;";
if (is_numeric($intNumberofSites))
{
//excute my query
$result = #mysql_query($strSQL);
$i=-1;
//create an array I can work with ?? Can I do this better ??
while ($row = mysql_fetch_array($result, MYSQL_NUM))
{
$arrResult[$i++] = $row[0];
}
//mix them up
shuffle($arrResult);
//take the first X number of results ?? Can I do this better ??
for ($i=0;$i<$intNumberofSites;$i++)
{
$arrOutput[$i] = $arrResult[$i];
}
}
return $arrOutput;
}
UPDATE QUESTION:
I know about the ORDER BY RAND(), I just don't want to use it because there are rumors it isn't the best at scaling and performance. I am being overly critical of my code. What I have works, ORDER BY RAND() works, but can I make it better?
MORE UPDATE
There are holes in the IDs. There is not a ton of churn, but any churn that happens needs to be approved by our team, and therefore could handled to dump any caching.
Thanks for the replies!
Why not use the Rand Function in an orderby in your database query? Then you don't have to get into randomizing etc in code...
Something like (I don't know if this is legal)
Select *
from site_info
Order by Rand()
LIMIT N
where N is the number of records you want...
EDIT
Have you profiled your code vs. the query solution? I think you're just pre-optimizing here.
If you dont want to select with order by rand().
Instead of shuffeling, use array_rand on the result:
$randKeys = array_rand($arrResult, $intNumberofSites);
$arrOutput = array_intersect_key(array_flip($randKeys), $arrResult);
edit: return array of keys not new array with key => value
Well, I don't think that ORDER BY RAND() would be that slow in a table with only 66 rows, but you can look into a few different solutions anyway.
Is the data really sparse and/or updated often (so there are big gaps in the ids)?
Assuming it's not very sparse, you could select the max id from the table, use PHP's built-in random function to pick N distinct numbers between 1 and the max id, and then attempt to fetch the rows with those ids from the table. If you get back less rows than you picked numbers, get more random numbers and try again, until you have the number of rows needed. This may not be particularly fast either.
If the data is sparse, I would set up a secondary "id-type" column that you make sure is sequential. So if there are 66 rows in the table, ensure that the new column contains the values 1-66. Whenever rows are added to or removed from the table, you will have to do some work to adjust the values in this column. Then use the same technique as above, picking random IDs in PHP, but you don't have to worry about the "missing ID? retry" case.
Here are the three functions I wrote and tested
My answer
/***************************************************/
/* RandomSite1 */
//****************/
// Returns an array of random rec site IDs or NULL
/***************************************************/
function RandomSite1($intNumberofSites = 1) {
$arrOutput = NULL;
GetDatabaseConnection('dev');
$strSQL = "SELECT id FROM site_info WHERE major <> 0;";
if (is_numeric($intNumberofSites))
{
$result = #mysql_query($strSQL);
$i=-1;
while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$arrResult[$i++] = $row[0]; }
//mix them up
shuffle($arrResult);
for ($i=0;$i<$intNumberofSites;$i++) {
$arrOutput[$i] = $arrResult[$i]; }
}
return $arrOutput;
}
JPunyon and many others
/***************************************************/
/* RandomSite2 */
//****************/
// Returns an array of random rec site IDs or NULL
/***************************************************/
function RandomSite2($intNumberofSites = 1) {
$arrOutput = NULL;
GetDatabaseConnection('dev');
$strSQL = "SELECT id FROM site_info WHERE major<>0 ORDER BY RAND() LIMIT ".$intNumberofSites.";";
if (is_numeric($intNumberofSites))
{
$result = #mysql_query($strSQL);
$i=0;
while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$arrOutput[$i++] = $row[0]; }
}
return $arrOutput;
}
OIS with a creative solution meeting the intend of my question.
/***************************************************/
/* RandomSite3 */
//****************/
// Returns an array of random rec site IDs or NULL
/***************************************************/
function RandomSite3($intNumberofSites = 1) {
$arrOutput = NULL;
GetDatabaseConnection('dev');
$strSQL = "SELECT id FROM site_info WHERE major<>0;";
if (is_numeric($intNumberofSites))
{
$result = #mysql_query($strSQL);
$i=-1;
while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$arrResult[$i++] = $row[0]; }
$randKeys = array_rand($arrResult, $intNumberofSites);
$arrOutput = array_intersect_key($randKeys, $arrResult);
}
return $arrOutput;
}
I did a simple loop of 10,000 iterations where I pulled 2 random sites. I closed and opened a new browser for each function, and cleared the cached between run. I ran the test 3 times to get a simple average.
NOTE - The third solution failed at pulling less than 2 sites as the array_rand function has different output if it returns a set or single result. I got lazy and didn't fully implement the conditional to handle that case.
1 averaged: 12.38003755 seconds
2 averaged: 12.47702177 seconds
3 averaged: 12.7124153 seconds
mysql_query("SELECT id FROM site_info WHERE major <> 0 ORDER BY RAND() LIMIT $intNumberofSites")
EDIT
Damn, JPunyon was a bit quicker :)
Try this:
SELECT
#nv := #min + (RAND() * (#max - #min)) / #lc,
(
SELECT
id
FROM site_info
FORCE INDEX (primary)
WHERE id > #nv
ORDER BY
id
LIMIT 1
),
#max,
#min := #nv,
#lc := #lc - 1
FROM
(
SELECT #min := MIN(id)
FROM site_info
) rmin,
(
SELECT #max := MAX(id)
FROM site_info
) rmax,
(
SELECT #lc := 5
) l,
site_info
LIMIT 5
This will select a random ID on each iteration using index, in descending order.
There is slight chance, though, that you get less results that you wanted, as it gives no second chance to the missed id's.
The more percent of rows you select, the bigger is the chance.
I would simply use the rand() function (I assume you are using MySQL)...
SELECT id, rand() as rand_idx FROM site_info WHERE major <> 0 ORDER BY rand_idx LIMIT x;
I'm with JPunyon. Use ORDER BY RAND() LIMIT $N. I think you'll get a bigger performance hit from $arrResult having and shuffling so many (unused) entries than from using the MySQL RAND() function.
function getSites ( $numSites = 5 ) {
// Sanitize $numSites if necessary
$result = mysql_query("SELECT id FROM site_info WHERE major <> 0 "
."ORDER BY RAND() LIMIT $numSites");
$arrResult = array();
while ( $row = mysql_fetch_array($result,MYSQL_NUM) ) {
$arrResult[] = $row;
}
return $arrResult;
}