This question already has answers here:
Generate a random value that doesn't exist in the same column
(2 answers)
Closed 7 months ago.
This post was edited and submitted for review 7 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Without using PHP loops, I'm trying to find an efficent way to generate a unique number that's different from all existing values in MYSQL database. I tried to do that in PHP, but it's not efficent because it has to deal with lots of loops in the future. Recently, I tried to do this with MYSQL and I found this sqlfiddle solution. But It doesn't work. Sometimes it generates the same number as in the table and it doesn't check every value in the table. I tried this this but didn't help. They generally give that query:
SELECT *, FLOOR(RAND() * 9) AS random_number FROM Table1
WHERE "random_number" NOT IN (SELECT tracker FROM Table1)
I will work with 6-digit numbers in the future, so I need that to be efficent and fast. I can use different methods such as pre-generating the numbers to be more efficent but I don't know how to handle that. I would be glad if you help.
EDIT: Based on #Wiimm's solution, I can fill the table with 999.999 different 'random' unique numbers without using PHP loop functions. Then I developed a method where ID's of the deleted rows can be reused. This is how I managed it:
Duplicate your original table and name it "table_deleted". (All columns must be the same.)
Create a trigger in MYSQL. To do that, enter "SQL" in MYSQL and run this code (It simply moves your row to the "table_deleted"):
MYSQL Code
DELIMITER
$$
CREATE TRIGGER `table_before_delete` BEFORE DELETE
ON
`your_table` FOR EACH ROW
BEGIN
INSERT INTO table_deleted
select * from your_table where id = old.id;
END ; $$
DELIMITER
;
Create another trigger. This code will move the row back to original table when it's updated.
MYSQL Code
DELIMITER
$$
CREATE TRIGGER `table_after_update` AFTER UPDATE
ON
`your_table` FOR EACH ROW
BEGIN
INSERT INTO your_table
select * from table_deleted where id = old.id;
END ; $$
DELIMITER
;
The PHP code that I use (The number column must be "NULL" to work this code):
PHP Code
//CHECK IF TABLE_DELETED HAS ROWS
$deleted = $db->query('SELECT COUNT(*) AS num_rows FROM table_deleted');
$deletedcount= $temp->fetchColumn();
//IF TABLE_DELETED HAS ROWS, RUN THIS
if($tempcount > 0) {
//UPDATE THE VALUE THAT HAS MINIMUM ID
$update = $db->prepare("UPDATE table_deleted SET value1= ?, value2= ?, value3= ?, value4= ? ORDER BY id LIMIT 1");
$update->execute(array("$value1","$value2","$value3","$value4"));
//AFTER UPDATE, DELETE THAT ROW
$delete=$db->prepare("DELETE from table_deleted ORDER BY id LIMIT 1");
$delete->execute();
}
else {
//IF TABLE_DELETED IS EMPTY, ADD YOUR VAULES (EXCEPT RANDOM NUMBER)
$query=$db->prepare("insert into your_table set value1= ?, value2= ?, value3= ?, value4= ?");
$query->execute(array("$value1","$value2","$value3","$value4"));
//USING #Wiimm's SOLUTION FOR GENERATING A RANDOM-LOOKING UNIQUE NUMBER
$last_id = $db->lastInsertId();
$number= str_pad($last_id * 683567 % 1000000, 6, '0', STR_PAD_LEFT);
//INSERT THAT RANDOM NUMBER TO THE CURRENT ROW
$insertnumber= $db->prepare("UPDATE your_table SET number= :number where id = :id");
$insertnumber->execute(array("number" => "$number", "id" => "$last_id"));
}
MYSQL triggers do the rest for you.
Random numbers and non-repeatable numbers are basically 2 different things that are mutually exclusive. Can it be that a sequence of numbers that only looks like random numbers is enough for you?
If yes, then I have a solution for it:
Use auto increment of your database.
Multiply the Id by a prime number. Other manipulations like bit rotations are possible too.
About prime number: It is important, that the value range (in your case 1000000) and the multiplicand have no common prime divisors. Otherwise the sequence of numbers is much shorter.
Here is an example for 6 digits:
MYSQL_INSERT_INSTRUCTION;
$id = $mysql_conn->insert_id;
$random_id = $id * 683567 % 1000000;
With this you get:
1: 683567
2: 367134
3: 50701
4: 734268
5: 417835
6: 101402
7: 784969
8: 468536
9: 152103
10: 835670
11: 519237
12: 202804
13: 886371
14: 569938
15: 253505
16: 937072
17: 620639
18: 304206
19: 987773
20: 671340
After 1000000 records the whole sequence is repeated. I recommend the usage of the full range of 32 bits. So the sequence have 4 294 967 296 different numbers. In this case use a much larger prime number, e.g. about 2.8e9. I use always a prime ~ 0.86*RANGE for this.
Alternatives
Instead of $random_id = $id * 683567 % 1000000; you can user other calculations to disguise your algorithm. Some examples:
# add a value
$random_id = ( $id * 683567 + 12345 ) % 1000000;
# add a value and swap higher and lower part
$temp = ( $id * 683567 + 12345 ) % 1000000;
$random_id = intdiv($temp/54321) + ($temp%54321)*54321;
If you can move away from the 6 digit (numeric) requirement, I would as it would allow you to create true random strings with some sort of uuid() function.
However, if this needs to be done outside of PHP and has to be 6 digit numbers, I would use an auto-increment column in MySQL.
If there needs to be some randomness, you can adjust the auto-increment column by a random increase:
alter table tableName auto_increment = [insert new starting number here];
This of course may find you in 7 digit numbers rather quickly.
Alternatively, I'd see the solution being PHP picking a random number and checking that against the DB (or pull in the rows of the DB first to check against without a DB query every time).
Related
I need to generate close to a million(100 batches of 10000 numbers) unique and random 12 digit codes for a scratch card application. This process will be repeated and will need an equal number of codes to be generated everytime.
Also the generated codes need to be entered in a db so that they can be verified later when a consumer enters this on my website. I am using PHP and Mysql to do this. These are the steps I am following
Get admin input on the number of batches and the codes per batch
Using for loop generate the code using
mt_rand(100000000000,999999999999)
Check every time a number is generated to see if a duplicate exists
in the db and if not add to results variable else regenerate.
Save generated number in db if unique
Repeat b,c, and d over required number of codes
Output codes to admin in a csv
Code used(removed most of the comments to make it less verbose and because I have already explained the steps earlier):
$totalLabels = $numBatch*$numLabelsPerBatch;
// file name for download
$fileName = $customerName."_scratchcodes_" . date('Ymdhs') . ".csv";
$flag = false;
$generatedCodeInfo = array();
// headers for download
header("Content-Disposition: attachment; filename=\"$fileName\"");
header("Content-Type: application/vnd.ms-excel");
$codeObject = new Codes();
//get new batch number
$batchNumber = $codeObject->getLastBatchNumber() + 1;
$random = array();
for ($i = 0; $i < $totalLabels; $i++) {
do{
$random[$i] = mt_rand(100000000000,999999999999); //need to optimize this to reduce collisions given the databse will be grow
}while(isCodeNotUnique($random[$i],$db));
$codeObject = new Codes();
$codeObject->UID = $random[$i];
$codeObject->customerName = $customerName;
$codeObject->batchNumber = $batchNumber;
$generatedCodeInfo[$i] = $codeObject->addCode();
//change batch number for next batch
if($i == ($numLabelsPerBatch-1)){$batchNumber++;}
//$generatedCodeInfo[i] = array("UID" => 10001,"OID"=>$random[$i]);
if(!$flag) {
// display column names as first row
echo implode("\t", array_keys($generatedCodeInfo[$i])) . "\n";
$flag = true;
}
// filter data
array_walk($generatedCodeInfo[$i], 'filterData');
echo implode("\t", array_values($generatedCodeInfo[$i])) . "\n";
}
function filterData(&$str)
{
$str = preg_replace("/\t/", "\\t", $str);
$str = preg_replace("/\r?\n/", "\\n", $str);
if(strstr($str, '"')) $str = '"' . str_replace('"', '""', $str) . '"';
}
function isCodeNotUnique($random){
$codeObject = new Codes();
$codeObject->UID = $random;
if(!empty($codeObject->getCodeByUID())){
return true;
}
return false;
}
Now this is taking really long to execute and I believe is not optimal.
How can I optimize so that the unique random numbers are generated quickly?
Will it be faster if the numbers were instead generated in mysql or other way rather than php and if so how do I do that?
When the db starts growing the duplicate check in step b will be really time consuming so how do I avoid that?
Is there a limit on the number of rows in mysql?
Note: The numbers need to be unique across all batches across lifetime of the application.
1) Divide your range of numbers up to smaller ranges based on the number of batches. E.g. if your range 0 - 1000 and you have 10 batches, then have a batch from 0 - 99, the next 100 - 199, etc. When you generate the numbers for a batch, only generate the random number from the batch range. This way you know that you can only have duplicate numbers within a batch.
Do not insert each number into the database individually, but store them in an array. When you generate a new random number, then check against the array, not the database using in_array() function. When the batch is complete, then use a single insert statement to insert the contents of the batch:
insert into yourtable (bignumber) values (1), (2), ..., (n)
Check MySQL's max_allowed_packet setting to see if it is able to receive the complete sql statement in one go.
Implement a fallback plan, just in case a duplicate value is still found during the insert (error handling and number regeneration).
2) MySQL is not that great on procedural stuff, so I would stick with an external language, such as php.
3) Add a unique index on the field containing the random numbers. If you try to insert a duplicate record, MySQL will prevent it and throws an error. It is really quick.
4) Depending on the actual table engine used (innodb, myisam, etc), its configuration, and the OS, certain limits may apply on the size of the table. See Maximum number of records in a MySQL database table question here on SO for a more detailed answer (check the most upvoted answer, not the accepted one).
You can do the following:
$random = getExistingCodes(); // Get what you already have (from the DB).
$random = array_flip($random); //Make them into keys
$existingCount = count($random); //The codes you already have
do {
$random[mt_rand(100000000000,999999999999)] = 1;
} while ((count($random)-$existingCount) < $totalLabels);
$random = array_keys($random);
When you generate a duplicate number it will just overwrite that key and not increase the count.
To insert you can start a transaction and do as many inserts as needed. MySQL will try to optimize all operations within a single transaction.
Here is a query that generates 1 million pseudo-random numbers without repetitions:
select cast( (#n := (13*#n + 97) % 899999999981)+1e11 as char(12)) as num
from (select #n := floor(rand() * 9e11) ) init,
(select 1 union select 2) m01,
(select 1 union select 2) m02,
(select 1 union select 2) m03,
(select 1 union select 2) m04,
(select 1 union select 2) m05,
(select 1 union select 2) m06,
(select 1 union select 2) m07,
(select 1 union select 2) m08,
(select 1 union select 2) m09,
(select 1 union select 2) m10,
(select 1 union select 2) m11,
(select 1 union select 2) m12,
(select 1 union select 2) m13,
(select 1 union select 2) m14,
(select 1 union select 2) m15,
(select 1 union select 2) m16,
(select 1 union select 2) m17,
(select 1 union select 2) m18,
(select 1 union select 2) m19,
(select 1 union select 2) m20
limit 1000000;
How it works
It starts by generating a random integer value n with 0 <= n < 900000000000. This number will have the function of the seed for the generated sequence:
#n := floor(rand() * 9e11)
Through multiple (20) joins with inline pairs of records, this single record is multiplied to 220 copies, which is just a bit over 1 million.
Then the selection starts, and as record after record is fetched, the value of the #n variable is modified according to this incremental formula:
#n := (13*#n + 97) % 899999999981
This formula is a linear congruential generator. The three constant numbers need to obey some rules to maximise the period (of non-repetition), but it is the easiest when 899999999981 is prime, which it is. In that case we have a period of 899999999981, meaning that the first 899999999981 generated numbers will be unique (and we need much less). This number is in fact the largest prime below 900000000000.
As a final step, 100000000000 is added to the number to ensure the number always has 12 digits, so excluding numbers that are smaller than 100000000000. Because of the choice of 899999999981 there will be 20 numbers that will never be generated, namely those between 999999999981 and 999999999999 inclusive.
As this generates 220 records, the limit clause will make sure this is chopped off to exactly one million records.
The cast to char(12) is optional, but may be necessary to visualise the 12-digit numbers without them being rendered on the screen in scientific notation. If you will use this to insert records, and the target data type is numeric, then you would leave out this conversion of course.
CREATE TABLE x (v BIGINT(12) ZEROFILL NOT NULL PRIMARY KEY);
INSERT IGNORE INTO x (v) VALUES
(FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()),
(FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()),
(FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()),
(FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()),
(FLOOR(1e12*RAND()), (FLOOR(1e12*RAND()), (FLOOR(1e12*RAND());
Do that INSERT 1e6/15 times.
Check COUNT(*) to see if you have a million. Do this until the table as a million rows:
INSERT IGNORE INTO x (v) VALUES
(FLOOR(1e12*RAND());
Notes:
ZEROFILL is assuming that you want the display to have leading zeros.
IGNORE is because there will be some number of duplicates. This avoids the costly check after each insert.
"Batch insert" is faster than one row at a time. (Doing 100 at a time is about optimal, but I am lazy.)
Potential problem: While I think the pattern of values for RAND() does not repeat at, say 2^16 or 2^32 values, I do not know for a fact. If you can't get to a million, then the random number generator is bad; you should switch to PHP's rand, or something else.
Beware of linear consequential random number generators. They are probably easily hacked. (I assume there is some "money" behind the scratch cards.)
Do not plan on mt_rand() being unique for small ranges
<?php
// Does mt_rand() repeat?
TryMT(100);
TryMT(100);
TryMT(1000);
TryMT(10000);
TryMT(1e6);
TryMT(1e8);
TryMT(1e10);
TryMT(1e12);
TryMT(1e14);
function TryMT($max) {
$h = [];
for ($j = 0; $j<$max; $j++) {
$v = mt_rand(1, $max);
if (isset($h[$v])) {
echo "Dup after $j iterations (limit=$max)<br>\n";
return;
}
$h[$v] = 1;
}
}
Sample output:
Dup after 7 iterations (limit=100)<br>
Dup after 13 iterations (limit=100)<br>
Dup after 29 iterations (limit=1000)<br>
Dup after 253 iterations (limit=10000)<br>
Dup after 245 iterations (limit=1000000)<br>
Dup after 3407 iterations (limit=100000000)<br>
Dup after 29667 iterations (limit=10000000000)<br>
Dup after 82046 iterations (limit=1000000000000)<br>
Dup after 42603 iterations (limit=1.0E+14)<br>
mt_rand() is a "good" random number generated because it does have dups.
As the headline states, I'd like to know how to insert both a random number generated with php, and selected lines from another table. Example:
<?php
$randomid = (rand(1,1000000));
$sql = "INSERT INTO example2 (randomid, userid, name)
VALUES ('$randomid')
SELECT userid, name
FROM example1
WHERE name='Donald' "
$mysqli->query($sql);
?>
I'm not sure how to go about this. Must I divide this into an insert and an update query?
SELECT in MySQL can be used to output message/static values like in this example
SELECT CASE
WHEN userid<100 THEN 'less than 100'
WHEN userid<200 THEN 'less than 200'
ELSE 'greater than 200'
END AS message, userid
FROM mytable
so in your example you can just do the same
$randomid = (rand(1,1000000)); // <--- imagine 25 was returned
$sql = "INSERT INTO example2 (randomid, userid, name)
SELECT ".$randomid.", userid, name
FROM example1
WHERE name='Donald' " // <--- you select now looks like 'SELECT 25, userid, name'
however there is a downside as this will give every entry with the name Donald the same value so if you have multiple Donalds it kinda defeated the purpose of a random value unless you plan to limit the insert to do one at the time giving your PHP rand function to recalculate
a better way to do this is with MySQL's own RAND function
Returns a random floating-point value v in the range 0 <= v < 1.0.
ofcause since this function returns a decimal/float value which isn't really ideal for an integer key we want to make it into a inetger by mutiplying it and using FLOOR by using this
FLOOR(RAND()*1000000) AS randomid
Fiddle
this will get us a value between 0 and 1, we multiply it by 1000000 and then round it down to the nearest full number using FLOOR and unlike the PHP code a new number is created for every entry. so 15 Donalds will have 15 different random ids. there is still the possibility that you can get identical number but thus is the nature of random number
A user can input it's preferences to find other users.
Now based on that input, I'd like to get the top 10 best matches to the preferences.
What I thought is:
1) Create a select statement that resolves users preferences
if ($stmt = $mysqli->prepare("SELECT sex FROM ledenvoorkeuren WHERE userid = you"))
$stmt->bind_result($ownsex);
2) Create a select statement that checks all users except for yourself
if ($stmt = $mysqli->prepare("SELECT sex FROM ledenvoorkeuren WHERE userid <> you"))
$stmt->bind_result($othersex);
3) Match select statement 1 with select statement 2
while ($stmt->fetch()) {
$match = 0;
if ($ownsex == $othersex) {
$match = $match + 10;
}
// check next preference
4) Start with a variable with value 0, if preference matches -> variable + 10%
Problem is, I can do this for all members, but how can I then select the top 10???
I think I need to do this in the SQL statement, but I have no idea how...
Ofcourse this is one just one preference and a super simple version of my code, but you'll get the idea. There are like 15 preference settings.
// EDIT //
I would also like to see how much the match rating is on screen!
Well, it was a good question from the start so I upvoted it and then wasted about 1 hour to produce the following :)
Data
I have used a DB named test and table named t for our experiment here.
Below you can find a screenshot showing this table's structure (3 int columns, 1 char(1) column) and complete data
As you can see, everything is rather simple - we have a 4 columns, with id serving as primary key, and a few records (rows).
What we want to achieve
We want to be able to select a limited set of rows from this table based upon some complex criteria, involving comparison of several column's values against needed parameters.
Solution
I've decided to create a function for this. SQL statement follows:
use test;
drop function if exists calcMatch;
delimiter //
create function calcMatch (recordId int, neededQty int, neededSex char(1)) returns int
begin
declare selectedQty int;
declare selectedSex char(1);
declare matchValue int;
set matchValue = 0;
select qty, sex into selectedQty, selectedSex from t where id = recordId;
if selectedQty = neededQty then
set matchValue = matchValue + 10;
end if;
if selectedSex = neededSex then
set matchValue = matchValue + 10;
end if;
return matchValue;
end//
delimiter ;
Minor explanation
Function calculates how well one particular record matches the specified set of parameters, returning an int value as a result. The bigger the value - the better the match.
Function accepts 3 parameters:
recordId - id of the record for which we need to calculate the result(match value)
neededQty - needed quantity. if the record's qty matches it, the result will be increased
neededSex - needed sex value, if the record's sex matches it, the result will be increased
Function selects via id specified record from the table, initializes the resulting match value with 0, then makes a comparison of each required columns against needed value. In case of successful comparison the return value is increased by 10.
Live test
So, hopefully this solves your problem. Feel free to use this for your own project, add needed parameters to function and compare them against needed columns in your table.
Cheers!
Use the limit and offset in query:
SELECT sex FROM ledenvoorkeuren WHERE userid = you limit 10 offset 0
This will give the 10 users data of top most.
You can set a limit in your query like this:
SELECT sex FROM ledenvoorkeuren WHERE userid <> yourid AND sex <> yourpreferredsex limit 0, 10
Where the '0' is the offset, and the '10' your limit
More info here
you may try this
SELECT sex FROM ledenvoorkeuren WHERE userid = you limit 0, 10 order by YOUR_PREFERENCE
I try to build a variable that integrates some other variable.
one of that will be the number of an auto-increment-field where later on an insert-query will happens.
I tried to use:
$get_num = $db/*=>mysqli*/->query("SELECT COUNT (*) auto_increment_column FROM table1");
$num = $query->fetch_assoc($get_num);
$end = $num + 1;
I don't have any update/insert query before that so I can't use
$end = $db->insert_id;
that's why i thought i can just count the numbers of the auto_increment rows and have my last variable that is necessary to build my new variable.
for a reason this wonT count the entries and outputs 0. i dont understand why this happens.
i really would appreciate if there is someone who could tell me what am i doing wrong. thanks a lot.
UPDATE
For everyone who likes to know about what's the goal:
I like to create a specific name or id for a file that later on will be created by the input of the fields from the insert query. I like to have an unique key. this key consists of an user_id and a timestamp. at the end of this generated variable it should be placed the auto_increment nr. of the query that will be placed in the table. so the problem is, that I create an variable before the insert query happens so that this variable will be part of the insert query like:
$get_num = $db->query("SELECT COUNT (*) FROM tableA");
$num = $query->fetch_assoc();
$end = $num + 1;
$file_id = $id .".". time() .".". $end;
$insert = $db->query("INSERT INTO tableA ( file_id, a, b, c) VALUES('".$file_id."','".$a."','".$b."','".c."')");{
hope now, it will be clear what I like to approach.
If you need an auto-incrementing column in MySQL then you should use AUTO_INCREMENT. It implements it all for you and avoids race conditions. The manual way you are trying to implement it has a couple of flaws, namely
If two scripts are trying to insert concurrently they might both get the same COUNT (say 10) and hence both try to insert with ID 11. One will then fail (or else you will have duplicates!)
If you add 10 items but then delete item 1, the COUNT will return 9 but id 10 will already exist.
try
SELECT COUNT(*) FROM table1
In php what is a function to only display strings that have a length greater than 50 characters, truncate it to not display more than 130 characters and limit it to one result?
so for example say i have 30 rows in a result set but I only want to show the newest row that have these parameters. If the newest row has 25 characters it should not display. It should only display the newest one that has a string length of 50 or more characters.
Use an SQL query. For finding the newest you want max on either an auto_increment primary key (ill call it id) or a date/time when the row was created (say, time time_created).
So I am assuming table with: id (int), stringVal (string, char(), varchar(), whatever)
SELECT MAX(id), SUBSTRING(stringVal, 1, 130)
FROM yourTable
WHERE LENGTH(stringVal) > 30
Replace id with a time field if you have to. You're going to have a hard time finding the newest without one of them, but you can always arbitrarily pick one row.
--Edit-- a sample of using mysql functions in PHP to run above query and fetch desired output
$sql = "SELECT MAX(id), SUBSTRING(stringVal, 1, 130) FROM yourTable WHERE LENGTH(stringVal) > 30";
$r = mysql_query($sql, $conn); //im hoping $conn or something like it is already set up
$row = mysql_fetch_assoc($r);
$desiredString = $row['stringVal'];
Something like this should do just make sure that you grab your data sorting by the newest items first. The break statement will ensure that the loop is terminated after the first result matching your criteria is found...
foreach($array_returned_from_query as $row)
{
if(strlen($row) > 50)
{
echo substr($row, 0, 130);
break;
}
}