My code randomly generates a 4 or 5 digit code along with 3 digit pre-defined text and checks in database, If it is already exists, then it regenerates the code and saves into database.
But sometimes the queries get stuck & become slower, if each pre-defined keyword has around 1000 record.
Lets take an example for one Keyword "XYZ" and Deal ID = 100 and lets say it has 8000 records in database. The do while loops take a lot of time.
$keyword = "XYZ"; // It is unique for each deal id.
dealID = 100; // It is Foreign key of another table.
$initialLimit = 1;
$maxLimit = 9999;
do {
$randomNo = rand($initialLimit, $maxLimit);
$coupon = $keyword . str_pad($randomNo, 4, '0', STR_PAD_LEFT);
$findRecord = DB::table('codes')
->where('code', $coupon)
->where('deal_id', $dealID)
->exists();
} while ($findRecord == 1);
As soon as the do-while loops end, Record is being inserted into database after above code. But the Above code takes too much time,
The above query is printed as follow in MySQL. like for above example deal id, it has already over 8000 records. The above code keeps querying until it finds. When traffic is high, app becomes slower.
select exists(select * from `codes` where `code` = 'XYZ1952' and `deal_id` = '100');
select exists(select * from `codes` where `code` = 'XYZ2562' and `deal_id` = '100');
select exists(select * from `codes` where `code` = 'XYZ7159' and `deal_id` = '100');
Multiple queries like this get stuck in database. The codes table has around 500,000 records against multiple deal ids. But Each deal id has around less than 10,000 records, only few has more than 10,000.
Any suggestions, How can I improve the above code?
Or I should use the MAX() function and find the code and do the +1 and insert into db?
When 80% of the numbers are taken, it takes time to find one that is not taken. The first test loses 80% of the time; the second loses 64% of the time, then 51%, etc. Once in a while, it will take a hundred tries to find a free number. And if the random number generator is "poor", it could literally never finish.
Let's make a very efficient way to generate the numbers.
Instead of looking for a free one, pre-determine all 9999 values. Do this by taking all the numbers 1..9999 and shuffle them. Store that list somewhere and "pick the next one" (instead of scrounging for an unused one).
This could be done with an extra column (indexed). Or an extra table. Or any number of other ways.
Related
I'm trying to write a Laravel eloquent statement to do the following.
Query a table and get all the ID's of all the duplicate rows (or ideally all the IDs except the ID of the first instance of the duplicate).
Right now I have the following mysql statement:
select `codes`, count(`codes`) as `occurrences`, `customer_id` from `pizzas`
group by `codes`, `customer_id`
having `occurrences` > 1;
The duplicates are any row that shares a combination of codes and customer_id, example:
codes,customer_id
183665A4,3
183665A4,3
183665A4,3
183665A4,3
183665A4,3
I'm trying to delete all but 1 of those.
This is returning a set of the codes, with their occurrences and their customer_id, as I only want rows that have both.
Currently I think loop through this, and save the ID of the first instance, and then call this again and delete any without that ID. This seems not very fast, as there's about 50 million rows so each query takes forever and we have multiple queries for each duplicate to delete.
// get every order that shares the same code and customer ID
$orders = Order::select('id', 'codes', DB::raw('count(`codes`) as `occurrences`'), 'customer_id')
->groupBy('codes')
->groupBy('customer_id')
->having('occurrences', '>', 1)
->limit(100)
->get();
// loop through those orders
foreach ($orders as $order)
{
// find the first order that matches this duplicate set
$first_order = Order::where('codes', $order->codes)
->where('customer_id', $order->customer_id)
->first();
// delete all but the first
Order::where('codes', $order->codes)
->where('customer_id', $order->customer_id)
->where('id', '!=', $first_order->id)
->delete();
}
There has got to be a more efficient way to track down all rows that share the same code and customer_id, and delete all the duplicates but keep the first instance, right? lol
I'm thinking maybe if I can add a fake column to the results that is an array of every ID, I could at least then remove the first ID and delete the others.
Don't involve PHP
This seems not very fast
The logic in the question is inherently slow because it's lots of queries and for each query there's:
DB<->PHP network roundtrip
PHP ORM logic/overhead
Given the numbers in the question, the whole code needs calling up to 10k times (if there are exactly 2 occurrences for every one of those 2 million duplicate records), for arguments sake let's say there are 1k sets of duplicates, overall that's:
1,000 queries finding duplicates
100,000 queries finding the first record
100,000 delete queries
201,000 queries is a lot and the php overhead makes it an order of magnitude slower (a guess, based on experience).
Do it directly on the DB
Just eliminating php/orm/network (even if it's on the same machine) time would make the process markedly faster, that would involve writing a procedure to mimic the php logic in the question.
But there's a simpler way, the specifics depend on the circumstances. In comments you've said:
The table is 140GB in size
It contains 50 million rows
Approx 2 million are duplicate records
There isn't enough free space to make a copy of the table
Taking these comments at face value the process I suggest is:
Ensure you have a functional DB backup
Before doing anything make sure you have a functional DB backup. If you manage to make a mistake and e.g. drop the table - be sure you can recover without loss of data.
You'll be testing this process on a copy of the database first anyway, right :) ?
Create a table of "ids to keep" and populate it
This is a permutation of removing duplicate with a unique index:
CREATE TABLE ids_to_keep (
id INT PRIMARY KEY,
codes VARCHAR(50) NOT NULL, # use same schema as source table
customer_id INT NOT NULL, # use same schema as source table
UNIQUE KEY derp (codes,customer_id)
);
INSERT IGNORE INTO ids_to_keep
SELECT id, codes, customer_id from pizzas;
Mysql will silently drop the rows conflicting with the unique index, resulting in a table with one id per codes+customer_id tuple.
If you don't have space for this table - make room :). It shouldn't be too large; 140GB and 50M rows means each row is approx 3kb - this temporary table will likely require single-digit % of the original size.
Delete the duplicate records
Before executing any expected-to-be-slow query use EXPLAIN to check if the query will complete in a reasonable amount of time.
To run as a single query:
DELETE FROM
pizzas
WHERE
id NOT IN (SELECT id from ids_to_keep);
If you wish to do things in chunks:
DELETE FROM
pizzas
WHERE
id BETWEEN (0,10000) AND
id NOT IN (SELECT id from ids_to_keep);
Cleanup
Once the table isn't needed any more, get rid of it:
DROP TABLE ids_to_keep;
Make sure this doesn't happen again
To prevent this happening again, add a unique index to the table:
CREATE UNIQUE INDEX ON pizzas(codes, customer_id);
Try this one it will keep only the duplicate and non-duplicate id lastest id:
$deleteDuplicates = DB::table('orders as ord1')
->join('orders as ord2', 'ord1.codes', '<', 'ord2.codes')
->where('ord1.codes', '=', 'ord2.codes') ->delete();
How to limit mysql rows to select newest 50 rows and have a next button such that next 50 rows are selected without knowing the exact number of rows?
I mean there may be an increment in number of rows in table. Well I will explain it clearly: I was developing a web app as my project on document management system using php mysql html. Everything is done set but while retrieving the documents I mean there may be thousands of documents.
All the documents whatever in my info table are retrieving at a time in home page which was not looking good. So I would like to add pages on such that only newest 50 documents are placed in first page next 50 are in second and so on.
But how come I know the exact number of rows every time and I cannot change the code every time a new document added so... numrows may not be useful I think...
Help me out please...
What you are looking for is called pagination, and the easiest way to implement a simple pagination is using LIMIT x , y in your SQL queries.
You don't really need the total ammount of rows you have, you just need two numbers:
The ammount of elemments you have already queried, so you know where you have to continue the next query.
The ammount of elements you want to list each query (for example 50, as you suggested).
Let's say you want to query the first 50 elements, you should insert at the end of your query LIMIT 0,50, after that you'll need to store somewhere the fact that you have already queried 50 elements, so the next time you change the limit to LIMIT 50,50 (starting from element number 50 and query the 50 following elements).
The order depends on the fields you are making when the entries are inserted. Normally you can update your table and add the field created TIMESTAMP DEFAULT CURRENT_TIMESTAMP and then just use ORDER BY created, because from now on your entries will store the exact time they were created in order to look for the most recent ones (If you have an AUTO_INCREMENT id you can look for the greater values aswell).
This could be an example of this system using php and MySQL:
$page = 1;
if(!empty($_GET['page'])) {
$page = filter_input(INPUT_GET, 'page', FILTER_VALIDATE_INT);
if(false === $page) {
$page = 1;
}
}
// set the number of items to display per page
$items_per_page = 50;
// build query
$offset = ($page - 1) * $items_per_page;
$sql = "SELECT * FROM your_table LIMIT " . $offset . "," . $items_per_page;
I found this post really useful when I first try to make this pagination system, so I recommend you to check it out (is the source of the example aswell).
Hope this helped you and sorry I coudn't provide you a better example since I don't have your code.
Search for pagination using php & mysql. That may become handy with your problem.
To limit a mysql query to fetch 50 rows use LIMIT keyword. You may need to find & store the last row id(50th row) so that you can continue with 51th to 100th rows in the next page.
Post what you have done with your code. Please refer to whathaveyoutried[dot]com
check this example from another post https://stackoverflow.com/a/2616715/6257039, you could make and orber by id, or creation_date desc in your query
I have a database table with about 160 million rows in it.
The table has two columns: id and listing.
I simply need to used PHP to display 1000 random rows from the listing column and put them into <span> tags. Like this:
<span>Row 1</span>
<span>Row 2</span>
<span>Row 3</span>
I've been trying to do it with ORDER BY RAND() but that takes so long to load on such a large database and I haven't been able to find any other solutions.
I'm hoping that there is a fast/easy way to do this. I can't imagine that it'd be impossible to simply echo 1000 random rows... Thanks!
Two solutions presented here. Both of these proposed solutions are mysql-only and can be used by any programming language as the consumer. PHP would be wildly too slow for this, but it could be the consumer of it.
Faster Solution: I can bring 1000 random rows from a table of 19 million rows in about 2 tenths of a second with more advanced programming techniques.
Slower Solution: It takes about 15 seconds with non-power programming techniques.
By the way both use the data generation seen HERE that I wrote. So that is my little schema. I use that, continue with TWO more self-inserts seen over there, until I have 19M rows. So I am not going to show that again. But to get those 19M rows, go see that, and do 2 more of those inserts, and you have 19M rows.
Slower version first
First, the slower method.
select id,thing from ratings order by rand() limit 1000;
That returns 1000 rows in 15 seconds.
For anyone new to mysql, don't even read the following.
Faster solution
This is a little more complicated to describe. The gist of it is that you pre-compute your random numbers and generate an in clause ending of random numbers, separated by commas, and wrapped with a pair of parentheses.
It will look like (1,2,3,4) but it will have 1000 numbers in it.
And you store them, and use them once. Like a one time pad for cryptography. Ok, not a great analogy, but you get the point I hope.
Think of it as an ending for an in clause, and stored in a TEXT column (like a blob).
Why in the world would one want to do this? Because RNG (random number generators) are prohibitively slow. But to generate them with a few machines may be able to crank out thousands relatively quickly. By the way (and you will see this in the structure of my so called appendices, I capture how long it takes to generate one row. About 1 second with mysql. But C#, PHP, Java, anything can put that together. The point is not how you put it together, rather, that you have it when you want it.
This strategy, the long and short of it is, when this is combined with fetching a row that has not been used as a random list, marking it as used, and issuing a call such as
select id,thing from ratings where id in (a,b,c,d,e, ... )
and the in clause has 1000 numbers in it, the results are available in less than half a second. Effective employing the mysql CBO (cost based optimizer) than treats it like a join on a PK index.
I leave this in summary form, because it is a bit complicated in practice, but includes the following particles potentially
a table holding the precomputed random numbers (Appendix A)
a mysql create event strategy (Appendix B)
a stored procedure that employees a Prepared Statement (Appendix C)
a mysql-only stored proc to demonstrate RNG in clause for kicks (Appendix D)
Appendix A
A table holding the precomputed random numbers
create table randomsToUse
( -- create a table of 1000 random numbers to use
-- format will be like a long "(a,b,c,d,e, ...)" string
-- pre-computed random numbers, fetched upon needed for use
id int auto_increment primary key,
used int not null, -- 0 = not used yet, 1= used
dtStartCreate datetime not null, -- next two lines to eyeball time spent generating this row
dtEndCreate datetime not null,
dtUsed datetime null, -- when was it used
txtInString text not null -- here is your in clause ending like (a,b,c,d,e, ... )
-- this may only have about 5000 rows and garbage cleaned
-- so maybe choose one or two more indexes, such as composites
);
Appendix B
In the interest of not turning this into a book, see my answer HERE for a mechanism for running a recurring mysql Event. It will drive the maintenance of the table seen in Appendix A using techniques seen in Appendix D and other thoughts you want to dream up. Such as re-use of rows, archiving, deleting, whatever.
Appendix C
stored procedure to simply get me 1000 random rows.
DROP PROCEDURE if exists showARandomChunk;
DELIMITER $$
CREATE PROCEDURE showARandomChunk
(
)
BEGIN
DECLARE i int;
DECLARE txtInClause text;
-- select now() into dtBegin;
select id,txtInString into i,txtInClause from randomsToUse where used=0 order by id limit 1;
-- select txtInClause as sOut; -- used for debugging
-- if I run this following statement, it is 19.9 seconds on my Dell laptop
-- with 19M rows
-- select * from ratings order by rand() limit 1000; -- 19 seconds
-- however, if I run the following "Prepared Statement", if takes 2 tenths of a second
-- for 1000 rows
set #s1=concat("select * from ratings where id in ",txtInClause);
PREPARE stmt1 FROM #s1;
EXECUTE stmt1; -- execute the puppy and give me 1000 rows
DEALLOCATE PREPARE stmt1;
END
$$
DELIMITER ;
Appendix D
Can be intertwined with Appendix B concept. However you want to do it. But it leaves you with something to see how mysql could do it all by itself on the RNG side of things. By the way, for parameters 1 and 2 being 1000 and 19M respectively, it takes 800 ms on my machine.
This routine could be written in any language as mentioned in the beginning.
drop procedure if exists createARandomInString;
DELIMITER $$
create procedure createARandomInString
( nHowMany int, -- how many numbers to you want
nMaxNum int -- max of any one number
)
BEGIN
DECLARE dtBegin datetime;
DECLARE dtEnd datetime;
DECLARE i int;
DECLARE txtInClause text;
select now() into dtBegin;
set i=1;
set txtInClause="(";
WHILE i<nHowMany DO
set txtInClause=concat(txtInClause,floor(rand()*nMaxNum)+1,", "); -- extra space good due to viewing in text editor
set i=i+1;
END WHILE;
set txtInClause=concat(txtInClause,floor(rand()*nMaxNum)+1,")");
-- select txtInClause as myOutput; -- used for debugging
select now() into dtEnd;
-- insert a row, that has not been used yet
insert randomsToUse(used,dtStartCreate,dtEndCreate,dtUsed,txtInString) values
(0,dtBegin,dtEnd,null,txtInClause);
END
$$
DELIMITER ;
How to call the above stored proc:
call createARandomInString(1000,18000000);
That generates and saves 1 row, of 1000 numbers wrapped as described above. Big numbers, 1 to 18M
As a quick illustration, if one were to modify the stored proc, un-rem the line near the bottom that says "used for debugging", and have that as the last line, in the stored proc that runs, and run this:
call createARandomInString(4,18000000);
... to generate 4 random numbers up to 18M, the results might look like
+-------------------------------------+
| myOutput |
+-------------------------------------+
| (2857561,5076608,16810360,14821977) |
+-------------------------------------+
Appendix E
Reality check. These are somewhat advanced techniques and I can't tutor anyone on them. But I wanted to share them anyway. But I can't teach it. Over and out.
ORDER BY RAND() is a mysql function working fine with small databases, but if you run anything larger then 10k rows, you should build functions inside your program instead of using mysql premade functions or organise your data in special manners.
My suggestion: keep your mysql data indexed by auto increment id, or add other incremental and unique row.
Then build a select function:
<?php
//get total number of rows
$result = mysql_query('SELECT `id` FROM `table_name`', $link);
$num_rows = mysql_num_rows($result);
$randomlySelected = [];
for( $a = 0; $a < 1000; $a ++ ){
$randomlySelected[$a] = rand(1,$num_rows);
}
//then select data by random ids
$where = "";
$control = 0;
foreach($randomlySelected as $key => $selectedID){
if($control == 0){
$where .= "`id` = '". $selectedID ."' ";
} else {
$where .= "OR `id` = '". $selectedID ."'";
}
$control ++;
}
$final_query = "SELECT * FROM `table_name` WHERE ". $where .";";
$final_results = mysql_query($final_query);
?>
If some of your incremental IDs out of that 160 million database are missing, then you can easily add a function to add another random IDs (a while loop probably) if an array of randomly selected ids consists of less then required.
Let me know if you need some further help.
If your RAND() function is too slow, and you only need quasi-random records (for a test sample) and not truly random ones, you can always make a fast, effectively-random group by sorting by middle characters (using SUBSTRING) in indexed fields. For example, sorting by the 7th digit of a phone number...in descending order...and then by the 6th digit...in ascending order...that's already quasi-random. You could do the same with character columns: the 6th character in a person's name is going to be meaningless/random, etc.
You want to use the rand function in php. The signature is
rand(min, max);
so, get the number of rows in your table to a $var and set that as your max.
A way to do this with SQL is
SELECT COUNT(*) FROM table_name;
then simply run a loop to generate 1000 rands with the above function and use them to get specific rows.
If the IDs are not sequential but if they are close, you can simply test each rand ID to see if there is a hit. If they are far apart, you could pull the entire ID space into php and then randomly sample from that distribution via something like
$random = rand(0, count($rows)-1);
for an array of IDs in $rows.
Please use mysql rand in your query during select statement. Your query will be look like
SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1;
I am creating a job number system that a few users will be using at the same time. I have created a job number on the php page and then it saves the number to the job sheet and uses this to link other tables to the job.
I take the job number from a table called numbers which then should increment the number by 1 each time the job is submitted ready to create the next job.
But the numbers are not working correctly.
As an example I get 1,2,3,4,8, then 43,44,45,then 105
I cant see why they would jump so much
$job_number_query = "SELECT * FROM numbers";
$job_result =($mysqli-> query($job_number_query)) ;
$job_num = mysqli_fetch_assoc($job_result);
$increment_job_number = $job_num[job_number];
$update_job_number_query = "UPDATE numbers SET job_number = $increment_job_number +1 ";
$mysqli-> query($update_job_number_query);
//echo ($customer_id);
Then I simply insert the $increment_job_number into the jobsheet table.
I am using int for the Job_number field in the table numbers
I cant think of a way to test the numbers. I guess a way is to look through the jobsheets and add another number to there but because more than one user might have a job that hasn't been submitted yet would this also cause problems.
Just increase the value without the first SELECT:
UPDATE numbers SET job_number = job_number +1
You have no where clause on your update query, so you're incrementing the job_number field in ALL records in the table.
It was me that was the technical failure in the end. I have got the incremental numbers on the create page but then unfortunately I had also got the incremental number on the edit pages so every time I edited the pages I then added 1 to the number field in the numbers table.
I have a web page where people are able to post a single number between 0 and 10.
There is like a lotto single number generation once daily. I want my PHP script to check on the the posted numbers of all the users and assign a score of +1 or -1 to the relative winners (or losers).
The problem is that once I query the DB for the list of the winning users, I want to update their "score" field (in "users" table). I was thinking of a loop like this (pseudocode)
foreach winner{
update score +1
}
but this would mean that if there are 100 winners, then there will be 100 queries. Is there a way to do some sort of batch inserting with one single query?
Thanks in advance.
I'll assume you are using a database, with sql, and suggest that would probably want to do something like
UPDATE `table` SET `score`=`score`+1 WHERE `number`=3;
and the corresponding -1 for losers (strange, can't see a reason to -1 them).
Without more details though, I can't be of further help.
You didn't specify how the numbers were stored. If there is a huge number of people posting, a good option is to use a database to store their numbers.
You can have for example a table called lotto with three fields: posted_number, score and email. Create an (non-unique!) index on the posted_number field.
create table lotto (posted_number integer(1) unsigned, score integer, email varchar(255), index(posted_number));
To update their score you can execute two queries:
update lotto set score = score+1 where posted_number = <randomly drawn number here>
update lotto set score = score-1 where posted_number = <randomly drawn number here>
Let's just assume we have a datatable named posts and users.
Obviously, users contain the data of the gambler (with a convenient id field and points for the number of points they have), and posts contain the post_id ID field for the row, user_id, which is the ID of the user and value, the posted number itself.
Now you only need to implement the following SQL queries into your script:
UPDATE users INNER JOIN posts ON users.id = posts.user_id SET users.points = (users.points + 1)
WHERE posts.value = 0;
Where 0 at the end is to be replaced with the randomly drawn number.
What will this query do? With the INNER JOIN construct, it will create a link between the two tables. Automatically, if posts.value matches our number, it will link posts.user_id to users.id, knowing which user has to get his/her points modified. If someone gambled 0, and his ID (posts.user_id) is 8170, the points field will update for the user having user.id = 8170.
If you alter the query to make it (users.points - 1) and WHERE posts.value != 0, you will get the non-winners having one point deducted. It can be tweaked as much as you want.
Just be careful! After each daily draw, the posts table needs to be truncated or archived.
Another option would be storing the timestamp (time() in PHP) of the user betting the number, and when executing, checking against the stored timestamp... whether it is in between the beginning and the end of the current day or not.
Just a tip: you can use graphical database software (like Microsoft Access or LibreOffice Base) to have your JOINs and such simulated on a graphical display. It makes modelling such questions a lot easier for beginners. If you don't want desktop-installed software, trying out an installation of phpMyAdmin is another solution too.
Edit:
Non-relational databases
If you are to use non-relational databases, you will first need to fetch all the winner IDs with:
SELECT user_id FROM posts WHERE value=0;
This will give you a result of multiple rows. Now, you will need to go through this result, one-by-one, and executing the following query:
UPDATE users SET points=(users.points + 1) WHERE id=1;
(0 is the drawn winning number, 1 is the concurrent id of the user to update.)
Without using the relation capabilities of MySQL, but using a MySQL database, the script would look like this:
<?php
$number = 0; // This is the winning number we have drawn
$result = mysql_query("SELECT user_id FROM posts WHERE number=" .$number);
while ( $row = mysql_fetch_assoc($result) )
{
$curpoints_result = mysql_query("SELECT points FROM users WHERE user_id=" .$row['user_id']);
$current_points = mysql_fetch_assoc($curpoints_results);
mysql_query("UPDATE users SET points=" .($current_points['points'] + 1). " WHERE user_id=" .$row['user_id']);
}
?>
The while construct make this loop to run until every row of the result (list of winners) is updated.
Oh and: I know MySQL is a relational database, but it is just what it is: an example.