Time Taken for each row to be inserted to MySQL database - php

Hi I have data to upload to a Mysql database and I need to get the time taken for each row to be inserted so that I can pass that number through to a progress bar. I have alreay tried accomplishing this by determining the number of rows affetced by the insertion then find the percentage of that number which is not the correct manner to do this.
here is the code
$result3=mysql_query("INSERT INTO dest_table.create_info SELECT * from Profusion.source_cdr") or die(mysql_error());
$progress=mysql_affected_rows();
// Total processes
$total = $progress;
// Loop through process
for($i=1; $i<=$total; $i++){
// Calculate the percentage
$percent = intval($i/$total * 100)."%";
echo $percent;
this actually divides the total number of rows by 1 and multiplies by 100 to get the percentage and this is wrong .
I need the time taken for each row to be inserted and then find the percentage of that.
Your help will be highly appreciated.

It is not easy to determine the time of one insertion in this case; because normal insertion INSERT INTO my_table VALUES(2,4,5)is different from INSERT INTO my_Table SELECT foo,bar,zoo FROM my_other_table.
In your case, the mysql server will try to keep both tables in memory, which need a lot of memory if tables are big. Check this blog post: Even faster: loading half a billion rows in MySQL revisited for some details.
Anyway, in the above script, after returning from mysql_query; the query is ALREADY EXECUTED which means you percentage will count after the query is executed.
EDIT: A solution it to chunk the entries in the table into chunks. Pseudo code of the solution shall be like this:
mysql_query("SELECT count(*) FROM my_other_table");
$total_count = get_count_from_query_result();
$one_percent = intval($total_count/100);
for($i=0; $i<$total_count; $i += $one_percent)
{
mysql_query("INSERT INTO my_Table SELECT foo,bar,zoo FROM my_other_table LIMIT $i, $one_percent");
increment_progress_bar();
}

Related

Randomly Generate 4 digit Code & Check if exist, Then re-generate

My code randomly generates a 4 or 5 digit code along with 3 digit pre-defined text and checks in database, If it is already exists, then it regenerates the code and saves into database.
But sometimes the queries get stuck & become slower, if each pre-defined keyword has around 1000 record.
Lets take an example for one Keyword "XYZ" and Deal ID = 100 and lets say it has 8000 records in database. The do while loops take a lot of time.
$keyword = "XYZ"; // It is unique for each deal id.
dealID = 100; // It is Foreign key of another table.
$initialLimit = 1;
$maxLimit = 9999;
do {
$randomNo = rand($initialLimit, $maxLimit);
$coupon = $keyword . str_pad($randomNo, 4, '0', STR_PAD_LEFT);
$findRecord = DB::table('codes')
->where('code', $coupon)
->where('deal_id', $dealID)
->exists();
} while ($findRecord == 1);
As soon as the do-while loops end, Record is being inserted into database after above code. But the Above code takes too much time,
The above query is printed as follow in MySQL. like for above example deal id, it has already over 8000 records. The above code keeps querying until it finds. When traffic is high, app becomes slower.
select exists(select * from `codes` where `code` = 'XYZ1952' and `deal_id` = '100');
select exists(select * from `codes` where `code` = 'XYZ2562' and `deal_id` = '100');
select exists(select * from `codes` where `code` = 'XYZ7159' and `deal_id` = '100');
Multiple queries like this get stuck in database. The codes table has around 500,000 records against multiple deal ids. But Each deal id has around less than 10,000 records, only few has more than 10,000.
Any suggestions, How can I improve the above code?
Or I should use the MAX() function and find the code and do the +1 and insert into db?
When 80% of the numbers are taken, it takes time to find one that is not taken. The first test loses 80% of the time; the second loses 64% of the time, then 51%, etc. Once in a while, it will take a hundred tries to find a free number. And if the random number generator is "poor", it could literally never finish.
Let's make a very efficient way to generate the numbers.
Instead of looking for a free one, pre-determine all 9999 values. Do this by taking all the numbers 1..9999 and shuffle them. Store that list somewhere and "pick the next one" (instead of scrounging for an unused one).
This could be done with an extra column (indexed). Or an extra table. Or any number of other ways.

How to limit mysql rows to select newest 50 rows

How to limit mysql rows to select newest 50 rows and have a next button such that next 50 rows are selected without knowing the exact number of rows?
I mean there may be an increment in number of rows in table. Well I will explain it clearly: I was developing a web app as my project on document management system using php mysql html. Everything is done set but while retrieving the documents I mean there may be thousands of documents.
All the documents whatever in my info table are retrieving at a time in home page which was not looking good. So I would like to add pages on such that only newest 50 documents are placed in first page next 50 are in second and so on.
But how come I know the exact number of rows every time and I cannot change the code every time a new document added so... numrows may not be useful I think...
Help me out please...
What you are looking for is called pagination, and the easiest way to implement a simple pagination is using LIMIT x , y in your SQL queries.
You don't really need the total ammount of rows you have, you just need two numbers:
The ammount of elemments you have already queried, so you know where you have to continue the next query.
The ammount of elements you want to list each query (for example 50, as you suggested).
Let's say you want to query the first 50 elements, you should insert at the end of your query LIMIT 0,50, after that you'll need to store somewhere the fact that you have already queried 50 elements, so the next time you change the limit to LIMIT 50,50 (starting from element number 50 and query the 50 following elements).
The order depends on the fields you are making when the entries are inserted. Normally you can update your table and add the field created TIMESTAMP DEFAULT CURRENT_TIMESTAMP and then just use ORDER BY created, because from now on your entries will store the exact time they were created in order to look for the most recent ones (If you have an AUTO_INCREMENT id you can look for the greater values aswell).
This could be an example of this system using php and MySQL:
$page = 1;
if(!empty($_GET['page'])) {
$page = filter_input(INPUT_GET, 'page', FILTER_VALIDATE_INT);
if(false === $page) {
$page = 1;
}
}
// set the number of items to display per page
$items_per_page = 50;
// build query
$offset = ($page - 1) * $items_per_page;
$sql = "SELECT * FROM your_table LIMIT " . $offset . "," . $items_per_page;
I found this post really useful when I first try to make this pagination system, so I recommend you to check it out (is the source of the example aswell).
Hope this helped you and sorry I coudn't provide you a better example since I don't have your code.
Search for pagination using php & mysql. That may become handy with your problem.
To limit a mysql query to fetch 50 rows use LIMIT keyword. You may need to find & store the last row id(50th row) so that you can continue with 51th to 100th rows in the next page.
Post what you have done with your code. Please refer to whathaveyoutried[dot]com
check this example from another post https://stackoverflow.com/a/2616715/6257039, you could make and orber by id, or creation_date desc in your query

What is the difference between while and between, and what is the better choice in my case?

I have a table that have hundreds of rows, and i want to get specific rows, I used the LIMIT and between id and id
In my application i have two text inputs, one is for START NUMBER and one is for END NUMBER, When i use the LIMIT i need to tell the user to make the right calculation to get the right start and end
So for example if i have a table of 3000 rows, and i want to select 100 rows above 2000 the query will be :
Select * from table LIMIT 2000,100
This will select 100 rows above 2000
The between method :
In the between method, i'm running a while function on all the table and i'm using IF statement to get the right id's here is what i'm doing :
Select * from table
$datastart = $_POST["datastart"];
$dataend = $_POST["dataend"];
$firstid = 0;
$lastid = 0;
$varcount6=1;
$sql = "select ID from users_info";
$sqlread = mysqli_query($conn,$sql);
while($row = mysqli_fetch_assoc($sqlread)){
if($datastart==$varcount6){
$firstid = $rowdirstid["ID"];
}
if($varcount6>=$dataend){
$lastid = $rowdirstid["id1"];
break;
}
$varcount6++;
}
So now i have the first id and the last id of the table, next i use another sql query :
Select * from table where id between $firstid and $lastid
Both worked
My question is: what should i use if i'm loading huge amount of data each time ?
Should i go with while ? or the LIMIT will make the job done ?
To begin with, you should never use PHP to get the data required, stick to doing that solely in SQL, as PHP is never needed.
The limit query you're using will not cut it for what you're trying to do, as it will not care what id's the entries has, so, if your id's are not 100% consecutive, you will not get the desired result.
You should use the between query you display at the bottom of your post.
But, since you haven't provided your full code I cannot say wether or not you sanitized that input, but that is always a good thing to keep in mind. It's preferable to use parameterized queries instead aswell.
If your sure the ID are consecutive, use SELECT * FROM t WHERE id BETWEEN a AND b ORDER BY ID ASC
If you use LIMIT, the SQL Engine have to scan and order all the first results.
(and index the id field)

mysql - deleting large number of rows with a limit (php nightly cron)

Not sure what the best way to handle this is. For my particular situation I have numerous tables where I want to delete any rows with a timestamp that is greater than 3 months ago... aka only keep records for the last 3 months.
Very simply it would be something like so :
//set binding cutoff timestamp
$binding = array(
'cutoff_time' => strtotime('-3 months')
);
//##run through all the logs and delete anything before the cutoff time
//app
$stmt = $db->prepare("
DELETE
FROM app_logs
WHERE app_logs.timestamp < :cutoff_time
");
$stmt->execute($binding);
//more tables after this
Every table I am going to be deleting from has a timestamp column which is indexed. I am concerned about down the road when the number of rows to delete is large. What would be the best practice to limit the chunks in a loop? All I can think of is doing an initial select to find if there are any rows which need to be deleted then run the delete if there are... repeat until the initial doesn't find any results. This adds in an additional count query for each iteration of the loop.
What is the standard/recommended practice here?
EDIT:
quick writeup of what I was thinking
//set binding cutoff timestamp
$binding = array(
'cutoff_time' => strtotime('-3 months')
);
//set limit value
$binding2 = array(
'limit' => 1000
);
//##run through all the logs and delete anything before the cutoff time
//get the total count
$stmt = $db->prepare("
SELECT
COUNT(*)
FROM app_logs
WHERE app_logs.timestamp < :cutoff_time
");
$stmt->execute($binding);
//get total results count from above
$found_count = $stmt->fetch(PDO::FETCH_COLUMN, 0);
// loop deletes
$stmt = $db->prepare("
DELETE
FROM app_logs
WHERE app_logs.timestamp < :cutoff_time
LIMIT :limit
");
while($found_count > 0)
{
$stmt->execute( array_merge($binding, $binding2) );
$found_count = $found_count - $binding2['limit'];
}
It depends on you table size and its workload so you can try some iterations:
Just delete everything that is older than 3 month. Take a look if it's timing is good enough. Is there performance degradation or table locks? How your app handles period of data deletion?
It case everything is bad consider to delete with 10k limit or so on. Check it as above. Add proper indexes
Even it's still bad, consider selecting PK before delete and than delete on PK with 10k limit and pauses between queries.
Still bad? Add new column "to delete", and perform operation on it with all requirements above.
There is a lot of tricks on for rotating tables. Try something and you will face your needs

How to check for existing records before running update code?

We currently have some php code that allows an end user to update records in our database by filling out a few fields on a form. The database has a field called sequentialNumber, and the user can enter a starting number and ending number, and an update query runs in the background on all records with a sequentialNumber between the starting and ending numbers. This is a pretty quick query.
We've been running into some problems lately where people are trying to update records that don't exist. Now, this isn't an issue on the database end, but they want to be notified if records don't exist, and if so, which ones don't exist.
The only way I can think of to do this is to run a select query in a loop:
for ($i=$startNum; $i<=$endNum; $i++) {
//run select query: select sequentialNumber from myTable where sequentialNumber = $startNum;
}
The problem is, our shared host has a timeout limit on scripts, and if the sequentialNumber batch is large enough, the script will time out. Is there a better way of checking for the existence of the records before running the update query?
EDIT:
It just occurred to me that I could do another kind of test: get the number of records they're trying to update ($endNum - $startNum), and then do a count query:
select count(sequentialNumber) where sequentialNumber between $startNum and $endNum
If the result of the query is not the same as the value of the subtraction, then we know that all the records aren't there. It wouldn't be able to tell us WHICH ones aren't there, but at least we'd know something wasn't as expected. Is this a valid way to do things?
You could try
select sequentialNumber from myTable where sequentialNumber between $startNum and $endNum
This will return all known numbers in that range. Then you can use the array_search function to find out if a certain number is known or not. This should be faster than doing a lot of queries to the db.
var count = mysql_fetch_array(mysql_query('SELECT count(*) FROM x WHERE id>startNum and id<endNum'));
var modified = mysql_affected_rows(mysql_query('UPDATE x SET col='value' WHERE id>startNum and id<endNum'));
if (count[0] > modified) {
// panic now
}

Categories