I am creating an application that inserts (or updates) values in mysql daily. A simplified recordset with headers is :
ItemName,ItemNumber,ItemQty,Date
test1,1,5,2016/01/01
test1,1,3,2016/01/02
test2,2,7,2016/01/01
test2,2,5,2016/01/02
When using a simple insert statement for the above recordset with 16 columns and 216.000 records takes about 4 minutes (php/mysql) - This covers a week of values. Of course if I import the same recordset I get duplicates. I am trying to find a way to effectively disallow duplicate entries.
The aim is to : In the scenario where I import every day a recordset that has dates for the current week I end up with the addition of the new dates only.
The only thing that might change in consecutive imports is the ItemQty.
In php I made a logic where I query the db for ItemName,ItemNumber,Date with the values I am trying to insert. If there is a result on the SELECT statement, I break. If there isn't, I proceed inserting a new row.
Problem is that with the addition of this logic now it does not take 4 minutes, but a couple of hours. (Works though)
Any ideas?
I was thinking perhaps when I insert, to insert something like a checksum column, for example md5(ItemName,ItemNumber,ItemQty,Date) and then check this checksum rather than SELECT * FROM $table WHERE ItemName = value ,ItemNumber = value,ItemQty = value,Date = value that I currently have.
My problem is that the records I insert have nothing unique basically. Uniqueness comes from a group of fields only if compared to the dataset to be imported. If I manage somehow to get uniqueness, I'll solve my other problem too, which is deleting a row or updating a row when the ItemQty changes.
The one that you are looking for is the unique constraint. Using unique constraint, you can add all your columns to the constraint and if all columns satisfied the inserting data, it will not proceed in inserting
Few options:
1) On PHP, iterate over the records, mapping the duplicate ones and keeping the newests
$itemsArray = []; // The array where you have stored your data
$uniqueItems = [];
foreach($itemsArray as $item)
{
if(isset($uniqueItems[$item['ItemName']]))
{
$oldRecord = $uniqueItems[$item['ItemName']];
$newTimeStamp = strtotime($item['Date']); // Might not work with your format date
$currentTimeStamp = strtotiem($oldRecord['Date']);
if($newTimeStamp > $currentTimeStamp)
{
$uniqueItems[$item['ItemName']] = $item;
}
}
else
{
$uniqueItems[$item['ItemName']] = $item;
}
}
// uniqueItems now hold only 1 record per ItemName (the newest one)
2) Sort the data in php by date on ascending order(before inserting in database). Then, on your clause, use ON DUPLICATE KEY UPDATE. This will cause mysql to update the records with duplicate key. In this case, the older records will be inserted first, so the lastest records will be inserted last, overwritting the old records data.
Related
This is a rough example of my mysql query (note: this is inside an other loop that goes through all users):
$query = db.query('SELECT * FROM table WHERE userid = $uid AND reminded = 0');
while ($row = $query->fetch()) {
// send personalized reminder email to the user
db.query('UPDATE table SET reminded = 1 WHERE userid = $uid');
}
The field reminded is set to 1 for all instances for that user.
My question is:
Is the query/while (fetch) already loaded into memory based on the original terms (reminded = 0), or will the remaining while loop behave according to those updates (reminded = 1)?
Let's say the user had 50 rows where reminded is 0, and the query selects those: Are they still existing with the value 0 in the rest of the while loop even though they were all changed to 1 during the loop?
Assuming that the code and SQL you have is only an example (because you should update directly without a php loop).
The fetch on the table rows is executed on the DB row by row.
So, if one or more of these rows are updated in the while loop, in next iterations you will retrive and update (again) the previous updated rows.
I think that you have to be careful "only" if you are updating a field that is part of an index or a field that is used in the SQL to retrieve data (es. a field used in the ORDER BY, etc.).
I have a while loop that adds a key and a variable to an array like this:
$array = [];
while($row = mysqli_fetch_assoc($results)) {
$array[$row['user']] = $row['variable'];
}
}
and the table it is fetching the data from is over 1000 rows, with duplicate rows for user which may have a more updated variable value.
What I am wondering is, how does the array handle duplicate keys? Does it;
1) replace $row['variable'] with the new one if it comes across
OR
2) skip entering in array since the key; $row['user'] , already exists
The answer to this will inform how I set up my mysqli_query with either ORDER BY date ASC or ORDER BY date DESC since I want to ensure the most recent variable inserted is linked to the user.
If you are asking how the PHP behaves with While loop then sure, it will always replace the last occurrence with the same "key" found. So 1st option.
But you can test that out yourself on any php sandbox available with any simple array replacing.
I am hoping someone can help me because I am attempting to do something that is beyond my limits, I don't even know if a function exists for this within PHP or MySQL so my search on google hasn't been very productive.
I am using PHPWord with my PHP/MySql Project, the intention is that I want to create a word document based on a template.
I have used this guide which is also on stack exchange.
However this approach requires that the number of rows and the values are hard coded, i.e. in his example he has used cloneRow('first_name', 3), which then clones the table to have 3 rows, and then goes on to manually define the tags, i.e.
$doc->setValue('first_name#1', 'Jeroen');
$doc->setValue('last_name#1', 'Moors');
$doc->setValue('first_name#2', 'John');
I am trying to make this dynamic, in my instance I am trying to make a timetable, and one of the child tables is exactly that, so the query I have looks up how many entries there are and then collects a count of them, this $count is then used to dynamically create the correct number of rows. This is the count I am using:
$rs10 = CustomQuery("select count(*) as count FROM auditplanevents where AuditModuleFk='" . $result["AuditModulePk"]."'");
$data10 = db_fetch_array($rs10);
$Count = $data10["count"];
I then use this $document->cloneRow('date', $Count); to executive the clonerow function, which works great and my document now looks something like this.
So, so far so good.
What I now want is for a way to then append each row value of the query into the document, so rather than manually setting the tag value i.e. $doc->setValue('first_name#1', 'Jeroen'); I could use something like $doc->setValue('first_name#1', '$name from row 1'); I suspect this will involve a foreach query but not too sure.
I hope the above makes sense, but please feel free to ask me for anything else and become my personal hero. Thanks
Update: Just for sake of clarity, what I would like is for the output to look something like this:
In my example are 5 results and therefore 5 rows created, I want to set values in following way:
${$date1} = date column from query 1st row
${$date2} = date column from query 2nd row
${$date3} = date column from query 3rd row
${$date4} = date column from query 4th row
${$date5} = date column from query 5th row
I was able to sort this out by inserting the records from the query into a temp table, with an AI ID, then using:
//update timetable with events from temp table
$rs14 = CustomQuery("select * FROM tempauditplan where AuditModuleFk='" . $result["AuditModulePk"]."'");
while ($data14 = db_fetch_array($rs14))
{
$document->setValue('date#'.$data14["rowid"], date('d/m/y', strtotime($data14["date"])));
$document->setValue('time#'.$data14["rowid"], date('H:i', strtotime($data14["time"])));
$document->setValue('auditor#'.$data14["rowid"], $data14["auditor"]);
$document->setValue('area#'.$data14["rowid"], $data14["area"]);
$document->setValue('notes#'.$data14["rowid"], $data14["notes"]);
$document->setValue('contact#'.$data14["rowid"], $data14["contact"]);
}
The trick is to also have a function that truncates the table after use so can be used over again
Might not be the most efficient way, but it works!
I have the following call to my database to retrieve the last row ID from an AUTO_INCREMENT column, which I use to find the next row ID:
$result = $mysqli->query("SELECT articleid FROM article WHERE articleid=(SELECT MAX(articleid) FROM article)");
$row = $result->fetch_assoc();
$last_article_id = $row["articleid"];
$last_article_id = $last_article_id + 1;
$result->close();
I then use $last_article_id as part of a filename system.
This is working perfectly....until I delete a row meaning the call retrieves an ID further down the order than the one I want.
A example would be:
ID
0
1
2
3
4-(deleted row)
5-(deleted row)
6-(next ID to be used for INSERT call)
I'd like the filename to be something like 6-0.jpg, however the filename ends up being 4-0.jpg as it targets ID 3 + 1 etc...etc...
Any thoughts on how I get the next MySQL row ID when any number of previous rows have been deleted??
You are making a significant error by trying to predict the next auto-increment value. You do not have a choice, if you want your system to scale... you have to either insert the row first, or rename the file later.
This is a classic oversight I see developers make -- you are coding this as if there would only ever be a single user on your site. It is extremely likely that at some point two articles will be created at almost the same time. Both queries will "predict" the same id, both will use the same filename, and one of the files will disappear, one of the table entries may point to the wrong file, and the other entry will reference a file that does not exist. And you'll be scratching your head asking "how did this happen?!"
Predicting auto-increment values is bad practice. Don't do it. Plan for concurrency.
Also, the information_schema tables are not really tables... they are server internals exposed to the SQL interface. Calls to the "tables" table, and show table status are expensive calls that you do not want to make in production... so don't be tempted to use something you find there.
You can use mysql_insert_id() after you insert the new row to retrieve the new key:
$mysqli->query($yourQueryHere);
$newId = $mysqli->insert_id();
That requires the id field to be a primary key, though (I believe).
As for the filename, you could store it in a variable, then do the query, then change the name and then write the file.
I want to return the row number of a particular row (so how many rows before a given row).
Now the problem is the PRIMARY_KEYs are not sequential, so there are 'gaps' inside them, because sometimes I have to DELETE rows.
id = 1
id = 2
id = 5
id = 9
id = 10
So the only option to get the row number is to use a COUNT(*):
SELECT COUNT(*) FROM table WHERE id < selected_row_id;
But for a given page I have to perform this operation multiple times.. so one solution is to use a foreach loop, like:
foreach($foo as $item){
mysql_query("SELECT COUNT(*) FROM table WHERE id < $item['id']");
//...
}
But I think it's not optimal...if one have thousands of rows and 80-100 iterations for the above foreach.
Another solution would be to rebuild the entire id column after DELETING a row.. but because foreign constraints / references I think this isn't a good step, too.
So if COUNT(*) in a foreach is not viable, then is there anyone who faced this type of problem, what would be the optimal solution?
Thanks for your time, and sry for my bad english.
I recently had a similar issue where, for a given set of results, I wanted to know the position # of a specific result in that set.
There's an elegant solution which will give you sequential row numbers for a resultset, based on internal incrementation of a variable
See http://craftycodeblog.com/2010/09/13/rownum-simulation-with-mysql
Hope that helps you out
Instead of rebuilding the id column after each delete or insert, how about just adding a new column to the table to store the data you need?
I think you want to try this:
SELECT parent.Id, count( DISTINCT (child.Id) ) AS previous
FROM table AS parent, table AS child
WHERE child.Id < parent.Id
I don't think it is efficient from a database perspective, but should be better than iterating from code. Caveat is that you need to build the required php code since I am not proficient at that, but should be easy.
There are a couple ways to improve this
You can sort $items by ID, then keep track of the number of rows above the last item, and add to this the number of rows between the current item and the last item: $last_id <= id AND id < $item['id']
Context switching is expensive. It is better to grab all the IDs in a single query, then process the information in PHP.