Consider the following table
+-------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| date | date | NO | | NULL | |
| sku | varchar(10) | | | NULL |
| impressions | int(11) | NO | | NULL | |
| sales | int(11) | NO | | NULL | |
+-------------+---------+------+-----+---------+----------------+
The table gets populated daily from a bulk download of the previous days sales records.
Each days download not only contains the previous days sales data but also all data from the last 90 days (possible 50k+ records).
However the data for previous days may change since the original insert due to matters outside our control, e.g.
Day 1.
Date: 2015-01-01
SKU: ABCD
Impressions: 100
Sales: 0
Day 2.
Date: 2015-01-01
SKU: ABCD
Impressions: 100
Sales: 3
Date: 2015-01-02
SKU: ABCD
Impressions: 105
Sales: 0
So for any given record from the data download it could be
a) Already seen and the same as before - ignore
b) New - add to database
c) Already seen but new data - Update
Arguably this could be trivially solved by checking each row as so
while (!$file->eof()) {
$row = $file->fgets();
$data = explode("\t", $row);
$sku = $data[0];
$date = $data[1];
$impressions = $data[2];
$sales = $data[3];
$order = $em->getRepository('Orders')->findOneBy(['sku' => $sku, 'date' => $date]);
if($order && $order->getImpressions() != $impressions && $order->getSales() != $sales) {
$order->setImpressions($impressions);
$order->setSales($sales);
} else {
... create new model
}
$em->persist($order);
}
However the rows which will have updated data will be minimal and doing a select for each and every row would mean this job would be incredibly slow due to sheer number of rows.
So my question is what patterns could be used to solve this problem as efficiently as possible?
Any ideas welcome
I would suggest you completely replace the previous 90 days' data with the newly downloaded data.
The reasoning is simple:
The processing time to do this will be trivial. 50,000 rows is tiny in database terms. I would probably do this even if it were a million rows.
Trying to replace only the changed rows is complicated and could introduce errors.
When you say "same as before" it seems like the keys are date and sku (combined) and sales and impressions are the fields that could be updated. If that's correct, then the most efficient way to do this in MySQL is to use INSERT ... ON DUPLICATE KEY UPDATE ... query:
Create a unique key on date and sku columns.
In your php script pre-parse all data from file (or do it in batches if you'd like).
Run a query similar to this (substitute actual data from parsed values in step 1):
INSERT INTO
mytable (`date`, sku, impressions, sales)
VALUES
('2015-01-01', 'ABCD', 100, 3),
('2015-01-02','ABCD', 100, 3),
...
ON DUPLICATE KEY UPDATE
impressions = VALUES(impressions),
sales = VALUES(sales)
A couple of notes:
check out the documentation for this syntax
if the next day's data update containing previous date record was supplementary, you could do sales = sales + VALUES(sales) but I don't think that's the case for you
Related
1- I am sorry for the title, I couldn't describe my complex situation better.
2- I have a table for a Double Accounting System where I am trying to calculate the balance at a specific date and until a specific transaction, and due to specific situations in the frond-end i need to get the result in a single query.
Table example is like that:
| id | date | amount |
| --- | ---------- | ------ |
| 93 | 2018-03-02 | -200 |
| 94 | 2018-01-23 | 250 |
| 108 | 2018-03-05 | 400 |
| 120 | 2018-01-23 | 720 |
| 155 | 2018-03-02 | -500 |
| 170 | 2018-03-02 | 100 |
And here is my simple query that I am using inside a loop of every transaction, because I want to show the new BALANCE after every transaction is made:
... for ...
Transactions::where('date', '<=', $item->date)->get()
... end ...
That query is returning the balance at the END of the day, means until the last transaction made that day, and I don't want this result.
Desired result is achieved by something like:
... for ...
Transactions::where('date', '<=', $item->date)
-> and transaction is < index of current $item
->get()
... end ...
Of course I can't use the ID because the ID is not related in this situation, as the whole ordering and calculation operations are date related.
So basically what i want is a query to get all the transactions from the age of stone until a specific date BUT exclude all the transactions made after the CURRENT one (in the loop).
For example, in the above table situation the query for:
Transaction ID # 93 should return: 93
Transaction ID # 94 should return: 94
Transaction ID # 108 should return: 94,120,93,155,170,108
Transaction ID # 120 should return: 94,120
Transaction ID # 155 should return: 94,120,155
..
...
....
The last transaction to get should be the current transaction.
I hope I could clear it well, I spend 3 days searching for a solution and I came up with this slow method:
$currentBalance = Transaction::where('date', '<=', $item->date)->get(['id']);
$array = array();
foreach ($currentBalance as $a) {
$array[] = $a->id;
}
$newBalanceA = array_slice($array, 0, array_search($item->id, $array) + 1);
$currentBalance = Transaction::whereIn('id', $newBalanceA)->sum('amount');
return $currentBalance;
It is slow and dirty, I appreciate saving me with a simple solution in 1 query if this is possible.
I want to achieve something like this using php and mysql
if the customer has an account with rewards and wants to spend rewards, how do i update the table so that it will going to subtract the spent reward to the table.
Definitely it will going to get the sum of reward by customer_id then subtract the spent reward. IF the first row(reward) is less than the spent value, it will going to subtract all then go to next row get the difference from previous result until the value of spent is equal to 0.
sample:
spent = 60
id_customer = 2
I have a table like this
id | id_customer | reward
1 | 2 | 50
2 | 2 | 20
3 | 3 | 100
4 | 4 | 5
the result should be something like this:
1st row: 50(value of first row) - 60 = 0 (with remaining 10)
2nd row: 20(value of 2nd row) - 10 (remaining points from first row) = 0
id | id_customer | reward
1 | 2 | 0
2 | 2 | 10
3 | 3 | 100
4 | 4 | 5
Hope that makes sense. Thanks
This is the logic of my solution (of course, maybe more than this one and a better ones):
Get the set of rows for that customer (id_customer = 2) and loop through the rows returned.
In each iteration, compare the value of field reward against the amount you like to subtract (60).
If the actual value is >= 60, update that row and exit. If not, update it with 0, update the remain value (60 - row value) and go to the next item in the iteration doing the same action.
In MySQL, I think the best way to do this is with variables. This should work:
declare #spent := 60;
update tablelikethis
set reward = (case when #spent = 0 then reward
when #spent >= reward
then (case when (#tmp := #spent) is null then NULL
when (#spent := #spent - reward) is null then NULL
else 0
end)
else (case when (#tmp := #spent) is null then NULL
when (#spent := 0) is null then NULL
else reward - #tmp
end)
end)
where id_customer = 2
order by id;
MySQL makes this a little hard to do in a single update, because you cannot use order by with a join. The variable version just has to deal with logic on whether the amount remaining for the reward is bigger or less than the amount remaining being spent.
I'd structure your database in a different way. You could create a column called "reward_points" in the customer table, and have a separate reward table. The structure is:
REWARD_TABLE
----------------------------------
reward_id | customer_id | reward
----------------------------------
1 | 2 | 50
2 | 2 | 20
3 | 3 | 100
4 | 4 | 5
CUSTOMER_TABLE
-------------------------------------------------
customer_id | name | reward_points
-------------------------------------------------
1 | Eddard Stark | 0
2 | Jaime Lannister | 70
3 | Joffrey Baratheon | 100
4 | Theon Greyjoy | 5
Then you could just update the CUSTOMER_TABLE with the new value. You could keep the REWARD_TABLE as a 'reward history'. Even better... upon purchase, you could add a negative transaction to the REWARD_TABLE, so when you would do a SELECT SUM(reward) asRewardFROM reward_table WHERE customer_id = 2 GROUP BY customer_id, it would count all the negative transactions as well, resulting in something, which is close to your concept.
I have a SQL table being created daily that is downloaded from a suppliers website,containing product info, that is in csv. I have everything creating alright and all the tables are identical. The problem that I am needing solved is that I need to compare the tables between today and yesterday (table names are the dates in following format mm-dd-yyyy) I need to compare a few different columns for different things.
I need to know all products that are in today's data that weren't in
yesterdays (can be checked by supplier SKU)
I need to know all product that were in yesterday's data that is no
longer in today's
I need to know when the price went up from yesterday's data
I need to know when the price has gone down from yesterday's data
I need to know when a sale has started based on yesterday's data as
well as stopped
These need to show the following labels in the table that will show the changes
regular up
regular down
miscillanious change (description change or change to a fields that aren't a priority)
promo on (discount added from supplier)
promo off (discount taken off by supplier)
delete (no record of the product in new list {probably been deleted})
new item (new record of product in new list)
out of stock
I have been searching everywhere for the answer for these issues and have found stuff that kind of shows me how to do this using union and join but I don't fully understand how to use them based on this scenario.
I have tried different PHP solutions by going through each piece of data and searching for the sku in the new table and vice versa then checking for any changes if they exist in both tables but this is taking a really long time and I have over 200 000 products in these tables. I am hoping that I can do these in less queries and by letting the sql server do more work then the php script.
Thanks for all the help!
Yesterday's Table
__________________________________________________________
| id | price | sale | description | qty | sku |
---------------------------------------------------------
| 1 | 12.50 | 0.00 | description product 1 | 12 | 12345 |
| 2 | 22.99 | 20.99 | describe the problem | 1 | 54321 |
| 3 | 192.99 | 0.00 | description ftw | 5 | 53421 |
| 4 | 543.52 | 0.00 | description | 15 | 45121 |
----------------------------------------------------------
Today's Table
__________________________________________________________
| id | price | sale | description | qty | sku |
---------------------------------------------------------
| 1 | 12.50 | 0.00 | description product 1 | 12 | 12345 |
| 2 | 22.99 | 0.00 | describe the problem | 1 | 54321 |
| 3 | 192.99 | 50.00 | description ftw | 5 | 53421 |
| 4 | 523.99 | 0.00 | description | 15 | 45123 |
----------------------------------------------------------
I need the new table to look like the following
_____________________________________________________________
| id | sku | label | description | price |
-------------------------------------------------------------
| 1 | 54321 | promo off | describe the problem | 22.99 |
| 2 | 53421 | promo on | description ftw | 192.99|
| 3 | 45123 | new item | description | 523.99|
| 4 | 45121 | delete | description | 543.52|
-------------------------------------------------------------
The following is the code I have for the deleted and new items currently. I am using int for the label/status in the example below and just signifying the different numbers.
$deleted = mysql_query("SELECT * FROM `test1`") or die(mysql_error());
while($skus= mysql_fetch_array($deleted))
{
$query = mysql_num_rows(mysql_query("SELECT * FROM `test2` WHERE SKU='".$skus['sku']."'"));
if($query < 1)
{
$tata= mysql_query("INSERT INTO `gday` (Contract_Price, SKU, status) VALUES (".$skus['price'].", ".$skus['sku'].", 1)");
}
}
$deleted = mysql_query("SELECT * FROM `test2`") or die(mysql_error());
while($skus= mysql_fetch_array($deleted))
{
$query = mysql_num_rows(mysql_query("SELECT * FROM `test1` WHERE SKU='".$skus['sku']."'"));
if($query < 1)
{
$tata= mysql_query("INSERT INTO `gday` (Contract_Price, SKU, status) VALUES (".$skus['price'].", ".$skus['sku'].", 2)");
}
}
EDIT:
The Following is the true table that all the data will be going into. I originally didn't want to muddy the water with the large table but by request I have included it.
ID
Status
DiscountDate
Price
Discount
DiscountEndDate
Desc1
Desc2
Desc3
Warranty
Qty1
Qty2
PricingUnit
PriceUpdate
Vendor
Category
UPC
Weight
WeightUnit
Because of the size of your database I could propose you an SQL solution.
When you work with a lot of data, PHP can be slow for several reason.
You can try to do some functions.
You just have to store the status data in an other table. This is the documentation to write a trigger.
http://dev.mysql.com/doc/refman/5.0/en/triggers.html
Becareful if you have a lot of operation on your table I can suggest you to use PostgresSQL instead of mysql because I found it more simple to write function, trigger, ... using PL/sql
EDIT: Just have to write a simple function
I'm working on it at the moment. Think about using select case like in this answer
EDIT: Core function
DELIMITER |
CREATE PROCEDURE export()
BEGIN
(SELECT today.id, today.sku,
CASE today.price
WHEN today.price = yesterday.price THEN 'nothing'
WHEN today.price < yesterday.price THEN 'promo on'
ELSE 'promo off' END AS label
FROM today, yesterday WHERE yesterday.sku = today.sku)
UNION
(
SELECT today.id, today.sku,
'new item' AS label
FROM today LEFT JOIN yesterday ON yesterday.sku = today.sku WHERE yesterday.sku IS NULL)
UNION
(
SELECT yesterday.id, yesterday.sku,
'delete' AS label
FROM yesterday LEFT JOIN today ON today.sku = yesterday.sku WHERE today.sku IS NULL
);
END|
DELIMITER ;
To call just do:
CALL export();
Here is an example of possible core functions. Be careful in this case id could be the same. In the function you'll have to add a personal one in the first column.
If you need performance to display it faster in PHP, think about APC cache
I have this data from data base.
+----+------------+----------+
| id | date time | duration |
+----+------------+----------+-----------
| 3 | 2012-12-20 09:28:53 | ? |
| 1 | 2012-12-20 19:44:10 | ? |
| 2 | 2012-12-23 16:25:15 | |
| 4 | 2012-12-23 18:26:16 | |
| 4 | 2012-12-24 08:01:27 | |
| 5 | 2012-12-29 20:57:33 | |
| 5 | 2012-12-29 20:57:33 | |
+----+------------+----------+------------
duration for id #1 should be equal to the date id #2 - id #1
duration for id #2 should be equal to the date id #3 - id #2
While if the id is the same, it will be added.
Sorry this is only in my mind, still very new in php so I don't know how to start.
Any help is appreciated.
Thanks
edited sorted by date. duration or total time for 1st record = 2nd record - 1st record
If I understand you correctly there is something which is called id=3 and it starts at "2012-12-20 09:28:53" then it ends at "2012-12-20 19:44:10" and in the same second something id=3 starts and you want to know how long everything lasts?
I would do a loop for all records, but I'd start from ending (in SQL: ...ORDER BY date_time DESC), assuming that end of the last (ie. id=5 which started on 2012-12-29 20:57:33) is now, then I calculate duration as substraction of dates (beginning and end), then I would take event beginning as end of the previous one and so on.
An example (not tested):
$end=now();
$dbh=new PDO(...); // here you need to connect to your db, see manual
while($record=$dbh->query('SELECT id, datetime FROM table_name ORDER BY datetime DESC')->fetchObject()){
$start=$record->datetime;
echo $duration=$end-$start; // convert $end and $start to timestamps, if necessary
$end=$start; // here I say that next (in sense of loop, in fact it is previous) record will end at the moment where this record started
}
This doesn't sum because I don't know how are you going to store your data, but I think you will manage with this.
EDITED
At the beginning I define an array:
$durations=array(); // this will hold durations
$ids=array(); // this will hold `id`-s
$last_id=-1; // the value that is not existent
Then the code follows and instead of echo I put this:
$duration=$end-$start;
if($last->id==$record->id){ // is this the same record as before?
$durations[count($durations)-1]->duration+=$duration; // if yes, add to previous value
}
else { // a new id
$durations[]=$duration; // add new duration to array of durations
$ids[]=$record->id; // add new id to array of ids
$last_id=$record->id; // update $last_id
}
and then $end=$start as above.
To view all durations and ids simply
for($i=0;$i<count($durations);$i++){
echo 'id='.$ids[$i].', duration='.$durations[$i].'<br />';
}
Note these tables are in reverse order.
I have a log` that saves log records (amount earned, etc) of employees and a code that separates the data into tables grouped under each employee id:
Empid: 0001
---------------------------
| Logid | Hours | Pay |
---------------------------
| 1001 | 10 | 50 |
---------------------------
| 1002 | 2 | 10 |
---------------------------
Empid: 0003
---------------------------
| Logid | Hours | Pay |
---------------------------
| 1003 | 3 | 9 |
---------------------------
| 1004 | 6 | 18 |
---------------------------
I managed this with the following semi-pseudocode:
$query = mysql_query("SELECT * FROM `log` ORDER BY empid");
$id = 0;
while ($list = mysql_fetch_assoc($query)) {
if ($id != $list['logid']) {
create header (Logid, Hours, Pay)
$id = $list['logid'];
}
add each data row for the empid
}
But now I would like to add the total of the Pay column and put it at the bottom of each table for each empid.
By putting the code $total_pay = $total_pay + $list['pay'] in the while loop I can get the total pay but I can't figure out how I might be able to show the total at the bottom.
Would really appreciate any advice on this!
This should do it. You basically sum up until the id is changing.
$sum = 0;
while ($list = mysql_fetch_assoc($query)) {
if ($id != $list['logid']) {
//create the totals using $sum !!!
// after that re-set sum to 0
$sum = 0;
//create header (Logid, Hours, Pay)
$id = $list['logid'];
}
$sum += $list['Pay'];
//add each data row for the empid
}
Also...
Please, don't use mysql_* functions in new code. They are no longer maintained and are officially deprecated. See the red box? Learn about prepared statements instead, and use PDO, or MySQLi - this article will help you decide which. If you choose PDO, here is a good tutorial.
There are two ways that you can do this.
PHP
Keep a running total of all of the "pay" values, and add it into your table at the bottom. For example:
$i=0;
while ($list = mysql_fetch_assoc($query)) { // for each row in your results
if ($id != $list['EmployeeId']) { // We only enter this loop if the EmployeeId doesn't equal $id. This can happen because either $id doesn't exist yet, or it doesn't match the previous EmployeeId
$i++; // increase $i by 1
if($i>1) { // Enter this loop only if $i is greater than or equal to 2 (if it is less than two, then this is our first time running this script, and adding a footer row wouldn't make any sense).
create footer (EmployeeId, Hours, Pay); // Log Id is irrelevant here
}
// reset your variables here
$id = $list['EmployeeId']; // set $id = the first or the new Employee ID
$total_pay = $list['pay']; // This is our first time for this Employee, so don't just add it to the running total
create header (EmployeeId, Hours, Pay) // Create the top half of your table
} else { // The EmployeeId has been established: we only need to change the running total
$total_pay = $total_pay + $list['pay'];
}
// add a data row for each LogId. This executes every time we go through the loop
create_normal_row(LogId, EmployeeId, Hours, Pay)
}
// At this point, both Employees have a header, and all data rows. However, we left the loop before we could add the last Employee's footer row
// Let's add one more footer row for the last user
create_footer (Logid, Hours, Pay);
SQL
MySQL has a function that does something very similar to what you are trying to do called ROLLUP. You can read more about it here:
http://dev.mysql.com/doc/refman/5.0/en/group-by-modifiers.html
Basically, you would change your query to work like this:
SELECT LogId, EmployeeId, SUM(Hours), SUM(Pay) FROM `log`
GROUP BY empid, logid WITH ROLLUP
This query will return a dataset that looks like this:
---------------------------------------
| Logid | EmployeeId| Hours | Pay |
---------------------------------------
| 1001 | 1 | 10 | 50 |
---------------------------------------
| 1002 | 1 | 2 | 10 |
---------------------------------------
| NULL | 1 | 12 | 60 |
---------------------------------------
| 1003 | 2 | 3 | 9 |
---------------------------------------
| 1004 | 2 | 6 | 18 |
---------------------------------------
| NULL | 2 | 9 | 27 |
---------------------------------------
| NULL | NULL | 21 | 87 |
---------------------------------------
Whenever $list['Logid'] is null, you know that you have a "total" row. Be careful though, this will add a "sum of all employees" row at the bottom of your dataset. If $list['EmployeeId'] is null, then you know you're in this "total" row.
On a related note (I'm not sure if this is what you're asking for), you can show this stuff in a table by using HTML <table> elements.
Each row would look like this:
<table> <!-- shown at the beginning of each table -->
<tr> <!-- shown at the beginning of each row -->
<td> <!-- shown at the beginning of each table cell -->
Your text goes here
</td> <!-- shown at the end of each table cell -->
<td>
More text can go here
</td>
</tr> <!-- shown at the end of each row -->
</table> <!-- shown at the end of each table -->
<tr>s can be repeated indefinitely within each <table>, and <td>s can be repeated within <tr>s.