I'm looking for a little bit of direction for how to analyze a problem. I work for a small manufacturing company. We paint about 150 items per day. Those items then go to Quality Control. About 70% pass QC. The remaining 30% have to be repaired in some way.
We have 5 different repair categories:Repaint, Reclear, Remake, Reglaze, Fix
Every time an order gets QC'd my system inputs some data in a "Repairs" mysql table. If it passes QC, it's given a category of Great. It's structure is like this:
id | Repair | Date
5 | repaint| 2013-01-01
6 | reclear| 2013-01-01
5 | great | 2013-01-02 ...etc
I need to be able to perform analysis on what actions are happening. I'd like to know what 'paths' items are going down.
For example. What percentage of items have these categories Reclear->Repaint->Great. What percentage have Repaint->Repaint->Remake->Great (every item should eventually end with 'Great)
I'm kind of stuck on where to start in figuring out how to analyze this.
Should I be keeping track of the repair number in the table? If I did that then maybe I could use a self join to select orders where repairnum=1 AND repair=Repaint joined with repairnum=2 AND repair='Great' This would tell me which orders went down the path Repaint->Great I'm a little hesitant to go this route because 1) I don't want to have to do a query and get the repairnumber before I insert a new row into the table and 2) It seems like I'd have to have some pretty nasty querys to analyze items that have 5 or 6 (or more) repairs.
Perhaps someone can point me in the right direction?
My app is in php and mysql.
You don't need a separate "repair number", because you have the date when each repair was made, so can order by that (assuming you store time as well if more than one repair can be made in a day).
The "path" for an item is the list of its repairs, in order of date. If you just say SELECT repair FROM repairs WHERE id=5 ORDER BY date ASC you'll get them as rows.
The trick is to turn these into a single value representing the whole path, using GROUP_CONCAT - SELECT GROUP_CONCAT(repair ORDER BY date ASC SEPARATOR '->') FROM repairs WHERE id=5
Once you have that, you can run that for all products in the DB using a GROUP BY, and then look for patterns in it with HAVING:
SELECT
id,
GROUP_CONCAT(repair ORDER BY date ASC SEPARATOR '->') as path
FROM
repairs
GROUP BY
id
HAVING
path = 'Repaint->Repaint->Remake->Great'
Note that I don't have a copy of MySQL to try this out with, so I may have made a mistake, but the manual suggests that the above should work.
Related
I am trying to build a simple randomised voting system for a site, currently, and I believe wrongly, I have the following setup
as the user goes to the random voting section of the site, he is presented with a votable item that firstly, is an item that was last voted on the longest time agp, Secondly, isn't something the user voted for before and thirdly must be relevant to them based upon their list of relevant subjects. As a side note, if the user skips the vote, it can't show them the same thing again later in the list and all the single votes must be recorded to produce statistics.
Currently, the way I am doing this is by holding a serialized array against their account in the database containing a list of vote item Id numbers that they have previously voted for.
I would like to say at this point that I don't condone this, and inserting a serialized array into the database was silly, and I regret my actions ;) .
Nevertheless, I couldn't figure out another way to do this. at this point, the array is built upon by users voting which adds to the list. but the query would become massive if I was to continue and someone had up to 100 things they could vote for. At this point I am using this query to get the next item in the list:
//$finarr is the received unserialized list of values from the database
$sql = "SELECT * FROM vote_item_headers WHERE Id NOT IN (".$finarr.") ORDER BY LastVoted DESC LIMIT 1"
//execute query and display results
I should also note that at this point, I haven't even bothered adding in the third requirement above because frankly, I didn't even know where to begin or anything when I had to make a query that needed to encompass all of the requirements above.
A bit more information you might find relevant:
I have some other tables which are explained below.
core_users = list of users and their interests
core_interests = a list of all interests
core_language = a list of the different possible languages that an item could fall into
vote_item_headers = a list of all the votable items with a reference in the interests and lang tables to define extra properties of the votable item.
core_votes = the master list of people's votes
I am really sorry if this is too vague for you guys, all I really want is guidance in this instance when dealing with large amounts of information that needs to be combined to get a result.
Any suggestions welcome. I am happy to restructure the entire thing just to get it right.
voted
user_id | Id
----------+------
1 | 12
1 | 14
1 | 187
2 | 23
SELECT * FROM vote_item_headers
WHERE Id NOT IN (
SELECT Id FROM voted WHERE user_id=1234
)
ORDER BY LastVoted DESC LIMIT 1
but you also want relevant posts
relevant_posts
user_id | Id
----------+------
1 | 342
1 | 253
1 | 32
2 | 53
SELECT vote_item_headers.* FROM vote_item_headers
# cut down the amount returned with relevant_posts table
INNER JOIN relevant_posts ON (
vote_item_headers.Id=relevant_posts.Id
AND user_id=1234
)
WHERE vote_item_headers.Id NOT IN (
SELECT Id FROM voted WHERE user_id=1234
)
ORDER BY LastVoted DESC LIMIT 1
I have a online shopping cart, at checkout user enters his zipcode.
There are 2 payment methods, cash-on-delivery and net-banking. The courier service ships to only certain areas(identified by zipcode). And the allowed list of zipcodes for COD and Net-Banking differ. (length of list = about 2500 for COD, and about 10,000 for latter)
Should I store these lists in database or a flat file?
For database, I will be querying using SELECT, and for file, I can read the entire(or partial) list in array, and then do Binary search on it.
Which one would be faster, considering following points -
There is only one courier service now, but in future there will be more, and with different lists of there own. So I need to search in multiple lists.
There is mostly read, write would be much less. Also the list should be customisable at later point.
I would have selected Database, but I don't know if it would make things slower, and I don't want to spend time designing database, when a file might be better.
EDIT:
Say there are 2 courier companies ABC and DEF.
For file I will have 4 files (say) ABC_COD.txt, ABC_net.txt, DEF_COD.txt, DEF_net.txt. So if a customer goes for COD, I search ABC_COD, if not in there, I search DEF_COD and so on. So ok this seems to be costly, but it is also easily extensible.
Now consider database, I will have a table Allowed_zipcodes, with five columns : zipcode(int/varchar(6)), ABC_COD(boolean), ABC_net(boolean), DEF_COD(boolean), DEF_net(boolean). If the x company offers cod for y code, the corresponding column has true, otherwise false.
While this seems good for lookup, adding a company involves a change in schema.
Please consider future changes and design as well.
Database, without any hint of a doubt. More logical, and more scalable.
For some reason I think you should look at the magenta framework, isn't it already in some of the packages?
But if you want to do it yourself: Just to give you a starting point on the database model:
carrier
id(int) | name (varchar)
zipcodes
start(int) | end(int) | carrier(fk::carrier.id)
For instance:
carrier
1 | UPS
2 | fedex
zipcodes
1000 | 1199 | 2
1000 | 1099 | 1
Querying your zipcode and available carriers:
SELECT carrier.name
FROM zipcodes
LEFT JOIN carrier ON zip codes.carrier = carrier.id
WHERE
zipcodes.end >= :code
AND
zipcodes.start <= :code
I'm working on a horse rating system and I need to assign values to each horse based on the value of another (already filled) field (all this is stored in a MySQL db).
Consider the following simplified example:-
A four horse race where the odds for each horse are as follows:-
Horse A - 2/1
Horse B - 3/1
Horse C - 3/1
Horse D - 5/1
As Horse A has the lowest price, I want to give it a value of 1.
However, Horse B and C have the same price and so I want to give them both 2.
Horse D has the next highest price and so I want to give it the value of 3.
When I first started to do this, I thought it would be easy but it has now reached the stage where the loops are driving me loopy. Any suggestions would be greatly appreciated.
Many thanks in advance.
In view of the response I received below from Daan then I should also add that my problem is further compounded by the fact that my table has several subsets (i.e. it contains more than one race on any given day and they need to be ranked individually).
My table is currently:-
racedate | racetime | racecourse | horsename | forecast | forecast_rate | id
The racedate for the purposes of this will always be the same. The racetime and racecourse together identify the race in question.
forecast is the price given to each horse (this has already been entered at this stage) and this is what needs to have the shared ranking done on it to be stored in forecast_rate.
id is just the unique index for each entry in the table.
This is what I have now got to (and it doesn't work... surprise...)
$testdude=mysql_query("SELECT DISTINCT racecourse,racetime FROM picking") or die(mysql_error());
while($rih=mysql_fetch_array($testdude)){
$testdude1=mysql_query("SELECT s1.forecast, s1.horsename, COUNT(DISTINCT s2.forecast) AS rank FROM picking s1 JOIN picking s2 ON (s1.forecast <= s2.forecast) GROUP BY s1.horsename;");
while($rih1=mysql_fetch_array($testdude1)){
mysql_query("UPDATE picking SET forecast_rate='$testdude1[rank]' where horsename='$testdude1[horsename]'") or die(mysql_error());
}
}
This is called shared ranking and it's easiest to do this in MySQL. Take a look at this tutorial and see whether you can get that to work. If not, please provide more details about your table lay-out, and I'll get you a tailored example :)
I have data such as...
ID | Amount
-----------------
1 | 50.00
2 | 40.00
3 | 15.35
4 | 70.50
etc. And I have a value I'm working up to, in this case let's say 100.00. I want to get all records up to 100.00 in order of the ID. And I want to grab one more than that, because I want to fill it up all the way to the value I'm aiming for.
That is to say, I want to get, in this example, records 1, 2, and 3. The first two total up to 90.00, and 3 pushes the total over 100.00. So I want a query to do that for me. Does such a thing exist in MySQL, or am I going to have to resort to PHP array looping?
Edit:
To put it in English terms: Let's say they have $100 in their account. I want to know which of their requests can be paid, either in toto or partially. So I can pay off the $50 and the $40, and part of the $15.35. I don't care, at this point in the program, about the partialness; I only want to find out which quality in any way.
Yes, is possible
set #total:=0;
select * from
(
select *, if(#total>100, 0, 1) as included, #total:=#total+Amount
from your_table
order by id
) as alls
where included=1
order by id;
Refering to the last sentence: doesn't mysql sum cut it?
I have a table in postgres called workorders. In it are various headings. The ones I am interested in are labor, date_out and ident. This table ties up with wo_parts (workorder parts). In this table are the headings I am interested in, part and workorder. Both are integers. (part auto number) The final table is part2vendor and the headings are retail and cost. Right, basically what happens is.....I create a workorder (invoice). This calls a part from part2vendor. I enter it and invoice it off. In workorder a row is created and saved. It is given an ident. In wo_parts, the part i used is recorded as well as workorder number and qty used. What I want to do is create a report in php that pools all this info on one page. IE. if i choose dates 2009-10-01 to 2009-10-31 it will pull all workorders in this range and tell me the total labour sold and then the PROFIT (retail less cost) of the parts I sold, using these 3 tables. I hope i have explained as clear as possible. any questions please ask me. Thank you very much for your time.
You will want to read up on SQL - keywords to look for include "aggregate", "SUM" and "GROUP BY".
You query will look something like (but this will certainly need correcting):
SELECT
SUM(wo.labor) AS tot_labor,
SUM(p2v.cost - p2v.retail) AS tot_profit
FROM
workorders AS wo
JOIN wo_parts AS wp ON wo.ident=wp.ident [?]
JOIN part2vendor AS p2v ON ...something...
WHERE
date_out BETWEEN '2009-10-01'::date AND '2009-10-31'::date;