The site has 5 currencies. The sales price of goods and delivery is set in euros. But the customer can choose the preferred currency to display prices.
A better solution would be to keep prices in all currencies in the table "prices" or convert dynamically?
Structure of table "prices":
id
currency
variant_id
value
original_value
Dynamically. It would be a simple JOIN plus a calculation.
However, ... It would be tempting to ROUND(..,2) when displaying. This works for many Western currencies but does not work well for some of the Gulf states, which need 3 decimal places, or Korea, which does not really use Jeons any more.
If you need to round to a different number of places, use a Stored Function. And/or perhaps the JOIN also gives the number of decimal places.
The answer depend on the business rule. If a price can be calculated by a simple conversion then store only price in some reference currency and convert it dynamically. However, you need a date to store rate history. When a price cannot be calculated dynamically you should store it. The combined approach may be used, too.
Related
So I have a mysql database in my PhpMyAdmin which stores all the information about a big list of products. The name, description, the image file name and the price. So here is the case, some products have decimal numbers in their price so I set the datatype for productprice from int to float. But some products cost for instance, 199.90, as I enter that value in the table it receives it and stores it as 199.90 just as I want it to. But when I output the price on the website which I do with php it only outputs the number 199.9 as you can see the last zero is missing. Even though the two numbers are of the same value. 199.9 looks much less appealing than 199.90, and I hope that is understandable. So, what am I supposed to do?
You can use number formatting in PHP:
$Text = number_format($Amount,$Decimals,$DecimalSeperator,$ThousandsSeperator);
But remember that floats are not always the best type for storing prices. Normally prices are stored in cents in integers. The answer to the sum 100 - 50 is a definate 50, but the answer to 1.00 - 0.50 might be 0.499999999999999. This is just a simple example, it gets worse with more complicated calculation. A float is ok, as long as storing a price is the only thing you do. See also: Best data type to store money values in MySQL
I'm trying to figure out how to build a specific algorithm (ultimately implemented in PHP, but that's less important), but I'm having a hard time wrapping my head around the best way to do the math. Instead of defining a complex industry-specific process, I'll use a crazy metaphor here (the math is what's important). Imagine you're trying to identify the percent chance a specific make of car is parked in a store's parking lot based on the items sold within the store. To begin you take a physical survey of 100,000 store parking lots, recording each unique car make spotted outside, each unique item sold within the store, and a fixed percent relevance that item has to the store (ex: lumber has an 89% relevance to Home Depot, but pencils only have a 23% relevance to Walmart).
There are two parts to what I’m trying to solve. First, I’m trying to figure out the best way to roll-up this data to a specific item, while respecting each relevance percent and the number of confirmed observations (so one spotting doesn’t equal 100% chance, similar to http://www.evanmiller.org/how-not-to-sort-by-average-rating.html ). In other words, if a brand new, never-before-seen store is selling Waterford glasses and cashmere sweaters, from those items we can predict there’s an 89% chance a Mercedes is in the parking lot.
So to recap:
Each item has been seen a specific number of times in a store. For each of those times, there is a different product/store relevance percentage and a list of all car makes in the parking lot. How do I best mathematically calculate the percent chance a specific make is in the parking lot of a brand new store, only based on the items within?
Now the second part of this is getting a bit more complicated by adding another layer of abstraction. If a single person visits 50 stores, and we aggregate all the items in all those stores, we can predict what type of car they drive (ex: lots of camping and hiking stores, so they have a 67% chance of driving a Jeep). Then if they visit a new store and are exposed to a brand new item, for which we have no data, I need to apply that 67% Jeep onto the new item (still respecting the relevance of that item to the store). Then use that item’s less-than-certain Jeep statistic to influence our predictions of parking lots that contain that new item (which was never directly measured). Perhaps this requires us to add a confidence interval of some kind? Or how can we represent that uncertainty, without every one of the millions of items we analyze eventually averaging out to 50%?
I REALLY appreciate your help on this!
I think, you need to build cross-correlation matrix,
where lines are goods, and columns are car types.
Each cell contains normalized coefficient, how to some
good (i.e. diamond ring) is related to car type (Geo or Mercedes).
Details see here:
http://en.wikipedia.org/wiki/Cross-correlation
This is kind of a neat problem and I've enjoyed thinking it through...
Assume that you run a "Widget Rental" website, and on your application and you want to allow prospective purchasers to sort the widgets based on prices. (Low to high or high to low).
Each widget can have a different price based on the time of year. Some widgets will have dozens of different prices depending on the season as you get "high" seasons and "low" seasons.
However, the sellers of the "Widgets" are especially mischievous, and have realised that if they set their widget to be really expensive for one day of the year, and also really cheap one day of the year, then they can easily appear at the low and high sort ranges.
Currently, I took a very naive solution in order to calculate the "lowest price" for a Widget, which is to just take the lowest( N ) value from a dataset.
What I would like to is to get a "lowest from price" for a widget, which accurately portrays the price which it could be rented from.. and remove the lower/higher-band outliers.
Take a look at this chart... with values...
X Axis - Time (each significant interval is a day)
Y Axis - Price
The X axis is time, and the Y axis is the price. Now, this contains a normal distribution, and there aren't any real statistical outliers in that dataset. It's common to see the price between the lowest value and the upper value to fluctuate as much as 200%.
However, take a look at this second chart... It contains a single day tariff, which is only 20 ēuros...
I've played around with using Grubbs test and it seems to work quite well.
The important thing is that I want to get a "from price". That is to say, I want to be able to say, "You can rent this widget from XXXX". So it should be reflect the overall pricing taken as a whole and ignore clear outliers.
PHP bonus points if you point me in the direction of anything that already exists. (But I'm happy to code this myself in PHP).
One issue is that there are multiple definitions for what an outlier actually is. However, for this purpose a straight forward solution seems sufficient.
You could remove outliers by limiting the range of values to either +- some percentage or +- some number of standard deviations (probably one or two, but it could very) from the average price. Likely you'd probably want to use a combination of both, as if the prices don't very much, then a discount could be viewed as an outlier, which may or may not be appropriate. In any case, you'd likely have to do some experimenting to determine how sensitive it is. Chances are you'd probably want to set it so outliers must be at least some percentage away from the mean even if it's only 5-20 percent. Below are a few percentage based limits based on an average of $500.
90%: $50 to $950
75%: $125 to $875
50%: $250 to $750
30%: $350 to $650
25%: $375 to $650
If multiple passes are used, then it would be easier to sort the prices, then remove the price that is farthest from the average (perhaps considering the highest price as well as the lowest price) as long as it exceeds the range. This ends up being O(N*D log D) to obtain the result of continuous single passes until they have no effect, instead of O(N*D) for a single pass, where N is the number of items to rent and D is the number of days considered.
You also might find the Ramer–Douglas–Peucker algorithm useful for finding points of interest after a bit of experimenting with how to define the value of epsilon.
I have a question on what the best way to calculate and store sales tax in the US should be. I am creating an invoice program that can have multiple line items. Here is an example of the issue I'm am running into.
One of my invoice line items looks like this.
quantity 2
amount 1133.67
tax rate 7.5% (.075)
If I add 1133.67 to 1133.67 and multiply by .075, the tax is 170.05.
However, if I take each quantity 1133.67 and apply tax to it individually first, the amount of tax totals up to 170.06.
Obviously, when I'm dealing with taxing each individual quantity, each quantity is being rounded up. But when I total each quantity and then tax the total, there is no rounding up.
I can probably solve this problem by simply editting my table field to allow for 3 decimal places instead of 2.
This may be a question only I can answer, but does it make sense to store tax amounts for each line item or no? I was thinking the data could be useful in reports later down the road.
Wondering what others are doing.
Thanks in advance.
Having worked at a successful sales tax startup, I can tell you "it depends". Local laws vary on whether you calculate sales tax at the line item or invoice level. For some discussion on the topic see:
https://money.stackexchange.com/questions/15051/sales-tax-rounded-then-totaled-or-totaled-then-rounded
I can also tell you that US sales tax is extremely complicated. There are over 10,000 jurisdictions that can levy a tax (state, county, city, and special districts such as stadium districts, metropolitan transport districts, water districts, etc.). The boundaries of those districts are not well defined an a readily available public source, and certainly do not conform to ZIP code boundaries.
If you want to get it right, your best bet is to use a sales tax calculation service. There are several SAAS solutions that are well-suited for web apps.
I have code that searches for cars depending on your price range:
$category = "SELECT * FROM tbl_listings WHERE price between '$c[0]' AND '$c[1]'";
For some reason, that code doesn't work perfectly. It showed me a couple cars in the right range, but also showed one that is 200,000 when I was searching between 5,000 and 20,000.
Also, what is a good way to search when some cars have a price with a dollar sign in the database and some have commas? The search form is not returning anything with a dollar sign or commas.
Stop storing prices as strings? A price is typically stored as one of two types:
integer: number of cents
float: dollars and cents, but be sure to set the number of decimal places to 2
One doesn't generally store prices as strings (like "$14,999.99") in the database because you can't do range queries, like the one you're trying to do now.
You also can't do arithmetic, like a query that totals the prices of a particular subset of cars.
If the data you're pulling in has formatted strings like this, use NumberFormatter::parseCurrency() to get a float from the string you're given before shoving it in the DB. http://php.net/manual/en/numberformatter.parsecurrency.php
Your statement about
some cars have a price with a dollar sign in the database and some
have commas
makes me think the datatype in the database are not numeric datatype. This can be an issue, even provided that your $c[0] is correctly the lower bound and $c[1] is the upper bound.