PHP actuarial-type probability formula - php

How would I convert the following formula to php to solve for "p":
(1-p)^365=.80
What I'm trying to do is calculate the daily probability of an event based on the annual probability. For example, let's say the likelihood that John Doe is going to die in a particular year is 20%. So what would be the likelihood on any given day of that year? I know the answer for that example scenario is 0.0611165 using the formula above (where .80 is the 80% annual survival probability). What I need is a formula in which I can substitute various annual probabilities and the result would be the corresponding daily probabilities.
Can anyone help?

To solve for p you need to take the 365-th root of both sides, then solve for p.
<?php
$prob = 0.80;
$p = 1 - pow($prob, 1/365);
?>

Well you'd take the 365th root of both sides which is equivalent to raising to the 1/365
<?php
$prob = 0.80;
$prob = pow($prob, 1/365);
then you'd add 1
$p = $prob + 1;
echo $p;
?>

Related

Cosine similarity result above one

I am coding cosine similarity in PHP. Sometimes the formula gives a result above one. In order to derive a degree from this number using inverse cos, it needs to be between 1 and 0.
I know that I don't need a degree, as the closer it is to 1, the more similar they are, and the closer to 0 the less similar.
However, I don't know what to make of a number above 1. Does it just mean it is totally dissimilar? Is 2 less similar than 0?
Could you say that the order of similarity kind of goes:
Closest to 1 from below down to 0 - most similar as it moves from 0 to one.
Closest to 1 from above - less and less similar the further away it gets.
Thank you!
My code, as requested is:
$norm1 = 0;
foreach ($dict1 as $value) {
$valuesq = $value * $value;
$norm1 = $norm1 + $valuesq;
}
$norm1 = sqrt($norm1);
$dot_product = array_sum(array_map('bcmul', $dict1, $dict2));
$cospheta = ($dot_product)/($norm1*$norm2);
To give you an idea of the kinds of values I'm getting:
0.9076645291077
2.0680991116095
1.4015600717928
1.0377360186767
1.8563586243689
1.0349674872379
1.2083865384822
2.3000034036913
0.84280491429133
Your math is good but I'm thinking you're missing something calculating the norms. It works great if you move that math to its own function as follows:
<?php
function calc_norm($arr) {
$norm = 0;
foreach ($arr as $value) {
$valuesq = $value * $value;
$norm = $norm + $valuesq;
}
return(sqrt($norm));
}
$dict1 = array(5,0,97);
$dict2 = array(300,2,124);
$dot_product = array_sum(array_map('bcmul', $dict1, $dict2));
$cospheta = ($dot_product)/(calc_norm($dict1)*calc_norm($dict2));
print_r($cospheta);
?>
I don't know if I'm missing something but I think you are not applying the sum and the square root to the values in the dict2 (the query I assume).
If you do not normalised per query you can get results greater than one. However, this is done some times as it is ranking equivalent (proportional) to the correct result and it is quicker to compute.
I hope this helps.
Due to the vagaries of floating point arithmetic, you could have calculations which, when represented in the binary form that computers use, are not exact. Probably you can just round down. Likewise for numbers slightly less than zero.

Calculating overall rating

If i have a series of 10 objects with rating from 1 to 10. Then how can i calculate overall rating?
For example if i have a list like this:
Entertainment - 8/10
Fun - 9/10
Comedy - 6/10
Dance - 8/10
and so on... Like this 10 objects. Tell me how to calculate the overall rating for 10.
Overall - ?/10
I am very weak in maths. I was told by someone to add the total and if I got 83 as the answer, then the overall rating will be 8.3/10. Is this correct?
I am doing this for my PHP website. So if someone knows how to write a query for this, that would be very helpful for me.
Average the total rating and you will get the answer.
he one that is told for will stand correct if there are 10 criteria on which scoring is to be made.
SELECT avg(score) FROM tbl
There is inbuilt function available for it
Refer
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_avg
Yes, to get the average, add them all together and divide by the amount. Example:
//do a MySQL query instead of this
$result_out_of_10 = array(
'fun' => 9,
'comedy' => 6,
'dance' => 8
);
$total = 0;
$total_results = 0;
foreach( $result_out_of_10 as $result )
{
$total += $result;
$total_results++;
}
$final_average_out_of_10 = $total / $total_results;
print "Average rating: $final_average_out_of_10 out of 10.";
EDIT: Meherzad has a better way - using the MySQL AVG() function - which I didn't know about. Use his way instead (although mine still works, it's more code than necessary).

php game, formula to calculate a level based on exp

Im making a browser based PHP game and in my database for the players it has a record of that players total EXP or experience.
What i need is a formula to translate that exp into a level or rank, out of 100.
So they start off at level 1, and when they hit say, 50 exp, go to level 2, then when they hit maybe 125/150, level 2.
Basically a formula that steadily makes each level longer (more exp)
Can anyone help? I'm not very good at maths :P
Many formulas may suit your needs, depending on how fast you want the required exp to go up.
In fact, you really should make this configurable (or at least easily changed in one central location), so that you can balance the game later. In most games these (and other) formulas are determined only after playtesting and trying out several options.
Here's one formula: First level-up happens at 50 exp; second at 150exp; third at 300 exp; fourth at 500 exp; etc. In other words, first you have to gather 50 exp, then 100 exp, then 150exp, etc. It's an Arithmetic Progression.
For levelup X then you need 25*X*(1+X) exp.
Added: To get it the other way round you just use basic math. Like this:
y=25*X*(1+X)
0=25*X*X+25*X-y
That's a standard Quadratic equation, and you can solve for X with:
X = (-25Âħsqrt(625+100y))/50
Now, since we want both X and Y to be greater than 0, we can drop one of the answers and are left with:
X = (sqrt(625+100y)-25)/50
So, for example, if we have 300 exp, we see that:
(sqrt(625+100*300)-25)/50 = (sqrt(30625)-25)/50 = (175-25)/50 = 150/50 = 3
Now, this is the 3rd levelup, so that means level 4.
If you wanted the following:
Level 1 # 0 points
Level 2 # 50 points
Level 3 # 150 points
Level 4 # 300 points
Level 5 # 500 points etc.
An equation relating experience (X) with level (L) is:
X = 25 * L * L - 25 * L
To calculate the level for a given experience use the quadratic equation to get:
L = (25 + sqrt(25 * 25 - 4 * 25 * (-X) ))/ (2 * 25)
This simplifies to:
L = (25 + sqrt(625 + 100 * X)) / 50
Then round down using the floor function to get your final formula:
L = floor(25 + sqrt(625 + 100 * X)) / 50
Where L is the level, and X is the experience points
It really depends on how you want the exp to scale for each level.
Let's say
LvL1 : 50 Xp
Lvl2: LvL1*2=100Xp
LvL3: LvL2*2=200Xp
Lvl4: LvL3*2=400Xp
This means you have a geometric progression
The Xp required to complete level n would be
`XPn=base*Q^(n-1)`
In my example base is the inital 50 xp and Q is 2 (ratio).
Provided a player starts at lvl1 with no xp:
when he dings lvl2 he would have 50 total Xp
at lvl3 150xp
at lvl4 350xp
and so forth
The total xp a player has when he gets a new level up would be:
base*(Q^n-1)/(Q-1)
In your case you already know how much xp the player has. For a ratio of 2 the formula gets simpler:
base * (2^n-1)=total xp at level n
to find out the level for a given xp amount all you need to do is apply a simple formula
$playerLevel=floor(log($playerXp/50+1,2));
But with a geometric progression it will get harder and harder and harder for players to level.
To display the XP required for next level you can just calculate total XP for next level.
$totalXpNextLevel=50*(pow(2,$playerLevel+1)-1);
$reqXp=$totalXpNextLevel - $playerXp;
Check start of the post:
to get from lvl1 -> lvl2 you need 50 xp
lvl2 ->lvl3 100xp
to get from lvl x to lvl(x+1)
you would need
$totalXprequired=50*pow(2,$playerLevel-1);
Google gave me this:
function experience($L) {
$a=0;
for($x=1; $x<$L; $x++) {
$a += floor($x+300*pow(2, ($x/7)));
}
return floor($a/4);
}
for($L=1;$L<100;$L++) {
echo 'Level '.$L.': '.experience($L).'<br />';
}
It is supposed the be the formula that RuneScape uses, you might me able to modify it to your needs.
Example output:
Level 1: 0
Level 2: 55
Level 3: 116
Level 4: 184
Level 5: 259
Level 6: 343
Level 7: 435
Level 8: 536
Level 9: 649
Level 10: 773
Here is a fast solution I used for a similar problem. You will likely wanna change the math of course, but it will give you the level from a summed xp.
$n = -1;
$L = 0;
while($n < $xp){
$n += pow(($L+1),3)+30*pow(($L+1),2)+30*($L+1)-50;
$L++;
}
echo("Current XP: " .$xp);
echo("Current Level: ".$L);
echo("Next Level: " .$n);
I take it what you're looking for is the amount of experience to decide what level they are on? Such as:
Level 1: 50exp
Level 2: 100exp
Level 3: 150exp ?
if that's the case you could use a loop something like:
$currentExp = x;
$currentLevel;
$i; // initialLevel
for($i=1; $i < 100; $i *= 3)
{
if( ($i*50 > $currentExp) && ($i < ($i+1)*$currentExp)){
$currentLevel = $i/3;
break;
}
}
This is as simple as I can make an algorithm for levels, I haven't tested it so there could be errors.
Let me know if you do use this, cool to think an algorithm I wrote could be in a game!
The original was based upon a base of 50, thus the 25 scattered across the equation.
This is the answer as a real equation. Just supply your multiplier (base) and your in business.
$_level = floor( floor( ($_multipliter/2)
+ sqrt( ($_multipliter^2) + ( ($_multipliter*2) * $_score) )
)
/ $_multipliter
) ;

Calculate average without being thrown by strays

I am trying to calculate an average without being thrown off by a small set of far off numbers (ie, 1,2,1,2,3,4,50) the single 50 will throw off the entire average.
If I have a list of numbers like so:
19,20,21,21,22,30,60,60
The average is 31
The median is 30
The mode is 21 & 60 (averaged to 40.5)
But anyone can see that the majority is in the range 19-22 (5 in, 3 out) and if you get the average of just the major range it's 20.6 (a big difference than any of the numbers above)
I am thinking that you can get this like so:
c+d-r
Where c is the count of a numbers, d is the distinct values, and r is the range. Then you can apply this to all the possble ranges, and the highest score is the omptimal range to get an average from.
For example 19,20,21,21,22 would be 5 numbers, 4 distinct values, and the range is 3 (22 - 19). If you plug this into my equation you get 5+4-3=6
If you applied this to the entire number list it would be 8+6-41=-27
I think this works pretty good, but I have to create a huge loop to test against all possible ranges. In just my small example there are 21 possible ranges:
19-19, 19-20, 19-21, 19-22, 19-30, 19-60, 20-20, 20-21, 20-22, 20-30, 20-60, 21-21, 21-22, 21-30, 21-60, 22-22, 22-30, 22-60, 30-30, 30-60, 60-60
I am wondering if there is a more efficient way to get an average like this.
Or if someone has a better algorithm all together?
You might get some use out of standard deviation here, which basically measures how concentrated the data points are. You can define an outlier as anything more than 1 standard deviation (or whatever other number suits you) from the average, throw them out, and calculate a new average that doesn't include them.
Here's a pretty naive implementation that you could fix up for your own needs. I purposely kept it pretty verbose. It's based on the five-number-summary often used to figure these things out.
function get_median($arr) {
sort($arr);
$c = count($arr) - 1;
if ($c%2) {
$b = round($c/2);
$a = $b-1;
return ($arr[$b] + $arr[$a]) / 2 ;
} else {
return $arr[($c/2)];
}
}
function get_five_number_summary($arr) {
sort($arr);
$c = count($arr) - 1;
$fns = array();
if ($c%2) {
$b = round($c/2);
$a = $b-1;
$lower_quartile = array_slice($arr, 1, $a-1);
$upper_quartile = array_slice($arr, $b+1, count($lower_quartile));
$fns = array($arr[0], get_median($lower_quartile), get_median($arr), get_median($upper_quartile), $arr[$c-1]);
return $fns;
}
else {
$b = round($c/2);
$a = $b-1;
$lower_quartile = array_slice($arr, 1, $a);
$upper_quartile = array_slice($arr, $b+1, count($lower_quartile));
$fns = array($arr[0], get_median($lower_quartile), get_median($arr), get_median($upper_quartile), $arr[$c-1]);
return $fns;
}
}
function find_outliers($arr) {
$fns = get_five_number_summary($arr);
$interquartile_range = $fns[3] - $fns[1];
$low = $fns[1] - $interquartile_range;
$high = $fns[3] + $interquartile_range;
foreach ($arr as $v) {
if ($v > $high || $v < $low)
echo "$v is an outlier<br>";
}
}
//$numbers = array( 19,20,21,21,22,30,60 ); // 60 is an outlier
$numbers = array( 1,230,239,331,340,800); // 1 is an outlier, 800 is an outlier
find_outliers($numbers);
Note that this method, albeit much simpler to implement than standard deviation, will not find the two 60 outliers in your example, but it works pretty well. Use the code for whatever, hopefully it's useful!
To see how the algorithm works and how I implemented it, go to: http://www.mathwords.com/o/outlier.htm
This, of course, doesn't calculate the final average, but it's kind of trivial after you run find_outliers() :P
Why don't you use the median? It's not 30, it's 21.5.
You could put the values into an array, sort the array, and then find the median, which is usually a better number than the average anyway because it discounts outliers automatically, giving them no more weight than any other number.
You might sort your numbers, choose your preferred subrange (e.g., the middle 90%), and take the mean of that.
There is no one true answer to your question, because there are always going to be distributions that will give you a funny answer (e.g., consider a biased bi-modal distribution). This is why may statistics are often presented using box-and-whisker diagrams showing mean, median, quartiles, and outliers.

How do I pick a selection of rows with the minimum date difference

The question was difficult to phrase. Hopefully this will make sense.
I have a table of items in my INVENTORY.
Let's call the items Apple, Orange, Pear, Potato. I want to pick a basket of FRUIT (1 x Apple,1 x Orange, 1 x Pear).
Each item in the INVENTORY has a different date for availability. So that...
Apple JANUARY
Apple FEBRUARY
Apple MARCH
Orange APRIL
Apple APRIL
Pear MAY
I don't want to pick the items in the order they appear in the inventory. Instead I want to pick them according to the minimum date range in which all items can be picked. ie Orange & Apple in APRIL and the pear in MAY.
I'm not sure if this is a problem for MYSQL or for some PHP arrays. I'm stumped. Thanks in advance.
If array of fruits isn't already sorted by date, let's sort it.
Now, the simple O(n^2) solution would be to check all possible ranges. Pseudo-code in no particular language:
for (int i = 0; i < inventory.length; ++i)
hash basket = {}
for (int j = i; j < inventory.length; ++j) {
basket.add(inventory[j]);
if (basket.size == 3) { // or whatever's the number of fruits
// found all fruits
// compare range [i, j] with the best range
// update best range, if necessary
break;
}
}
end
You may find it's good enough.
Or you could write a bit more complicated O(n) solution. It's just a sliding window [first, last]. On each step, we move either left border (excluding one fruit from the basket) or right (adding one fruit to the basket).
int first = 0;
int last = 0;
hash count = {};
count[inventory[0]] = 1;
while (true) {
if (count[inventory[first]] > 0) {
--count[inventory[first]];
++first;
} else if (last < inventory.length) {
++last;
++count[inventory[last]];
} else {
break;
}
if (date[last] - date[first] < min_range
&& count.number_of_nonzero_elements == 3) {
// found new best answer
min_range = date[last] - date[first]
}
}
Given you table inventory is structured:
fruit, availability
apple, 3 // apples in march
//user picks the availability month maybe?
$this_month = 5 ;
//or generate it for today
$this_month = date('n') ;
// sql
"select distinct fruit from inventory where availability = $this_month";
Sound quite complicated. The way that I would approach the problem is to group each fruit into its availability month group and see how many are in each group.
JANUARY (1)
FEBRUARY (1)
MARCH (1)
APRIL (2)
MAY (1)
To see that the most fruits fall within APRIL. So APRIL is therefore our preferred month.
I would then remove the items from months with duplicates (Apples in your example), which would remove MARCH as an option. This step could either be done now, or after the next step depending on your data and the results you get.
I would then look at the next most popular month and calculate how far away that month is (eg. JAN is 3 away from APRIL, MARCH is 1 etc). If you then had a tie then it shouldn't matter which you choose. In this example though you would end up choosing the 2 fruits from APRIL and 1 fruit from MAY as you requested.
This approach may not work if the most popular month doesn't actually result in the "best" selection.

Categories