Finding the nearest data point and sunshine data in php - php

I am in a big problem.The problem is following
1)I have a .dat file(dat file used to store the data ).It is a 25 MB file which contains
over 200K rows of latitude and longitudes.Example of one of such row is
-59.083 -26.583 9.4 5.2 3.3 4.3 8.1 6.6 5.3 8.4 8.3 10.0 9.1 5.1
It statrs form the latitude ,longitude,sunshine data (hours) in january,fabruary..and so on to decmber.
I have more than 200K rows like above.
My task is to calculate the sunshine hours in a particulr latitude and longitude.Suppose i have a latitude =3.86082 and longitude=100.50509 ;.So my task would be to find average sunshine hours per month on this latitude and longitude.but the second problem is that i am not going to have exact match of given latitude and longitude with the ones i have in the file.So first of all i have to find the nearest point and then i have to calculate the sunshine hours.
I am using the following code to calculate the nearest point.But it is taking a huge time beacuse of the bulk of data in the file
$file_name='grid_10min_sunp.dat';
$handle = fopen($file_name, "r");
$lat1=13.86082;
$lan1=100.50509;
$lat_lon_sunshines = make_sunshine_dict($file_name);
$closest = 500;
for($c=0;$c<count($lat_lon_sunshines);$c++)
{
$lat2=$lat_lon_sunshines[$c]['lat'];
$lan2=$lat_lon_sunshines[$c]['lan'];
$sunshines=$lat_lon_sunshines[$c]['sunshine'];
$lat_diff = abs(round((float)($lat1), 4)-$lat2);
if ($lat_diff < $closest)
{
$diff = $lat_diff + abs(round((float)($lan1), 4)-$lan2);
if($diff < $closest)
{
$closest = $diff;
$sunshinesfinal=$sunshines;
}
}
$sunshines='';
}
print_r($sunshinesfinal);die;
the function make_sunshine_dict($file_name ) also goes throgh each line of the file and prepares an array as following
$sunshines_dict = array();
$f = file_get_contents($file_name);
$handle = fopen($file_name, "r");
while($buffer = fgets($handle))
{
$tok = strtok($buffer, " \n\t");
$lat=$tok;
$latArray[]=$tok;
$tok = strtok(" \n\t");
$months = '';
$months = array();
for ($k = 0; $tok !== false; $k+=1)
{
if($k==0)
{
$lan=$tok;
$lanArray[]=$tok;
}
if($k!=0)
{
$months[] = $tok ;
"month $k : ".$months[$k]."<br>";
}
$tok = strtok(" \n\t");
}
$data[$kkk]['lat']=$lat;
$data[$kkk]['lan']=$lan;
foreach($months as $m=>$sunshine)
{
$sunshines=array();
$sumD = 0;
$iteration= 31;
for($n=1;$n<=$iteration;$n++)
{
$J = ($m+1)*$n;
$P = asin(.39795*cos(.2163108 + 2*atan(.9671396*tan(.00860*($J-186)))));
$value=(sin(0.8333*pi/180) + sin($lat*pi/180)*sin($P))/(cos($lat*pi/180)*cos($P));
/* $value ? ($value > 1 and 1) : $value;
$value ? ($value < -1 and -1): $value;*/
$D = 24 - ((24/pi) * acos($value));
$sumD = $sumD + $D;
}
$sunshinesdata=(($sumD/30)*(float)($sunshine)*.01);
$data[$kkk]['sunshine'][$m]=$sunshinesdata;
$sunshines='';
}
}
return $data;
Please help and please let me know if you require more information
And please remenber i can not use default php function for sunshine information here beacsue i am also taking cloud cover and other factors into consideration

A lot of the code looks wrong (in terms of just figuring out the sunshine hours). Although, I'll admit, I'm lost as to what you are doing in the make_sunshine_dict function.
In terms of helping you speed things up, you are reading your file twice:
$f = file_get_contents($file_name);
$handle = fopen($file_name, "r");
Also, you can map Lat/Long coordinates into a grid. In other words, make a 2 dimensional array where each row and each column represents 1 degree of latitude and 1 degree of longitude, respectively. As you read in your file, dump entries into the correct lat/long bucket in the grid. When you need to find the location closest to a given lat/lon then you only need to compare that location against the points in its bucket.

Load the .dat into a SQL database every x interval in a Cron job and then just query the database like you always would.

Related

Create a dictionary (array) from CSV data [duplicate]

This question already has an answer here:
Iterating through a column in a CSV file (PHP)
(1 answer)
Closed 9 hours ago.
I need to write a function that takes temperature as an input and returns a dictionary with years as keys and number of days as values.
CSV file looks like this (year, month, day, hour, temperature):
2019,1,1,0,0.1
2019,1,1,1,0.4
2019,1,1,2,0.8
2019,1,1,3,1.3
2019,1,1,4,1.8
...
2020,1,1,0,-3.9
The number of days is calculated by another function which I already have. It takes a year and a temperature and returns how many days in a given year the temperature was equal to or below the given temperature. Since the data is about hours, not days, the number of hours is found and then divided by 24.
The function:
function getDaysUnderTemp(int $targetYear, float $targetTemp): float {
$file = fopen("data/temperatures-filtered.csv", "r");
$hours = 0;
while ($data = fgetcsv($file)) {
if ($data[0] == $targetYear and $data[4] <= $targetTemp) {
$hours ++;
}
}
fclose($file);
return $hours / 24;
}
So as an example getDaysUnderTemp(2019, -10) returns 13.92.
This is a function I am asking about as I'm not sure how it might be done:
function getDaysUnderTempDictionary(float $targetTemp): array {
$file = fopen("data/temperatures-filtered.csv", "r");
while ($data = fgetcsv($file)) {
???
}
fclose($file);
return [];
}
The problem is I don't understand how an already written function could be implemented in this new one, and then create a required dictionary from all this data. Without using classes.
Desired output of getDaysUnderTempDictionary(-10):
Array
(
[2019] => 3.88
[2020] => 0.21
[2021] => 13.92
)
Instead of just storing hours as a single variable, store it in an array indexed by year. Once done with the CSV, loop over the array and divide each by 24, same as you did before.
This code has a mock CSV file for demo purposes, but is otherwise the same as yours:
function getDaysUnderTempDictionary(float $targetTemp): array {
//This is just for mocking a CSV file
$dataString = <<<EOT
2019,1,1,0,0.1
2019,1,1,1,0.4
2019,1,1,2,0.8
2019,1,1,3,1.3
2019,1,1,4,1.8
2020,1,1,0,-3.9
EOT;
$stream = fopen('php://memory', 'r+');
fwrite($stream, $dataString);
rewind($stream);
$years = [];
while ($data = fgetcsv($stream)) {
$year = $data[0];
if ($data[4] <= $targetTemp) {
if(!isset($years[$year])){
$years[$year] = 0;
}
$years[$year]++;
}
}
foreach($years as $year => $hours){
$years[$year] = $hours / 24;
}
return $years;
}
Demo: https://3v4l.org/RWFPK

Iterating through a column in a CSV file (PHP)

I need to write a function that takes a year and a temperature as an input and returns how many days in a given year the temperature was equal to or below the given temperature. Since the data is about hours, not days, need to find the number of hours and divide it by 24.
Example: getDaysUnderTemp(2019, -10) returns 13.92.
CSV file looks like this (year, month, day, hour, temperature):
2019,1,1,0,0.1
2019,1,1,1,0.4
2019,1,1,2,0.8
2019,1,1,3,1.3
2019,1,1,4,1.8
...
2020,1,1,0,-3.9
So far my code looks like this (I'm new to php):
function getDaysUnderTemp(int $targetYear, float $targetTemp): float {
$inputFile = fopen("data/temperatures-debug.csv", "r");
while (!feof($inputFile)) {
$file = fgetcsv($inputFile);
$hours = intval($file[3]);
if (intval($file[0]) === $targetYear and floatval($file[4]) <= $targetTemp) {
???
return ??? / 24;
}
fclose($inputFile);
}
return "error";
}
The problem is I don't know how to iterate through a column which represents the hours of a specific year, and then add them all together.
you are not far from the solution.
function getDaysUnderTemp($targetYear, $targetTemp): float
{
$hours = 0;
if ((($handle = fopen("data.csv", "r")) !== FALSE)) {
while (($data = fgetcsv($handle)) !== FALSE) {
// [0] -> year [1] -> month [2] -> day [3] -> hours [4] -> temp
if ($data[0] == $targetYear && $data[4] <= $targetTemp) {
$hours += $data[3];
}
}
fclose($handle);
}
return $hours / 24;
}
The logic you had was correct, I just refactored the code and fixed some small issues:
Your function is bound to return a float value but since in your code you are returning the string "error" when a match is not found It would have thrown an exception at some point. I am now returning $hours / 24 so now the problem is solved;
After each iteration, you close the pointer to file by calling fclose($inputFile), this can cause some problems since the feof function needs a valid stream not closed to work properly. Now the fclose function is called only once when the work is done;
For printing the results I suggest you to use the printf function as follows:
printf("%.2f", getDaysUnderTemp(2019, -10);

Calculate g-force from acceleration for 1 second interval

I have extracted a CSV file with accelerometer data (in m/s2) from GoPro metadata file (github library).
One second of accelerometer contains ~200 samples of data on 3 axis. A sample of this file looks like this:
In PHP, for each instantaneous value on X axis, I convert m/s2 like this:
function convert_meters_per_second_squared_to_g($ms2) {
// 1g = 9.80665 m/s2
return $ms2 * 0.101971621297793; // 1 / 9.80665 == 0.101971621297793
}
Sample code for 200 rows (1 second) of CSV file:
$acc_x_summed_up = 0;
if (($handle = fopen($filepath, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
list ($millis, $acc_x, $acc_y, $acc_z) = $data;
$acc_x_summed_up += $acc_x;
}
}
$g_force = convert_meters_per_second_squared_to_g($acc_x_summed_up);
But how do I show the g-force value for each second on X axis? I tried to sum up the values and convert them, but the result is clearly wrong, as I get values up to 63 G.
[ UPDATE: ]
The instant g-force values (all 3 axis, separated) are displayed on a graph (using highcharts). The gopro video file is displayed (using YouTube javascript API) side-by-side with the graph and played real time.
The graph and video are already working fine side by side. Only the g-force values are wrong.
Note: The video file has a g-force overlay (embeded in it) showing 2 axis (x,y).
I have rewarded #Joseph_J just because it seemed a good solution and because I'm forced to give the reward (over the weekend) by SO system. Thanks everyone for your answers!
I believe you are treating each instantaneous value as if it has occurred over 1 second, rather than instantaneously.
I'd say your best bet is to do each calculation by multiplying $acc_x by the resolution of your data divided by gravity's acceleration. So in your case, the resolution of your data is 5ms or one two-hundredth of a second, meaning your calculation should be $acc_x * 0.005/9.80665.
Using the information you provided, the 63G result that you got should be more like 0.315G. This seems more appropriate, though I'm not sure the context of the data.
EDIT: I forgot to mention that you should still sum all values that you receive from $acc_x * 0.005/9.80665 over 200 values, (you can choose to do this in blocks, or do it in running, doing in blocks will be less taxing on the system, but running will be more accurate). Pointed out by #Joseph_J
EDIT 2: As per your request of a source, I could not find much from calculating the average acceleration (and therefore g-force), but you can use the same principal behind average velocity from velocity over time graphs, however, I did find a scenario similar to yours here: Source and Source 2
Hope this helps!
As per my comment, summing it up doesn't work because force is not additive over time. What you want is to calculate the average acceleration:
function convert_meters_per_second_squared_to_g($acc_array) {
$acc_average = array_sum($acc_array)/count($acc_array);
return $acc_average * 0.101971621297793;
}
$acc_x_array = [];
if (($handle = fopen($filepath, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
list ($millis, $acc_x, $acc_y, $acc_z) = $data;
$acc_x_array[] = $acc_x;
}
}
$g_force = convert_meters_per_second_squared_to_g($acc_x_array);
Maybe your question can be seen as equivalent to asking for the net change in velocity between samples at one-second intervals?
In that sense, what you need to do is to integrate-up all the small accelerations in your 5ms intervals, so as to compute the net change in velocity over a period of one second (i.e. 200 samples). That change in velocity, divided by the 1-second interval, represents an average acceleration during that 1-second period.
So, in your case, what you'd need to do is to add up all the AcclX, AcclY & AcclZ values over a one-second period, and multiply by 0.005 to get the vector representing the change in velocity (in units of metres per second). If you then divide that by the one-second total extent of the time window, and by 9.80665m/s^2, you'll end up with the (vector) acceleration in units of G. If you want the (scalar) acceleration you can then just compute the magnitude of that vector, as sqrt(ax^2+ay^2+az^2).
You could apply the same principle to get an average acceleration over a different time-window, so long as you divide the sum of AcclX,AcclY,AcclY (after multiplying by the 0.005s inter-sample time) by the duration of the time window over which you've integrated. This is just like approximating the time-derivative of a function f(t) by (f(t+d) - f(t))/d. In fact, this is a better approximation to the derivative at the midpoint of the time-interval, namely t+d/2. For example, you could sum up the values over a 2s window, to get an average value at the centre of that 2s timespan. There's no need to just report these average accelerations every two seconds; instead you could simply move the window along 0.5s to get the next reported average acceleration 0.5s later.
THE UPDATED UPDATED SOLUTION
This solution will take your CSV and create an array containing your time, Ax, Ay, & Az values after they have been converted to G's. You should be able to take this array and feed it right into your graph.
The value displayed at each interval will be the average acceleration "at" the interval no before or after it.
I added a parameter to the function to allow for you to define how many intervals per second that you want to display on your graph. This will help smooth out your graph a bit.
I also set the initial and final values. Since this finds the average acceleration at the interval it needs data on both sides of the interval. Obviously at 0 we are missing the left hand side and on the last interval we are missing the right hand side.
I chose to use all the data from one interval to the next, this overlaps half the values from one interval to the next. This will smooth out(reduce the noise) of the averages instead of pickup up from one interval where the other left off. I added a parameter where you can toggle the overlap on and off.
Hope it works for you!
function formatAccelData($data, $split, $scale, $overlap = TRUE){
if(!$data || !$split || !$scale || !is_int($split) || !is_int($scale)){
return FALSE;
}
$g = 9.80665;
$round = 3;
$value1 = 1;
$value2 = 2;
if(!$overlap){ //Toggle overlapping data.
$value1 = 2;
$value2 = 1;
}
//Set the initial condition at t=0;
$results = array();
$results[0]['seconds'] = 0;
$results[0]['Ax'] = round(($data[0][1])/$g, $round);
$results[0]['Ay'] = round(($data[0][2])/$g, $round);
$results[0]['Az'] = round(($data[0][3])/$g, $round);
$count = 1;
$interval = (int)(1000/$split)/$scale;
for($i = $interval; $i < count($data); $i += $interval){
$Ax = $Ay = $Az = 0;
for($j = $i - ($interval/$value1); $j < $i + ($interval/$value1); $j++){
$Ax += $data[$j][1];
$Ay += $data[$j][2];
$Az += $data[$j][3];
}
$results[$count]['seconds'] = round($count/$scale, $round);
$results[$count]['Ax'] = round(($Ax/($interval * $value2))/$g, $round);
$results[$count]['Ay'] = round(($Ay/($interval * $value2))/$g, $round);
$results[$count]['Az'] = round(($Az/($interval * $value2))/$g, $round);
$count++;
}
array_pop($results); //We do this because the last interval
//will not have enought data to be calculated.
//Set the final condition with the data from the end of the last complete interval.
$results[$count - 1]['seconds'] = round(($count - 1)/$scale, $round);
$results[$count - 1]['Ax'] = round(($data[$i - $interval][1])/$g, $round);
$results[$count - 1]['Ay'] = round(($data[$i - $interval][2])/$g, $round);
$results[$count - 1]['Az'] = round(($data[$i - $interval][3])/$g, $round);
return $results;
}
To use:
$data = array_map('str_getcsv', file($path));
$split = 5; //(int) - # of milliseconds inbetween datapoints.
$scale = 4; // (int) # of data points per second you want to display.
$overlap = TRUE; //(Bool) - Overlap data from one interval to the next.
$results = formatAccelData($data, $split, $scale, $overlap);
print_r($results);
THE OLD UPDATED SOLUTION
Remember, this function takes the average leading up to the interval. So it's really a half an interval behind.
function formatAccelData($data, $step){
$fps = 1000/$step;
$second = 1;
$frame = 0;
$count = 0;
for($i = 0; $i < count($data); $i += $fps){
$Ax = $Ay = $Az = 0;
for($j = 0; $j < $fps; $j++){
$Ax += $data[$frame][1];
$Ay += $data[$frame][2];
$Az += $data[$frame][3];
$frame++;
}
$results[$count]['seconds'] = $second;
$results[$count]['Ax'] = ($Ax/$fps) * 0.101971621297793;
$results[$count]['Ay'] = ($Ay/$fps) * 0.101971621297793;
$results[$count]['Az'] = ($Az/$fps) * 0.101971621297793;
$second++;
$count++;
}
return $results;
}
How to use:
$data = array_map('str_getcsv', file($path));
$step = 5; //milliseconds
$results = formatAccelData($data, $step);
print_r($results);

Exponential Moving Average in php

I want to calculate the EMA (Exponential Moving Average) value in PHP.
I've tried with following code but it's giving me 500 error.
$real = array(12,15,17,19,21,25,28,12,15,16);
$timePeriod = 3;
$data = trader_ema($real,$timePeriod);
var_dump($data);
PHP: EMA calculation function trader-ema
Tried with long time Googling but not getting any help on this in PHP. So, I've no clue what needs to be done to calculate the EMA value.
Edit-1: Installed extensions
I've installed all the necessary extensions, Now I am getting the output.
But it doesn't seems giving proper output.
I think PHP function for calculating EMA is not working properly.
Any help in this would be greatly appreciated.
I recommend to use the math library from:
https://github.com/markrogoyski/math-php
public static function exponentialMovingAverage(array $numbers, int $n): array
{
$m = count($numbers);
$α = 2 / ($n + 1);
$EMA = [];
// Start off by seeding with the first data point
$EMA[] = $numbers[0];
// Each day after: EMAtoday = α⋅xtoday + (1-α)EMAyesterday
for ($i = 1; $i < $m; $i++) {
$EMA[] = ($α * $numbers[$i]) + ((1 - $α) * $EMA[$i - 1]);
}
return $EMA;
}
The trader extension for PHP actually looks quite promising. The underlying code looks very mature, and I notice at the time of writing the latest stable PHP module (0.5.1) was released at the first of this year with support for PHP8.
It may take some reading of the documentation, for example the note around trader_set_unstable_period, and god-forbid, the trader source code to become proficient.
If I do a quick installation of the trader module in a PHP Docker container
apt-get update
pecl install trader
docker-php-ext-enable trader
using the article from here as a benchmark
and put together a simple test script comparing the function supplied by #Tryke and trader_ema
function exponentialMovingAverage(array $numbers, int $n): array
{
$m = count($numbers);
$α = 2 / ($n + 1);
$EMA = [];
// Start off by seeding with the first data point
$EMA[] = $numbers[0];
// Each day after: EMAtoday = α⋅xtoday + (1-α)EMAyesterday
for ($i = 1; $i < $m; $i++) {
$EMA[] = ($α * $numbers[$i]) + ((1 - $α) * $EMA[$i - 1]);
}
return $EMA;
}
function merge_results($input, $avgs) {
$results = [];
$empty = count($input) - count($avgs);
foreach($input as $i => $price) {
$results[] = $i < $empty ? [$price, null] : [$price, round($avgs[$i], 2)];
}
return $results;
}
$real = [
22.27,
22.19,
22.08,
22.17,
22.18,
22.13,
22.23,
22.43,
22.24,
22.29,
22.15,
22.39,
22.38,
22.61,
23.36,
24.05,
23.75,
23.83,
23.95,
23.63,
23.82,
23.87,
23.65,
23.19,
23.10,
23.33,
22.68,
23.10,
22.40,
22.17
];
$timePeriod = 10;
$traderData = trader_ema($real,$timePeriod);
echo "trader ema\n";
var_dump(merge_results($real, $traderData));
$phpData = exponentialMovingAverage($real, 3);
echo "\n\nphp ema\n";
var_dump(merge_results($real, $phpData));
The results of the trader_ema match exactly. The results from Tryke's function do not. It seems to have results starting on the first day, whereas my expectation (and the output of the trader_ema and benchmark numbers reflect) is that there are no results until the $timePeriod has elapsed. See this note from the Investopedia article on EMA
Calculating the EMA requires one more observation than the SMA.
Suppose that you want to use 20 days as the number of observations for
the EMA. Then, you must wait until the 20th day to obtain the SMA. On
the 21st day, you can then use the SMA from the previous day as the
first EMA for yesterday.

Compare two charts and get percentage of similarity

question:How to compare two chart ranges, and get percentage as a result.
I want to compare two chart parts, (range of 60 values), and as a result get percentage of difference. So i can find very similar chart curves.
(Example:Get all charts that are 90% similar to this one)
For every range, data is stored in 60 number array.
Every range starts with 0 and all next numbers represent chart value from that moment.("+" chart goes up and "-" chart goes down)
$range1 = array(0.00,-0.90,2.10,0.10,-3.40,-4.30,-1.90,-0.30,0.00,0.10,-0.60,-0.20,-0.30,-0.30,1.00,-0.90,-0.50,1.00,2.80,5.00,5.50,5.20,6.70,5.50,5.70,7.30,6.00,5.10,5.30,11.10,10.90,9.00,7.10,6.60,7.00,5.50,5.50,12.60,15.60,14.30,18.50,16.60,16.60,20.30,20.60,18.10,16.10,19.10,14.40,18.70,17.40,17.80,17.20,19.90,20.60,17.70,17.00,17.50,16.70,14.70);
$range2 = array(0.00,-2.90,-3.60,-3.10,-3.90,-5.90,-11.80,-8.40,-8.00,-8.40,-8.20,-7.00,-7.60,-7.30,-5.10,-7.20,-7.30,-7.40,-7.70,-8.90,-9.30,-9.30,-9.90,-7.50,-11.70,-12.20,-19.80,-19.60,-19.90,-19.00,-22.10,-19.10,-20.10,-18.90,-19.70,-19.90,-16.50,-23.70,-26.60,-24.20,-28.30,-27.00,-28.60,-28.90,-22.90,-24.00,-25.10,-24.30,-18.40,-31.70,-29.80,-29.00,-29.50,-28.30,-35.50,-27.60,-34.00,-32.80,-36.00,-34.40,);
$result = some_specific_manual_written_function($range1, $range2);
//as a result i want get percentage or something else from what i can how similar charts are
I will do it reading csv files and then storing it to db, so it can be done with php or python.
Try this
function some_specific_manual_written_function($range1, $range2)
{
$match = 0;
for($i=0; count($range1); $i++)
{
if(in_array($range[$i],$range2))
{
$match++;
}
}
$percentage_match = (count($range1)/$match ) * 100;
echo "Perncetage Match is : ".$percentage_match."%";
return $percentage_match;
}
function similiar($range1, $range2) {
$i = 0;
foreach ($range1 as $k => $v) {
if ($range2[$k] == $v) {
$i++;
}
}
$return ($i/count($range1))*100);
}

Categories