Basic PHP/MySQL math example - php

I have a PHP form that grabs user-entered data and then posts it to a MySQL database. I'd like to know, how I can take the mathematical difference between two fields and post it to a third field in the database?
For example, I'd like to subtract "travel_costs" from "show_1_price" and write the difference to the "total_cost" variable. What's the best way to do this? Thanks so much.

You can lately process a select query: SELECT show_1_price - travel_costs AS pricediff FROM my_table; and then grab value in php and again do an insert query...

Should be simple to do on the PHP side of things how about
query=sprintf("INSERT INTO table VALUES(%d, %d, %d)", travel_costs,
show_1_price, show_1_price - travel_cost);
Generally though it is bad form to store a value in a database that can be calculated from other values. The reason being that you may never ever access this value again yet you are using storage for it. CPU cycles are much more abundant today so calculate the value when need. This is not a golden rule though - there are times when it could be more efficient to store the calculated value - although this is not usually the case.

Related

php : speed up levensthein comparing, 10k + records

In my MySQL table I have the field name, which is unique. However the contents of the field are gathered on different places. So it is possible I have 2 records with a very similar name instead of second one being discarded, due to spelling errors.
Now I want to find those entries that are very similar to another one. For that I loop through all my records, and compare the name to other entries by looping through all the records again. Problem is that there are over 15k records which takes way too much time. Is there a way to do this faster?
this is my code:
for($x=0;$x<count($serie1);$x++)
{
for($y=0;$y<count($serie2);$y++)
{
$sim=levenshtein($serie1[$x]['naam'],$serie2[$y]['naam']);
if($sim==1)
print("{$A[$x]['naam']} --> {$B[$y]['naam']} = {$sim}<br>");
}
}
}
A preamble: such a task will always be time consuming, and there will always be some pairs that slip through.
Nevertheless, a few ideas :
1. actually, the algorithm can be (a bit) improved
assuming that $series1 and $series2 have the same values in the same order, you don't need to loop over the whole second array in the inner loop every time. In this use case you only need to evaluate each value pair once - levenshtein('a', 'b') is sufficient, you don't need levenshtein('b', 'a') as well (and neither do you need levenstein('a', 'a'))
under these assumptions, you can write your function like this:
for($x=0;$x<count($serie1);$x++)
{
for($y=$x+1;$y<count($serie2);$y++) // <-- $y doesn't need to start at 0
{
$sim=levenshtein($serie1[$x]['naam'],$serie2[$y]['naam']);
if($sim==1)
print("{$A[$x]['naam']} --> {$B[$y]['naam']} = {$sim}<br>");
}
}
2. maybe MySQL is faster
there examples in the net for levenshtein() implementations as a MySQL function. An example on SO is here: How to add levenshtein function in mysql?
If you are comfortable with complex(ish) SQL, you could delegate the heavy lifting to MySQL and at least gain a bit of performance because you aren't fetching the whole 16k rows into the PHP runtime.
3. don't do everything at once / save your results
of course you have to run the function once for every record, but after the initial run, you only have to check new entries since the last run. Schedule a chronjob that once every day/week/month.. checks all new records. You would need an inserted_at column in your table and would still need to compare the new names with every other name entry.
3.5 do some of the work onInsert
a) if the wait is acceptable, do a check once a new record should be inserted, so that you either write it to a log oder give a direct feedback to the user. (A tangent: this could be a good use case for an asynchrony task queue like http://gearman.org/ -> start a new process for the check in the background, return with the success message for the insert immediately)
b) PHP has two other function to help with searching for almost similar strings: metaphone() and soundex() . These functions generate abstract hashes that represent how a string will sound when spoken. You could generate (one or both of) these hashes on each insert, store them as a separate field in your table and use simple SQL functions to find records with similar hashes
The trouble with levenshtein is it only compares string a to string b. I built a spelling corrector once that puts all the strings a into a big trie, and that functioned as a dictionary. Then it would look up any string b in that dictionary, finding all nearest-matching words. I did it first in Fortran (!), then in Pascal. It would be easiest in a more modern language, but I suspect php would not make it easy. Look here.

MySQL Query Between Two Ranges

I need help with a query. I am taking input from a user where they enter a range between 1-100. So it could be like 30-40 or 66-99. Then I need a query to pull data from a table that has a high_range and a low_range to find a match to any number in their range.
So if a user did 30-40 and the table had entries for 1-80, 21-33, 32-40, 40-41, 66-99, and 1-29 it would find all but the last two in the table.
What is the easiest why to do this?
Thanks
If I understood correctly (i.e. you want any range that overlaps the one entered by the user), I'd say:
SELECT * FROM table WHERE low <= $high AND high >= $low
What I understood is that the range is stored in this format low-high. If that is the case, then this is a poor design. I suggest splitting the values into two columns: low, and high.
If you already have the values split, you can use some statement like:
SELECT * FROM myTable WHERE low <= $needleHigherBound AND high >= $needleLowerBound
If you have the values stored in one column, and insist they stay so, You might find the SUBSTRING_INDEX function of MySQL useful. But in this case, you'll have to write a complicated query to parse all the values of all the rows, and then compare them to your search values. It seems like a lot of effort to cover up a design flaw.

Advanced statistics in PHP and MySQL

I have a slight problem. I have a dataset, which contains values measured by a weather station, which I want to analyze further using MySQL database and PHP.
Basically, the first column of the db contains the date and the other columns temperature, humidity, pressure etc.
Now, the problem is, that for the calculation of the mean, st.dev., max, min etc. it is quite simple. However there are no build-in commands for other parameters which I need, such as kurtosis etc.
What I need is for example to calculate the skewness, mean, stdev etc. for the individual months, then days etc.
For the build-in functions it is easy, for example finding some of the parameters for the individual months would be:
SELECT AVG(Temp), STD(Temp), MAX(Temp)
FROM database
GROUP BY YEAR(Date), MONTH(Date)
Obviously I cannot use this for the more advanced parameters. I thought about ways of achieving this and I could only think of one solution. I manually wrote a function, which processes the values and calculates the things such as kurtosis using the particular formulae. But, what that means is that I would need to create arrays of data for each month, day, etc. depending on what I am currently calculating. So for example, i would first need to take the data and split it into arrays lets say Jan11, Feb11, Mar11...... and each array would contain the data for that month. Then I would apply the function on those arrays and create new variables with the result (lets say kurtosis_jan11, kurtosis_feb11 etc.)
Now to my question. I need help with the splitting of data. The problem is that I dont know in advance which month the data starts and which it ends, so I cannot set fixed variables for this. The program first has to check the first month and then create new array for each month, day etc. until it reaches the last record. And for each it would create the array.
That of course would be maybe one solution but if anyone has any other ideas about how to go around this problem I would very much appreciate your help.
You can do more complex queries to achieve this. Here are some examples http://users.drew.edu/skass/sql/ , including Skew
SELECT AVG(Temp), STD(Temp), MAX(Temp)
FROM database
GROUP BY YEAR(Date), MONTH(Date)
having date between date_from and date_to
I think you want a group of data in between a data range.

Are date calculations faster in PHP or MySQL?

A while back a database administrator mentioned to me that some server-side programmers don't utilize SQL as often as they should. For instance, when it comes making time-based calculations, he claims that SQL is better suited.
I didn't give that much consideration since it didn't really affect what I was doing. However, now I am making considerable time-based calculations. Typically, I have used PHP for this in the past. For the sake of performance, I am curious as to whether SQL would be more efficient.
For example, these are some of the tasks I have been doing:
$todaysDate = date("d-m-Y");
$todayStamp = strtotime($todaysDate); //Convert date to unix timestamp for comparison
$verifyStamp = strtotime($verifyDate); //Convert submitted date to unix timestamp for comparison
//The date comparison
if((strtotime($lbp) <= $verifyStamp) && ($verifyStamp <= $todayStamp)){
return true;
}
else {
$invalid = "$verifyDate is outside the valid date range: $lbp - $todaysDate.";
return $invalid;
}
The variables aren't that important - it's just to illustrate that I am making comparisons, adding time to current dates, etc.
Would it be beneficial if I were to translate some or all of these tasks to SQL? Note that my connection to my database is via PDO and that I usually have to create a new connection. Also, my date calculations typically will be inserted into a database. So when I say that I'm making comparisons or adding time to a current date, I mean that I'm making these calculation before adding whatever results from them to a query:
i.e. $result = something...INSERT INTO table VALUE = $result
The calculations could just as easily be INSERT INTO table VALUE = DATE_ADD(...
Any input is appreciated.
The overhead of talking to the database would negate any and all advantages it may or may not have. It's simple: if you're in PHP anyway, do the calculations in PHP. If the data you want to do calculations on is in the database, do it in the database. Don't transition between systems just because unless you can really proof that it saves you a ton of time to do so (most likely it doesn't). What you're showing is child's play in either system, it hardly gets any faster as it is.
Well when you consider SQL with any of the programming language, then using SQL is more preferable for calculations than any other language.
If you consider Php and SQL then I would like to tell you what I have realized from my analysis..
The PHP architecture is a client-server architecture, that is Client sends a HTTP-Request to the Server and the server responds back to the client with HTTP-Response
One the backside of the server, the server generates a simple HTML Format page which is static that page is generated using the dynamic codes of PHP on the server.
Now the total time is:
HTTP-Request + SQL-Query + Fetching data from SQL Query + Data Manipulation of SQL Data + Php-to-HTMLGeneration + HTTP-Response
But if in case you use the calculations to be done within the SQL Query itself then the time for Data Manipulation of SQL in php would be saved. As the Php would have to deal with the datas explicitly.
So the total time would be:
HTTP-Request + SQL-Query + Fetching data from SQL Query + Php-to-HTMLGeneration + HTTP-Response
This may look almost equal if you are dealing with less amount of data. But for an instance if you are dealing with 1000 of rows in one query then a loop in php which would run 1000 time would be more time consuming than running a single query which would calculate the complete 1000 row in just one command.
One thing to consider is how many date calculations you are performing and where in the query your conversion is taking place. If you are searching a DB of 10 million records and you are converting a DateTime field into a Unix Timestamp inside of a WHERE clause for every single record and only ending up with 100 records in the query result it would be less efficient to use SQL to perform that conversion on 10 million records than it would be to use PHP to convert the DateTime object into a Timestamp on only the resulting 100 records.
Granted, only the result of 100 records would be converted anyway if you put the conversion in the select statement so it would be pretty much the same.

MySQLi query vs PHP Array, which is faster?

I'm developing an algorithm for intense calculations on multiple huge arrays. Right now I have used PHP arrays to do the job but, it seems slower than what I needed it to be. I was thinking on using MySQLi tables and convert the php arrays into database rows and then start the calculations to solve the speed issue.
At the very first step, when I was converting a 20*10 PHP array into 200 rows of database containing zeros, it took a long time. Here is the code: (Basically the following code is generating a zero matrix, if you're interested to know)
$stmt = $mysqli->prepare("INSERT INTO `table` (`Row`, `Col`, `Value`) VALUES (?, ?, '0')");
for($i=0;$i<$rowsNo;$i++){
for($j=0;$j<$colsNo;$j++){
//$myArray[$j]=array_fill(0,$colsNo,0);
$stmt->bind_param("ii", $i, $j);
$stmt->execute();
}
}
$stmt->close();
The commented-out line "$myArray[$j]=array_fill(0,$colsNo,0);" would generate the array very fast while filling out the table in next two lines, took a very longer time.
Array time: 0.00068 seconds
MySQLi time: 25.76 seconds
There is a lot more calculating remaining and I got worried even after modifying numerous parts it may get worse. I searched a lot but I couldn't find any answer on whether the array is a better choice or mysql tables? Has anybody done or know about any benchmarking test on this?
I really appreciate any help.
Thanks in advance
UPDATE:
I did the following test for a 273*273 matrix. I created two versions for the same data. First one, a two-dimension PHP array and the second one, a table with 273*273=74529 rows, both containing the same data. The followings are the speed test results for retrieving similar data from both [in here, finding out which column(s) of a certain row has a value equal to 1 - the other columns are zero]:
It took 0.00021 seconds for the array.
It took 0.0026 seconds for mysqli table. (more than 10 times slower)
My conclusion is sticking to the arrays instead of converting them into database tables.
Last thing to say, in case the mentioned data is stored in the database table in the first place, generating an array and then using it would be much much slower as shown below (slower due to data retrieval from database):
It took 0.9 seconds for the array. (more than 400 times slower)
It took 0.0021 seconds for mysqli table.
The main reason is not that the database itself is slower. The main reason is that the database access the hard-drive to store data and PHP functions use only the RAM memory to execute this procedure, wich is faster than the Hard-Drive.
Although there is a way to speed up your insert queries (most likely you are using innodb table without transaction), the very statement of question is wrong.
A database intended - in the first place - to store data. To store it permanently. It does it well. It can do calculations too, but again - before doing any calculations there is one necessary step - to store data.
If you want to do your calculations on a stored data - it's ok to use a database.
If you want to push your data in database only to calculate it - it makes not too much sense.
In my case, as shown on the update part of the question, I think arrays have better performance than mysql databases.
Array usage showed 10 times faster response even when I search through the cells to find desired values in a row. Even good indexing of the table couldn't beat the array functionality and speed.

Categories