Get and update column of all rows in a table

Get and update column of all rows in a table - php

I'm developing a game and I want a script to run every minute (with the help of cron job, I guess).
So the script that I want to run takes the maximum HP of a character (from a column in a database table), calculates to get 10% of that value and then add those 10% to the characters current hp (which is another column in the table). I then want to iterate this over all rows in the database table.
E.g. consider the following table:
charname current_hp max_hp
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
player1 20 30
player2 15 64
player3 38 38
After the script has been run, I want the table to look like this:
charname current_hp max_hp
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
player1 23 30
player2 21 64
player3 38 38
So, I know how I technically could implement this, e.g.
$characters = $db->getCharacterList();
foreach ($characters as $character) {
$maxHp = $character['max_hp'];
$curHp = $character['current_hp'];
$hpToAdd = $maxHp / 10;
if (($curHp + $hpToAdd) >= $maxHp) {
$db->update('characters', $character['id'], array(
'current_hp' => $maxHp
));
} else {
$db->update('characters', $character['id'], array(
'current_hp' => ($curHp + $hpToAdd)
));
}
}
My only question is: Is the solution posted above an efficient way to implement this? Will this work on a table with, say, 10 000 rows, or will it take too long?
Thanks in advance.

Are you using a SQL database? If you have 10,000 rows, your solution will hit the database 10,000 times. A SQL statement like this will only make one database hit:
UPDATE characters
SET current_hp = LEAST(max_hp, current_hp + (max_hp / 10))
The database will still have to do the I/O necessary to update all 10,000 rows, but it will happen more efficiently than thousands of individual queries.

In One SQL statement will do all your requirements for more performance remove any indexes on this table to keep it fast
UPDATE test
SET current_hp= CASE
WHEN ((max_hp/10)+current_hp) >=max_hp THEN max_hp
ELSE ((max_hp/10)+current_hp)
END

Related

PHP - Optimising preg_match of thousands of patterns

So I wrote a script to extract data from raw genome files, heres what the raw genome file looks like:
# rsid chromosome position genotype
rs4477212 1 82154 AA
rs3094315 1 752566 AG
rs3131972 1 752721 AG
rs12124819 1 776546 AA
rs11240777 1 798959 AG
rs6681049 1 800007 CC
rs4970383 1 838555 AC
rs4475691 1 846808 CT
rs7537756 1 854250 AG
rs13302982 1 861808 GG
rs1110052 1 873558 TT
rs2272756 1 882033 GG
rs3748597 1 888659 CT
rs13303106 1 891945 AA
rs28415373 1 893981 CC
rs13303010 1 894573 GG
rs6696281 1 903104 CT
rs28391282 1 904165 GG
rs2340592 1 910935 GG
The raw text file has hundreds of thousands of these rows, but I only need specific ones, I need about 10,000 of them. I have a list of rsids. I just need the genotype from each line. So I loop through the rsid list and use preg_match to find the line I need:
$rawData = file_get_contents('genome_file.txt');
$rsids = $this->get_snps();
while ($row = $rsids->fetch_assoc()) {
$searchPattern = "~rs{$row['rsid']}\t(.*?)\t(.*?)\t(.*?)\n~i";
if (preg_match($searchPattern,$rawData,$matchedGene)) {
$genotype = $matchedGene[3]);
// Do something with genotype
}
}
NOTE: I stripped out a lot of code to just show the regexp extraction I'm doing. I'm also inserting each row into a database as I go along. Heres the code with the database work included:
$rawData = file_get_contents('genome_file.txt');
$rsids = $this->get_snps();
$query = "INSERT INTO wp_genomics_results (file_id,snp_id,genotype,reputation,zygosity) VALUES (?,?,?,?,?)";
$stmt = $ngdb->prepare($query);
$stmt->bind_param("iissi", $file_id,$snp_id,$genotype,$reputation,$zygosity);
$ngdb->query("START TRANSACTION");
while ($row = $rsids->fetch_assoc()) {
$searchPattern = "~rs{$row['rsid']}\t(.*?)\t(.*?)\t(.*?)\n~i";
if (preg_match($searchPattern,$rawData,$matchedGene)) {
$genotype = $matchedGene[3]);
$stmt->execute();
$insert++;
}
}
$stmt->close();
$ngdb->query("COMMIT");
$snps->free();
$ngdb->close();
}
So unfortunately my script runs very slowly. Running 50 iterations takes 17 seconds. So you can imagine how long running 18,000 iterations is gonna take. I'm looking into ways to optimise this.
Is there a faster way to extract the data I need from this huge text file? What if I explode it into an array of lines, and use preg_grep(), would that be any faster?
Something I tried is combining all 18,000 rsids into a single expression (i.e. (rs123|rs124|rs125) like this:
$rsids = get_rsids();
$rsid_group = implode('|',$rsids);
$pattern = "~({$rsid_group })\t(.*?)\t(.*?)\t(.*?)\n~i";
preg_match($pattern,$rawData,$matches);
But unfortunately it gave me some error message about exceeding the PCRE expression limit. The needle was way too big. Another thing I tried is adding the S modifier to the expression. I read that this analyses the pattern in order to increase performance. It didn't speed things up at all. Maybe maybe pattern isn't compatible with it?
So then the second thing I need to try and optimise is the database inserts. I added a transaction hoping that would speed things up but it didn't speed it up at all. So I'm thinking maybe I should group the inserts together, so that I insert multiple rows at once, rather than inserting them individually.
Then another idea is something I read about, using LOAD DATA INFILE to load rows from a text file. In that case, I just need to generate a text file first. Would it work out faster to generate a text file in this case I wonder.
EDIT: It seems like whats taking up most time is the regular expressions. Running that part of the program by itself, it takes a really long time. 10 rows takes 4 seconds.

This is slow because you're searching a vast array of data over and over again.
It looks like you have a text file, not a dbms table, containing lines like these:
rs4477212 1 82154 AA
rs3094315 1 752566 AG
rs3131972 1 752721 AG
rs12124819 1 776546 AA
It looks like you have some other data structure containing a list of values like rs4477212. I think that's already in a table in the dbms.
I think you want exact matches for the rsxxxx values, not prefix or partial matches.
I think you want to process many different files of raw data, and extract the same batch of rsxxxx values from each of them.
So, here's what you do, in pseudocode. Don't load the whole raw data file into memory, rather process it line by line.
Read your rows of rsid values from the dbms, just once, and store them in an associative array.
for each file of raw data....
for each line of data in the file...
split the line of data to obtain the rsid. In php, $array = explode(" ", $line, 2); will yield your rsid in $array[0], and do it fast.
Look in your array of rsid values for this value. In php, if ( array_key_exists( $array[0], $rsid_array )) { ... will do this.
If the key does exist, you have a match.
extract the last column from the raw text line ('GC or whatever)
write it to your dbms.
Notice how this avoids regular expressions, and how it processes your raw data line by line. You only have to touch each line of raw data once. That's good, because your raw data is also your largest quantity of data. It exploits php's associative array feature to do the matching. All that will be much faster than your method.
To speed the process of inserting tens of thousands of rows into a table, read this. Optimizing InnoDB Insert Queries

+1 to #Ollie Jones' answer. He posted while I was working on my answer. So here's some code to get you started.
$rsids = $this->get_snps();
while ($row = $rsids->fetch_assoc()) {
$key = 'rs' . $row['rsid'];
$rsidHash[$key] = true;
}
$rawDataFd = fopen('genome_file.txt', 'r');
while ($rawData = fgetcsv($rawDataFd, 80, "\t")) {
if (array_key_exists($rawData[0], $rsidHash)) {
$genotype = $rawData[3];
// do something with genotype
}
}

I wanted to give the LOAD DATA INFILE approach to see how well that works, so I came up with what I thought is a nice elegant approach, heres the code:
$file = 'C:/wamp/www/nutri/wp-content/plugins/genomics/genome/test';
$data_query = "
LOAD DATA LOCAL INFILE '$file'
INTO TABLE wp_genomics_results
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
IGNORE 18 ROWS
(#rsid,#chromosome,#locus,#genotype)
SET file_id = '$file_id',
snp_id = (SELECT id FROM wp_pods_snp WHERE rsid = SUBSTR(#rsid,2)),
genotype = #genotype
";
$ngdb->query($data_query);
I put a foreign key restraint on the snp_id (thats the ID for my table of RSIDs) column so that it only enters genotypes for rsids that I need. Unfortunately this foreign key restraint caused some kind of error which locked the tables. Ah well. It might not have been a good approach anyhow since there are on average 200,000 rows in each of these genome files. I'll go with Ollie Jones approach since that seems to be the most effective and viable approach I've come across.

How to work with two large sets of data from one mySQL database table in PHP?

I'm attempting to work with two sets of data from the same mySQL table in a PHP script. The idea is data is scraped from an API and into a database hourly. A second script then pulls the information out of the database and displays a rolling 6-hour delta.
I've run into a bit of a problem trying to create the delta from the two datasets. I need to run two mySQL queries to get the data I need (current and from 6 hours ago), but can't really think of a way to get the script to work without including the queries inside the loops that output each entry (These can run up to a couple of hundred times, and I don't think having that many mySQL queries running would be good?)
This is what I have so far:
//select the system table and pulls data acquired within the last hour.
$sql = "SELECT system, vp, vpthreshold, owner, time FROM SysData WHERE time > DATE_SUB(NOW(),INTERVAL 1 HOUR)";
$result = $conn->query($sql);
if ($result->num_rows > 0) {
// output data of each row
while($row = $result->fetch_assoc()) {
//Calculates contested percentage
$vpthreshold = $row["vpthreshold"];
$vp = $row["vp"];
$currentcontested = $vp/$vpthreshold*100;
//Catches potential divide by zeroes, echos system is stable.
if ($vp == 0.0){
echo $row["system"] . " " . "is Stable<br>";
}
//else output contested percentage with system name in readable format.
else{
echo $row["system"] . " " . "{$currentcontested}%" . "<br>";
}
}
}
There's a broadly identical statement that pulls and echos the second set of information underneath this. How can I get these two sets together so I can work with them? Very new to PHP and learning on the fly here.

You can look into nested queries. Something like the following:
SELECT (data_now.somevalue - data_prev.somevalue) as deltavalue FROM
(
(your first select statement) as data_now,
(your 6 hours ago select statement) as data_prev
);
This lets you select data from other select statements all in one go.
The 2 inner "select statements" you should replace with your respective queries. The results will be put temporarily into data_now and data_prev. You can then use these as normal tables in the outer select statement.
EDIT: To be more specific to what you want, here is an updated example:
SELECT (data_now.vp/data_now.vpthreshold - data_prev.vp/data_prev.vpthreshold) as deltavalue FROM
(
(SELECT system, vp, vpthreshold, owner, time FROM SysData WHERE time > DATE_SUB(NOW(),INTERVAL 1 HOUR)) as data_now,
(your 6 hours ago select statement) as data_prev
);
In your PHP code remember to reference the result as:
$row["deltavalue"]
or whatever you put after "as" in the outer SELECT.

Unique auto increment reference number that resets at the 1st of each month

I have a database table with various fields involving jobs done on ships including a field named created which uses DATE format. The result i want to achieve is to have a unique reference number for each job. The format i want to use for this reference number is:
example : Lets say the date of the job is 23/11/2013 like today. Then the number would be 1311/1 the next job 1311/2 and goes on. If the month changes and the date of the next job is for example 15/12/2013 the refence number i would like to have if its the first job of the month is 1312/1.
So the two first digits of my reference number would show the year,the next two the month and the number after the slash i would like it to be an auto_increment number that will reset each month.My code so far is :
$job_num = 1;
foreach($random as $rand) {
$vak = $rand->created;
$gas = $rand->id;
$vak1 = substr($vak, 2, 2);
$vak2 = substr($vak, 5, -3);
$vak3 = substr($vak, 8, 10);
if(date(j) > 1) {
echo $vak1.$vak2.'/'.$job_num.'<br>';
$job_num++;
} else {
$job_num = 1;
echo $vak1.$vak2.'/'.$job_num.'<br>';
$job_num++;
}
}
So as u can see i want to achieve all this inside a foreach statement. And although the above code kinda works,the porblem i have is that at the 1st of any month in other words when date(j) = 1 if i insert more than one job in my database the $job_num variable resets as many times as the jobs i have inserted resulting in identical refence numbers.
I am really new in programming and php so if anyone could help me solve this, i would really appreciate it.
Thanks in advance:)

You can't do this with the auto-increment mechanism if you use InnoDB, which is MySQL's default storage engine.
You can do it with the MyISAM storage engine, but you really shouldn't use MyISAM, for many reasons.
So you'll have to assign the repeating numbers yourself. This means you have to lock the table while you check what is the current maximum number for the given month, then insert a new row with the next higher number.
If that seems like it would impair concurrent access to the table, you're right. Keep in mind that MyISAM does a table-lock during insert/update/delete of any row.

If you can use the MyISAM engine, you can get this behavior without procedural code.
create table demo (
yr_mo integer not null,
id integer auto_increment,
other_columns char(1) default 'x',
primary key (yr_mo, id)
) engine=MyISAM;
insert into demo (yr_mo) values (1311);
insert into demo (yr_mo) values (1311);
insert into demo (yr_mo) values (1311);
insert into demo (yr_mo) values (1311);
insert into demo (yr_mo) values (1312);
The last INSERT statement starts a new month.
Now if you look at the autoincrement values the MyISAM engine assigned . . .
select * from demo;
YR_MO ID OTHER_COLUMNS
--
1311 1 x
1311 2 x
1311 3 x
1311 4 x
1312 1 x
This is MyISAM's documented behavior; look for "MyISAM Notes".
If you want the form yymm/n for presentation, use something like this.
select concat(yr_mo, '/', id) as cat_key
from demo;

sql statement to chose the value of a field in the last entry

I have created a mysql database table which has two fields, serial # (primary key, auto increment) and version.
What I want to do is for every insert that I do to this database, set the version value to be 1, then 2, 3, 4, 5 and then back down to 1 again.
I can imagine doing this by pulling out the last entry in the table before doing an insert. How can I do this? Is there an easier way? I am using PHP.

You can use the built in MySQL Triggers: http://dev.mysql.com/doc/refman/5.0/en/triggers.html
You can create a database trigger to do what you want. The syntax is something like this:
CREATE
TRIGGER `event_name` BEFORE/AFTER INSERT/UPDATE/DELETE
ON `database`.`table`
FOR EACH ROW BEGIN
-- trigger body
-- this code is applied to every
-- inserted/updated/deleted row
END;
Edit:
Triggers are created directly in the database side. You can connect using the MySQL CLI or the PHPMyAdmin to run the query.
Also, a lazy solution may be to use a query similar to what aaaaaa123456789 suggested:
SELECT (
(SELECT version FROM <table name here> ORDER BY id DESC LIMIT 0, 1) % 5
) + 1 as version;
The modulos (%) operator returns the rest of the division of the two numbers and can be used to limit your increments.
Ex.:
1 % 5 = 1
2 % 5 = 2
3 % 5 = 3
5 % 5 = 0
7 % 5 = 2
10 % 5 = 0
PS.: I haven't tryed this query, 'cause I'm not on my dev machine, but from my mind it works.

The easiest way is, as rcdmk suggested, to write a trigger in the database itself, that updates the version field with the corresponding value.
If a trigger is just out of the window for any reason, then you could just do SELECT version FROM <table name here> ORDER BY id DESC LIMIT 0, 1 to get the last value. In fact, bracketting that (as in, putting it between parentheses) and sending 1 + (query here) to the INSERT query as the insertion value is probably the best way to do it, since it avoids race conditions.

$next_increment = 0;
$qShowStatus = "SHOW TABLE STATUS LIKE 'yourtable'";
$qShowStatusResult = mysql_query($qShowStatus) or die ( "Query failed: " . mysql_error() . "<br/>" . $qShowStatus );
$row = mysql_fetch_assoc($qShowStatusResult);
$next_increment = $row['Auto_increment'];

Timing out while updating MySQL with PHP from a CSV

I need to come up with a way to make a large task faster to beat the timeout.
I have very limited access to the server due to the restrictions of the hosting company.
I have a system set up where a cron visits a PHP file that grabs a csv that contains data on some products. The csv does not contain all of the fields that the product would have. Just a handful of essential ones.
I've read a fair number of articles on timeouts and handling csv's and currently (in an attempt to shave time) I have made a table (let's call it csv_data) to hold the csv data. I have a script that truncates the csv_data table then inserts data from the csv so each night the latest recordset from the csv is in that table (the csv file gets updated nightly). So far, no timeout problems..the task only takes about 4-5 seconds.
The timeouts occur when I have to sift through the data to make updates to the products table. The steps that it is running right now is like this
1. Get the sku from csv_data table (that holds thousands of records)
2. Select * from Products where products.sku = csv.sku (products table also holds thousands of records to loop through)
3. Get numrows.
If numrows<0{no record in products, so skip}.
If numrows>1{duplicate entries, don't change anything, but later on report the sku}
If numrows==1{Update selected fields in the products table with csv data}
4. Go to the next record in csv_data all over again
(I figured outlining the process is shorter and easier than dropping in the code.)
I looked into MySQl views and stored procedures but I am not skilled enough in it to know if it will handle the 'if' statement portion.
Is there anything I can do to make this faster to avoid the timeouts?
edit:
I should mention that set_time_limit(0); isn't doing it. And if it helps, the server uses IIS7 and fastcgi
Thanks for your help.
Update after using suggestions from Jakob and Shawn:
I'm doing something wrong. The speed is definitely faster and the csv sku is incrementing,
but when I tried to implement Shawn's solution; the query is giving me a PHP Warning: mysql_result() expects parameter 1 to be resource, boolean error.
Can you help me spot what I am doing wrong?
Here is the section of code:
$csvdata="SELECT * FROM csv_update";
$csvdata_result=mysql_query($csvdata);
mysql_query($csvdata);
$csvdata_num = mysql_num_rows($csvdata_result);
$i=0;
while($i<$csvdata_num){
$csv_code=#mysql_result($csvdata_result,$i,"skucode");
$datacheck=NULL;
$datacheck=substr($csv_code,0,1);
if($datacheck>='0' && $datacheck<='9'){
$csv_price=#mysql_result($csvdata_result,$i,"price");
$csv_retail=#mysql_result($csvdata_result,$i,"retail");
$csv_stock=#mysql_result($csvdata_result,$i,"stock");
$csv_weight=#mysql_result($csvdata_result,$i,"weight");
$csv_manufacturer=#mysql_result($csvdata_result,$i,"manufacturer");
$csv_misc1=#mysql_result($csvdata_result,$i,"misc1");
$csv_misc2=#mysql_result($csvdata_result,$i,"misc2");
$csv_selectlist=#mysql_result($csvdata_result,$i,"selectlist");
$csv_level5=#mysql_result($csvdata_result,$i,"level5");
$csv_frontpage=#mysql_result($csvdata_result,$i,"frontpage");
$csv_level3=#mysql_result($csvdata_result,$i,"level3");
$csv_minquantity=#mysql_result($csvdata_result,$i,"minquantity");
$csv_quantity1=#mysql_result($csvdata_result,$i,"quantity1");
$csv_discount1=#mysql_result($csvdata_result,$i,"discount1");
$csv_quantity2=#mysql_result($csvdata_result,$i,"quantity2");
$csv_discount2=#mysql_result($csvdata_result,$i,"discount2");
$csv_quantity3=#mysql_result($csvdata_result,$i,"quantity3");
$csv_discount3=#mysql_result($csvdata_result,$i,"discount3");
$count_check="SELECT COUNT(*) AS totalCount FROM products WHERE skucode = '$csv_code'";
$count_result=mysql_query($count_check);
mysql_query($count_check);
$totalCount=#mysql_result($count_result,0,'totalCount');
$loopCount = ceil($totalCount / 25);
for($j = 0; $j < $loopCount; $j++){
$prod_check="SELECT skucode FROM products WHERE skucode = '$csv_code' LIMIT ($loopCount*25), 25;";
$prodresult=mysql_query($prod_check);
mysql_query($prod_check);
$prodnum =#mysql_num_rows($prodresult);
$prod_id=#mysql_result($prodresult,0,"catalogid");
if($prodnum<1){
echo "NOT FOUND:$csv_code<br>";
$count_sku_not_found=$count_sku_not_found+1;
$list_sku_not_found=$list_sku_not_found." $csv_code";}
if($prodnum>1){
echo "DUPLICATE:$csv_ccode<br>";
$count_duplicate_skus=$count_duplicate_skus+1;
$list_duplicate_skus=$list_duplicate_skus." $csv_code";}
if ($prodnum==1){
///This prevents an overwrite from happening if the csv file doesn't produce properly
if ($csv_price!="" OR $csv_price!=NULL)
{$sql_price='price="'.$csv_price.'"';}
if ($csv_retail!="" OR $csv_retail!=NULL)
{$sql_retail=',retail="'.$csv_retail.'"';}
if ($csv_stock!="" OR $csv_stock!=NULL)
{$sql_stock=',stock="'.$csv_stock.'"';}
if ($csv_weight!="" OR $csv_weight!=NULL)
{$sql_weight=',weight="'.$csv_weight.'"';}
if ($csv_manufacturer!="" OR $csv_manufacturer!=NULL)
{$sql_manufacturer=',manufacturer="'.$csv_manufacturer.'"';}
if ($csv_misc1!="" OR $csv_misc1!=NULL)
{$sql_misc1=',misc1="'.$csv_misc1.'"';}
if ($csv_misc2!="" OR $csv_misc2!=NULL)
{$sql_pother2=',pother2="'.$csv_misc2.'"';}
if ($csv_selectlist!="" OR $csv_selectlist!=NULL)
{$sql_selectlist=',selectlist="'.$csv_selectlist.'"';}
if ($csv_level5!="" OR $csv_level5!=NULL)
{$sql_level5=',level5="'.$csv_level5.'"';}
if ($csv_frontpage!="" OR $csv_frontpage!=NULL)
{$sql_frontpage=',frontpage="'.$csv_frontpage.'"';}
$import="UPDATE products SET $sql_price $sql_retail $sql_stock $sql_weight $sql_manufacturer $sql_misc1 $sql_misc2 $sql_selectlist $sql_level5 $sql_frontpage $sql_in_stock WHERE skucode='$csv_code'";
mysql_query($import) or die(mysql_error("error updating in products table"));
echo "Update ".$csv_code." successful ($i)<br>";
$count_success_update_skus=$count_success_update_skus+1;
$list_success_update_skus=$list_success_update_skus." $csv_code";
//empty out variables
$sql_price='';
$sql_retail='';
$sql_stock='';
$sql_weight='';
$sql_manufacturer='';
$sql_misc1='';
$sql_misc2='';
$sql_selectlist='';
$sql_level5='';
$sql_frontpage='';
$sql_in_stock='';
$prodnum=0;
}
}
$i++;
}

Is it timing out before the first row is returned or is it between rows during the read? One good practice bit would be to handle your query in chunks; do a count first to see how many records you are dealing with for the SKU, the loop through smaller chunks (the size of these chunks would depend on how many things you have to do with each row). Your updated workflow would look more like this:
Get next SKU from CSV
Get a total count: SELECT COUNT(*) AS totalCount FROM products WHERE products.sku = csv.sku
Determine chunk size (using 25 for this demo)
loopCount = ceil(totalCount / 25)
Loop through all results using a loop like this: for($i = 0; $i < loopCount; $i++)
Inside your loop you should be running a query like this: SELECT * FROM products WHERE products.sku = csv.sku LIMIT (loopCount*25), 25
You will want to use a constant order for your SELECT chunks; your unique ID would probably be best.

I think you can solve this problem with cron. http://en.wikipedia.org/wiki/Cron . It has never had timeout.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.