I have the following code in one of my pages. Prior to this I execute a query that returns multiple rows keyed off of alias_code. This code creates an array of string arrays to me echoed into a javascript function for populating points on a graph. I've profiled this multiple times, but I still have the feeling that there's a more efficient way to do this. I do realize that I run the risk of running out of memory if my strings are too big, but I'll constrain this in my query since I'd like to avoid an additional sub array or the use of implode/join. Does anyone have any thoughts on speeding this up?
$detailArray = array();
$prevAliasCode = '';
$valuesStr = '';
while ($detailRow = mysqli_fetch_array($detailResult)) {
$aliasCode = $detailRow['alias_code'];
if ($aliasCode <> $prevAliasCode) {
if ($valuesStr <> '')
$detailArray[$prevAliasCode] = $valuesStr;
$valuesStr = '';
}
if ($valuesStr <> '')
$valuesStr = $valuesStr . ', ';
$valuesStr = $valuesStr .
"['" .
$detailRow['as_of_date'] . "', " .
$detailRow['difficulty'] . ", " .
$detailRow['price_usd'] . "]";
$prevAliasCode = $aliasCode;
}
$detailArray[$prevAliasCode] = $valuesStr;
Related
I have this query which works successfully in one of my PHP scripts:
$sql = "UPDATE Units SET move_end = $currentTime, map_ID = $mapID, attacking = $attackStartTime, unit_ID_affected = $enemy, updated = now() WHERE unit_ID IN ($attackingUnits);";
$attackingUnits is an imploded array of anywhere between 1 - 100 integers.
What I'd like to do is also add arrays with different values for $currentTime and $mapID which correspond with the values for $attackingUnits. Something like this:
$sql = "UPDATE Units SET move_end = " . $attackingUnits['move_end'] . ", map_ID = " . $attackingUnits['map_ID'] . ", attacking = $attackStartTime, unit_ID_affected = $enemy, updated = now() WHERE unit_ID IN ($attackingUnits);";
Obviously that won't work the way I want it to because $attackingUnits['move_end'] and $attackingUnits['map_ID'] are just single values, not an array, but I'm stumped as to how I can write this query. I know I can one query for each element of $attackingUnits, but this is precisely what I'm trying to avoid as I'd like to be able to use one UPDATE for as many elements as required.
How would I write this query?
The key parts of the PHP script are:
$attackStartTime = time(); // the time the units started attacking the enemy (i.e. the current time)
// create a proper multi-dimensional array as the client only sends a string of comma-delimited unitID values
$data = array();
// add the enemy unit ID to the start of the selectedUnits CSV and call it allUnits. we then run the same query for all units in the selectedUnits array. this avoids two separate queries for pretty much the same columns
$allUnits = $enemy . "," . $selectedUnits;
// get the current enemy and unit data from the database
$sql = "SELECT user_ID, unit_ID, type, map_ID, moving, move_end, destination, attacking, unit_ID_affected, current_health FROM Units WHERE unit_ID IN ($allUnits);";
$result = mysqli_query($conn, $sql);
// convert the CSV strings to arrays for processing in this script
$selectedUnits = explode(',', $selectedUnits);
$allUnits = explode(',', $allUnits);
while ($row = mysqli_fetch_assoc($result)) {
$data[] = $row;
}
$result -> close();
$increment = 0; // set an increment value outside of the foreach loop so that we can use the pointer value at each loop
// check each selected unit to see if it can validly attack the enemy unit, otherwise remove them from selected units and send an error back for that specific unit
foreach ($data as &$unit) {
// do a whole bunch of checking stuff here
}
// convert the attacking units (i.e. the unit ids from selected units which have passed the attacking tests) to a CSV string for processing on the database
$attackingUnits = implode(',', $selectedUnits);
// update each attacking unit with the start time of the attack and the unit id we are attacking, as well as any change in movement data
// HERE IS MY PROBLEMATIC QUERY
$sql = "UPDATE Units SET moving = " . $unit['moving'] . ", move_end = " . $unit['move_end'] . ", map_ID = " . $unit['map_ID'] . ", attacking = $attackStartTime, unit_ID_affected = $enemy, updated = now() WHERE unit_ID IN ($attackingUnits);";
$result = mysqli_query($conn, $sql);
// send back the full data array - should only be used for testing and not in production!
echo json_encode($data);
mysqli_close($conn);
OK, after some more web research I found a link that helped me out:
https://stuporglue.org/update-multiple-rows-at-once-with-different-values-in-mysql/
I updated his code to mysqli and after a lot of testing, it works! I can now successfully UPDATE hundreds of rows with one query, rather than sending hundreds of small updates via PHP. Here are the relevant parts of my code for anyone who's interested:
$updateValues = array(); // the array we are going to build out of all the unit values that need to be updated
// build up the query string
$updateValues[$unit["unit_ID"]] = array(
"moving" => $unit["new_start_time"],
"move_end" => $unit["new_end_time"],
"map_ID" => "`destination`",
"destination" => $unit["new_destination"],
"attacking" => 0,
"unit_ID_affected" => 0,
"updated" => "now()"
);
// start of the query
$updateQuery = "UPDATE Units SET ";
// columns we will be updating
$columns = array("moving" => "`moving` = CASE ",
"move_end" => "`move_end` = CASE ",
"map_ID" => "`map_ID` = CASE ",
"destination" => "`destination` = CASE ",
"attacking" => "`attacking` = CASE ",
"unit_ID_affected" => "`unit_ID_affected` = CASE ",
"updated" => "`updated` = CASE ");
// build up each column's CASE statement
foreach ($updateValues as $id => $values) {
$columns['moving'] .= "WHEN `unit_ID` = " . mysqli_real_escape_string($conn, $id) . " THEN " . mysqli_real_escape_string($conn, $values['moving']) . " ";
$columns['move_end'] .= "WHEN `unit_ID` = " . mysqli_real_escape_string($conn, $id) . " THEN " . mysqli_real_escape_string($conn, $values['move_end']) . " ";
$columns['map_ID'] .= "WHEN `unit_ID` = " . mysqli_real_escape_string($conn, $id) . " THEN " . mysqli_real_escape_string($conn, $values['map_ID']) . " ";
$columns['destination'] .= "WHEN `unit_ID` = " . mysqli_real_escape_string($conn, $id) . " THEN " . mysqli_real_escape_string($conn, $values['destination']) . " ";
$columns['attacking'] .= "WHEN `unit_ID` = " . mysqli_real_escape_string($conn, $id) . " THEN " . mysqli_real_escape_string($conn, $values['attacking']) . " ";
$columns['unit_ID_affected'] .= "WHEN `unit_ID` = " . mysqli_real_escape_string($conn, $id) . " THEN " . mysqli_real_escape_string($conn, $values['unit_ID_affected']) . " ";
$columns['updated'] .= "WHEN `unit_ID` = " . mysqli_real_escape_string($conn, $id) . " THEN " . mysqli_real_escape_string($conn, $values['updated']) . " ";
}
// add a default case, here we are going to use whatever value was already in the field
foreach ($columns as $columnName => $queryPart) {
$columns[$columnName] .= " ELSE `$columnName` END ";
}
// build the WHERE part. since we keyed our updateValues off the database keys, this is pretty easy
// $where = " WHERE `unit_ID` = '" . implode("' OR `unit_ID` = '", array_keys($updateValues)) . "'";
$where = " WHERE unit_ID IN ($unitIDs);";
// join the statements with commas, then run the query
$updateQuery .= implode(', ', $columns) . $where;
$result = mysqli_query($conn, $updateQuery);
This will significantly reduce the load on my database as these events can happen every second (think of hundreds of players at once, attacking hundreds of enemy units with hundreds of their own units). I hope this helps someone out.
I need to update tags column so each cell has the content like this:
2-5-1-14-5
or
3-9-14-19-23
or simmilar (five integers, in range from 1-25).
id column is not consecutive from 1-117, but anyway min id is 1 and max 117.
$arr = [];
$str = '';
$id = 1;
for ($x = 1; $x <= 25; $x++){
array_push($arr, $x);
}
while ($id < 117) {
shuffle($arr);
array_splice($arr, 5, 25);
foreach ($arr as $el){
$str .= $el . '-';
}
$str = rtrim($str,'-');
$db->query("update posts set tags = '" . $str . "' where id = " . $id);
$id += 1;
}
I'm not sure how to describe the final result, but it seems that the majority of cells are written multiple times.
Any help ?
To combine my comments into one piece of code:
$full = range(1, 25);
$id = 1;
while ($id < 117) {
shuffle($full);
$section = array_slice($full, 0, 5);
$str = implode('-',$section);
$db->query("update posts set tags = '" . $str . "' where id = " . $id);
$id += 1;
}
So the reset of $str is not needed anymore since I have inserted the implode() where it seems functional. The other bits of code could probably be improved.
Two warnings:
Using PHP variables directly in queries is not a good idea. Please use parameter binding. This particular piece of code might not be vulnerable to SQL-injection but if you do the same elsewhere it might be.
Your database doesn't seem to be normalized. This might cause trouble for you in the long run when you expand your application.
I'm trying to slim down the code:
I want to make a for loop out of this part, but it wouldn't work.
$line1 = $frage1[0] . '|' . $frage1[1] . '|' . $frage1[2] . '|' . $frage1[3];
$line2 = $frage2[0] . '|' . $frage2[1] . '|' . $frage2[2] . '|' . $frage2[3];
$line3 = $frage3[0] . '|' . $frage3[1] . '|' . $frage3[2] . '|' . $frage3[3];
$line4 = $frage4[0] . '|' . $frage4[1] . '|' . $frage4[2] . '|' . $frage4[3];
$line5 = $frage5[0] . '|' . $frage5[1] . '|' . $frage5[2] . '|' . $frage5[3];
This is my attempt:
for ($i=1; $i<6; $i++){
${line.$i} = ${frage.$i}[0] . '|' . ${frage.$i}[1] . '|' . ${frage.$i}[2] . '|' . ${frage.$i}[3];
}
EDIT:
This is the solution that works (just so simple :-p):
for ($i=1; $i<18; $i++){
${"line".$i} = implode("|", ${"frage".$i});
fwrite($antworten, ${"line".$i});
}
for ($i=1; $i<6; $i++){
$line[$i] = $frage[0] . '|' . $frage[1] . '|' . $frage$i[2] . '|' . $frage$i[3];
}
This will insert the same thing you posted abov e in line 1 to 6. Is this what you are looking for?!
for ($i=1; $i<6; $i++){
${"line$i"} = ${"frage$i"}[0] . '|' . ${"frage$i"}[1] . '|' . ${"frage$i"}[2] . '|' . ${"frage$i"}[3];
}
That works too!
TL;DR
You might find the documentation on variable variables useful but it looks like the problem is that PHP cannot handle the unquoted string inside your curly brackets.
So instead of ${frage.$i} you need ${"frage$i"}.
Another Approach
However, this is probably not the clearest way to solve this problem. It certainly gives me a bit of a headache trying to work out what this code is trying to do. Instead I would recommend adding all of your $frage to an array first and then looping as follows:
$lines = array();
$frages = array($frage1, $frage2, $frage3, $frage4, $frage5)
foreach($frages as $frage) {
$lines[] = join('|', $frage);
}
Note in particular that you can use join to concatenate each $frage with a | inbetween each, rather than doing this concatenation manually. You could always use [array_slice][2] if you really did only want the first 4 elements of the array to be joined.
If you have a significant number of $frage variables and don't want to add them to an array manually then:
$frages = array();
for($i = 1; $i < x; $i++) {
$frages[] = ${"frage$i"}
}
If you really need each $line variable rather than an array of lines, then you can use extract although this will give you variables like $line_1 rather than $line1:
extract($lines, EXTR_PREFIX_ALL, "line")
However, I would recommend taking a serious look at why you have these numbered $frage being generated and why you need these numbered $line as your output. I would be very surprised if your code could not be re-factored to just use arrays instead, which would make your code much simpler, less surprising and easier to maintain.
i have a task where i need to parse an extremely big file and write the results into a mysql database. "extremely big" means we are talking about 1.4GB of sort-of-CSV data, totalling in approx 10 million lines of text.
Thing is not "HOW" to do it, but how to do it FAST. my first approach was to just do it in php without any speed optimization and then let it run for a few days until it's done. unfortunately, it's been running for 48 hours straight right now and has processed only 2% of the total file. therefore, that's not an option.
the file format is as follows:
A:1,2
where the amount of comma separated numbers following the ":" can be 0-1000. the example dataset has to go into a table as follows:
| A | 1 |
| A | 2 |
so right now, i did it like this:
$fh = fopen("file.txt", "r");
$line = ""; // buffer for the data
$i = 0; // line counter
$start = time(); // benchmark
while($line = fgets($fh))
{
$i++;
echo "line " . $i . ": ";
//echo $i . ": " . $line . "<br>\n";
$line = explode(":", $line);
if(count($line) != 2 || !is_numeric(trim($line[0])))
{
echo "error: source id [" . trim($line[0]) . "]<br>\n";
continue;
}
$targets = explode(",", $line[1]);
echo "node " . $line[0] . " has " . count($targets) . " links<br>\n";
// insert links in link table
foreach($targets as $target)
{
if(!is_numeric(trim($target)))
{
echo "line " . $i . " has malformed target [" . trim($target) . "]<br>\n";
continue;
}
$sql = "INSERT INTO link (source_id, target_id) VALUES ('" . trim($line[0]) . "', '" . trim($target) . "')";
mysql_query($sql) or die("insert failed for SQL: ". mysql_error());
}
}
echo "<br>\n--<br>\n<br>\nseconds wasted: " . (time() - $start);
this is obviously not optimized for speed in ANY way. any hints for a fresh start? should i switch to another language?
The first optimization would be to insert with a transaction - each 100 or 1000 lines commit and begin a new transaction. Obviously you'd have to use a storage engine that supports transactions.
Then observe the CPU usage with the top command - if you have multiple cores, the mysql process does not do much and the PHP process does much of the work, rewrite the script to accept a parameter that skips n lines from the beginning and only import 10000 lines or so. Then start multiple instances of the script, each with a different starting point.
Third solution would be to convert the file into a CSV with PHP (no INSERT at all, just writing to a file) and the using LOAD DATA INFILE as m4t1t0 suggested.
as promised, attached you'll find the solution i went for in this post. i benchmarked it and it turned out, that it is 40 times (!) faster than the old one :)
sure - there's still much room for optimization, but it's fast enough for me right now :)
$db = mysqli_connect(/*...*/) or die("could not connect to database");
$fh = fopen("data", "r");
$line = ""; // buffer for the data
$i = 0; // line counter
$start = time(); // benchmark timer
$node_ids = array(); // all (source) node ids
mysqli_autocommit($db, false);
while($line = fgets($fh))
{
$i++;
echo "line " . $i . ": ";
$line = explode(":", $line);
$line[0] = trim($line[0]);
if(count($line) != 2 || !is_numeric($line[0]))
{
echo "error: source node id [" . $line[0] . "] - skipping...\n";
continue;
}
else
{
$node_ids[] = $line[0];
}
$targets = explode(",", $line[1]);
echo "node " . $line[0] . " has " . count($targets) . " links\n";
// insert links in link table
foreach($targets as $target)
{
if(!is_numeric($target))
{
echo "line " . $i . " has malformed target [" . trim($target) . "]\n";
continue;
}
$sql = "INSERT INTO link (source_id, target_id) VALUES ('" . $line[0] . "', '" . trim($target) . "')";
mysqli_query($db, $sql) or die("insert failed for SQL: ". $db::error);
}
if($i%1000 == 0)
{
$node_ids = array_unique($node_ids);
foreach($node_ids as $node)
{
$sql = "INSERT INTO node (node_id) VALUES ('" . $node . "')";
mysqli_query($db, $sql);
}
$node_ids = array();
mysqli_commit($db);
mysqli_autocommit($db, false);
echo "committed to database\n\n";
}
}
echo "<br>\n--<br>\n<br>\nseconds wasted: " . (time() - $start);
I find your description rather confusing - and it doesn't match up with the code you've provided.
if(count($line) != 2 || !is_numeric(trim($line[0])))
the trim here is redundant - whitespace doesn't change the behaviour of is_numberic. But you've said aleswhere that the start of the line is a letter - therefore this will always fail.
If you want to speed it up then switch to using stream processing of the input rather than message processing (PHP arrays can be very slow) or use a different language and aggregate the insert statements into multi-line inserts.
I would first just use the script to create a SQL file. Then lock the table using this http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html by placing the appropriate commands at the start/end of the SQL file (could get you script to do this).
Then just use the command tool to inject the SQL into the database (preferably on the machine where the database resides).
I have two tables each with 160k+ rows each. between the two some UUID are shared. I'm and using a foreach loop over the "new" table with an embedded foreach searching the "old" table. When a UUID match is out the "old" table is updated with data from the "new" table.
Both tables have an index on the ID.
My problem is this operation is extreme time intensive; does anyone know a more efficient way to do said search for matching UUIDs? Sidenote: we are using the MySQLi extension for PHP 5.3
Exp code:
$oldCounter = 0;
$newCounter = 1;
//loop
foreach( $accounts as $accKey=>$accValue )
{
echo( "New ID - " . $oldCounter++ . ": " . $accValue['id'] . "\n" );
foreach( $accountContactData as $acdKey=>$acdValue )
{
echo( "Old ID - " $newCounter++ . ": " . $acdValue['id'] . " \n" );
if( $accValue['id'] == $acdValue['id'] && (
$accValue['phone_office'] == "" || $accValue['phone_office'] == NULL || $accValue['phone_office'] == 0 )
){
echo("ID match found\n");
//when match found update accounts with accountsContact info
$query = '
UPDATE `accounts`
SET
`phone_fax` = "' . $acdValue['fax'] . '",
`phone_office` = "' . $acdValue['telephone1'] . '",
`phone_alternate` = "' . $acdValue['telephone2'] . '"
WHERE
`id` = "' . $acdValue['id'] . '"
';
echo "" . $query . "\n\n";
$DB->query($query);
break 1;
}
}
}
unset($oldCounter);
unset($newCounter);
Thank you in advance.
Do this all in SQL.
There is nothing that I see in your code that requires PHP.
UPDATE allows JOIN. JOIN the new and old tables and have your WHERE conditions match that of your description. Should be pretty straightforward and significantly faster.
I wrote a function some months ago
You can modify it as you want .
This can improve the speed of your search
public static function search($query)
{
$result = array();
$all = custom_query::getNumRows("bar");
$quarter = floor(0.25 * $all) + 1;
$all = 0;
for($i = 0;$i<4;$i++)
{
custom_query::condition("", "limit $all, $quarter");
$data = custom_query::FetchAll("bar");
foreach ($data as $v)
foreach($v as $_v)
if (count(explode($query, $_v)) > 1)
$result[] = $v["bar_id"];
$all += $quarter;
}
return $result;
}
It returns you the ID of the record that search matched on it.
This method divides the table to 4 parts and each iteration gets only a quarter of it...
You can change this number to for example 10 or 20 for speed of...
some methods are in the class and you can easily write theme...