I have a script that opens a huge XLSX file and reads 3000 rows of data, saving it to a two dimensional array. Of all places for Apache to crash, it does so in a simple loop that builds a MySQL query. I know this because if I remove the following lines from my application, it runs without issue:
$query = "INSERT INTO `map.lmds.dots` VALUES";
foreach($data as $i => $row)
{
$id = $row["Abonnementsid"];
$eier = $row["Eier"];
$status = $row["Status"];
if($i !== 0) $query .= "\n,";
$query .= "('$id', '$eier', '$status', '0', '0')";
}
echo $query;
I can't see a thing wrong with the code.
I'm using PHPExcel and dBug.php
Why is this script crashing Apache?
EDIT: Perhaps I should elaborate on what I mean by crash. I mean a classic Windows "Program has stopped working":
EDIT: Another attempt inspired by one of the answers. Apache still crashes:
$query = "INSERT INTO `map.lmds.dots` VALUES";
$records = array();
foreach($data as $i => &$row)
{
$id = $row["Abonnementsid"];
$eier = $row["Eier"];
$status = $row["Status"];
$records[] = "('$id', '$eier', '$status', '0', '0')";
}
echo $query . implode(",", $records);
EDIT: I have narrowed it down further. As soon as I add a foreach loop, Apache crashes.
foreach($data as $i => $row) {};
Like the others respondents have said this is most likely a memory issue, you should check both your Apache error logs and your PHP error logs for more info.
Assuming this is a memory problem, I suggest you change your code so that you execute multiple insert statements inside the foreach loop rather than storing the whole thing in a big string and sending it to the database all at once. Of course, this means that you're making 3000+ calls to the database rather than just one, I'd expect this to be a bit slower, you can mitigate this problem by using a prepared statement which should be a bit more efficient. If this is still too slow, try changing your loop so that you only call the database every N times round the loop.
The amount of string concatenation and the amount of string data involved could be too much to handle at once during the permitted execution time.
You could try to just collect the values in an array and put them together at the end:
$query = "INSERT INTO `map.lmds.dots` VALUES";
$records = array();
foreach($data as $i => $row) {
$records[] = "('".mysql_real_escape_string($row["Abonnementsid"])."', '".mysql_real_escape_string($row["Eier"])."', '".mysql_real_escape_string($row["Status"])."', '0', '0')";
}
$query .= implode("\n,", $records);
Or insert the records in chunks:
$query = "INSERT INTO `map.lmds.dots` VALUES";
$records = array();
foreach($data as $i => $row) {
$records[] = "('".mysql_real_escape_string($row["Abonnementsid"])."', '".mysql_real_escape_string($row["Eier"])."', '".mysql_real_escape_string($row["Status"])."', '0', '0')";
if ($i % 1000 === 999) {
mysql_query($query . implode("\n,", $records));
$records = array();
}
}
if (!empty($records)) {
mysql_query($query . implode("\n,", $records));
}
Also try it with reference in foreach to avoid that an internal copy of the array is made:
foreach($data as $i => &$row) {
// …
}
This does sounds like a memory issue. Maybe it has nothing to do with the loop building the SQL query. It could be related to reading the "very" large file before that. And the loop pushing memory usage over the limit. Did you try freeing up memory after reading the file?
You can use memory_get_peak_usage() and memory_get_usage() to get some more info about consumed memory.
If that doesn't solve your issue. Install a debugger like Xdebug or Zend Debugger and do some profiling.
Alright, it turns out that updating PHP from 5.3.1 to 5.3.5 made the problem go away. I still have no idea as to what made PHP crash in the first place, but I suppose my PHP could simply have been broken and in need of a reinstall.
Related
I have searched for hours to try and find a simple answer to this query I am having. I'm very sure it is covered in many ways, by other answers at least partially. I need a clear answer specific to what I am doing because I'm having difficulty trying to put together an answer that works for me from a bunch of other differently formatted and structured questions/answers.
I have a form which is posting a range of results intended to be used as keys for a mysql lookup loop. This is working fine - my post results in a successful array. For example:
$payarr = $_POST['pay'];
print_r($payarr);
Results in:
Array ( [0] => 12 [1] => 7 [2] => 1 )
Which is good given that the I want to run a mysqli process select the rows where claim_id is each of the values in $payarr. I then want to use fputcsv to write each of these rows fully, into a CSV file that is uniquely named by the days date and time that this is all run.
I have fragments of code that don't work and my frustration at trying to cobble something together that does work is getting a bit nuts. Can someone please show me how to do this effectively?
At the moment my code looks like this (but fails miserably):
<?php
include ("../conf/dbconfig.php");
include ("../conf/funcs.php");
include ("../conf/privs.php");
if($privs == 500) {
$view = $_POST['view'];
$payarr = $_POST['pay'];
$ts = date('Ymd-His');
print_r($payarr); //Testing to make sure our array to select from has arrived here ok.
//ob_start();
$fp = fopen('../csv/ResultsFile_'.$ts.'.csv', 'w');
foreach($payarr as $val) {
$result = mysqli_query($con, "'SELECT * FROM claims' WHERE claim_id='$val'");
//$row = mysqli_fetch_array($result, MYSQLI_ASSOC);
while ($row = mysqli_fetch_array($result)) {
echo $payarr;
echo $val;
//fputcsv($fp, $row);
print_r($row);
}
}
fclose($fp);
// return ob_get_clean();
} else {
header("Location: http://www.google.com.au/");
die;
}
?>
I'm willing (and happy) to rewrite the entire thing just as long as it works, so any help and suggestions are very appreciated!
Thanks in advance.
BTW - The $view value is not important in this process but is here as it will be passed back to the resulting header once this all works.
A few rough suggestions (you may need to adapt):
1) Change your query to use IN for the possible values, so you only have to execute one DB query:
$result = mysqli_query($con, "SELECT * FROM claims WHERE claim_id IN (" . implode(",",$payarr) . ")");
2) Make sure you are actually getting a DB result, instead of just assuming:
if(!$result || mysqli_num_rows($result) == 0) {
die("We didn't get any results from the DB!"); // Obviously you'll want better error handling
}
3) Now you can open your file, knowing that you need it. Make sure to verify it worked, since you could easily hit a permissions issue:
$fp = fopen('../csv/ResultsFile_'.$ts.'.csv', 'w');
if(!$fp) {
die("We couldn't open the CSV file for writing, check permissions!"); // Obviously you'll want better error handling
}
4) Now loop through your DB results and store them:
while ($row = mysqli_fetch_array($result)) {
fputcsv($fp, $row);
}
5) Does your CSV file need a header row? If so change mysqli_fetch_array() to mysqli_fetch_assoc() and stick this inside your loop:
while ($row = mysqli_fetch_assoc($result)) {
if(!isset($header)) {
$header = array_keys($row);
fputcsv($fp, $header);
}
fputcsv($fp, $row);
}
6) Only now should you close your file (in your code you do it inside the foreach loop):
fclose($fp);
7) Sanitize your $payarr. It could be as simple as:
$payarr = is_array($_POST['pay']) ? array_map('intval', $_POST['pay']) : array();
You may want to do more than that. But at least you're guaranteed to have an array with only integer values (and as long as you have no claim_id values of 0, it's harmless to have zeros in the array).
Hope that helps, at least track down where your code is failing.
I ran into the following question while writing a PHP script. I need to store the first two integers from an array of variable lenght into a database table, remove them and repeat this until the array is empty. I could do it with a while loop, but I read that you should avoid writing SQL statements inside a loop because of the performance hit.
A simpliefied example:
while(count($array) > 0){
if ($sql = $db_connect->prepare("INSERT INTO table (number1, number2) VALUES (?,?)")){
$sql->bind_param('ii',$array[0],$array[1]);
$sql->execute();
$sql->close();
}
array_shift($array);
array_shift($array);
}
Is this the best way, and if not, what's a better approach?
You can do something like this, which is way faster aswell:
Psuedo code:
$stack = array();
while(count($array) > 0){
array_push($stack, "(" . $array[0] . ", " . $array[1] . ")");
array_shift($array);
array_shift($array);
}
if ($sql = $db_connect->prepare("INSERT INTO table (number1, number2)
VALUES " . implode(',', $stack))){
$sql->execute();
$sql->close();
}
The only issue here is that it's not a "MySQL Safe" insert, you will need to fix that!
This will generate and Array that holds the values. Within 1 query it will insert all values at once, where you need less MySQL time.
Whether you run them one by one or in an array, an INSERT statement is not going to make a noticeable performance hit, from my experience.
The database connection is only opened once, so it is not a huge issue. I guess if you are doing some insane amount of queries, it could be.
I think as long as your loop condition is safe ( will break in time ) and you got something from it .. it's ok
You would be better off writing a bulk insert statement, less hits on mysql
$sql = "INSERT INTO table(number1, number2) VALUES";
$params = array();
foreach( $array as $item ) {
$sql .= "(?,?),\n";
$params[] = $item;
}
$sql = rtrim( $sql, ",\n" ) . ';';
$sql = $db_connect->prepare( $sql );
foreach( $params as $param ) {
$sql->bind_param( 'ii', $param[ 0 ], $param[ 1 ] );
}
$sql->execute();
$sql->close();
In ColdFusion you can put your loop inside the query instead of the other way around. I'm not a php programmer but my general belief is that most things that can be done in language a can also be done in language b. This code shows the concept. You should be able to figure out a php version.
<cfquery>
insert into mytable
(field1, field2)
select null, null
from SomeSmallTable
where 1=2
<cfloop from="1' to="#arrayLen(myArray)#" index="i">
select <cfqueryparam value="myArray[i][1]
, <cfqueryparam value="myArray[i][]
from SomeSmallTable
</cfloop>
</cfquery>
When I've looked at this approach myself, I've found it to be faster than query inside loop with oracle and sql server. I found it to be slower with redbrick.
There is a limitation with this approach. Sql server has a maximum number of parameters it will accept and a maximum query length. Other db engines might as well, I've just not discovered them yet.
I got this code
$i = -1;
$random_string = array();
while (sizeof($random_string) < 1600000) {
$i++;
$zmienna = generatePassword();
if (!in_array($zmienna, $random_string))
$random_string[$i] = $zmienna;
else
continue;
}
//print_r($random_string);
foreach ($random_string as $value) {
$sql = "INSERT INTO `kody`(`kod`) VALUES ('$value')";
mysql_query($sql, $con);
}
But it will take a lot of hours to insert it to databse, or even to array. Do someone know how to improve this code?
Well, in_array() is rather expensive. Use a hash instead of a simple array, and then you can use isset() instead of in_array().
Also, don't use things like sizeof() and count() as loop conditions. Instead, just use a simple for ($i = 0; $i < 1600000; ++$i) { ... } array.
Depending on your web host permissions, another significant optimization would be to use fputcsv() to write your array to disk and then make use of MySQL's LOAD DATA INFILE to load the contents into your database, instead of generating 1.6 million queries.
Yes, use one query to insert all of them at once with an SQL multi-insert:
$values = "('" . implode( "'), ('", $random_string) . "')";
$sql="INSERT INTO `kody`(`kod`) VALUES " . $values;
mysql_query($sql,$con);
As drrcknlsn very correctly points out, in_array() is inefficient, as it performs a linear O(n) search on the array. Here is how you can fix that (which is a hash implementation):
while( sizeof($random_string) < 1600000) {
$i++;
$zmienna = generatePassword();
if( !isset( $random_string[$zmienna]))
$random_string[$zmienna] = $zmienna;
else
continue;
}
Now, you can use the above code to generate a single SQL query, and this should run much, much faster.
The problem is probably that it's trying to update the INDEX after each insert. Try using transactions. This will only update the INDEX once (after COMMIT) is called. This will also let you ROLLBACK if something goes wrong.
mysql_query("SET AUTOCOMMIT=0");
mysql_query("START TRANSACTION");
foreach($random_string as $value)
{
$sql="INSERT INTO `kody`(`kod`) VALUES ('$value')";
mysql_query($sql,$con);
}
mysql_query("COMMIT");
Hey guys, i'm currently learning php and I need to do this
$connection = mysql_open();
$likes= array();
foreach($likes as $like)
{
$insert3 = "insert into ProfileInterests " .
"values ('$id', '$like', null)";
$result3 = # mysql_query ($insert3, $connection)
or showerror();
}
mysql_close($connection)
or showerror();
For some reason this does not work =/ I don't know why. $likes is an array which was a user input. I need it to insert into the table it multiple times until all of the things in the array are in.
EDIT I fixed the issue where I was closing it in my foreach loop. mysql_open is my own function btw.
Any ideas?
For one $likes is an empty array in your example, I am assuming you fix that in the code you run.
The second is you close the MySQL connection the first the time the loop would run, which would prevent subsequent MySQL queries from running.
there's no such function as mysql_open
you may need mysql_connect
also $likes variable is empty. so no foreach iterations will execute.
You close the connection within the foreach loop.
Here is the proper formatted code to insert data...You can use this.
// DATABASE CONNECTION
$conn=mysql_connect(HOST,USER,PASS);
$link=mysql_select_db(DATABASE_NAME,$conn);
// function to insert data ..here $tableName is name of table and $valuesArray array of user input
function insertData($tableName,$valuesArray) {
$sqlInsert="";
$sqlValues="";
$arrayKeys = array_keys($valuesArray);
for($i=0;$i < count($arrayKeys);$i++)
{
$sqlInsert .= $arrayKeys[$i].",";
$sqlValues .= '"'.$valuesArray[$arrayKeys[$i]].'",';
}
if($sqlInsert != "")
{
$sqlInsert = substr($sqlInsert,0,strlen($sqlInsert)-1);
$sqlValues = substr($sqlValues,0,strlen($sqlValues)-1);
}
$sSql = "INSERT INTO $tableName ($sqlInsert) VALUES ($sqlValues)";
$inser_general_result=mysql_query($sSql) or die(mysql_error());
$lastID=mysql_insert_id();
$_false="0";
$_true="1";
if(mysql_affected_rows()=='0')
{
return $_false;
}
else
{
return $lastID;
}
}
// End Of Function
While many PHP newbies (myself included) begin working with databases from good ole' mysql_connect/query/etc., I can't help suggest that you look into PDO, PHP Data Objects. Depending on your prior knowledge and programming background, there may be a steeper learning curve. However, it's much more powerful, extensible, etc.; I use PDO in all my production code database wheelings-and-dealings now.
I have a csv file that has 3.5 million codes in it.
I should point out that this is only EVER going to be this once.
The csv looks like
age9tlg,
rigfh34,
...
Here is my code:
ini_set('max_execution_time', 600);
ini_set("memory_limit", "512M");
$file_handle = fopen("Weekly.csv", "r");
while (!feof($file_handle)) {
$line_of_text = fgetcsv($file_handle);
if (is_array($line_of_text))
foreach ($line_of_text as $col) {
if (!empty($col)) {
mysql_query("insert into `action_6_weekly` Values('$col', '')") or die(mysql_error());
}
} else {
if (!empty($line_of_text)) {
mysql_query("insert into `action_6_weekly` Values('$line_of_text', '')") or die(mysql_error());
}
}
}
fclose($file_handle);
Is this code going to die part way through on me?
Will my memory and max execution time be high enough?
NB:
This code will be run on my localhost, and the database is on the same PC, so latency is not an issue.
Update:
here is another possible implementation.
This one does it in bulk inserts of 2000 records
$file_handle = fopen("Weekly.csv", "r");
$i = 0;
$vals = array();
while (!feof($file_handle)) {
$line_of_text = fgetcsv($file_handle);
if (is_array($line_of_text))
foreach ($line_of_text as $col) {
if (!empty($col)) {
if ($i < 2000) {
$vals[] = "('$col', '')";
$i++;
} else {
$vals = implode(', ', $vals);
mysql_query("insert into `action_6_weekly` Values $vals") or die(mysql_error());
$vals = array();
$i = 0;
}
}
} else {
if (!empty($line_of_text)) {
if ($i < 2000) {
$vals[] = "('$line_of_text', '')";
$i++;
} else {
$vals = implode(', ', $vals);
mysql_query("insert into `action_6_weekly` Values $vals") or die(mysql_error());
$vals = array();
$i = 0;
}
}
}
}
fclose($file_handle);
if i was to use this method what is the highest value i could set it to insert at once?
Update 2
so, ive found i can use
LOAD DATA LOCAL INFILE 'C:\\xampp\\htdocs\\weekly.csv' INTO TABLE `action_6_weekly` FIELDS TERMINATED BY ';' ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY ','(`code`)
but the issue now is that, i was wrong about the csv format,
it is actually 4 codes and then a line break,
so
fhroflg,qporlfg,vcalpfx,rplfigc,
vapworf,flofigx,apqoeei,clxosrc,
...
so i need to be able to specify two LINES TERMINATED BY
this question has been branched out to Here.
Update 3
Setting it to do bulk inserts of 20k rows, using
while (!feof($file_handle)) {
$val[] = fgetcsv($file_handle);
$i++;
if($i == 20000) {
//do insert
//set $i = 0;
//$val = array();
}
}
//do insert(for last few rows that dont reach 20k
but it dies at this point because for some reason $val contains 75k rows, and idea why?
note the above code is simplified.
I doubt this will be the popular answer, but I would have your php application run mysqlimport on the csv file. Surely it is optimized far beyond what you will do in php.
is this code going to die part way
through on me? will my memory and max
execution time be high enough?
Why don't you try and find out?
You can adjust both the memory (memory_limit) and execution time (max_execution_time) limits, so if you really have to use that, it shouldn't be a problem.
Note that MySQL supports delayed and multiple row insertion:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
http://dev.mysql.com/doc/refman/5.1/en/insert.html
make sure there are no indexes on your table, as indexes will slow down inserts (add the indexes after you've done all the inserts)
rather than create a new SQL statement in each call of the loop try and Prepare the SQL statement outside of the loop, and Execute that prepared statement with parameters inside the loop. Depending on the database this can be heaps faster.
I've done the above when importing a large Access database into Postgres using perl and got the insert time down to 30 seconds. I would have used an importer tool, but I wanted perl to enforce some rules when inserting.
You should accumulate the values and insert them into the database all at once at the end, or in batches every x records. Doing a single query for each row means 3.5 million SQL queries, each carrying quite some overhead.
Also, you should run this on the command line, where you won't need to worry about execution time limits.
The real answer though is evilclown's answer, importing to MySQL from CSV is already a solved problem.
I hope there is not a web client waiting for a response on this. Other than calling the import utility already referenced, I would start this as a job and return feedback to the client almost immediately. Have the insert loop update a percentage-complete somewhere so the end user can check the status, if you absolutely must do it this way.
2 possible ways.
1) Batch the process, then have a scheduled job import the file, while updating a status. This way, you can have a page that keeps checking the status and refresh itself if the status is not yet 100%. Users will have a live update of how much has been done. But for this you need to access to the OS to be able to set up the schedule task. And the task will be running idle when there is nothing to import.
2) Have the page handle 1000 rows (or any N number of rows... you decide), then send a java script to the browser to refresh itself with a new parameter to tell the script to handle the next 1000 rows. You can also display a status to the user while this is happening. Only problem is that if the page somehow does nor refresh, then the import stops.