function cpanel_populate_database($dbname)
{
// populate database
$sql = file_get_contents(dirname(__FILE__) . '/PHP-Point-Of-Sale/database/database.sql');
$mysqli->multi_query($sql);
$mysqli->close();
}
The sql file is a direct export from phpMyAdmin and about 95% of the time runs without issue and all the tables are created and data is inserted. (I am creating a database from scratch)
The other 5% only the first table or sometimes the first 4 tables are created, but none of the other tables are created (there are 30 tables).
I have decided to NOT use multi_query because it seems buggy and see if the the bug occurs by using just mysql_query on each line after semi-colon. Has anyone ran into issue's like this?
Fast and effective
system('mysql -h #username# -u #username# -p #database# < #dump_file#');
I've seen similar issues when using multi_query with queries that can create or alter tables. In particular, I tend to get InnoDB 1005 errors that seem to be related to foreign keys; it's like MySQL doesn't completely finish one statement before moving on to the next, so the foreign keys lack a proper referent.
In one system, I split the problematic statements into their own files. In another, I have indeed run each command separately, splitting on semicolons:
function load_sql_file($basename, $db) {
// Todo: Trim comments from the end of a line
log_upgrade("Attempting to run the `$basename` upgrade.");
$filename = dirname(__FILE__)."/sql/$basename.sql";
if (!file_exists($filename)) {
log_upgrade("Upgrade file `$filename` does not exist.");
return false;
}
$file_content = file($filename);
$query = '';
foreach ($file_content as $sql_line) {
$tsl = trim($sql_line);
if ($sql_line and (substr($tsl, 0, 2) != '--') and (substr($tsl, 0, 1) != '#')) {
$query .= $sql_line;
if (substr($tsl, -1) == ';') {
set_time_limit(300);
$sql = trim($query, "\0.. ;");
$result = $db->execute($sql);
if (!$result) {
log_upgrade("Failure in `$basename` upgrade:\n$sql");
if ($error = $db->lastError()) {
log_upgrade("$error");
}
return false;
}
$query = '';
}
}
}
$remainder = trim($query);
if ($remainder) {
log_upgrade("Trailing text in `$basename` upgrade:\n$remainder");
if (DEBUG) trigger_error('Trailing text in upgrade script: '.$remainder, E_USER_WARNING);
return false;
}
log_upgrade("`$basename` upgrade successful.");
return true;
}
I have never resorted to multi-query. When I needed something like that, I moved over to mysqli. Also, if you do not need any results from the query, passing the script to mysql_query will also work. You'll also get those errors if there are exports in an incorrect order that clash with require tables for foreign keys and others.
I think the approach of breaking the SQL file to single-queries would be a good idea. Even if its just for comparison purposes (to see if it solves the issue).
Also, I'm not sure how big is your file - but I've had a couple of cases where the file was incredibly big and splitting it into batches did the job.
Related
I need to compare a table column coming from two different type of databases(MYSQL and SQL Server) in php.
For the background the table column that i need to compare doesn't have the same values so i am converting them first for proper comparison e.g value in MYSQL is BN014 which is equivalent to BN00000014 in SQl Server (just extra 4 to 5 zeros after first two characters). I am using the following function for the required conversion, this part seems to be working fine:
function convertWR($wr)
{
$mysqli=$this->con;
$wr_number = $wr;
$prod_zeros='';
$p_cat=substr($wr_number,0,2);
$p_num=substr($wr_number,2,strlen($wr_number)-2);
$p_num_len=strlen($p_num);
$ctrl=8;
while ($ctrl>$p_num_len){
$prod_zeros.="0";
$ctrl--;
}
$converted_id=$p_cat.$prod_zeros.$p_num;
return $converted_id;
}
For the comparison part i am storing the results from both databases in separate arrays named $wr_from_mssql and $wr_from_mysql respectively, then comparing them using the function array_diff():
function compareData()
{
//SQL Server data
$mssql=$this->mssql;
if($mssql)
{
//echo "connected";
if(($result = sqlsrv_query($mssql,"SELECT [ItemId] FROM [dbo].[Item]")) !== false)
{
$count_mssql=1;
$wr_from_mssql=array();
while( $obj = sqlsrv_fetch_array( $result ))
{
$clean_itemid=$string = str_replace(' ', '', $obj['ItemId']);
$wr_from_mssql[]=$clean_itemid;
$count_mssql++;
}
}
}else {
die(print_r(sqlsrv_errors(), true));
}
//-----------------------------------------------------------------------//
//MYSQL data
$mysqli=$this->con;
$count_mysql=1;
$wr_from_mysql=array();
$sql="SELECT WR FROM products";
$result= mysqli_query($mysqli,$sql);
while ($row = mysqli_fetch_array($result)){
//for the required conversion e.g BN014 to BN00000014
$wr= $this->dynacomWR($row['WR']);
$clean_wr=$string = str_replace(' ', '', $wr);
$wr_from_mysql[]=$clean_wr;
$count_mysql++;
}
//----------------------------------------------------------------------//
//To get the difference
$result=array_diff($wr_from_mssql,$wr_from_mysql);
$count_diff=1;
foreach($result as $diff)
{
echo $count_diff.") ".$diff."</br>";
$count_diff++;
}
mysqli_close($mysqli);
$result2=array_diff($result,$wr_from_mysql);
mysqli_close($mysqli);
}
However i am not getting the expected results, I am still getting 3000+ values that exists in both tables. I am expecting around 7000 results while getting 10,000++ .
There are no white spaces or regular spaces(though i still tried removing them) that could affect this comparison as we do clean the strings before inserting them to our tables.
Any idea what could be going wrong, any other possible method for this type of comparison? need it for a report so cannot use the free tools available.
Probably not the best approach as suggested by #GordonLinoff , but I was able to get the desired results :) by creating a temporary table with the converted values and then using the above code, off course with some modifications for mysql query to compare the column values between the new temporary table and the SQL Server table.
I was unable to use the linked server due to a network issue which i will resolve in future. Have some deadlines to meet.
Reason: I was assigned to run some script that advances a website,it's a fantasy football site and there are several instants of the site located into different domains. Some has more than 80k users and each users supposed to have a team that consists of 15 players. Hence some tables have No.users x No.players rows.
However Sometimes the script fails and the result gets corrupted, therefore I must backup 10 tables in question before i execute the script. Nevertheless, I still need to backup the tables to keep historical record of users action. Because football matches may last for 50+ game weeks.
Task: To duplicate db tables using php script. When i started i used to backup the tables using sqlyog. it's works but it's time consuming since I have to wait for each table to be duplicated. Besides, for large tables the sqlyog application crashes during the duplicating of large tables which may be very annoying.
Current solution: I have created a simple application with interface that does the job and it works great. It consist of three files, one for db connection, 2nd for db manipulation, 3rd for user interface and to use the 2nd file's code.
The thing is, sometimes it get stuck at the middle of duplicating tables process.
Objective: To create an application to be used by admin to facilitate database backing up using mysql+php.
My Question: How to ensure that the duplicating script will definitely backup the table completely without hanging the server or interrupting the script.
Down here I will include my code for duplicating function, but basically these are the two crucial lines that i think the problem is located in them:
//duplicate tables structure
$query = "CREATE TABLE $this->dbName.`$newTableName` LIKE $this->dbName.`$oldTable`";
//duplicate tables data
$query = "INSERT INTO $this->dbName.`$newTableName` SELECT * FROM $this->dbName.`$oldTable`";
The rest of the code is solely for validation in case error occur. If you wish to take a look at the whole code, be my guest. Here's the function:
private function duplicateTable($oldTable, $newTableName) {
if ($this->isExistingTable($oldTable))
{
$this->printLogger("Original Table is valid -table exists- : $oldTable ");
}
else
{
$this->printrR("Original Table is invalid -table does not exist- : $oldTable ");
return false;
}
if (!$this->isExistingTable($newTableName))// make sure new table does not exist alrady
{
$this->printLogger("Distination Table name is valid -no table with this name- : $newTableName");
$query = "CREATE TABLE $this->dbName.`$newTableName` LIKE $this->dbName.`$oldTable`";
$result = mysql_query($query) or $this->printrR("Error in query. Query:\n $query\n Error: " . mysql_error());
}
else
{
$this->printrR("Distination Table is invalid. -table already exists- $newTableName");
$this->printr("Now checking if tables actually match,: $oldTable => $newTableName \n");
$varifyStatus = $this->varifyDuplicatedTables($oldTable, $newTableName);
if ($varifyStatus >= 0)
{
$this->printrG("Tables match, it seems they were duplicated before $oldTable => $newTableName");
}
else
{
$this->printrR("The duplicate table exists, yet, doesn't match the original! $oldTable => $newTableName");
}
return false;
}
if ($result)
{
$this->printLogger("Query executed 1/2");
}
else
{
$this->printrR("Something went wrong duplicateTable\nQuery: $query\n\n\nMySql_Error: " . mysql_error());
return false;
}
if (!$this->isExistingTable($newTableName))//validate table has been created
{
$this->printrR("Attemp to duplicate table structure failed $newTableName table was not found after creating!");
return false;
}
else
{
$this->printLogger("Table created successfully: $newTableName");
//Now checking table structure
$this->printLogger("Now comparing indexes ... ");
$autoInc = $this->checkAutoInc($oldTable, $newTableName);
if ($autoInc == 1)
{
$this->printLogger("Auto inc seems ok");
}
elseif ($autoInc == 0)
{
$this->printLogger("No inc key for both tables. Continue anyways");
}
elseif ($autoInc == -1)
{
$this->printLogger("No match inc key!");
}
$time = $oldTable == 'team_details' ? 5 : 2;
$msg = $oldTable == 'team_details' ? "This may take a while for team_details. Please wait." : "Please wait.";
$this->printLogger("Sleep for $time ...\n");
sleep($time);
$this->printLogger("Preparing for copying data ...\n");
$query = "INSERT INTO $this->dbName.`$newTableName` SELECT * FROM $this->dbName.`$oldTable`";
$this->printLogger("Processing copyign data query.$msg...\n\n\n");
$result = mysql_query($query) or $this->printrR("Error in query. Query:\n $query\n Error: " . mysql_error());
// ERROR usually happens here if large tables
sleep($time); //to make db process current requeste.
$this->printLogger("Query executed 2/2");
sleep($time); //to make db process current requeste.
if ($result)
{
$this->printLogger("Table created ($newTableName) and data has been copied!");
$this->printLogger("Confirming number of rows ... ");
/////////////////////////////////
// start checking count
$numRows = $this->checkCountRows($oldTable, $newTableName);
if ($numRows)
{
$this->printLogger("Table duplicated successfully ");
return true;
}
else
{
$this->printLogger("Table duplicated, but, please check num rows $newTableName");
return -3;
}
// end of checking count
/////////////////////////////////
}//end of if(!$result) query 2/2
else
{
$this->printrR("Something went wrong duplicate Table\nINSERT INTO $oldTable -> $newTableName\n\n$query\n mysql_error() \n " . mysql_error());
return false;
}
}
}
AS you noticed the function is only to duplicate one table, that's why there is another function that that takes an array of tables from the user and pass the tables names array one by one to duplicateTable().
Any other function should be included for this question, please let me know.
One solution pops into my mind, would duplicating tables by part by part add any improvement, I'm not sure how Insert into works, but maybe if I could insert let's say 25% at a time it may help?
However Sometimes the script fails and the result gets corrupted,
therefore I must backup 10 tables in question before i execute the
script.
Probably you need to use another solution here: transactions. You need to wrap up all queries you are using in failing script into transaction. If transaction fails all data will be the same as in the beginning of the operation. If queries got executed correctly - you are OK.
why are you every time duplicating the table..
CLUSTERS are good option which can make duplicate copies of your table in distributed manner and is much more reliable and secure.
When I run my script I receive the following error before processing all rows of data.
maximum execution time of 30 seconds
exceeded
After researching the problem, I should be able to extend the max_execution_time time which should resolve the problem.
But being in my PHP programming infancy I would like to know if there is a more optimal way of doing my script below, so I do not have to rely on "get out of jail cards".
The script is:
1 Taking a CSV file
2 Cherry picking some columns
3 Trying to insert 10k rows of CSV data into a my SQL table
In my head I think I should be able to insert in chunks, but that is so far beyond my skillset I do not even know how to write one line :\
Many thanks in advance
<?php
function processCSV()
{
global $uploadFile;
include 'dbConnection.inc.php';
dbConnection("xx","xx","xx");
$rowCounter = 0;
$loadLocationCsvUrl = fopen($uploadFile,"r");
if ($loadLocationCsvUrl <> false)
{
while ($locationFile = fgetcsv($loadLocationCsvUrl, ','))
{
$officeId = $locationFile[2];
$country = $locationFile[9];
$country = trim($country);
$country = htmlspecialchars($country);
$open = $locationFile[4];
$open = trim($open);
$open = htmlspecialchars($open);
$insString = "insert into countrytable set officeId='$officeId', countryname='$country', status='$open'";
switch($country)
{
case $country <> 'Country':
if (!mysql_query($insString))
{
echo "<p>error " . mysql_error() . "</p>";
}
break;
}
$rowCounter++;
}
echo "$rowCounter inserted.";
}
fclose($loadLocationCsvUrl);
}
processCSV();
?>
First, in 2011 you do not use mysql_query. You use mysqli or PDO and prepared statements. Then you do not need to figure out how to escape strings for SQL. You used htmlspecialchars which is totally wrong for this purpose. Next, you could use a transaction to speed up many inserts. MySQL also supports multiple interests.
But the best bet would be to use the CSV storage engine. http://dev.mysql.com/doc/refman/5.0/en/csv-storage-engine.html read here. You can instantly load everything into SQL and then manipulate there as you wish. The article also shows the load data infile command.
Well, you could create a single query like this.
$query = "INSERT INTO countrytable (officeId, countryname, status) VALUES ";
$entries = array();
while ($locationFile = fgetcsv($loadLocationCsvUrl, ',')) {
// your code
$entries[] = "('$officeId', '$country', '$open')";
}
$query .= implode(', ', $enties);
mysql_query($query);
But this depends on how long your query will be and what the server limit is set to.
But as you can read in other posts, there are better way for your requirements. But I thougt I should share a way you did thought about.
You can try calling the following function before inserting. This will set the time limit to unlimited instead of the 30 sec default time.
set_time_limit( 0 );
I have a csv file that has 3.5 million codes in it.
I should point out that this is only EVER going to be this once.
The csv looks like
age9tlg,
rigfh34,
...
Here is my code:
ini_set('max_execution_time', 600);
ini_set("memory_limit", "512M");
$file_handle = fopen("Weekly.csv", "r");
while (!feof($file_handle)) {
$line_of_text = fgetcsv($file_handle);
if (is_array($line_of_text))
foreach ($line_of_text as $col) {
if (!empty($col)) {
mysql_query("insert into `action_6_weekly` Values('$col', '')") or die(mysql_error());
}
} else {
if (!empty($line_of_text)) {
mysql_query("insert into `action_6_weekly` Values('$line_of_text', '')") or die(mysql_error());
}
}
}
fclose($file_handle);
Is this code going to die part way through on me?
Will my memory and max execution time be high enough?
NB:
This code will be run on my localhost, and the database is on the same PC, so latency is not an issue.
Update:
here is another possible implementation.
This one does it in bulk inserts of 2000 records
$file_handle = fopen("Weekly.csv", "r");
$i = 0;
$vals = array();
while (!feof($file_handle)) {
$line_of_text = fgetcsv($file_handle);
if (is_array($line_of_text))
foreach ($line_of_text as $col) {
if (!empty($col)) {
if ($i < 2000) {
$vals[] = "('$col', '')";
$i++;
} else {
$vals = implode(', ', $vals);
mysql_query("insert into `action_6_weekly` Values $vals") or die(mysql_error());
$vals = array();
$i = 0;
}
}
} else {
if (!empty($line_of_text)) {
if ($i < 2000) {
$vals[] = "('$line_of_text', '')";
$i++;
} else {
$vals = implode(', ', $vals);
mysql_query("insert into `action_6_weekly` Values $vals") or die(mysql_error());
$vals = array();
$i = 0;
}
}
}
}
fclose($file_handle);
if i was to use this method what is the highest value i could set it to insert at once?
Update 2
so, ive found i can use
LOAD DATA LOCAL INFILE 'C:\\xampp\\htdocs\\weekly.csv' INTO TABLE `action_6_weekly` FIELDS TERMINATED BY ';' ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY ','(`code`)
but the issue now is that, i was wrong about the csv format,
it is actually 4 codes and then a line break,
so
fhroflg,qporlfg,vcalpfx,rplfigc,
vapworf,flofigx,apqoeei,clxosrc,
...
so i need to be able to specify two LINES TERMINATED BY
this question has been branched out to Here.
Update 3
Setting it to do bulk inserts of 20k rows, using
while (!feof($file_handle)) {
$val[] = fgetcsv($file_handle);
$i++;
if($i == 20000) {
//do insert
//set $i = 0;
//$val = array();
}
}
//do insert(for last few rows that dont reach 20k
but it dies at this point because for some reason $val contains 75k rows, and idea why?
note the above code is simplified.
I doubt this will be the popular answer, but I would have your php application run mysqlimport on the csv file. Surely it is optimized far beyond what you will do in php.
is this code going to die part way
through on me? will my memory and max
execution time be high enough?
Why don't you try and find out?
You can adjust both the memory (memory_limit) and execution time (max_execution_time) limits, so if you really have to use that, it shouldn't be a problem.
Note that MySQL supports delayed and multiple row insertion:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
http://dev.mysql.com/doc/refman/5.1/en/insert.html
make sure there are no indexes on your table, as indexes will slow down inserts (add the indexes after you've done all the inserts)
rather than create a new SQL statement in each call of the loop try and Prepare the SQL statement outside of the loop, and Execute that prepared statement with parameters inside the loop. Depending on the database this can be heaps faster.
I've done the above when importing a large Access database into Postgres using perl and got the insert time down to 30 seconds. I would have used an importer tool, but I wanted perl to enforce some rules when inserting.
You should accumulate the values and insert them into the database all at once at the end, or in batches every x records. Doing a single query for each row means 3.5 million SQL queries, each carrying quite some overhead.
Also, you should run this on the command line, where you won't need to worry about execution time limits.
The real answer though is evilclown's answer, importing to MySQL from CSV is already a solved problem.
I hope there is not a web client waiting for a response on this. Other than calling the import utility already referenced, I would start this as a job and return feedback to the client almost immediately. Have the insert loop update a percentage-complete somewhere so the end user can check the status, if you absolutely must do it this way.
2 possible ways.
1) Batch the process, then have a scheduled job import the file, while updating a status. This way, you can have a page that keeps checking the status and refresh itself if the status is not yet 100%. Users will have a live update of how much has been done. But for this you need to access to the OS to be able to set up the schedule task. And the task will be running idle when there is nothing to import.
2) Have the page handle 1000 rows (or any N number of rows... you decide), then send a java script to the browser to refresh itself with a new parameter to tell the script to handle the next 1000 rows. You can also display a status to the user while this is happening. Only problem is that if the page somehow does nor refresh, then the import stops.
I'm writing a semi-simple database wrapper class and want to have a fetching method which would operate automagically: it should prepare each different statement only the first time around and just bind and execute the query on successive calls.
I guess the main question is: How does re-preparing the same MySql statement work, will PDO magically recognize the statement (so I don't have to) and cease the operation?
If not, I'm planning to achieve do this by generating a unique key for each different query and keep the prepared statements in a private array in the database object - under its unique key. I'm planning to obtain the array key in one of the following ways (none of which I like). In order of preference:
have the programmer pass an extra, always the same parameter when calling the method - something along the lines of basename(__FILE__, ".php") . __LINE__ (this method would work only if our method is called within a loop - which is the case most of the time this functionality is needed)
have the programmer pass a totally random string (most likely generated beforehand) as an extra parameter
use the passed query itself to generate the key - getting the hash of the query or something similar
achieve the same as the first bullet (above) by calling debug_backtrace
Has anyone similar experience? Although the system I'm working for does deserve some attention to optimization (it's quite large and growing by the week), perhaps I'm worrying about nothing and there is no performance benefit in doing what I'm doing?
MySQL (like most DBMS) will cache execution plans for prepared statements, so if user A creates a plan for:
SELECT * FROM some_table WHERE a_col=:v1 AND b_col=:v2
(where v1 and v2 are bind vars) then sends values to be interpolated by the DBMS, then user B sends the same query (but with different values for interpolation) the DBMS does not have to regenerate the plan. i.e. it's the DBMS which finds the matching plan - not PDO.
However this means that each operation on the database requires at least 2 round trips (1st to present the query, the second to present the bind vars) as opposed to a single round trip for a query with literal values, then this introduces additional network costs. There is also a small cost involved in dereferencing (and maintaining) the query/plan cache.
The key question is whether this cost is greater than the cost of generating the plan in the first place.
While (in my experience) there definitely seems to be a performance benefit using prepared statements with Oracle, I'm not convinced that the same is true for MySQL - however, a lot will depend on the structure of your database and the complexity of the query (or more specifically, how many different options the optimizer can find for resolving the query).
Try measuring it yourself (hint: you might want to set the slow query threshold to 0 and write some code to convert literal values back into anonymous representations for the queries written to the logs).
Believe me, I've done this before and after building a cache of prepared statements the performance gain was very noticeable - see this question: Preparing SQL Statements with PDO.
An this was the code I came up after, with cached prepared statements:
function DB($query)
{
static $db = null;
static $result = array();
if (is_null($db) === true)
{
$db = new PDO('sqlite:' . $query, null, null, array(PDO::ATTR_ERRMODE => PDO::ERRMODE_WARNING));
}
else if (is_a($db, 'PDO') === true)
{
$hash = md5($query);
if (empty($result[$hash]) === true)
{
$result[$hash] = $db->prepare($query);
}
if (is_a($result[$hash], 'PDOStatement') === true)
{
if ($result[$hash]->execute(array_slice(func_get_args(), 1)) === true)
{
if (stripos($query, 'INSERT') === 0)
{
return $db->lastInsertId();
}
else if (stripos($query, 'SELECT') === 0)
{
return $result[$hash]->fetchAll(PDO::FETCH_ASSOC);
}
else if ((stripos($query, 'UPDATE') === 0) || (stripos($query, 'DELETE') === 0))
{
return $result[$hash]->rowCount();
}
else if (stripos($query, 'REPLACE') === 0)
{
}
return true;
}
}
return false;
}
}
Since I don't need to worry about collisions in queries, I've ended up using md5() instead of sha1().
OK, since I've been bashing methods of keying the queries for the cache, other than simply using the query string itself, I've done a naive benchmark. The following compares using the plain query string vs first creating the md5 hash:
$ php -v
$ PHP 5.3.0-3 with Suhosin-Patch (cli) (built: Aug 26 2009 08:01:52)
$ ...
$ php benchmark.php
$ PHP hashing: 0.19465494155884 [microtime]
$ MD5 hashing: 0.57781004905701 [microtime]
$ 799994
The code:
<?php
error_reporting(E_ALL);
$queries = array("SELECT",
"INSERT",
"UPDATE",
"DELETE",
);
$query_length = 256;
$num_queries = 256;
$iter = 10000;
for ($i = 0; $i < $num_queries; $i++) {
$q = implode('',
array_map("chr",
array_map("rand",
array_fill(0, $query_length, ord("a")),
array_fill(0, $query_length, ord("z")))));
$queries[] = $q;
}
echo count($queries), "\n";
$cache = array();
$side_effect1 = 0;
$t = microtime(true);
for ($i = 0; $i < $iter; $i++) {
foreach ($queries as $q) {
if (!isset($cache[$q])) {
$cache[$q] = $q;
}
else {
$side_effect1++;
}
}
}
echo microtime(true) - $t, "\n";
$cache = array();
$side_effect2 = 0;
$t = microtime(true);
for ($i = 0; $i < $iter; $i++) {
foreach ($queries as $q) {
$md5 = md5($q);
if (!isset($cache[$md5])) {
$cache[$md5] = $q;
}
else {
$side_effect2++;
}
}
}
echo microtime(true) - $t, "\n";
echo $side_effect1 + $side_effect2, "\n";
To my knowledge PDO does not reuse already prepared statements as it does not analyse the query by itself so it does not know if it is the same query.
If you want to create a cache of prepared queries, the simplest way imho would be to md5-hash the query string and generate a lookup table.
OTOH: How many queries are you executing (per minute)? If less than a few hundred then you only complicate the code, the performance gain will be minor.
Using a MD5 hash as a key you could eventually get two queries that result in the same MD5 hash. The probability is not high, but it could happen. Don't do it. Lossful hashing algorithms like MD5 is just ment as a way to tell if two objects are different with high certainty, but are not a safe means of identifying something.