I have a functionality with csv file upload. Functionality is working if csv is having nearly 30k rows. But whenever csv file will have more than 30k rows then bulk insert is not working. Below is my code for reading csv and inserting into table.
$csvfile = fopen($file, 'r');
$i = 0;
$data4 = "";
while (!feof($csvfile))
{
$csv_data[] = fgets($csvfile, 1024);
$csv_array = explode(";", $csv_data[$i]);
$data4.= "('".$csv_array[2]."', '".$csv_array[4]."', '".$csv_array[0]."','".$csv_array[1]."'),";
}
$i++;
fclose($csvfile);
$data4 = substr($data4,0,-1);
$sql = "INSERT INTO csv_table(`column1`,`column2`,`column3`,`column4`) VALUES $data4";
mysqli_query($mysqliConn,$sql);
Only I am having issue when I have records more than 30k. Please suggest me changes here.
Thanks in advance
Pro tip: "Not working," of course, can mean anything from "my server caught fire and my data center burned to the ground," to "all my values were changed to 42," to "the operation had no effect." Understand your errors. Check the errors that come back from operations like mysqli_query().
That being said...
You're slurping up your entire CSV file's contents and jamming it into a single text string. It's likely that method falls over when the csv file is too long.
There's a limit on the length of a MySQL query. It's large, but not infinite, and it's set both by a server parameter and the server configuration. Read this. https://dev.mysql.com/doc/refman/5.7/en/packet-too-large.html
php can run out of memory as well.
How to fix? Process your CSV file not all at once, but in chunks of fifty rows or so. Once you've read fifty rows, do an insert.
Pro tip 2: Sanitize your input data. What will happen to your table if somebody puts a row like this in an uploaded CSV file?
"bwahahahaha!'; -- DROP TABLE csv_table;", "something", "somethingelse"
You may be OK. But do you want to run a risk like this?
Be aware that the public net is crawling with cybercriminals, and somebody will detect and exploit this kind of vulnerability in days if you leave it running.
Related
I am trying to import a CSV file into my SQL database. This is what I have:
if ($_FILES[csvFile][size] > 0)
{
$file = $_FILES[csvFile][tmp_name];
$handle = fopen($file,"r");
do {
if ($data[0])
{
$insert_query = "REPLACE INTO `teacherNames` SET
`schoolName` = '".addslashes($schoolname)."',
`teacherName` = '".addslashes($data[0])."'
;";
$result = mysql_query($insert_query);
echo $insert_query; -- SEE RESULTING QUERY BELOW
echo $data[0]." added\n<br />";
}
}
while ($data = fgetcsv($handle,1000,",","'"));
The CSV file has 3 records and it looks correct. The procedure works to an extent but for some reason it is not reading the CSV file correctly and the resulting query is like this:
REPLACE INTO `teacherNames` SET `schoolName` = 'Brooks', `teacherName` = 'RMG JMC PMC';
When I would expect to get 3 separate queries - one for each record. It does not seem to be reading the CSV file as 3 separate records but as 1. Can anyone see why?
UPDATE:
The CSV contents are:
RMG
JMC
PMC
The anwer of Julio Martins is better if you have the file on the same computer as the MySQL server.
But if you need to read the file from inside the PHP, there is a note from PHP.NET at http://php.net/manual/en/function.fgetcsv.php :
Note: If PHP is not properly recognizing the line endings when reading
files either on or created by a Macintosh computer, enabling the
auto_detect_line_endings run-time configuration option may help
resolve the problem.
How is the line endings on your file? As all lines are being read as one, it can be your case i guess.
To turn auto_detect_line_endings on, use ini_set("auto_detect_line_endings", true); as said Pistachio at http://php.net/manual/en/filesystem.configuration.php#107333
Use while instead do-while:
while ($data = fgetcsv($handle,1000,",","'")) {
//...
}
Try load data:
LOAD DATA INFILE '{$filepath}'
INTO TABLE '{$table}'
FIELDS TERMINATED BY ','
It is cleaner
So I have a script that reads a text file, organizes it into an array then uses this code to loop through the data to input into the proper columns/rows inside MySQL server:
$size = sizeof($str)/14;
$x=0;
$a=0; $b=1; $c=2; $d=3; $e=4; $f=5; $g=6; $h=7; $i=8; $j=9; $k=10; $l=11; $m=12; $n=13;
mysql_query('TRUNCATE TABLE scores');
do {
$query = "INSERT INTO scores (serverid,resetid, rank,number,countryname,land,networth,tag,gov,gdi,protection,vacation,alive,deleted)
VALUES ('$str[$a]','$str[$b]','$str[$c]','$str[$d]','$str[$e]','$str[$f]','$str[$g]','$str[$h]',
'$str[$i]','$str[$j]','$str[$k]','$str[$l]','$str[$m]','$str[$n]')";
mysql_query($query,$conn);
$a=$a+14; $b=$b+14; $c=$c+14; $d=$d+14; $e=$e+14; $f=$f+14; $g=$g+14; $h=$h+14; $i=$i+14; $j=$j+14; $k=$k+14; $l=$l+14; $m=$m+14; $n=$n+14;
$x++;
} while ($x != $size);
mysql_close($conn);
This code figures out how large the file is loops through all 13 columns until it reaches the last row in the text file. Each time it is ran it clears the DB and loads the new data (as intended).
My question is: is this a good way of doing it? Or is there a faster more clean way to do the same thing as my code above?
Could I use the LOAD DATA LOCAL INFILE '$myFile'" . " INTO TABLE ranksfeed_temp FIELDS TERMINATED BY ',' to do the same job in a more efficient manner? What are your thoughts? I'm trying to make my code more efficient and fast.
LOAD DATA would be faster and more efficient to import a character separated file like csv. LOAD DATA is optimized for importing large files into you MySQL table, whereas you are running one query per row from your textfile, which ist incredibly slow in execution.
Please pay attention to the fact that the LOCAL option is only for files which are placed on the client side of your MySQL-Server-Client Connection. Try to load the file form the machine which acts as the MySQL directly.
Disabling possible keys on your table before inserting can give you extra speed while importing. Try it with disabled keys and without to benchmark the results.
Hi I need to import a csv file of 15000 lines.
I m using the fgetcsv function and parsing each and every line..
But I get a timeout error everytime.
The process is too slow and data is oly partially imported.
Is there any way out to make the data import faster and more efficient?
if(isset($_POST['submit']))
{
$fname = $_FILES['sel_file']['name'];
$var = 'Invalid File';
$chk_ext = explode(".",$fname);
if(strtolower($chk_ext[1]) == "csv")
{
$filename = $_FILES['sel_file']['tmp_name'];
$handle = fopen($filename, "r");
$res = mysql_query("SELECT * FROM vpireport");
$rows = mysql_num_rows($res);
if($rows>=0)
{
mysql_query("DELETE FROM vpireport") or die(mysql_error());
for($i =1;($data = fgetcsv($handle, 10000, ",")) !== FALSE; $i++)
{
if($i==1)
continue;
$sql = "INSERT into vpireport
(item_code,
company_id,
purchase,
purchase_value)
values
(".$data[0].",
".$data[1].",
".$data[2].",
".$data[3].")";
//echo "$sql";
mysql_query($sql) or die(mysql_error());
}
}
fclose($handle);
?>
<script language="javascript">
alert("Successfully Imported!");
</script>
<?
}
The problem is everytime it gets stuck in between the import process and displays the following errors:
Error 1 :
Fatal Error: Maximum time limit of 30 seconds exceeded at line 175.
Error 2 :
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'S',0,0)' at line 1
This error I m not able to detect...
The file is imported oly partial everytime.. oly around 200 300 lines out of a 10000 lines..
You can build a batch update string for every 500 lines of csv and then execute it at once if you are doing the mysql inserts on each line. It'll be faster.
Another solution is to read the file with an offset:
Read the first 500 lines,
Insert them to the database
Redirect to csvimporter.php?offset=500
Return the 1. step and read the 500 lines starting with offset 500 this time.
Another solution would be setting the timeout limit to 0 with:
set_time_limit(0);
Set this at the top of the page:
set_time_limit ( 0 )
It will make the page run endlessly. However, that is not recommended but if you have no other option then cant help!
You can consult the documentation here.
To make it faster, you need to check your the various SQL you are sending and see if you have proper indexes created.
If you are calling user defined functions and these functions are referring to global variables, then you can minimize the time take even more by passing those variables to the function and change the code so that the function refers to those passed variables. Referring to global variables is slower than local variables.
You can make use of LOAD DATA INFILE which is a mysql utility, this is much faster than fgetcsv
more information is available on
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
simply use this # the beginning of your php import page
ini_set('max_execution_time',0);
PROBLEM:
There is a huge performance impact on the way you INSERT data into your table. For every one of your records you send an INSERT request to the server, 15000 INSERT requests that's huge!
SOLUTION::
Well you should group your data like the way mysqldump does. In your case you just need three insert statement not 15000 as below:
before the loop write:
$q = "INSERT into vpireport(item_code,company_id,purchase,purchase_value)values";
And inside the loop concatenate the records to the query as below:
$q .= "($data[0],$data[1],$data[2],$data[3]),";
Inside the loop check that the counter is equal to 5000 OR 10000 OR 15000 then insert data to the vpireprot table and then set the $q to INSERT INTO... again.
run the query and enjoy!!!
If this is a one-time exercise, PHPMyAdmin supports Import via CSV.
import-a-csv-file-to-mysql-via-phpmyadmin
He also notes the user of leveraging MySQL's LOAD DATA LOCAL INFILE. This is a very fast way to import data into a database table. load-data Mysql Docs link
EDIT:
Here is some pseudo-code:
// perform the file upload
$absolute_file_location = upload_file();
// connect to your MySQL database as you would normally
your_mysql_connection();
// execute the query
$query = "LOAD DATA LOCAL INFILE '" . $absolute_file_location .
"' INTO TABLE `table_name`
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(column1, column2, column3, etc)";
$result = mysql_query($query);
Obviously, you need to ensure good SQL practices to prevent injection, etc.
Hello again everybody on here :)
So we just finally got gps data to go from an android device to a database using php to pass the array into the table. Awesome.
Now I need to pull the data out of the table and do one of two things that I can think of.
I can modify my existing php page using fwrite() to write the incoming data to another php page which would act like a db table itself.....which is kind of defeating the purpose of having things stored in the db.
I can create a php page that I have the device hit when it wants to pull all the gps marks from the db. This Im not sure how to do at the moment.
My problem is for the first method Im a little unsure of how efficient space wise this would be vs. just pulling stuff from the db and also Im not to sure how to set up the variables in so that fwrite() actually writes them in the json format I need
In order to use fwrite() and get things to write to the new page correctly would I use it like this?
$myFile = "testFile.php";
$fh = fopen($myFile, 'w') or die("can't open file");
$stringData = " "lat": ", $lat, " \n";
fwrite($fh, $stringData);
$stringData = " "long": ", $long, " \n";
fwrite($fh, $stringData);
fclose($fh);
For the second method Im not sure really where to start. I have the gps info stored in a table within a mysql database so I would need a php script that returns the all the gps marks in the table to the user so they appear on the device. Also very soon well be moving to a postgresql database so help on doing that too would be great.
Once again any help would be much appreciated :)
Out of your two options, option 2 is the simplest, and the best. The PHP is dead simple:
<?php
mysql_connect("localhost","USERNAME","PASSWORD");
mysql_select_db("yourdb");
$sql = " select location_name, location_lat, location_lon from locations ";
$res = mysql_query($sql);
$locations = array();
for ($i = 0; $i < mysql_num_rows($res); $i++){
$row = mysql_fetch_assoc($res);
$locations[] = (object) $row;
}
echo json_encode($locations);
?>
So you want a PHP script that takes GPS data from database and outputs it in JSON, for Android app to consume?
The high level steps and corresponding sub-questions you'll need to find answers to:
querying database
"How to write SQL queries?"
"How to execute SQL query against MySQL and PostgreSQL using PHP?"
formatting query results as JSON
"How to serialize data to JSON using PHP?"
I have an 800mb text file with 18,990,870 lines in it (each line is a record) that I need to pick out certain records, and if there is a match write them into a database.
It is taking an age to work through them, so I wondered if there was a way to do it any quicker?
My PHP is reading a line at a time as follows:
$fp2 = fopen('download/pricing20100714/application_price','r');
if (!$fp2) {echo 'ERROR: Unable to open file.'; exit;}
while (!feof($fp2)) {
$line = stream_get_line($fp2,128,$eoldelimiter); //use 2048 if very long lines
if ($line[0] === '#') continue; //Skip lines that start with #
$field = explode ($delimiter, $line);
list($export_date, $application_id, $retail_price, $currency_code, $storefront_id ) = explode($delimiter, $line);
if ($currency_code == 'USD' and $storefront_id == '143441'){
// does application_id exist?
$application_id = mysql_real_escape_string($application_id);
$query = "SELECT * FROM jos_mt_links WHERE link_id='$application_id';";
$res = mysql_query($query);
if (mysql_num_rows($res) > 0 ) {
echo $application_id . "application id has price of " . $retail_price . "with currency of " . $currency_code. "\n";
} // end if exists in SQL
} else
{
// no, application_id doesn't exist
} // end check for currency and storefront
} // end while statement
fclose($fp2);
At a guess, the performance issue is because it issues a query for each application_id with USD and your storefront.
If space and IO aren't an issue, you might just blindly write all 19M records into a new staging DB table, add indices and then do the matching with a filter?
Don't try to invent the wheel, it's been done. Use a database to search through the file's content. You can looad that file into a staging table in your database and query your data using indexes for fast access if they add value. Most if not all databases have import/loading tools to get a file into the database relatively fast.
19M rows on DB will slow it down if DB was not designed properly. You can still use text files, if it is partitioned properly. Recreating multiple smaller files, based on certain parameters, storing in proper sorted way might work.
Anyway PHP is not the best language for file IO and processing, it is much slower than Java for this task, while plain old C would be one of the fastest for the job. PHP should be restricted to generated dynamic Web output, while core processing should be in Java/C. Ideally it should be Java/C service which generates output, and PHP using that feed to generate HTML output.
You are parsing the input line twice by doing two explodes in a row. I would start by removing the first line:
$field = explode ($delimiter, $line);
list($export_date, ...., $storefront_id ) = explode($delimiter, $line);
Also, if you are only using the query to test for a match based on your condition, don't use SELECT * use something like this:
"SELECT 1 FROM jos_mt_links WHERE link_id='$application_id';"
You could also, as Brandon Horsley suggested, buffer a set of application_id values in an array and modify your select statement to use the IN clause thereby reducing the number of queries you are performing.
Have you tried profiling the code to see where it's spending most of its time? That should always be your first step when trying to diagnose performance problems.
Preprocess with sed and/or awk ?
Databases are built and designed to cope with large amounts of data, PHP isn't. You need to re-evaluate how you are storing the data.
I would dump all the records into a database, then delete the records you don't need. Once you have done that, you can copy those records wherever you want.
As others have mentioned, the expense is likely in your database query. It might be faster to load a batch of records from the file (instead of one at a time) and perform one query to check multiple records.
For example, load 1000 records that match the USD currency and storefront at a time into an array and execute a query like:
'select link_id from jos_mt_links where link_id in (' . implode(',', application_id_array) . ')'
This will return a list of those records that are in the database. Alternatively, you could change the sql to be not in to get a list of those records that are not in the database.