I have excel data more than 5k rows and 17 columns, I use the nested loop technique in php, but this takes a long time, to process the data using the xls file format takes 45 minutes, while using the csv file format takes 30 minutes , is there a technique to speed up uploading files from excel to the database (I use Postgresql).
I use a nested loop because how many columns depend on the parameters, and for the INSERT or UPDATE process to the database also depends on the parameters.
Here is my code for the import process
<?php
$row = 5000; // estimated row
$col = 17; // estimated col
for($i=1; $i<=$row; $i+=1){
for($j=1; $j<=$col; $j+=1){
$custno = $custno = $sheetData[$i][0];
$getId = "SELECT id from data WHERE 'custno' = $custno";
if($getId){
$update = "UPDATE data SET address = 'address 1' WHERE custno = $custno";
}else{
$insert = "INSERT INTO data (address) VALUES (address jon);
}
}
}
I use the PhpSpreadsheet library
First, try to find out what is the root of the issue, is it because operating over the file is slow or there are too many SQL queries being executed in the meantime?
Bear in mind that running queries in the loop is always asking for performance trouble. Maybe you can avoid that by asking for needed data before processing the file? You may not be able to define which data are needed on that step but fetching more than you need could be still faster than making separate queries one by one. Also, I would like to encourage you to limit INSERT or UPDATE queries. They are usually slower than the SELECT one. Try to collect data for database write operations and run it once after the loop.
For CSV operations I would prefer basic php methods like fgetcsv() and str_getcsv() than the separate library as long as the file is not overcomplicated. If you are keen to check some alternatives for PhpSpreadsheet take a look at Spout by box.com, it looks promising but I have never used that.
I'm sure that you can improve your performance by using PHP Genrators, they are perfect everytime you have to read a file content. Here you have some more links:
https://www.sitepoint.com/memory-performance-boosts-with-generators-and-nikiciter/
https://www.sitepoint.com/generators-in-php/
https://riptutorial.com/php/example/5441/reading-a-large-file-with-a-generator/
If not using php for this operation is an option for your, try exporting this spreadsheet as CSV and importing the file using COPY. It won't take more than a few seconds.
If your database is installed locally you just need to execute COPY in a client of your choice, e.g. pgAdmin. Check this answer for more information.
COPY your_table FROM '/home/user/file.csv' DELIMITER ',' CSV HEADER;
Keep in mind that the user postgres in your system must have the necessary permissions to access the CSV file. Check how to do that in your operating system, e.g. chown in Linux.
In case your database is installed in a remote server, you have to use the STDIN facility of COPY via psql
$ cat file.csv | psql your_db -c "COPY your_table FROM STDIN;"
Related
I am using PHPSpreadsheet to take some spreadsheets a user can upload, add a column with certain values, save the file as CSV, and use the following query to import the csv file:
LOAD DATA LOCAL INFILE '{$file}'
INTO TABLE {$table}
FIELDS TERMINATED by ',' ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
Alternatively I can do something like:
foreach($rows as $row){
// INSERT $row INTO table
}
The spreadsheets will all have the same columns/data-types.
What would be the most efficient way to do this? Going from Xlsx -> CSV -> MySQL Import seems like I am adding extra steps.
MySQL's direct CSV import is usually the fastest option, however it is not without limitations. One is that you need to import all or nothing in the file and you won't know how far along it is until it's done. As some imports can take hours, even days, you may not know where it's at. The entire insert operation on an InnoDB table takes place atomically for performance reasons but that means it's not visible until fully committed.
Another is the file must be present on the server. The LOCAL option is a quirky feature of the mysql command-line tool and probably doesn't work in your database driver unless emulated.
Inserting row-by-row with a CSV parser is almost always slower. If you must do a thing, be sure to prepare an INSERT statement once and re-use it in the loop, or do a "multi-INSERT" with as many rows as you can fit in your max statement size buffer.
I have an excel sheet that contains almost 67,000 rows , and i tried to convert as mysql using excel_reader. But its not supporting large number of items. Please help to solve this issue.
Try also EasyXLS Excel library. You can import large data from Excel with this library. It includes a library as COM component, that can be used from PHP. COM objects are a little slower, but you can obtain a reasonable importing time.
Use this link as starting point:
https://www.easyxls.com/manual/FAQ/import-excel-to-mysql.html
A viable option (But certainly not the easiest) would be to construct a script using php -- Note this would be the loop itself; you would need your db connection etc etc.
<?php
$file = fopen("import.csv","r");
while(! feof($file))
{
//MYSQL insert Statement here
}
fclose($file);
?>
That would create an array for every line then you can use the array positions in your insert statement which will be repeated roughly 67,000 times Wouldnt take excessively long but may be a better approach than using say phpmyadmin if it is timing out on you etc etc.
I’m working with the royal mail PAF database in csv format (approx 29 million lines), and need to split the data into sql server using php.
Can anyone recommend the best method for this to prevent timeout?
Here is a sample of the data: https://gist.github.com/anonymous/8278066
To disable the script execution time limit, start your script off with this:
set_time_limit(0);
Another problem you will likely run into is a memory limit. Make sure you are reading your file line-by-line or in chunks, rather than the whole file at once. You can do this with fgets().
Start your script with
ini_set('max_execution_time', 0);
The quickest way I found was to use SQL Servers BULK INSERT to load the data directly and unchanged, from the csv files, into matching db import tables. Then do my own manipulation and population of application specific tables from those import tables.
I found BULK INSERT will import the main CSVPAF file, containing nearly 31 million address records in just a few minutes.
So my situation is the following. I obtain data from a certain source through PHP every 24 hours. Now during that time this data gets changed, new data is added or some is updated. Now when I run a query to insert or update this data each 24 hours it takes a long time for the query to execute. This would not be a problem if the execution time was moderate as I am planning on making this a cron job, however the execution time now is way too high.
So I was thinking about writing this data from PHP to a CSV file, when I obtain it. And then using this CSV file together with the MySQL LOAD DATA function which supposedly inserts a lot of data from a file in a very quick manner.
So the question is it possible to write to a CSV file from PHP and have it formatted in a way that suits the LOAD DATA INFILE function, and how can I each 24 hours delete that CSV and create a new one with newly inserted data, and how would I go about properly using the LOAD DATA INFILE function with this particular CSV file? Oh and can I make a cron job out of all of this ? Thanks.
assume you receive data from your source and prepare this array:
$x=array(1=>array('1','2','3','4'), 2=>array('5','6','7','8'));
create a csv file like this:
$file=fopen(<DIRECTORY_PATH>."file.csv","w");
if(!$file){
//error
}
$csv_data="";
foreach($x as $row){
foreach($row as $element){
$csv_data.=$element.", ";
}
//remove the last comma
$csv_data.="\n";
}
fwrite($file,$csv_data);
$query="load data infile '".<DIRECTORY_PATH>."file.csv"."' into table your_table";
if(!mysqli->query($query))
printf("Error: %s\n", $mysqli->error);
As you can read LOAD DATA statement its very flexible , you can specify the fields,rows delimiter and many other features. This will allow you to make your csv in any format do you want.LOAD DATA is indeed the fastest way to insert data in mysql.
In php you can very simply write data to a file like its shown here.
After this indeed you need a cron job that will load the file in mysql, this should be helpful.
One of my clients has all of his product information handled by an outside source. They have provided this to me in a CSV file which they will regulary update and upload to an ftp folder of my specification, say every week.
Within this CSV file is all of the product information; product name, spec, image location etc.
The site which I have built for my client is running a MySQL database, which I thought would be holding all of the product information, and thus has been built to handle all of the product data.
My question is this: How would I go about creating and running a script that would find a newly added CSV file from the specified FTP folder, extract the data, and replace all of the data within the relevant MySQL table, all done automatically?
Is this even possbile?
Any help would be greatly appreciated as I don't want to use the IFrame option, S.
should be pretty straight forward depending on the csv file
some csv files have quotes around text "", some don't
some have , comma inside the quoted field etc
depending on you level of php skills this should be reasonably easy
you can get a modified timestamp from the file to see if it is new
http://nz.php.net/manual/en/function.lstat.php
open the file and import the data
http://php.net/manual/en/function.fgetcsv.php
insert into the database
http://nz.php.net/manual/en/function.mysql-query.php
If the CSV is difficult to parse with fgetcsv
the you could try something like PHPExcel project which has csv reading capabilities
http://phpexcel.codeplex.com
You can just make a script which reads csv file using fread function of php, extract each row and format in an array to insert it into database.
$fileTemp = "path-of-the-file.csv";
$fp = fopen($fileTemp,'r');
$datas = array()
while (($data = fgetcsv($fp)) !== FALSE)
{
$data['productName'] = trim($data[0]);
$data['spec'] = trim($data[1]);
$data['imageLocation'] = trim($data[2]);
$datas[] = $data;
}
Now you have prepared array $datas which you can insert into database with iterations.
All you need is:
Store last file's mtime somewhere (let's say, for simplicity, in another file)
script that runs every X minutes by cron
In this script you simply mtime of the csv file with stored value. If mtime differs, you run SQL query that looks like this:
LOAD DATA LOCAL INFILE '/var/www/tmp/file.csv' REPLACE INTO TABLE mytable COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
Optionally, you can just touch your file to know when you've performed last data load. If the scv's file mtime is greate than your "helper" file, you should touch it and perform the query.
Documentation on LOAD DATA INFILE SQL statement is here
Of course there is a room for queries errors, but I hope you will handle it (you just need to be sure data loaded properly and only in this case touch file or write new mtime).
have you had a look at fgetcsv? You will probably have to set up a cron job to check for a new file at regular intervals.