So my situation is the following. I obtain data from a certain source through PHP every 24 hours. Now during that time this data gets changed, new data is added or some is updated. Now when I run a query to insert or update this data each 24 hours it takes a long time for the query to execute. This would not be a problem if the execution time was moderate as I am planning on making this a cron job, however the execution time now is way too high.
So I was thinking about writing this data from PHP to a CSV file, when I obtain it. And then using this CSV file together with the MySQL LOAD DATA function which supposedly inserts a lot of data from a file in a very quick manner.
So the question is it possible to write to a CSV file from PHP and have it formatted in a way that suits the LOAD DATA INFILE function, and how can I each 24 hours delete that CSV and create a new one with newly inserted data, and how would I go about properly using the LOAD DATA INFILE function with this particular CSV file? Oh and can I make a cron job out of all of this ? Thanks.
assume you receive data from your source and prepare this array:
$x=array(1=>array('1','2','3','4'), 2=>array('5','6','7','8'));
create a csv file like this:
$file=fopen(<DIRECTORY_PATH>."file.csv","w");
if(!$file){
//error
}
$csv_data="";
foreach($x as $row){
foreach($row as $element){
$csv_data.=$element.", ";
}
//remove the last comma
$csv_data.="\n";
}
fwrite($file,$csv_data);
$query="load data infile '".<DIRECTORY_PATH>."file.csv"."' into table your_table";
if(!mysqli->query($query))
printf("Error: %s\n", $mysqli->error);
As you can read LOAD DATA statement its very flexible , you can specify the fields,rows delimiter and many other features. This will allow you to make your csv in any format do you want.LOAD DATA is indeed the fastest way to insert data in mysql.
In php you can very simply write data to a file like its shown here.
After this indeed you need a cron job that will load the file in mysql, this should be helpful.
Related
I have excel data more than 5k rows and 17 columns, I use the nested loop technique in php, but this takes a long time, to process the data using the xls file format takes 45 minutes, while using the csv file format takes 30 minutes , is there a technique to speed up uploading files from excel to the database (I use Postgresql).
I use a nested loop because how many columns depend on the parameters, and for the INSERT or UPDATE process to the database also depends on the parameters.
Here is my code for the import process
<?php
$row = 5000; // estimated row
$col = 17; // estimated col
for($i=1; $i<=$row; $i+=1){
for($j=1; $j<=$col; $j+=1){
$custno = $custno = $sheetData[$i][0];
$getId = "SELECT id from data WHERE 'custno' = $custno";
if($getId){
$update = "UPDATE data SET address = 'address 1' WHERE custno = $custno";
}else{
$insert = "INSERT INTO data (address) VALUES (address jon);
}
}
}
I use the PhpSpreadsheet library
First, try to find out what is the root of the issue, is it because operating over the file is slow or there are too many SQL queries being executed in the meantime?
Bear in mind that running queries in the loop is always asking for performance trouble. Maybe you can avoid that by asking for needed data before processing the file? You may not be able to define which data are needed on that step but fetching more than you need could be still faster than making separate queries one by one. Also, I would like to encourage you to limit INSERT or UPDATE queries. They are usually slower than the SELECT one. Try to collect data for database write operations and run it once after the loop.
For CSV operations I would prefer basic php methods like fgetcsv() and str_getcsv() than the separate library as long as the file is not overcomplicated. If you are keen to check some alternatives for PhpSpreadsheet take a look at Spout by box.com, it looks promising but I have never used that.
I'm sure that you can improve your performance by using PHP Genrators, they are perfect everytime you have to read a file content. Here you have some more links:
https://www.sitepoint.com/memory-performance-boosts-with-generators-and-nikiciter/
https://www.sitepoint.com/generators-in-php/
https://riptutorial.com/php/example/5441/reading-a-large-file-with-a-generator/
If not using php for this operation is an option for your, try exporting this spreadsheet as CSV and importing the file using COPY. It won't take more than a few seconds.
If your database is installed locally you just need to execute COPY in a client of your choice, e.g. pgAdmin. Check this answer for more information.
COPY your_table FROM '/home/user/file.csv' DELIMITER ',' CSV HEADER;
Keep in mind that the user postgres in your system must have the necessary permissions to access the CSV file. Check how to do that in your operating system, e.g. chown in Linux.
In case your database is installed in a remote server, you have to use the STDIN facility of COPY via psql
$ cat file.csv | psql your_db -c "COPY your_table FROM STDIN;"
Im working on a project using Laravel, Im currently having a cron to process data and generate 2 CSV files: 1 is sorted (call file A) and the other 1 is not (file B).
Each file is ~10mb, cron is triggered daily, and we dont need the database to store the results.
Im currently doing something like this:
Cron runs and stores all the results into a temp table in database
After finish the process, read data from that table and write to file B (the normal file without sort)
Then, run another query to get the sorted data from table again, then write to file A
After that, empty that temp table
Coz the result data is too big to store as an array (running out of memory), so we end up with using the temp table to store them.
My question is: Is there any better appoach for this scenario?
Thanks
We solved some "out of memory" errors during the export of large data as csv by using cursors.
The "out of memory" occured both with the classic Eloquent method (Model::where(...)->get()) and the chunk function.
You can use cursors this way:
use App\Models\Flight;
foreach (Flight::where('destination', 'Zurich')->cursor() as $flight) {
// write the record in your csv file
}
You can find some more documentation here: https://laravel.com/docs/8.x/eloquent#cursors
I Have created a table on my db, and filled all the records, using CSV file.
I need to do this weekly to keep the table updated.
I want to upload the new records without disturbing the old one onto the same table using csv.
[I have to pick the data from remote host and upload it locally on my server, i dont have access to the remote db]
Kindly guide me.
You can upload records from a CSV into a table VERY quickly using the load data infile syntax (http://dev.mysql.com/doc/refman/5.1/en/load-data.html)
The syntax is pretty simple, but also flexible. This is an example:
LOAD DATA INFILE 'data.txt' INTO TABLE table2
FIELDS TERMINATED BY '\t';
You can kick these off from a console or via code.
This will append to the table, not replace it, so if you don't truncate it first, it should work a charm.
You can of course also load the data manually by parsing the CSV file in your code and manually creating an insert statement for each line of code, but if the format is fixed already, this will be quicker and more efficient.
Edit: It appends the data. By default, no database will delete data from a table unless you specifically tell it to. Any insert statement is what you consider an append statement.
I'll be importing data from another website into my db, the external data I'll be reading before importing into my database is xml.
I'll be running the script every 15 minutes to check the xml file.
What is the best way to go about inserting/modifying/deleting data?
I know I can just delete all the data from the database table before the importing the data from the xml but there has to be a more efficient way of doing this.
I appreciate the help.
Why not simply use the Truncate keyword in your statement?
TRUNCATE TableName;INSERT INTO....
regards
One of my clients has all of his product information handled by an outside source. They have provided this to me in a CSV file which they will regulary update and upload to an ftp folder of my specification, say every week.
Within this CSV file is all of the product information; product name, spec, image location etc.
The site which I have built for my client is running a MySQL database, which I thought would be holding all of the product information, and thus has been built to handle all of the product data.
My question is this: How would I go about creating and running a script that would find a newly added CSV file from the specified FTP folder, extract the data, and replace all of the data within the relevant MySQL table, all done automatically?
Is this even possbile?
Any help would be greatly appreciated as I don't want to use the IFrame option, S.
should be pretty straight forward depending on the csv file
some csv files have quotes around text "", some don't
some have , comma inside the quoted field etc
depending on you level of php skills this should be reasonably easy
you can get a modified timestamp from the file to see if it is new
http://nz.php.net/manual/en/function.lstat.php
open the file and import the data
http://php.net/manual/en/function.fgetcsv.php
insert into the database
http://nz.php.net/manual/en/function.mysql-query.php
If the CSV is difficult to parse with fgetcsv
the you could try something like PHPExcel project which has csv reading capabilities
http://phpexcel.codeplex.com
You can just make a script which reads csv file using fread function of php, extract each row and format in an array to insert it into database.
$fileTemp = "path-of-the-file.csv";
$fp = fopen($fileTemp,'r');
$datas = array()
while (($data = fgetcsv($fp)) !== FALSE)
{
$data['productName'] = trim($data[0]);
$data['spec'] = trim($data[1]);
$data['imageLocation'] = trim($data[2]);
$datas[] = $data;
}
Now you have prepared array $datas which you can insert into database with iterations.
All you need is:
Store last file's mtime somewhere (let's say, for simplicity, in another file)
script that runs every X minutes by cron
In this script you simply mtime of the csv file with stored value. If mtime differs, you run SQL query that looks like this:
LOAD DATA LOCAL INFILE '/var/www/tmp/file.csv' REPLACE INTO TABLE mytable COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
Optionally, you can just touch your file to know when you've performed last data load. If the scv's file mtime is greate than your "helper" file, you should touch it and perform the query.
Documentation on LOAD DATA INFILE SQL statement is here
Of course there is a room for queries errors, but I hope you will handle it (you just need to be sure data loaded properly and only in this case touch file or write new mtime).
have you had a look at fgetcsv? You will probably have to set up a cron job to check for a new file at regular intervals.