One of my clients has all of his product information handled by an outside source. They have provided this to me in a CSV file which they will regulary update and upload to an ftp folder of my specification, say every week.
Within this CSV file is all of the product information; product name, spec, image location etc.
The site which I have built for my client is running a MySQL database, which I thought would be holding all of the product information, and thus has been built to handle all of the product data.
My question is this: How would I go about creating and running a script that would find a newly added CSV file from the specified FTP folder, extract the data, and replace all of the data within the relevant MySQL table, all done automatically?
Is this even possbile?
Any help would be greatly appreciated as I don't want to use the IFrame option, S.
should be pretty straight forward depending on the csv file
some csv files have quotes around text "", some don't
some have , comma inside the quoted field etc
depending on you level of php skills this should be reasonably easy
you can get a modified timestamp from the file to see if it is new
http://nz.php.net/manual/en/function.lstat.php
open the file and import the data
http://php.net/manual/en/function.fgetcsv.php
insert into the database
http://nz.php.net/manual/en/function.mysql-query.php
If the CSV is difficult to parse with fgetcsv
the you could try something like PHPExcel project which has csv reading capabilities
http://phpexcel.codeplex.com
You can just make a script which reads csv file using fread function of php, extract each row and format in an array to insert it into database.
$fileTemp = "path-of-the-file.csv";
$fp = fopen($fileTemp,'r');
$datas = array()
while (($data = fgetcsv($fp)) !== FALSE)
{
$data['productName'] = trim($data[0]);
$data['spec'] = trim($data[1]);
$data['imageLocation'] = trim($data[2]);
$datas[] = $data;
}
Now you have prepared array $datas which you can insert into database with iterations.
All you need is:
Store last file's mtime somewhere (let's say, for simplicity, in another file)
script that runs every X minutes by cron
In this script you simply mtime of the csv file with stored value. If mtime differs, you run SQL query that looks like this:
LOAD DATA LOCAL INFILE '/var/www/tmp/file.csv' REPLACE INTO TABLE mytable COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
Optionally, you can just touch your file to know when you've performed last data load. If the scv's file mtime is greate than your "helper" file, you should touch it and perform the query.
Documentation on LOAD DATA INFILE SQL statement is here
Of course there is a room for queries errors, but I hope you will handle it (you just need to be sure data loaded properly and only in this case touch file or write new mtime).
have you had a look at fgetcsv? You will probably have to set up a cron job to check for a new file at regular intervals.
Related
I have excel data more than 5k rows and 17 columns, I use the nested loop technique in php, but this takes a long time, to process the data using the xls file format takes 45 minutes, while using the csv file format takes 30 minutes , is there a technique to speed up uploading files from excel to the database (I use Postgresql).
I use a nested loop because how many columns depend on the parameters, and for the INSERT or UPDATE process to the database also depends on the parameters.
Here is my code for the import process
<?php
$row = 5000; // estimated row
$col = 17; // estimated col
for($i=1; $i<=$row; $i+=1){
for($j=1; $j<=$col; $j+=1){
$custno = $custno = $sheetData[$i][0];
$getId = "SELECT id from data WHERE 'custno' = $custno";
if($getId){
$update = "UPDATE data SET address = 'address 1' WHERE custno = $custno";
}else{
$insert = "INSERT INTO data (address) VALUES (address jon);
}
}
}
I use the PhpSpreadsheet library
First, try to find out what is the root of the issue, is it because operating over the file is slow or there are too many SQL queries being executed in the meantime?
Bear in mind that running queries in the loop is always asking for performance trouble. Maybe you can avoid that by asking for needed data before processing the file? You may not be able to define which data are needed on that step but fetching more than you need could be still faster than making separate queries one by one. Also, I would like to encourage you to limit INSERT or UPDATE queries. They are usually slower than the SELECT one. Try to collect data for database write operations and run it once after the loop.
For CSV operations I would prefer basic php methods like fgetcsv() and str_getcsv() than the separate library as long as the file is not overcomplicated. If you are keen to check some alternatives for PhpSpreadsheet take a look at Spout by box.com, it looks promising but I have never used that.
I'm sure that you can improve your performance by using PHP Genrators, they are perfect everytime you have to read a file content. Here you have some more links:
https://www.sitepoint.com/memory-performance-boosts-with-generators-and-nikiciter/
https://www.sitepoint.com/generators-in-php/
https://riptutorial.com/php/example/5441/reading-a-large-file-with-a-generator/
If not using php for this operation is an option for your, try exporting this spreadsheet as CSV and importing the file using COPY. It won't take more than a few seconds.
If your database is installed locally you just need to execute COPY in a client of your choice, e.g. pgAdmin. Check this answer for more information.
COPY your_table FROM '/home/user/file.csv' DELIMITER ',' CSV HEADER;
Keep in mind that the user postgres in your system must have the necessary permissions to access the CSV file. Check how to do that in your operating system, e.g. chown in Linux.
In case your database is installed in a remote server, you have to use the STDIN facility of COPY via psql
$ cat file.csv | psql your_db -c "COPY your_table FROM STDIN;"
The server I'm using does not allow me to use LOAD DATA INFILE or even save a file as a .txt file. I have a .csv file that is just one column, with a name in each row. I want to insert this data into the name column of a table named people with name as one column and location as the other… where the locations will be updated at a later time.
It's a weird way to do this. But I need to do it this way and not using those previous commands.
Does anybody have any ideas? This has been giving me a problem for many hours. I can't figure out how to load the csv into my table column without using those previously mentioned methods that I can't use on my server.
Based on your issue and lack of general permissions you will have to do the following:
Replace the DOS carriage returns with unix new lines:
$contents=preg_replace('/(\r\n|\r|\n)/s',"\n",$contents);
Save the contents to the file, and then loop through each line, building an INSERT command that you execute to MySQL.
So my situation is the following. I obtain data from a certain source through PHP every 24 hours. Now during that time this data gets changed, new data is added or some is updated. Now when I run a query to insert or update this data each 24 hours it takes a long time for the query to execute. This would not be a problem if the execution time was moderate as I am planning on making this a cron job, however the execution time now is way too high.
So I was thinking about writing this data from PHP to a CSV file, when I obtain it. And then using this CSV file together with the MySQL LOAD DATA function which supposedly inserts a lot of data from a file in a very quick manner.
So the question is it possible to write to a CSV file from PHP and have it formatted in a way that suits the LOAD DATA INFILE function, and how can I each 24 hours delete that CSV and create a new one with newly inserted data, and how would I go about properly using the LOAD DATA INFILE function with this particular CSV file? Oh and can I make a cron job out of all of this ? Thanks.
assume you receive data from your source and prepare this array:
$x=array(1=>array('1','2','3','4'), 2=>array('5','6','7','8'));
create a csv file like this:
$file=fopen(<DIRECTORY_PATH>."file.csv","w");
if(!$file){
//error
}
$csv_data="";
foreach($x as $row){
foreach($row as $element){
$csv_data.=$element.", ";
}
//remove the last comma
$csv_data.="\n";
}
fwrite($file,$csv_data);
$query="load data infile '".<DIRECTORY_PATH>."file.csv"."' into table your_table";
if(!mysqli->query($query))
printf("Error: %s\n", $mysqli->error);
As you can read LOAD DATA statement its very flexible , you can specify the fields,rows delimiter and many other features. This will allow you to make your csv in any format do you want.LOAD DATA is indeed the fastest way to insert data in mysql.
In php you can very simply write data to a file like its shown here.
After this indeed you need a cron job that will load the file in mysql, this should be helpful.
I am trying to import write a script that imports a csv file and parses it into mysql, then imports it in the db, I came across this article,but it seems to be written for ms sql, does anyone know of a tutorial for doing this in mysql, or better still a libary or script that can do it ?.
Thanks :-)
http://blog.sqlauthority.com/2008/02/06/sql-server-import-csv-file-into-sql-server-using-bulk-insert-load-comma-delimited-file-into-sql-server/
Using the LOAD DATA INFILE SQL statement
Example :
LOAD DATA LOCAL INFILE '/importfile.csv'
INTO TABLE test_table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(field1, filed2, field3);
If you are looking for script / library. I want to refer below link. here you can find :
PHP script to import csv data into mysql
This is a simple script that will allow you to import csv data into your database. This comes handy because you can simply edit the appropriate fields, upload it along with the csv file and call it from the web and it will do the rest.
It allows you to specify the delimiter in this csv file, whether it is a coma, a tab etc. It also allows you to chose the line separator, allows you to save the output to a file (known as a data sql dump).
It also permits you to include an empty field at the beginning of each row, which is usually an auto increment integer primary key.
This script is useful mainly if you don’t have phpmyadmin, or you don’t want the hassle of logging in and prefer a few clicks solution, or you simply are a command prompt guy.
Just make sure the table is already created before trying to dump the data.
Kindly post your comments if you got any bug report.
My task is to parse a large .txt file (circa. 15,000 lines) into a MySQL database. The problem is I'm working with a 30 second maximum execution time. I've tried using this:
$handle = #fopen('http://www.someothersiteyouknow.com/bigfile.txt', "r");
if ($handle) {
while (!feof($handle)) {
$lines[] = fgets($handle, 4096);
}
fclose($handle);
}
I can then access the $lines array and parse the data whichever way I need to but it takes too long for the script to finish running. My feeling is that I should read the file in chunks, maybe 1000 lines at a time. But I only understand how to read from the beginning of the .txt file. Please may you impart some ideas for methods of doing this correctly? Just to clarify, I don't require specific code examples, just ideas for how to parse large .txt files using PHP.
This doesn't seem like the best idea, to be honest. What if multiple users access the same page at, or around, the same time? You'll have (number of users*large text file) being processed concurrently.
Suggest you bring the file local (save it locally if the file doesn't already exist), and work with the local file. This should help reduce your transaction time
This should help bring you into the 30s limit ... if the file doesn't take longer than 30s to download!
Consider putting a set_time_limit inside your loop.
Also if this is a once-off thing you could look at doing it with mySQL's load data file ?
If you can put your file on the server, then you may try to use LOAD DATA INFILE query. It has plenty of options to parse the input, and works reasonably fast. Start experimenting with the small portion of your file. If the server ends up inserting everything into the single row, then tune the LINES TERMINATED BY part, by specifying '\n' or '\r\n'. Then double check the number of rows against the number of lines in the file, and SELECT some of them to see what ended up in the table.