I have a table with over a million rows. The PHP export script with headers and ajax that I normally use to create the user interface to export to csv, is not able to handle these many rows and times out. Hence looking for an alternative
After a fews days of digging I collated the below script from the internet, this downloads wonderfully but only to the local server > mysql folder
What I am looking for is to create a php mysql script so I can let users download large tables to csv's through the php user interface itself
SELECT 'a', 'b', 'c' UNION ALL SELECT a,b,c INTO OUTFILE '2026.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM table ;
You need to fetch from the DB with paginated results.
Select * from a LIMIT 0,100; the result of this query you must put in in a variable, then parse it like this:
Export to CSV via PHP
After you put the first 100 elements to the csv, you have to fetch from 100 to 200 and so on...then to reopen the csv and put the 100 to 200 elements and so on.
And then finally send the csv to the user.
Related
I have a rather large product feed which i split up in multiple csv files of 20k lines each.
feed-0.csv
feed-1.csv
etc
I load this csv into a temp table and export 3 new csv files which i will lateron load into seperate tables.
products.csv
attributes.csv
prices.csv
Above of course also contain 20k lines just like the (split) source, so far so good, all going well.
Another part of the script loads above 3 csv files into their respective tables, db.products , db.attributes and db.prices. When i select 1 file (be it feed-0.csv or feed-9.csv any split file will do) the database is updated with 20k rows in each respective table. Still no problem there.
Now i create a loop where i loop through the split csv files and add 20k rows to each table on each loop.
This works well until i hit the 3rd loop. then i will get mismatching numbers in the database. e.g.
db.products - 57569 rows
db.attributes - 58661 rows
db.prices - 52254 rows
So while on the previous all was 40k, now i have mismatching numbers.
I have checked the products.csv, attributes.csv and prices.csv on each loop, and these have each the 20k as they should have.
I have tried with random split feeds, e.g. feed-1.csv, feed-5.csv, feed-7.csv or feed-1.csv, feed-8.csv and feed-3.csv. So i changed the files, i changed the order, but each time on the 3rd and further loops i get this problem.
I tried to import split files from different feeds too, but each time 3rd loop i get incorrect numbers. The source files should be good. When i run just 1 or 2 files in any sequence the results are goods.
I suspect that i am hitting some limitation somewhere. I thought it might be a innodb buffer issue, so i restarted the server, but the issue remains. (innodb buffer around 25% after the 3rd loop)
I am using MariaDB 10.1.38, PHP Version 7.3.3, innodb buffer size 500mb
Any pointers in what direction i have to search for a solution would be welcome!
my issue is this
I have table in my database which have more than 1 million rows. Sometimes i need have sql dump of my table in my PC and export whole table taking very long.
Actually i need exported table for example with last 5000 rows.
So is there a way to export MySql table by selecting last X rows.
I know some ways to do it by terminal commands, but i need poor MySql query if it is possible.
Thanks
If I understand well, you could try by using the INTO OUTFILE functionality provided by MySQL. of course I don't know what's your current query but you can easily change my structure with yours:
SELECT *
FROM table_name
INTO OUTFILE '/tmp/your_dump_table.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
ORDER BY id DESC
LIMIT 5000
Since the user #Hayk Abrahamyan has expressed preference to export the dump as .sql file, let's try to analyze a valid alternative:
We can run the query from phpmyadmin or (it's for sure a better solution) MysqlWorkbench SQL editor console and save it by press the export button (see the picture below):
As .sql output file result you will have something like the structure below:
/*
-- Query: SELECT * FROM mytodo.todos
ORDER BY id DESC
LIMIT 5000
-- Date: 2018-01-07 13:15
*/
INSERT INTO `todos` (`id`,`description`,`completed`) VALUES (3,'Eat something',1);
INSERT INTO `todos` (`id`,`description`,`completed`) VALUES (2,'Buy something',1);
INSERT INTO `todos` (`id`,`description`,`completed`) VALUES (1,'Go to the store',0);
I have a very large database table (more than 700k records) that I need to export to a .csv file. Before exporting it, I need to check some options (provided by the user via GUI) and filter the records. Unfortunately this filtering action cannot be achieved via SQL code (for example, a column contains serialized data, so I need to unserialize and then check if the record "passes" the filtering rules.
Doing all records at once leads to memory limit issues, so I decided to break the process in chunks of 50k records. So instead of loading 700k records at once, I'm loading 50k records, apply filters, save to the .csv file, then load other 50k records and go on (until it reaches the 700k records). In this way I'm avoiding the memory issue, but it takes around 3 minutes (This time will increase if the number of records increase).
Is there any other way of doing this process (better in terms of time) without changing the database structure?
Thanks in advance!
The best thing one can do is to get PHP out of the mix as much as possible. Always the case for loading CSV, or exporting it.
In the below, I have a 26 Million row student table. I will export 200K rows of it. Granted, the column count is small in the student table. Mostly for testing other things I do with campus info for students. But you will get the idea I hope. The issue will be how long it takes for your:
... and then check if the record "passes" the filtering rules.
which naturally could occur via the db engine in theory without PHP. Without PHP should be the mantra. But that is yet to be determined. The point is, get PHP processing out of the equation. PHP is many things. An adequate partner in DB processing it is not.
select count(*) from students;
-- 26.2 million
select * from students limit 1;
+----+-------+-------+
| id | thing | camId |
+----+-------+-------+
| 1 | 1 | 14 |
+----+-------+-------+
drop table if exists xOnesToExport;
create table xOnesToExport
( id int not null
);
insert xOnesToExport (id) select id from students where id>1000000 limit 200000;
-- 200K rows, 5.1 seconds
alter table xOnesToExport ADD PRIMARY KEY(id);
-- 4.2 seconds
SELECT s.id,s.thing,s.camId INTO OUTFILE 'outStudents_20160720_0100.txt'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
FROM students s
join xOnesToExport x
on x.id=s.id;
-- 1.1 seconds
The above 1AM timestamped file with 200K rows was exported as a CSV via the join. It took 1 second.
LOAD DATA INFILE and SELECT INTO OUTFILE are companion functions that, for one one thing, cannot be beat for speed short of raw table moves. Secondly, people rarely seem to use the latter. They are flexible too if one looks into all they can do with use cases and tricks.
For Linux, use LINES TERMINATED BY '\n' ... I am on a Windows machine at the moment with the code blocks above. The only differences tend to be with paths to the file, and the line terminator.
Unless you tell it to do otherwise, php slurps your entire result set at once into RAM. It's called a buffered query. It doesn't work when your result set contains more than a few hundred rows, as you have discovered.
php's designers made it use buffered queries to make life simpler for web site developers who need to read a few rows of data and display them.
You need an unbuffered query to do what you're doing. Your php program will read and process one row at a time. But be careful to make your program read all the rows of that unbuffered result set; you can really foul things up if you leave a partial result set dangling in limbo between MySQL and your php program.
You didn't say whether you're using mysqli or PDO. Both of them offer mode settings to make your queries unbuffered. If you're using the old-skool mysql_ interface, you're probably out of luck.
I am trying to understand how to write a query to import my goodreads.com book list (csv) into my own database. I'm using PHP.
$query = "load data local infile 'uploads/filename.csv' into table books
RETURN 1 LINES
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
(Title, Author, ISBN)";
This results in data in the wrong columns, I also can't get the ISBN, no matter what I try, guessing the equals (=) sign is causing some problems.
Here's the first two lines of the CSV:
Book Id,Title,Author,Author l-f,Additional Authors,ISBN,ISBN13,My Rating,Average Rating,Publisher,Binding,Number of Pages,Year Published,Original Publication Year,Date Read,Date Added,Bookshelves,Bookshelves with positions,Exclusive Shelf,My Review,Spoiler,Private Notes,Read Count,Recommended For,Recommended By,Owned Copies,Original Purchase Date,Original Purchase Location,Condition,Condition Description,BCID
11487807,"Throne of the Crescent Moon (The Crescent Moon Kingdoms, #1)","Saladin Ahmed","Ahmed, Saladin","",="0756407117",="9780756407117",0,"3.87","DAW","Hardcover","288",2012,2012,,2012/02/19,"to-read","to-read (#51)","to-read",,"",,,,,0,,,,,
What I want to do is only import the Title, Author, and ISBN for all books to a table that has the following fields: id (auto), title, author, isbn. I want to exclude the first row as well because I dont want the column headers, but when I have tried that it fails every time.
What's wrong with my query?
Unfortunately, in case you want to use the native MySQL CSV parser, you will have to create a table that matches the column count of the CSV file. Simply drop the unnecessary columns when the import is finished.
I have a csv file, which is generated weekly, and loaded into a mysql database. I need to make a report, which will include various statistics on the records imported. The first such statistic is how many records were imported.
I use PHP to interface with the database, and will be using php to generate a page showing such statistics.
However, the csv files are imported via a mysql script, quite separate from any PHP.
Is it possible to calculate the records that were imported and store the number in a different field/table, or some other way?
Adding an additional timefield to work out fields added since a certain time is not possible, as the structure of the database can not be changed.
Is there a query I can use while importing from a mysql script, or a better way to generate/count the number of imported records from within php?
You can get the number of records in a table using the following query.
SELECT COUNT(*) FROM tablename
So what you can do is you can count the number of records before the import and after the import and then select the difference like so.
$before_count = mysql_fetch_assoc(mysql_query("SELECT COUNT(*) AS c FROM tablename"));
// Run mysql script
$after_count = mysql_fetch_assoc(mysql_query("SELECT COUNT(*) AS c FROM tablename"));
$records_imported = $after_count['c'] - $before_count['c'];
You could do this all from the MySql script if you would like but I think using PHP to do it turns out to be a bit more clean.
A bit of a barrel-scraper, but depending on permissions you could edit the cron executed MySQL script to output some pre-update stats into a file using INTO OUTFILE and then parse the resultant file in PHP. You'd then have the 'before' stats and could execute the stats queries against via PHP to obtain the 'after' stats.
However, as with many of these solutions it'll be next to impossible to find updates to existing rows using this solution. (Although new rows should be trivial to detect.)
Not really sure what you're after, but here's a bit more detail:
Get MySQL to export the relevant stats to a known directory using
SELECT... INTO OUTFILE..
This directory would need to be readable/writable by the MySQL
user/group and your web server's user/group (or whatever user/group
you're running PHPas if you're going to automate the cli via cron on a
weekly basis. The file should be in CSV format and datestamped as
"stats_export_YYYYMMDD.csv".
Get PHP to scan the export directory for files beginning
"stats_export_", perhaps using the "scandir" function with a simple
substr test. You can then add the matching filename to an array. Once
you're run out of files, sort the array to ensure it's in date order.
Read the stats data from each of the files listed in the array in
turn using fgetcsv. It would be wise to place this data into a clean
array which also contains the relevant datestamp as extracted from the
filename.
At this point you'll have a summary of the stats at the end of each
day in an array. You can then execute the relevant stats SQL queries
again (if so required) directly from PHP and add the stats to the data
array.
Compare/contrast and output as required.
Load the files using PHP and 'LOAD DATA INFILE .... INTO TABLE .. ' and then get the number of imported rows using mysqli_affected_rows() (or mysql_affected_rows)