I have records into .CSV file and I want to import them into MySQL database.
Whenever I import the .CSV I get the message Import has been successfully finished... but only 79 out of 114 records are be inserted into the database.
When I try to import the .CSV file with 411 records, just 282 are be inserted. The CSV file which got 411 records includes two categories of records Active and Sold whereby 114 records are Active.
Has someone gotten this type of problem? If so what should be done?
I wrote my own csv importer with php. I use php command fgetcsv to read the csv file and then I use mysql insert command in a loop.
$handle = fopen($this->file, "r");
$i=0;
$delimiter = ($this->fieldDelimiter == 'TAB') ? chr(9) : $this->fieldDelimiter;
while (($data = fgetcsv($handle, 10000, $delimiter)) !== FALSE)
{
$mydata[] = $data;
}
fclose ($handle);
reset ($mydata);
if ($this->CSVhasTitle)
{
$mydata = array_slice($mydata,1); //delete first row
}
Then I loop through my array and I use mysql insert:
foreach ($mydata as $value)
{
INSERT INTO $table (...) VALUES (....)
}
But I add exact columnnames into the array before the loop. I've an array of all columnames.
I had this problem too. Even though its a bit old and these recommendations go in one direction here is the solution I found. I was creating a large database and import and had the same thing happen, after trial and trials I realized that a key was created somehow assigned (I didn't recognize it because Iwas using the new skin and always use the old skin). Took out the key that somehow got assigned not by me and left the primary only and boom, absorbed the data upload. Also, had issues with view now, and request times out with respect to your question. I pushed up the viewer display a lot to thousands and now stuck, cannot access the config file anywhere with my CP. So as will hang and hosting customer support to lazy to read my concerns and override it on their end-I will have to remove whole table instead of any DROP as can't even run SQL as freezes with overload. So, food for thought would be to keep you table view down, which sucks like in my case b/c I need to look at 17,000 rcords visually quickly to ensure my .csv was correct rather then functions as if issues then can spot them and correct in the control which makes more sense to me anyway.
Take a look at your CSV file. It very likely contains something like
1,2,"some data",1
2,5,"data,with,comma",2
If you don't specify COLUMNS OPTIONALLY ENCLOSED BY '"' (SINGLE_QUOTE DOUBLE_QUOTE SINGLE_QUOTE) then the commas embedded in the string data in the second row, third column will not be imported properly.
Check the CSV to see what enclosure character is being used and specify that in the phpmyadmin interface.
Related
I'm parsing a 1 000 000 line csv file in PHP to recover this datas: IP Address, DNS , Cipher suites used.
In order to know if some DNS (having several mail servers) has different Cipher suites used on their servers, I have to store in a array a object containing the DNS name, a list of the IP Address of his servers, and a list of cipher suites he uses. At the end I have an array of 1 000 000 elements. To know the number of DNS having different cipher suites config on their servers I do:
foreach($this->allDNS as $dnsObject){
$res=0;
if(count($dnsObject->getCiphers()) > 1){ //if it has several different config
res++;
}
return $res;
}
Problem: Consumes too much memory, i can't run my code on 1000000 line csv (if I don't store these data in a array, I parse this csv file in 20 sec...). Is there a way to bypass this problem ?
NB: I already put
ini_set('memory_limit', '-1');
but this line just bypass the memory error.
Saving all of those CSV data will definitely take its toll on the memory.
One logical solution to your problem is to have a database that will store all of those data.
You may refer to this link for a tutorial on parsing your CSV file and storing it to database.
Write the processed Data (for each Line seperately) into one File (or Database)
file_put_contents('data.txt', $parsingresult, FILE_APPEND);
FILE_APPEND will append the $parsingresult at the End of the File-Content.
Then you can access the processed Data by file_get_contents() or file().
Anyways. I think, using a Database and some Pre-Processing would be the best Solution if this is needed more often.
You can use fgetcsv() to read and parse the CSV file one line at a time. Keep the data you need and discard the line:
// Store the useful data here
$data = array();
// Open the CSV file
$fh = fopen('data.csv', 'r');
// The first line probably contains the column names
$header = fgetcsv($fh);
// Read and parse one data line at a time
while ($row = fgetcsv($fh)) {
// Get the desired columns from $row
// Use $header if the order or number of columns is not known in advance
// Store the gathered info into $data
}
// Close the CSV file
fclose($fh);
This way it uses the minimum amount of memory needed to parse the CSV file.
I generate JSON files which I load into datatables, and these JSON files can contain thousands of rows from my database. To generate them, I need to loop through every row in the database and add each database row as a new row in the JSON file. The problem I'm running into is this:
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 262643 bytes)
What I'm doing is I get the JSON file with file_get_contents($json_file) and decode it into an array then I add a new row to the array, then encode the array back into JSON and export it to the file with file_put_contents($json_file).
Is there a better way to do this? Is there a way I can prevent the memory increasing with each loop iteration? Or is there a way I can clear the memory before it reaches the limit? I need the script to run to completion, but with this memory problem it barely gets up to 5% completion before crashing.
I can keep rerunning the script and each time I rerun it, it adds more rows to the JSON file, so if this memory problem is unavoidable, is there a way to automatically rerun the script numerous times until its finished? For example could I detect the memory usage, and detect when its about to reach the limit, then exit out of the script and restart it? I'm on wpengine so they won't allow security risky functions like exec().
So I switched to using CSV files and it solved the memory problem. The script runs vastly faster too. JQuery DataTables doesn't have built in support for CSV files, so I wrote a function to convert the CSV file to JSON:
public function csv_to_json($post_type) {
$data = array(
"recordsTotal" => $this->num_rows,
"recordsFiltered" => $this->num_rows,
"data"=>array()
);
if (($handle = fopen($this->csv_file, 'r')) === false) {
die('Error opening file');
}
$headers = fgetcsv($handle, 1024, "\t");
$complete = array();
while ($row = fgetcsv($handle, 1024, "\t")) {
$complete[] = array_combine($headers, $row);
}
fclose($handle);
$data['data'] = $complete;
file_put_contents($this->json_file,json_encode($data,JSON_PRETTY_PRINT));
}
So the result is I create a CSV file and a JSON file much faster than creating a JSON file alone, and there are no issues with memory limits.
Personally as I said in the comments, I would use CSV files. They have several advantages.
you can read / write one line at a time so you only manage the memory for one line
you can just append new data into the file.
PHP has plenty of built in support using either the fputcsv() or SPL file objects.
you can load them directly into the database using using "Load Data Infile"
http://dev.mysql.com/doc/refman/5.7/en/load-data.html
The only cons are
keep the same schema through the whole file
no nested data structures
The issue with Json, is ( as far as I know ) you have to keep the whole thing in memory as a single data set. Therefor you cannot stream it ( line for line ) like a normal text file. There is really no solution beside limiting the size of the json data, which may or may not even be easy to do. You can increase the memory some, but that is just a temporary fix if you expect the data to continue to grow.
We use CSV files in a production environment and I regularly deal with datasets that are 800k or 1M rows. I've even seen one that was 10M rows. We have a single table of 60M rows ( MySql ) that is populated from CSV uploads. So it will work and be robust.
If your set on Json, then I would just come up with a fixed number of rows that works and design your code to only run that many rows at a time. It's impossible for me to guess how to do that without more details.
Using PHP, is it possible to load just a single record / row from a CSV file?
In other words, I would like to treat the file as an array, but don't want to load the entire file into memory.
I know this is really what a database is for, but I am just looking for a down and dirty solution to use during development.
Edit: To clarify, I know exactly which row contains the info I am looking for.
I would just like to know if there is a way to get it without having to read the entire file into memory.
As I understand you are looking for a row with certain data. Therefore you could probably implement the following logic:
(1) scan file for the given data (ex. value which is in the row that you are trying to find),
(2) load only this line of file,
(3) perform your operations on that line.
fgetcsv() operates over a file resource handle, so if you want you can obtain the position of the line you can fseek() the resource to that position and use fgetcsv() normally.
If you don't know which line you are looking for until after you have read the row, your best bet is reading the record until you find the record by testing the array that is returned.
$fp = fopen('data.csv', 'r');
while(false !== ($data = fgetcsv($fp, 0, ','))) {
if ($data['field'] === 'somevalue') {
echo 'Hurray';
break;
}
}
If you are looking to read a specific line, use the splfile object and seek to the record number. This will return a string that you must convert to an array
$file = new SplFileObject('data.csv');
$file->seek(2);
$record = $file->current();
$data = explode(",", $record);
I have csv file with 104 fields, but I need only 4 fields to use in mysql database. each file has about a million rows.
could somebody tell me efficient way to do this? reading each line to array takes long time.
thanks
You have to read every line in its entirety by definition. This is necessary to find the delimiter for the next record (i.e. the newline character). You only need to discard the data you have read that you don't need. E.g.:
$data = array();
$fh = fopen('data.csv', 'r');
$headers = fgetcsv($fh);
while ($row = fgetcsv($fh)) {
$row = array_combine($headers, $row);
$data[] = array_intersect_key($row, array_flip(array('foo', 'bar', 'baz')));
// alternatively, if you know the column index, something like:
// $data[] = array($row[1], $row[45], $row[60]);
}
This only retains the columns foo, bar and baz and discards the rest. The reading from file (fgetcsv) is about as fast as it gets. If you need it any faster, you'll have to implement your own CSV tokenizer and parser which skips over the columns you don't need without even temporarily storing them in memory; how much of a performance boost this brings vs. development time necessary to implement this bug free is very debatable.
simple excel macro can drop all unnecessary columns (100 out of 104)
within second. I am looking for similar solution.
That is because Excel, once a file is opened, has all data in memory and can act on it very quickly. For an accurate comparison you need to compare the time it takes to open the file in Excel + dropping of the columns, not just dropping the columns.
I am using PHP to expose vehicle GPS data from a CSV file. This data is captured at least every 30 seconds for over 70 vehicles and includes 19 columns of data. This produces several thousand rows of data and file sizes around 614kb. New data is appended to end of the file. I need to pull out the last row of data for each vehicle, which should represent the most the recent status. I am able to pull out one row for each unit, however since the CSV file is in chronological order I am pulling out the oldest data in the file instead of the newest. Is it possible to read the CSV from the end to the beginning? I have seen some solutions, however they typically involve loading the entire file into memory and then reversing it, this sounds very inefficient. Do I have any other options? Thank you for any advice you can offer.
EDIT: I am using this data to map real-time locations on-the-fly. The data is only provided to me in CSV format, so I think importing into a DB is out of the question.
With fseek you can set the pointer to the end of the file and offset it negative to read a file backwards.
If you must use csv files instead of a database, then perhaps you could read the file line-by-line. This will prevent more than the last line being stored in memory (thanks to the garbage collector).
$handle = #fopen("/path/to/yourfile.csv", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
// old values of $last are garbage collected after re-assignment
$last = $line;
// you can perform optional computations on past data here if desired
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
// $last will now contain the last line of the file.
// You may now do whatever with it
}
edit: I did not see the fseek() post. If all you need is the last line, then that is the way to go.