PHP How to handle/parse csv files that have missing columns - php

I have many csv files generated by a third party, for which I have no say or control.
So each day I must import these csv data to mysql.
Some tables have correct matching number of columns to header.
Others do not.
Even when I did a prepared statement, it still did not import.
I tried to create a repair csv function, to add extra columns to each row, if their count of columns was less than the count of header columns.
As part of this project I am using the composer package league csv.
https://csv.thephpleague.com/
But here is my function code:
public function repaircsv(string $filepath) {
// make sure incoming file exists
if (!file_exists($filepath)) {
// return nothing
return;
}
// setup variables
$tempfile = pathinfo($filepath,PATHINFO_DIRNAME).'temp.csv';
$counter = 0;
$colcount = 0;
$myline = '';
// check if temp file exists if it does delete it
if (file_exists($tempfile)) {
// delete the temp file
unlink($tempfile);
}
// C:\Users\admin\vendor\league\csv
require('C:\Users\admin\vendor\league\csv\autoload.php');
// step one get header column count
$csv = Reader::createFromPath($filepath);
// set the header offset
$csv->setHeaderOffset(0);
//returns the CSV header record
$header = $csv->getHeader();
// get the header column count
$header_count = count($header);
// check if greater than zero and not null
if ($header_count < 1 || empty($header_count)) {
// return nothing
return $header_count;
}
// loop thru csv file
// now read file line by line skipping line 1
$file = fopen($filepath, 'r');
$temp = fopen($tempfile, 'w');
// loop thru each line
while (($line = fgetcsv($file)) !== FALSE) {
// if first row just straight append
if ($counter = 0) {
// append line to temp file
fputcsv($temp, $line);
}
// if all other rows compare column count to header column count
if ($counter > 0) {
// get column count for normal rows
$colcount = count($line);
// compare to header column count
$coldif = $header_count - $colcount;
// loop til difference is zero
while ($colcount != $header_count) {
// add to line extra comma
$line .= ',';
// get new column count
$colcount = count($line);
}
// append to temp file
fputcsv($temp, $line);
// show each line
$myline .= 'Line: ['.$line.']<br/><br/>';
}
// increment counter
$counter++;
}
// check file size of temp file
$fs = filesize($tempfile);
// if below 200 ignore and do not copy
if ($fs > 200) {
// copy temp to original filename
copy($tempfile,$filepath);
}
return $myline;
}
The logic is to copy the original csv file to a new temp csv file and add extra commas to rows of data that have missing columns.
Thank you for any help.
Edit: So the various csv's contain private data, so I can not share them.
But let us for example say i download multiple csvs for different data daily.
Each csv has a header row, and data.
If the number of columns in each row isn't 100% the same number of columns as in the header, it errors out.
If there are any special characters, it errors out.
There are 1000's of rows of data.
The code above is my first attempt to try to fix rows that have missing columns.
Here is an example
FirstName, LastName, Email
Steve,Jobs
,Johnson,sj#johns.com
Just a very small example.
I have no control of how the csvs are created, I do control the download process and import process.
Which then i use the csv data to update mysql tables.
I have tried the load data infile but that errors out too.
So I need to fix the csv files after they are downloaded.
Any ideas?

Do not mix array and string, instead of
$line .= ',';
do
$Line[]= '';
Also fix:
$myline .= 'Line: ['.implode(',', $line).']<br/><br/>';
Suggestion, you can replace your while loop with:
$line = array_pad($line, $header_count, ''); // append missing items
$line = array_slice($line, 0, $header_count); // remove eventual excess items

Related

Importing CSV to mysql table and add date to each row using PHP execution time

I did php code that do all the work of finding CSV files in a given directory and importing each csv file to the right table. The problem is I a csv file that contains 1M rows! Yes 1M rows :/ So it takes more than 15 mins to import it. This is the ISSUE. How can I improve the execution time?
$csv = new SplFileObject($file, 'r');
$csv->setFlags(SplFileObject::READ_CSV);
// get columns name
$tableColumns = $db->getColumns('daily_transaction');
print_r($tableColumns);
// get line fro csv file without the first one
foreach(new LimitIterator($csv, 1) as $line){
$i = 0;
$data = array();
foreach ($tableColumns as $Columns) {
// print($line[$i]."<br>");
$data[$Columns] = $line[$i];
$i++;
}
$data['file_name'] = $infoNAME;
$data['file_date'] = $file_date;
$insert_id = $db->arrayToInsert('daily_transaction', $data);
}
}

Php Upload CSV and Get Column data

What I am trying to do is Upload a CSV file with Php. The first line is the Column names and below that the data (of course). Each column name can change depends on the end user uploads. So the main column names we need can change spots (A1 or B1 etc...) So lets say the column I need is B1 and I need to get all the data in B. Not sure on how to go by it. So far this is what I have. Any ideas?
ini_set("allow_url_fopen", 1);
$handle = fopen($_FILES['fileToUpload']['tmp_name'], 'r') or die ('cannot open the file');
while(!feof($handle)) {
$data[] = fgetcsv($handle);
}
var_dump($data);
fclose($handle);
UPDATE:
I am importing this file from .CSV to PHP
I need to search for column header that starts with “SKU” and then “COST”
From there once those are found then I want the whole column… B, E. But those column letters can change, depends on how it is being exported by the end user. I do not need the rows, just columns.
Once the file is uploaded into the server, use something like the following code to parse it and actually use it as an array[];
Code:
$filename = "upload/sample.csv";
if (($handle = fopen($filename, 'r')) !== FALSE){
while (($row = fgetcsv($handle, 1000, ",")) !== FALSE){
print_r($row);
}
}
That's one way of doing it, you could also read more about it here.
If you want the value of a specific column for each row then you need to loop through the results and pick it out. It looks like you are getting an array of arrays so...(EDITED to get the column based on the header name):
$header = $data[0];
unset($data[0]); // delete the header row so those values don't show in results
$sku_index = '';
$cost_index = '';
// get the index of the desired columns by name
for($i=0; $i < count($header); $i++) {
if($header[$i] == 'SKU') {
$sku_index = $i;
}
if($header[$i] == 'COST') {
$cost_index = $i;
}
}
// loop through each row and grab the values for the desired columns
foreach($data as $row) {
echo $row[$sku_index];
echo $row[$cost_index];
}
Should get what you want.

Read specific column in CSV to Array

I am trying to read a certain data in my csv file and transfer it to an array. What I want is to get all the data of a certain column but I want to start on a certain row (let say for example, row 5), is there a possible way to do it? What I have now only gets all the data in a specific column, want to start it in row 5 but can't think any way to do it. Hope you guys can help me out. Thanks!
<?php
//this is column C
$col = 2;
// open file
$file = fopen("example.csv","r");
while(! feof($file))
{
echo fgetcsv($file)[$col];
}
// close connection
fclose($file);
?>
Yes you can define some flag to count the row. Have a look on below solution. It will start printing from 5th row, also you can accesscolum by its index. For eg. for second column you can use $row[1]
$start_row = 5; //define start row
$i = 1; //define row count flag
$file = fopen("myfile.csv", "r");
while (($row = fgetcsv($file)) !== FALSE) {
if($i >= $start_row) {
print_r($row);
//do your stuff
}
$i++;
}
// close file
fclose($file);
You have no guarantee that your file exists or you can read it or ....
Similar to fgets() except that fgetcsv() parses the line it reads for fields in CSV format and returns an array containing the fields read. PHP Manual
//this is column C
$col = 2;
// open file
$file = fopen("example.csv","r");
if (!$file) {
// log your error ....
}
else {
while( ($row = fgetcsv($file)) !== FALSE){
if (isset($row[$col])) // field doesn't exist ...
else print_r ($row[$col]);
}
}
// close file
fclose($file);
?>
Depending on the quality and volume of your incoming data, you may wish to use iterated conditions to build your output array or you may prefer to dump all of the csv data into a master array and then filter it to the desired structure.
To clarify the numeracy in my snippets, the 5th row of data with be located at index [4]. The same indexing is used for column targeting -- the 4th column is at index [3].
A functional approach (assumes no newlines in values and is not set up with any extra csv parsing flags):
$starting_index = 4;
$target_column = 3;
var_export(
array_column(
array_slice(
array_map(
'str_getcsv',
file('example.csv')
),
$starting_index
),
$target_column
)
);
A language construct approach with leading row exclusions based on a decrementing counter.
$disregard_rows = 4;
$target_column = 3;
$file = fopen("example.csv", "r");
while (($row = fgetcsv($file)) !== false) {
if ($disregard_rows) {
--$disregard_rows;
} else {
$column_data[] = $row[$target_column];
}
}
var_export($column_data);

PHP mysql INSERT causing blank first row

I can't seem to figure this out but my code is inserting 1 blank row (1st row). The blank row has blank car name, blank car brand, and only has "0.00" in car price. This code is for uploading a csv file and getting the data from that csv file and inserting to database. The first row is the column headers and I was assuming that the first call of $GetHeaders = fgetcsv($file); would have been for the headers.
$file = fopen($_FILES['fileupload']['tmp_name'],"r");
$GetHeaders = fgetcsv($file);
$CarName = array_search('Car Name', $GetHeaders);
$CarBrand = array_search('Car Brand', $GetHeaders);
$CarPrice = array_search('Car Price', $GetHeaders);
$theQue = "";
while(! feof($file))
{
$GetHeaders = fgetcsv($file);
$theQue .= "INSERT INTO cardb (CarName, CarBrand, Carprice) VALUES ('$GetHeaders[$CarName]', '$GetHeaders[$CarBrand]', '$GetHeaders[$CarPrice]')";
}
fclose($file);
if (mysqli_multi_query($connection, $theQue))
{
echo "Success";
}
Posting and ending this with this findings so others who encounter the issue with fgetcsv will get this hint.
This is weird, but after a couple more testing, I found that fgetcsv is reading each rows from the CSV file but there's an additional row being read by fgetcsv which is a NULL.
It's as if there's a invisible row and it's the last row. It is only showing in the first row when in database probably because of auto sort or something but it again the last row fgetcsv got is a NULL. I thought it should have detected NULL as EOF?
What I did to detect the bug is something I should have done in the first place which is to use echo vardump which displayed all the car names and the last car was named "NULL"
Thank you for the help guys, each of you gave me ideas which led me to finding this prick lol
A couple things could be causing this.. First of all try to delete your header row in your csv file. Next put in a check that the data row is not equal to blank or null, before writing it to the database.
<?php
//open the csv file for reading
$jhandle = fopen($file_path, 'r');
$row_limit=1000;
while (($jdata = fgetcsv($jhandle, $row_limit, ",")) !== FALSE) {
$car_name = $jdata[0];
$car_brand = $jdata[1];
$car_price = $jdata[2];
If(($car_name != '')||($car_brand != '')||($car_price > 0)){
//write to your database here.
}
//close your while statement
}

MySQL to MySQLi Query issue when joining arrays

I'm trying to convert some MYSQL querys to MYSQLI, but I'm having an issue, below is part of the script I am having issues with, the script turn a query into csv:
$columns = (($___mysqli_tmp = mysqli_num_fields($result)) ? $___mysqli_tmp : false);
// Build a header row using the mysql field names
$rowe = mysqli_fetch_assoc($result);
$acolumns = array_keys($rowe);
$csvstring = '"=""' . implode('""","=""', $acolumns) . '"""';
$header_row = $csvstring;
// Below was used for MySQL, Above was added for MySQLi
//$header_row = '';
//for ($i = 0; $i < $columns; $i++) {
// $column_title = $file["csv_contain"] . stripslashes(mysql_field_name($result, $i)) . $file["csv_contain"];
// $column_title .= ($i < $columns-1) ? $file["csv_separate"] : '';
// $header_row .= $column_title;
// }
$csv_file .= $header_row . $file["csv_end_row"]; // add header row to CSV file
// Build the data rows by walking through the results array one row at a time
$data_rows = '';
while ($row = mysqli_fetch_array($result)) {
for ($i = 0; $i < $columns; $i++) {
// clean up the data; strip slashes; replace double quotes with two single quotes
$data_rows .= $file["csv_contain"] .$file["csv_equ"] .$file["csv_contain"] .$file["csv_contain"] . preg_replace('/'.$file["csv_contain"].'/', $file["csv_contain"].$file["csv_contain"], stripslashes($row[$i])) . $file["csv_contain"] .$file["csv_contain"] .$file["csv_contain"];
$data_rows .= ($i < $columns-1) ? $file["csv_separate"] : '';
}
$data_rows .= $this->csv_end_row; // add data row to CSV file
}
$csv_file .= $data_rows; // add the data rows to CSV file
if ($this->debugFlag) {
echo "Step 4 (repeats for each attachment): CSV file built. \n\n";
}
// Return the completed file
return $csv_file;
The problem I am having is when building a header row for the column titles mysqli doesn't use field_names so I am fetching the column titles by using mysqli_fetch_assoc() and then implode() the array, adding the ,'s etc for the csv.
This works but when I produce the csv I am deleting the first data row when the header is active, when I remove my header part of the script and leave the header as null I get all data rows and a blank header (As expected).
So I must be missing something when joining my header to array to the $csv_file.
Can anyone point me in the right direction?
Many Thanks
Ben
A third alternative is to refactor the loop body as a function, then also call this function on the first row before entering the loop. You can use fputcsv as this function.
$csv_stream = fopen('php://temp', 'r+');
if ($row = $result->fetch_assoc()) {
fputcsv($csv_stream, array_keys($row));
fputcsv($csv_stream, $row);
while ($row = $result->fetch_row()) {
fputcsv($csv_stream, $row);
}
fseek($csv_stream, 0);
}
$csv_data = stream_get_contents($csv_stream);
if ($this->debugFlag) {
echo "Step 4 (repeats for each attachment): CSV file built. \n\n";
}
// Return the completed file
return $csv_data;
As this basically does the same thing as a do ... while loop, which would make more sense to use. I bring up this alternative to present the loop body refactoring technique, which can be used when a different kind of loop doesn't make sense.
Best of all would be to use both mysqli_result::fetch_fields and fputcsv
$csv_stream = fopen('php://temp', 'r+');
$fields = $result->fetch_fields();
foreach ($fields as &$field) {
$field = $field->name;
}
fputcsv($csv_stream, $fields);
while ($row = $result->fetch_row()) {
fputcsv($csv_stream, $row);
}
fseek($csv_stream, 0);
$csv_data = stream_get_contents($csv_stream);
if ($this->debugFlag) {
echo "Step 4 (repeats for each attachment): CSV file built. \n\n";
}
// Return the completed file
return $csv_data;
If you can require that PHP be at least version 5.3, you can replace the foreach that generates the header line with a call to array_map. There admittedly isn't much advantage to this, I just find the functional approach more interesting.
fputcsv($csv_stream,
array_map(function($field) {return $field->name},
$result->fetch_fields()));
As you observe, you're using the first row to obtain the field names but then not using the data from the row. Evidently, you need to change your code so that you get both of those things.
There are a number of ways you might do this. The most appropriate one is to use mysqli_fetch_fields() instead to get the field metadata from the result object.
http://www.php.net/manual/en/mysqli-result.fetch-fields.php
Alternatively, you could make the loop lower down in the code a do... while instead of a while.

Categories