Delete blank lines in CSV file with PHP or PHPExcel? - php

I am trying programmatically delete blank lines in CSV files using PHP. Files are uploaded to a site and converted to CSV using PHPExcel. A particular type of CSV is being generated with blank lines in between the data rows, and I'm trying to clean them up with PHP without any luck. Here is an example of what this CSV looks like: https://gist.github.com/vinmassaro/467ea98151e26a79d556
I need to load the CSV, remove the blank lines, and save it, using either PHPExcel or standard PHP functions. Thanks in advance.
EDIT:
Here is a snippet from how it is currently converted with PHPExcel. This is part of a Drupal hook, acting on a file that has just been uploaded. I couldn't get the PHPExcel removeRow method working because it didn't seem to work on blank lines, only empty data rows.
// Load the PHPExcel IOFactory.
require_once(drupal_realpath(drupal_get_path('module', 'custom')) . '/PHPExcel/Classes/PHPExcel/IOFactory.php');
// Load the uploaded file into PHPExcel for normalization.
$loaded_file = PHPExcel_IOFactory::load(drupal_realpath($original_file->uri));
$writer = PHPExcel_IOFactory::createWriter($loaded_file, 'CSV');
$writer->setDelimiter(",");
$writer->setEnclosure("");
// Get path to files directory and build a new filename and filepath.
$files_directory = drupal_realpath(variable_get('file_public_path', conf_path() . '/files'));
$new_filename = pathinfo($original_file->filename, PATHINFO_FILENAME) . '.csv';
$temp_filepath = $files_directory . '/' . $new_filename;
// Save the file with PHPExcel to the temp location. It will be deleted later.
$writer->save($temp_filepath);

If you want to use phpexcel, search for CSV.php
and edit this File:
// Write rows to file
for ($row = 1; $row <= $maxRow; ++$row) {
// Convert the row to an array...
$cellsArray = $sheet->rangeToArray('A'.$row.':'.$maxCol.$row, '', $this->preCalculateFormulas);
// edit by ger
// if last row, then no linebreak will added
if($row == $maxRow){ $ifMaxRow = TRUE; }else{ $ifMaxRow = False; }
// ... and write to the file
$this->writeLine($fileHandle, $cellsArray[0], $ifMaxRow);
}
at the end of file edit this
/**
* Write line to CSV file
*
* edit by ger
*
* #param mixed $pFileHandle PHP filehandle
* #param array $pValues Array containing values in a row
* #throws PHPExcel_Writer_Exception
*/
private function writeLine($pFileHandle = null, $pValues = null, $ifMaxRow = false)
{
...
// Add enclosed string
$line .= $this->enclosure . $element . $this->enclosure;
}
insert this following
if($ifMaxRow == false){
// Add line ending
$line .= $this->lineEnding;
}

Using str_replace as in Mihai Iorga's comment will work. Something like:
$csv = file_get_contents('path/file.csv');
$no_blanks = str_replace("\r\n\r\n", "\r\n", $csv);
file_put_contents('path/file.csv', $no_blanks);
I copied the text from the example you posted, and this worked, although I had to change the "find" parameter to to "\r\n \r\n" instead of "\r\n\r\n" because of a single space on each of the blank-looking lines.

Try this:
<?php
$handle = fopen("test.csv", 'r'); //your csv file
$clean = fopen("clean.csv", 'a+'); //new file with no empty rows
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data);
if($num > 1)
fputcsv($clean, $data ,";");
}
fclose($handle);
fclose($clean);
?>
Tested on my localhost.
Output Data:
Initial File:
col1,col2,col3,col4,col5,col6,col7,col8
0,229.500,7.4,3165.5,62,20.3922,15.1594,0
1,229.600,8.99608,3156.75,62,15.6863,16.882,0
2,229.700,7.2549,3130.25,62,16.8627,15.9633,0
3,229.800,7.1098,3181,62,17.2549,14.1258,0
Clean Csv File:
col1 col2 col3 col4 col5 col6 col7 col8
0 229.500 7.4 3165.5 62 203.922 151.594 0
1 229.600 899.608 3156.75 62 156.863 16.882 0
2 229.700 72.549 3130.25 62 168.627 159.633 0
3 229.800 71.098 3181 62 172.549 141.258 0

Related

fputcsv - Combining multiple CSVs into one; empty enclosures "" only output with delimiter

I'm coding a plugin that runs everyday at 5am. It combines multiple csv files (That have a txt extension).
Currently, it is working... HOWEVER, the output format is incorrect.
The input will look like this:
"","","","","email#gmail.com","PARK PLACE 109 AVE","SOME RANDOM DATA","","","",""
And so on. this is only a partial row.
The ouput of this code does not retun the same format. It produces something like this without the " in columns without data
,,,,email#gmail.com,"PARK PLACE 109 AVE","SOME RANDOM DATA",,,,
Here is the part of the function that combines everything:
function combine_and_email_csv_files() {
// Get the current time and date
$now = new DateTime();
$date_string = $now->format('Y-m-d_H-i-s');
// Get the specified directories
$source_directory = get_option('csv_file_combiner_source_directory');
$destination_directory = get_option('csv_file_combiner_destination_directory');
// Load the CSV files from the source directory
$csv_files = glob("$source_directory/*.txt");
// Create an empty array to store the combined CSV data
$combined_csv_data = array();
// Loop through the CSV files
foreach ($csv_files as $file) {
// Load the CSV data from the file
$csv_data = array_map('str_getcsv', file($file));
// Add the CSV data to the combined CSV data array
$combined_csv_data = array_merge($combined_csv_data, $csv_data);
}
// Create the combined CSV file
$combined_csv_file = fopen("$destination_directory/$date_string.txt", 'w');
// Write the combined CSV data to the file
foreach ($combined_csv_data as $line) {
fputcsv($combined_csv_file, $line);
}
// Close the combined CSV file
fclose($combined_csv_file);
}
No matter, what I've tried... it's not working. I'm missing something simple I know.
Thank you Nigel!
So this thread, Forcing fputcsv to Use Enclosure For *all* Fields helped me get there....
Using fputs instead of fputscsv and force "" on null values is the short answer for me. Works beautifully... code is below:
function combine_and_email_csv_files() {
// Get the current time and date
$now = new DateTime();
$date_string = $now->format('Y-m-d_H-i-s');
// Get the specified directories
$source_directory = get_option('csv_file_combiner_source_directory');
$destination_directory = get_option('csv_file_combiner_destination_directory');
// Load the CSV files from the source directory
$csv_files = glob("$source_directory/*.txt");
// Create an empty array to store the combined CSV data
$combined_csv_data = array();
// Loop through the CSV files
foreach ($csv_files as $file) {
// Load the CSV data from the file
$csv_data = array_map('str_getcsv', file($file));
// Add the CSV data to the combined CSV data array
$combined_csv_data = array_merge($combined_csv_data, $csv_data);
}
// Create the combined CSV file
$combined_csv_file = fopen("$destination_directory/$date_string.txt", 'w');
// Write the combined CSV data to the file
foreach ($combined_csv_data as $line) {
// Enclose each value in double quotes
$line = array_map(function($val) {
if (empty($val)) {
return "\"\"";
}
return "\"$val\"";
}, $line);
// Convert the line array to a CSV formatted string
$line_string = implode(',', $line) . "\n";
// Write the string to the file
fputs($combined_csv_file, $line_string);
}
Thank you Sammitch
After much haggling with this problem... Sammitch pointed out why not just concat the files... Simplicity is the ultimate sophistication... right?
*Note: this will only work for my specific circumstance. All I'm doing now is concating the files and checking each file ends with a new line and just plain skipping the csv manipulation.
Code below:
function combine_and_email_csv_files() {
// Get the current time and date
$now = new DateTime();
$date_string = $now->format('Y-m-d_H-i-s');
// Get the specified directories
$source_directory = get_option('csv_file_combiner_source_directory');
$destination_directory = get_option('csv_file_combiner_destination_directory');
// Load the files from the source directory
$files = glob("$source_directory/*.txt");
// Create the combined file
$combined_file = fopen("$destination_directory/$date_string.txt", 'w');
// Loop through the files
foreach ($files as $file) {
// Read the contents of the file
$contents = file_get_contents($file);
// Ensure that the file ends with a newline character
if (substr($contents, -1) != "\n") {
$contents .= "\n";
}
// Write the contents of the file to the combined file
fwrite($combined_file, $contents);
}
// Close the combined file
fclose($combined_file);

How to optimize loops for large CSV files data extraction

I have a question about code optimization.
I haven't coded anything besides simple loops in over ten years.
I created the code below, which works fine but is super slow for my needs.
In essence, I have 2 CSV files:
a source CSV file that has about 500 000 records, let's say: att1, att2, source_id, att3, att4 (in reality there are about 40 columns)
a main CSV file that has about 120 million records, let's say: att1, att2, att3, main_id, att4 (in reality there are about 120 columns)
For each source_id in the source file, my code parses the main file for all the lines where main_ id == source_id and it writes each of those lines in a new file.
Do you have any suggestion on how I could optimize the code, to go much much faster?
<?php
$mf = "main.csv";
$mf_max_line_length = "512";
$mf_id = "main_id";
$sf = "source.csv";
$sf_max_line_length = "884167";
$sf_id = "source_id";
if (($mf_handle = fopen($mf, "r")) !== FALSE)
{
// Read the first line of the main CSV file
// and look for the position of main_id
$mf_data = fgetcsv($mf_handle, $mf_max_line_length, ",");
$mf_id_pos = array_search ($mf_id, $mf_data);
// Create a new main CSV file
if (($nmf_handle = fopen("new_main.csv", "x")) !== FALSE)
{
fputcsv($nmf_handle,$mf_data);
} else {
echo "Cannot create file: new_main.csv" . $sf;
break;
}
}
// Open the source CSV file
if (($sf_handle = fopen($sf, "r")) !== FALSE)
{
// Read the first line of the source CSV file
// and look for the position of source_id
$sf_data = fgetcsv($sf_handle, $sf_max_line_length, ",");
$sf_id_pos = array_search ($sf_id, $sf_data);
// Go trhough the whole source CSV file
while (($sf_data = fgetcsv($sf_handle, $sf_max_line_length, ",")) !== FALSE)
{
// Open the main CSV file
if (($mf_handle = fopen($mf, "r")) !== FALSE)
{
// Go trhough the whole main CSV file
while (($mf_data = fgetcsv($mf_handle, $mf_max_line_length, ",")) !== FALSE)
{
// If the source_id matches the main_id
// then we write it into the new_main CSV file
if ($mf_data[$mf_id_pos] == $sf_data[$sf_id_pos])
{
fputcsv($nmf_handle,$mf_data);
}
}
fclose($mf_handle);
}
}
fclose($sf_handle);
fclose($nmf_handle);
}
?>
Sounds like a job for mysql.
First, you'll need to create tables based on all your fields. See here
Then, you'll load your data. See here
Finally, you'll create a query like:
SELECT * INTO OUTFILE '/tmp/something.csv'
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM source_table INNER JOIN main_table ON
source_table.source_id=main_table.main_id;

Move first row from a csv to another using php

What I need to do is to be able to move the first row from a testdata.csv every time I run the .php to another .csv with the name testdata_new.csv(appending data).
This is an example of data that includes Name, Age, Job
Example data testdata.csv:
John,32,Scientist
Mary,25,Employer
Nick,36,Designer
Miky,46,Sales
Alex,29,Logistics
This is what the .php will do running it:
Cut the first row from testdata.csv(john,32,scientist) and paste it to the new testdata_new.csv under the first row(header) that will always be "Name Age Job".
Save testdata_new.csv and testdata.csv with the remaining rows.
I did some tests but I'm still far away from the solution.
<?php
$file = "testdata.csv";
$f = fopen($file, "r");
$i = 0;
$file2 = str_replace(".csv", "_new.csv", $file);
$f2 = fopen($file2,"a");
while ($i<2) {
$record = fgetcsv($f);
foreach($record as $field) {
echo $field . "<br>";
}
$i++;
}
fwrite($f2,fread($f, filesize($file)));
fclose($f);
fclose($f2);
?>
Executing the script will display the first row of the template.csv file
and will produce another file with the name template_new.csv with the following rows:
Mary,25,Employer
Nick,36,Designer
Miky,46,Sales
Alex,29,Logistics
What I really need to have in the template_new.csv file is only the first row displayed:
John,32,Scientist
And save again the template.csv without the first row as the idea is to cut and paste the rows, as following:
Mary,25,Employer
Nick,36,Designer
Miky,46,Sales
Alex,29,Logistics
Thank you all in advance for your help!
As easy as this ;-)
$old_file = 'testdata.csv';
$new_file = 'testdata_new.csv';
$file_to_read = file_get_contents($old_file); // Reading entire file
$lines_to_read = explode("\n", $file_to_read); // Creating array of lines
if ( $lines_to_read == '' ) die('EOF'); // No data
$line_to_append = array_shift( $lines_to_read ); // Extracting first line
$file_to_append = file_get_contents($new_file); // Reading entire file
if ( substr($file_to_append, -1, 1) != "\n" ) $file_to_append.="\n"; // If new file doesn't ends in new line I add it
// Writing files
file_put_contents($new_file, $file_to_append . $line_to_append . "\n");
file_put_contents($old_file, implode("\n", $lines_to_read));

Converting CSV Array

I am trying to convert a CSV file into a PHP array, somehow it joined all the things in a single string. I wish to start a new line with the * sign,end with the ",," and separate by using ",".
Here are parts of the csv:
*,Alerts,Alert,Type,Text,,,,,,phr_ccr,alert,Type,,
*,Alerts,Alert,Type,Code,Value,,,,,phr_ccr,alert,Type,,
*,Alerts,Alert,Type,Code,CodingSystem,,Text,,,phr_ccr,alert,Type,,
*,Alerts,Alert,Agent,Products,Product,Description,Code,Value,,phr_ccr,alert,Product_Name_CD,,
*,Alerts,Alert,Agent,Products,Product,Description,Code,CodingSystem,,phr_ccr,alert,Product_Name_CDS,,
*,Alerts,Alert,Agent,Products,Product,Description,ProductName,*,,phr_ccr,alert,Product_Name,,
Try this code if it helps:
$csv_name;#name of ur csv file
$raw_data=array();
$count=0;
$csv = fopen($csv_name[$count_list], 'r');
while (($csv_data = fgetcsv($csv, ",")) !== FALSE)
{
#your csv column names:
$alert=$csv_data[0];
$alerttype=$csv_data[1];
#and so on
$raw_data[$count]=$alert."#".$alerttype."#so on according to ur need";
$count++;
}

Edit CSV field value for entire column

I have a CSV that is downloaded from the wholesaler everynight with updated prices.
What I need to do is edit the price column (2nd column) and multiply the current value by 1.3 (30%).
My code to read the provided CSV and take just the columns I need is below, however I can't seem to figure out how to edit the price column.
<?php
// open the csv file in write mode
$fp = fopen('var/import/tb_prices.csv', 'w');
// read csv file
if (($handle = fopen("var/import/Cbl_4036_2408.csv", "r")) !== FALSE) {
$targetColumns = array(1, 2, 3); // get data from the 1st, 4th and 15th column
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$targetData = array(); // array that hold target data
foreach($targetColumns as $column){ // loop throught the targeted columns array
if($column[2]){
$data[$column] = $data[0] * 1.3;
}
$targetData[] = $data[$column]; // get the data from the column
}
# Populate the multidimensional array.
$csvarray[$nn] = $targetData; // add target data to csvarray
// write csv file
fputcsv($fp, $targetData);
}
fclose($handle);
fclose($fp);
echo "CSV File Written Successfully!";
}
?>
Could somebody point me in the right direction please, explaining how you've worked out the function too so I can learn at the same time.
You are multiplying your price column always as - $data[0] * 1.3.
It may be wrong here.
Other views:
If you are doing it once in a lifetime of this data(csv) handling, try to solve it using mysql itself only. Create the table similar to the database, import the .csv data into that mysql table. And then, SQL operate as you want.
No loops; no coding, no file read/write, and precise control over what you want to do with UPDATE. You just need to be aware of the delimiters (line separators eg. \r\n, column separators (eg. comma or tab or semicolon) and data encoding in double/single-quotes or not)
Once you modify your data, you can export it back to csv again.
If you want to handle the .csv file itself, open it in one connection (read only mode), and write to another file - saving the original data.
you say that the column that contains the price is the second but then use that index with zero. anyway the whole thing can be easier
$handle = fopen("test.csv", "r");
if ( $handle !== FALSE) {
$out = "";
while (($data = fgetcsv($handle, 1000, ";")) !== FALSE) {
$data[1] = ((float)$data[1] * 1.3);
$out .= implode(";",$data) . "\n";
}
fclose($handle);
file_put_contents("test2.csv", $out);
}
this code open a csv file with comma as separator.
than read every line and for every line it's multiplies the second coloumn (index 1) for 1.3
this line
$out .= implode(";",$data) . "\n";
generate a line for new csb file. see implode on the officile documentation ...
after I close the connection to the file. and 'useless to have a connection with two files when you can do the writing of the second file in one fell swoop. the thing is true for small files

Categories