I am trying to convert a tab delimited file to csv. The problem is its a huge file. 100000 plus records. And i want only specific columns from that file. The file is not generated by me but by amazon so i cant really control the format.
The code i made works fine. But i need to ignore/remove some columns or rather i want only few columns from that. How do i do that without effecting the performance of conversion from txt to csv.
$file = fopen($file_name.'.txt','w+');
fwrite($file,$report);
fclose($file);
$handle = fopen($file_name.".txt", "r");
$lines = [];
$row_count=0;
$array_count = 0;
$uid = array($user_id);
if (($handle = fopen($file_name.".txt", "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 100000, "\t")) !== FALSE)
{
if($row_count>0)
{
$lines[] = str_replace(",","<c>",$data);
array_push($lines[$array_count],$user_id);
$array_count++;
}
$row_count++;
}
fclose($handle);
}
$fp = fopen($file_name.'.csv', 'w');
foreach ($lines as $line)
{
fputcsv($fp, $line);
}
fclose($fp);
I am using unset to remove any column. But is there a better way ? for multiple columns.
I would do that by checking keys. For example:
// columns keys you don't wanna skip
$keys = array(0, 1, 3, 4, 7, 9);
$lines = file($file_name);
$result_lines = array();
foreach ($lines as $line) {
$tmp = array();
$tabs = explode("\t", $line);
foreach($tabs as $key => $value){
if(in_array($key, $keys)){
$tmp[] = $value;
}
}
$result_lines[] = implode(",", $tmp);
}
$finalString = implode("\n", $result_lines);
// Then write string to file
Hope it helps.
Cheers,
SiniĊĦa
In its simplest form i.e. without worrying about removing columns from the output this will do a simple read line and write line, therefore no need to maintain any memory hungry arrays.
$file_name = 'tst';
if ( ($f_in = fopen($file_name.".txt", "r")) === FALSE) {
echo 'Cannot find inpout file';
exit;
}
if ( ($f_out = fopen($file_name.'.csv', 'w')) === FALSE ) {
echo 'Cannot open output file';
exit;
}
while ($data = fgetcsv($f_in, 8000, "\t")) {
fputcsv($f_out, $data, ',', '"');
}
fclose($f_in);
fclose($f_out);
This is one way of removing the unwanted columns
$file_name = 'tst';
if ( ($f_in = fopen("tst.txt", "r")) === FALSE) {
echo 'Cannot find inpout file';
exit;
}
if ( ($f_out = fopen($file_name.'.csv', 'w')) === FALSE ) {
echo 'Cannot open output file';
exit;
}
$unwanted = [26,27]; //index of unwanted columns
while ($data = fgetcsv($f_in, 8000, "\t")) {
// remove unwanted columns
foreach($unwanted as $i) {
unset($data[$i]);
}
fputcsv($f_out, $data, ',', '"');
}
fclose($f_in);
fclose($f_out);
Related
I've got a csv file which contains product datas and prices from two distributors.
There are 67 keys in this file.
Now I want to search all EANs in this file which are twice available and then get the cheapest price.
After that delete the other higher price product line.
The CSV has a key for my merchant.
I made a test csv for easier view:
artno;name;ean;price;merchant
1;ipad;1654213154;499.00;merchant1
809;ipad;1654213154;439.00;merchant2
23;iphone;16777713154;899.00;merchant2
90;iphone;16777713154;799.00;merchant1
After the script runs through, the csv should look like (writing to new file):
artno;name;ean;price;merchant
809;ipad;1654213154;439.00;merchant2
90;iphone;16777713154;799.00;merchant1
I played around with fgetcsv, looping through the csv is not a problem, but how I can search for the ean in key 2?
$filename = './test.csv';
$file = fopen($filename, 'r');
$fileline = 1;
while (($data = fgetcsv($file, 0, ";")) !== FALSE) {
if($fileline == "1"){ $fileline++; continue; }
$search = $data[2];
$lines = file('./test.csv');
$line_number = false;
$count = 0;
while (list($key, $line) = each($lines) and !$line_number) {
$line_number = (strpos($line, $search) !== FALSE) ? $key : $line_number;
$count++;
}
if($count > 2){
echo "<pre>",print_r(str_getcsv($lines[$line_number], ";")),"</pre>";
}
}
I think this is what you are looking for:
<?php
$filename = './test.csv';
$file = fopen($filename, 'r');
$lines = file('./test.csv');
$headerArr = str_getcsv($lines[0], ";");
$finalrawData = [];
$cheapeastPriceByProduct = [];
$dataCounter = 0;
while (($data = fgetcsv($file, 0, ";")) !== FALSE) {
if($dataCounter > 0) {
$raw = str_getcsv($lines[$dataCounter], ";");
$tempArr = [];
foreach( $raw as $key => $val) {
$tempArr[$headerArr[$key]] = $val;
}
$finalrawData[] = $tempArr;
}
$dataCounter++;
}
foreach($finalrawData as $idx => $dataRow ) {
if(!isset($cheapeastPriceByProduct[$dataRow['name']])) {
$cheapeastPriceByProduct[$dataRow['name']] = $dataRow;
}
else {
if(((int)$dataRow['price'])< ((int)$cheapeastPriceByProduct[$dataRow['name']]['price'])) {
$cheapeastPriceByProduct[$dataRow['name']] = $dataRow;
}
}
}
echo "<pre>";
print_r($finalrawData);
print_r($cheapeastPriceByProduct);
I just added $finalData data array to store the parsed data and associated all rows with their header key counterpart then you can compare and filter data based on your criteria.
I've an existing csv file with following values
column1 column2
Fr-fc Fr-sc
Sr-fc Sr-sc
I want to add 2 new columns in it and achieve the following format
column1 column2 column3 column4
Fr-fc Fr-sc 1 2
Sr-fc Sr-sc 1 2
If I use following code it inserts same column header value in column data for the newly created columns
$a = file('amit.csv');// get array of lines
$new = '';
foreach($a as $line){
$line = trim($line);// remove end of line
$line .=";column3";// append new column
$line .=";column4";// append new column
$new .= $line.PHP_EOL;//append end of line
}
file_put_contents('amit2.csv', $new);// overwrite the same file with new data
How I can achieve the above?
Instead of reinventing the wheel, you can use php's inbuilt csv functions fgetcsv and fputcsv respectively to ease your work. First read in each row with fgetcsv and store the data in a multidimensional array:
$delimiter = "\t"; //your column separator
$csv_data = array();
$row = 1;
if (($handle = fopen('test.csv', 'r')) !== FALSE) {
while (($data = fgetcsv($handle, 1000, $delimiter)) !== FALSE) {
$csv_data[] = $data;
$row++;
}
fclose($handle);
}
Next edit the rows to add the extra columns using array_merge:
$extra_columns = array('column3' => 1, 'column4' => 2);
foreach ($csv_data as $i => $data) {
if ($i == 0) {
$csv_data[$i] = array_merge($data, array_keys($extra_columns));
} else {
$csv_data[$i] = $data = array_merge($data, $extra_columns);
}
}
Finally use fputcsv to enter each row into the csv.
if (($handle = fopen('test.csv', 'w')) !== FALSE) {
foreach ($csv_data as $data) {
fputcsv($handle, $data, $delimiter);
}
fclose($handle);
}
You can combine these steps to make your code more efficient by reducing the number of loops.
This approach is less code
<?php
$inFile = fopen('test.csv','r');
$outFile = fopen('output.csv','w');
$line = fgetcsv($inFile);
while ($line !== false) {
$line[] = 'third column';
$line[] = 'fourth column';
fputcsv($outFile, $line);
$line = fgetcsv($inFile);
}
fclose($inFile);
fclose($outFile);
I would like to convert a csv file that has duplicate contents and i would like to sum the quantity and extract the price without sum it.
file.csv :
code,qty,price
001,2,199
001,1,199
002,2,159
002,2,159
Actual php that sum the quantiy and get a result with unique value and total qty.
<?php
$tsvFile = new SplFileObject('file.csv');
$tsvFile->setFlags(SplFileObject::READ_CSV);
$tsvFile->setCsvControl("\t");
$file = fopen('file.csv', 'w');
$header = array('sku', 'qty');
fputcsv($file, $header, ',', '"');
foreach ($tsvFile as $line => $row) {
if ($line > 0) {
if (isset($newData[$row[0]])) {
$newData[$row[0]]+= $row[1];
} else {
$newData[$row[0]] = $row[1];
}
}
}
foreach ($newData as $key => $value) {
fputcsv($file, array($key, $value), ',', '"');
}
fclose($file);
?>
the result for this is:
code,qty
001,3
002,4
and i would like to add price, but without sum it.
The result i need is:
code,qty,price
001,3,199
002,4,159
I haven't tested this yet, but I think this is what you are looking for:
<?php
$tsvFile = new SplFileObject('file.csv');
$tsvFile->setFlags(SplFileObject::READ_CSV);
$tsvFile->setCsvControl("\t");
$file = fopen('file.csv', 'w');
$header = array('sku', 'qty');
fputcsv($file, $header, ',', '"');
foreach ($tsvFile as $line => $row) {
if ($line > 0) {
if(!isset($newData[$row[0]])) {
$newData[$row[0]] = array('qty'=>0, 'price'=>$row[2]);
}
$newData[$row[0]]['qty'] += $row[1];
}
}
foreach ($newData as $key => $arr) {
fputcsv($file, array($key, $arr['qty'], $arr['price']), ',', '"');
}
fclose($file);
?>
To start with, there's a nice function on the PHP page str_getcsv which will help you end up with a more legible array to work with:
function csv_to_array($filename='', $delimiter=',') {
if(!file_exists($filename) || !is_readable($filename))
return FALSE;
$header = NULL;
$data = array();
if (($handle = fopen($filename, 'r')) !== FALSE) {
while (($row = fgetcsv($handle, 1000, $delimiter)) !== FALSE) {
if(!$header)
$header = $row;
else
$data[] = array_combine($header, $row);
}
fclose($handle);
}
return $data;
}
This is purely for legibility sake but now comes the code which would allow you to work over the array.
$aryInput = csv_to_array('file.csv', ',');
$aryTemp = array();
foreach($aryInput as $aryRow) {
if(isset($aryTemp[$aryRow['code'])) {
$aryTemp[$aryRow['code']['qty'] += $aryRow['qty'];
} else {
$aryTemp[$aryRow['code']] = $aryRow;
}
}
In the above code, it simply:
Loops through the input
Checks whether the key exists in a temporary array
If it does, it just adds the new quantity
If it doesn't, it adds the entire row
Now you can write out your expectant csv file :)
I've searched the web and looked through existing answers but cant find a solution to this one. I want to use php to do the following task. Here are my files:
csv file 1: member.csv
member1|john|smith|2009
member2|adam|jones|2007
member3|susie|rose|2002
csv file 2: classes.csv
member1|massage|swimming|weights
member2|gym|track|pilates
member3|yoga|running|stretches
I want to output a third file called file3.csv which merges the two above files together based on the key field which is the member number. the output should be like this:
member1|john|smith|2009|massage|swimming|weights
member2|adam|jones|2007|gym|track|pilates
member3|susie|rose|2002|yoga|running|stretches
the delimiter is a bar character. I want to do this just using php - no other languages.
I would be very greatful for a solution.
Matt
Read both files and store data to the array with keys: member1, ...
Write a new file lines in loop:
foreach ($firstArray as $key => $value1) {
$value2 = $secondArray[$key];
// ...
}
<?php
$data = array();
if (($handle = fopen('file1.csv', 'r')) !== FALSE) {
while (($line = fgetcsv($handle, 0, '|')) !== FALSE) {
$memberId = $line[0];
unset($line[0]);
$data[$memberId] = $line;
}
fclose($handle);
}
if (($handle = fopen('file2.csv', 'r')) !== FALSE) {
while (($line = fgetcsv($handle, 0, '|')) !== FALSE) {
$memberId = $line[0];
unset($line[0]);
$data[$memberId] = array_merge($data[$memberId], $line);
}
fclose($handle);
}
ksort($data); // not needed, but puts records in order by member
if (($handle = fopen('file3.csv', 'w')) !== FALSE) {
foreach($data as $key => $value) {
fwrite($handle, "$key|" . implode('|', $value) . "\n");
}
fclose($handle);
}
Try this. It is untested.
$arr_one = array();
if (($fp = fopen("member.csv", "r")) !== FALSE) {
while (($data = fgetcsv($fp, 1000, ",")) !== FALSE) {
$arr_one[$data[0]] = $data;
}
fclose($fp);
}
$arr_two = array();
if (($fp = fopen("classes.csv", "r")) !== FALSE) {
while (($data = fgetcsv($fp, 1000, ",")) !== FALSE) {
$arr_two[$data[0]] = $data;
}
fclose($fp);
}
$classes_field_count = sizeof(current($arr_two));
$members = array_keys($arr_one);
foreach ($members as $key) {
if (!isset($arr_two[$key])) {
$arr_two[$key] = range(0, ($classes_field_count - 1));
}
unset($arr_two[$key][0]);
$result_arr[$key] = array_merge($arr_one[$key], $arr_two[$key]);
}
if (($fp = fopen("file3.csv", "w")) !== FALSE) {
foreach ($result_arr as $fields) {
fputcsv($fp, $fields, '|');
}
fclose($fp);
}
I would like to convert a CSV to Json, use the header row as a key, and each line as object. How do I go about doing this?
----------------------------------CSV---------------------------------
InvKey,DocNum,CardCode
11704,1611704,BENV1072
11703,1611703,BENV1073
---------------------------------PHP-----------------------------------
if (($handle = fopen('upload/BEN-new.csv'. '', "r")) !== FALSE) {
while (($row_array = fgetcsv($handle, 1024, ","))) {
while ($val != '') {
foreach ($row_array as $key => $val) {
$row_array[] = $val;
}
}
$complete[] = $row_array;
}
fclose($handle);
}
echo json_encode($complete);
Just read the first line separately and merge it into every row:
if (($handle = fopen('upload/BEN-new.csv', 'r')) === false) {
die('Error opening file');
}
$headers = fgetcsv($handle, 1024, ',');
$complete = array();
while ($row = fgetcsv($handle, 1024, ',')) {
$complete[] = array_combine($headers, $row);
}
fclose($handle);
echo json_encode($complete);
I find myself converting csv strings to arrays or objects every few months.
I created a class because I'm lazy and dont like copy/pasting code.
This class will convert a csv string to custom class objects:
Convert csv string to arrays or objects in PHP
$feed="https://gist.githubusercontent.com/devfaysal/9143ca22afcbf252d521f5bf2bdc6194/raw/ec46f6c2017325345e7df2483d8829231049bce8/data.csv";
//Read the csv and return as array
$data = array_map('str_getcsv', file($feed));
//Get the first raw as the key
$keys = array_shift($data);
//Add label to each value
$newArray = array_map(function($values) use ($keys){
return array_combine($keys, $values);
}, $data);
// Print it out as JSON
header('Content-Type: application/json');
echo json_encode($newArray);
Main gist:
https://gist.github.com/devfaysal/9143ca22afcbf252d521f5bf2bdc6194
For those who'd like things spelled out a little more + some room to further parse any row / column without additional loops:
function csv_to_json_byheader($filename){
$json = array();
if (($handle = fopen($filename, "r")) !== FALSE) {
$rownum = 0;
$header = array();
while (($row = fgetcsv($handle, 1024, ",")) !== FALSE) {
if ($rownum === 0) {
for($i=0; $i < count($row); $i++){
// maybe you want to strip special characters or merge duplicate columns here?
$header[$i] = trim($row[$i]);
}
} else {
if (count($row) === count($header)) {
$rowJson = array();
foreach($header as $i=>$head) {
// maybe handle special row/cell parsing here, per column header
$rowJson[$head] = $row[$i];
}
array_push($json, $rowJson);
}
}
$rownum++;
}
fclose($handle);
}
return $json;
}