PHP probleme while remove columns in csv - php

I use the php script below to remove some columns from a csv, order them new and save it as new file.
And it works for the file i made it for.
Now i need to do the same with another csv but i don't know whats wrong. I always get a comma befor the data in the first column.
This is what i have, but it doesn't really work.
<?php
$input = 'http://***/original.csv';
$output = 'new.csv';
if (false !== ($ih = fopen($input, 'r'))) {
$oh = fopen($output, 'w');
while (false !== ($data = fgetcsv($ih))) {
// this is where you build your new row
$outputData = array($data[4], $data[0]);
fputcsv($oh, $outputData);
}
fclose($ih);
fclose($oh);
}
The original.csv looks that:
subproduct_number barcode stock week_number qty_estimate productid variantid
05096470000 4024144513543 J 3 6 35016
ae214 848518017215 N 23 0 7 35015
05097280000 4024144513727 J 1 32 34990
The seperator is ';'. The same seperator is used in the file that is working
But here it will be go wrong because my saved new.csv looks like this:
subproduct_number barcode stock week_number qty_estimate productid variantid
,05096470000 4024144513543 J 3 6 35016
,ae214 848518017215 N 23 0 7 35015
,05097280000 4024144513727 J 1 32 34990
But what i need is a new csv that looks like this:
qty_estimate subproduct_number
3 05096470000
0 ae214
1 05097280000
As you can see, i need only the 5. column ($data[4]) as first and the first column ($data[0]) as the second one.
I hope someone can point me in the reight direction.
Thanks

You can do so:
while (false !== ($data = fgetcsv($ih))) {
$data = explode(';', $data[0]);
$outputData = array($data[4], $data[0]);
fputcsv($oh, $outputData, ';');
}

Related

PHP - Count Distinct Value in a CSV file

I have a csv file with a very large number of item (5000 lines) in this format
storeId,bookId,nb
124,48361,0
124,48363,6
125,48362,8
125,48363,2
126,28933,4
142,55433,6
142,55434,10
171,55871,7
171,55872,6
I need to count the number of stores in the file, so for exemple with the line above the result should be 5. But I need to doo it with 5000 lines so I can't just loop.
How can I achieve that?
I also need too return the max quantity, so 10
I began by converting the file into an array:
if (file_exists($file)) {
$csv = array_map('str_getcsv', file($file));
#Stores
$storeIds = array_column($csv, 0);
$eachStoreNb = array_count_values($storeIds);
$storeCount = count($eachStoreNb);
}
print_r($storeCount);
Is there a better way to do it? Faster ? Maybe without using the array
Faster here would come in the context of micro-optimization, however you can see an improvement in memory usage.
You could just read the file line by line instead of collecting all store IDs in an array and then doing an array_count_values() saving you an extra loop and unnecessary linear storage of all duplicate values.
Store IDs would just be made as a key for an associative array.
For max NB, you can just keep a max variable keeping the track of max value using max() function. Rest is self-explanatory.
Snippet:
<?php
$file = 'test.csv';
if (file_exists($file)) {
$fp = fopen($file ,'r');
$max_nb = 0;
$store_set = [];
fgetcsv($fp); // ignoring headers
while(!feof($fp)){
$row = fgetcsv($fp);
$store_set[$row[0]] = true;
$max_nb = max($max_nb,end($row));
}
fclose($fp);
echo "Num Stores : ",count($store_set),"<br/>";
echo "Max NB : ",$max_nb;
}else{
echo "No such CSV file found.";
}
Note: For profiling, I suggest you to try both scripts using xdebug
What if you looped through the file line by line?
I mean ...
$datas = [];
$handle = fopen("filename.csv", "r");
$flagFirstLine = true;
while(!feof($handle)){
//dont read first line
if($flagFirstLine) continue;
$flagFirstLine = false;
$csvLine = fgetcsv($handle);
$storeID = $csvLine[0];
$datas[] = $storeID;
}
echo "all row: " . count($datas);
echo "\nnum store: " . count(array_unique($datas));
What 'nice_dev' says, but a little more compact.
$fp = fopen('<your_file>', 'r');
fseek($fp, strpos($content, "\n") + 1); // skip first line
$stores = [];
while($row = fgetcsv($fp)) {
$stores[$row[0]] = max([($stores[$row[0]] ?? 0), $row[2]]);
}
Working example.
An answer with awk would be:
awk -F, 'BEGIN {getline}
{ a[$1]++; m=$3>m?$3:m }
END{ for (i in a){ print i, a[i] };
print "Number of stores",length(a), "max:",m}' testfile
getline to skip the first line
increment the element with the value of the first column $1 in array a with one, and keep the max value in m
loop over the array a and print all counts (optional)
print the total 'Number of stores', and the max value.
output:
124 52
125 52
126 26
142 52
171 52
Number of stores 5 max: 10
Solution in AWK, to compare the difference. This includes the count of each store as well. AWK should be able to process millions in less than 1 second. I use the same to filter duplicates from a file.
BEGIN{ # Set some variables initially
FS="," # field separator for INPUT
mymax=0 # init variable mymax
}
NR>1 { # skip the header line, this matches line 2 onwards
mycount[$1]++ # increase associative array at that position
if ($3>mymax){ # compare with max
mymax=$3
}
}
END{ # finally print results
for (i in mycount){
if (length(i)>0){
print "value " i " has " mycount[i]
}
}
print "Maximum value is " mymax
}

array_keys returning higher number than end($array)

I have a CSV file that contains around 8500 lines but I'm getting a really weird "bug".
I'm validating the data inside the CSV to make sure the data is cool to import into the database. I currently just log the data errors to a log file, but when I open it I see error reports for rows upto 8800 (give or take).
I did some basic debugging to see what's what and did this to begin with:
foreach ($csv as $key => $row)
{
if ($key > 8500) {
echo '<pre>';
print_r($row);
echo '</pre>';
}
}
and that only returned about 50/60 more which is fine as the total rows is around that number.
I then tried doing this to get the end array result:
$last = end($csv);
print_r($last);
and that showed an array with data as expected. However when I do this:
var_dump(array_keys($csv));
then it shows 8800 (give or take) values. Doing count($csv) returns the same number.
I've tried going into the actual CSV and highlighting everything below the last row and hitting clear but it still has the same affect..
Here's how I build my $csv array:
$skus = $csv = [];
if (($handle = fopen($fileTmp, 'r')) !== false) {
set_time_limit(0);
$i = 0;
while (($csvData = fgetcsv($handle, 1000, ',')) !== false)
{
$colCount = count($csvData);
$csv[$i]['sku'] = $csvData[0];
$csv[$i]['desc'] = $csvData[1];
$csv[$i]['ean'] = $csvData[2];
$csv[$i]['rrp_less_vat'] = $csvData[3];
$csv[$i]['rrp_inc_vat'] = $csvData[4];
$csv[$i]['stock'] = $csvData[5];
$csv[$i]['est_delivery'] = $csvData[6];
$csv[$i]['img_name'] = $csvData[7];
$csv[$i]['vatable'] = $csvData[8];
$csv[$i]['obsolete'] = $csvData[9];
$csv[$i]['dead'] = $csvData[10];
$csv[$i]['replacement_product'] = $csvData[11];
$csv[$i]['brand'] = $csvData[12];
$csv[$i]['ext_desc'] = $csvData[13];
$i++;
}
fclose($handle);
}
Am I doing something wrong that I can't see in building the array or is this unexpected behaviour?
PHP version: 7.1
OS: Linux Mint
You have lines that are longer than the $length argument you are passing to fgetcsv(). From the documentation, emphasis mine:
Must be greater than the longest line (in characters) to be found in the CSV file (allowing for trailing line-end characters). Otherwise the line is split in chunks of length characters, unless the split would occur inside an enclosure.
The easiest fix is to stop limiting the length of the line to 1000:
while (($csvData = fgetcsv($handle)) !== false)

working with data in TXT file

I have a text file that contain data like this (EURUSD quotes)
19710104,000000,0.53690,0.53690,0.53690,0.53690,1
19710105,000000,0.53660,0.53660,0.53660,0.53660,1
19710106,000000,0.53650,0.53650,0.53650,0.53650,1
19710107,000000,0.53680,0.53680,0.53680,0.53680,1
19710108,000000,0.53710,0.53710,0.53710,0.53710,1
19710111,000000,0.53710,0.53710,0.53710,0.53710,1
19710112,000000,0.53710,0.53710,0.53710,0.53710,1
I want to move some data to another file like
0.53690,0.53690,0.53690,0.53690
and add some difrent calculated number to each line like (Moving average and RSI, Stoch ...) so the file can be trained by Neural Network, final file must be like this
OPEN, HIGH, LOW, CLOSE, VOL, MA50, MA20, RSI14, StochMain, StochSignal,
so I need some hints
You should use the PHP functions fgetcsv and fputcsv. See working example below that you can tweak to your needs.
It assumes that your input values given are in the format OPEN, CLOSE, HIGH, LOW, VOL. Introduce RSI and Stochastic etc in the same way that the Moving Average works.
<?php
// Prepare variables
$row = 1;
$output = array();
// Attempt to open the file quotes.txt
if (($handle = fopen("quotes.txt", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$row++;
// This part of the while loop will be hit for each line in the quotes.txt
// $data is an array that contains the values of this line from the csv
// $output is a new array that you are generating
// Create a new sub-array in the output array (add a new line to the main output)
$output[$row] = array();
// Add the third value of the input array to the start of the new array (the opening price)
$output[$row][] = $data[2];
// Add the fourth value of the input array to the new array (the closing price)
$output[$row][] = $data[3];
// Add the fifth value of the input array to the new array (the high price)
$output[$row][] = $data[4];
// Add the sixth value of the input array to the new array (the low price)
$output[$row][] = $data[5];
// Add the seventh value of the input array to the new array (the volume)
$output[$row][] = $data[6];
// Add moving average to the new array
$output[$row][] = calculate_ma($output, $row);
}
fclose($handle);
}
// Create new file or open existing to save the output to
$handle = fopen('output.csv', 'w');
// Flatten the arrays and save the csv data
foreach ($output as $file) {
$result = [];
array_walk_recursive($file, function($item) use (&$result) {
$result[] = $item;
});
fputcsv($handle, $result);
}
/**
* Calculate the value for the MA using the values in $output.
*/
function calculate_ma($output, $row) {
// For this example we will just say that the MA is equal to the closing price of this period
// and the previous four periods, divided by 5.
$ma = $output[$row][1] + $output[$row-1][1] + $output[$row-2][1] + $output[$row-3][1] + $output[$row-4][1];
$ma = $ma / 5;
return $ma;
}
?>
The output of the above code, using the same input as you have pasted in your question, will be:
0.53690,0.53690,0.53690,0.53690,1,0.10738
0.53660,0.53660,0.53660,0.53660,1,0.2147
0.53650,0.53650,0.53650,0.53650,1,0.322
0.53680,0.53680,0.53680,0.53680,1,0.42936
0.53710,0.53710,0.53710,0.53710,1,0.53678
0.53710,0.53710,0.53710,0.53710,1,0.53682
0.53710,0.53710,0.53710,0.53710,1,0.53692
Bear in mind that the first four moving averages will be incorrect, as they do not have 5 periods of data from which to calculate the 5-period MA.
To calculate a larger MA (50-period) without a huge bunch of code, replace the MA function with:
function calculate_ma($output, $row) {
$period = 50;
for ($x = 0 ; $x < $period ; $x++){
$ma = $ma + $output[$row-$x][1];
}
$ma = $ma / $period;
return $ma;
}
I strongly believe that you should open file, read line by line, explode it with ',' and then store every line to some kind of map, do some calculations and finally save it to another file.
http://php.net/manual/en/function.explode.php
How to read a file line by line in php

How to sum column text file

Hello everyone and I immediately apologize, as
I have seen various threads on the site, but unfortunately my knowledge is still insufficient to complete my project.
I have a text file and I have to do the sum of each column (just need the total):
1003|name1|1208.00|2.00 |96.00 |0.00|0.00|0.00|0.00|98.00 |90.95 |7.05 |8516.40
1011|name2|1450.00|2.00 |49.00 |0.00|0.00|0.00|0.00|51.00 |44.62 |6.38 |9243.7
1004|name3|1450.00|25.00|170.00|0.00|0.00|0.00|0.00|195.00|175.75|19.25|27912.5 <br>
1002|name4|765.00 |1.00 |17.00 |0.00|0.00|0.00|0.00|18.00 |15.13 |2.87 |2193.26
I need to get this(I have this file on linux then we can use Bash, PHP, Mysql... ):
1003|name1|1208.00|2.00 |96.00 |0.00|0.00|0.00|0.00|98.00 |90.95 |7.05 |8516.40
1011|name2|1450.00|2.00 |49.00 |0.00|0.00|0.00|0.00|51.00 |44.62 |6.38 |9243.7
1004|name3|1450.00|25.00|170.00|0.00|0.00|0.00|0.00|195.00|175.75|19.25|27912.5 <br>
1002|name4|765.00 |1.00 |17.00 |0.00|0.00|0.00|0.00|18.00 |15.13 |2.87 |2193.26 <br>
xxxx|Total |4873.00|30.00|332.00|0.00|0.00|0.00|0.00|362.00 |326.45|35.55|47865.86
Where xxxx is the Id number (No sum here).
I've been trying to do this in PHP and MySQL -- No luck so far.
try something like:
$file = '/path/to/your_file.txt';
if ( ($file = fopen($file, "r")) !== FALSE) {
$total = 0;
$row_1 = 0;
while (($line = fgetcsv($file, 1000, "|")) !== FALSE) {
// brutal dirt sanitization
foreach ( $line as $k => $v ) {
$line[$k] = (float) preg_replace('#[^0-9\.]#','', $v);
}
$total = $total + array_sum(array_slice($line, 2));
$row_1 = $row_1 + array_sum(array_slice($line, 2, 1));
//...
}
echo $total.' | '.$row_1; //...
}
else echo 'error ...';
also, you can sanitize each row by replacing array_sum() by array_map() wih a callback function
Psuedocode:
open source file for reading
open destination file for writing
initialise totaling array to zero values
while not EOF
read in line from file
explode line into working array
for x=2 ; x<14; x++
add totalling array with floatval( working array )
write line out to destination file
close read file
write out totals array to destination file
close destingation file
Try to get the text file data into an excel spreadsheet and then add up the columns.
You can use VB to get the text into excel and then continue adding up the values of each column.
1) replace all | chars with , using str_replace
2) Use str_getcsv to create array out of the above resulting csv string
3) use foreach and loop through each row and calculate total
some PHP code
$str = file_get_contents('myfile.txt');
$str = str_replace('|', ',', $str);
$csv = str_getcsv($str);
$totals = array(0,0,0,0);
foreach ($csv as $row) {
$totals[0] += trim($row[0]);
$totals[1] += trim($row[2]);
$totals[2] += trim($row[3]);
$totals[3] += trim($row[4]);
}
the $totals array contains all totals!

Script to trim 7 columns to 5 ( csv file )

How can I trim these 7 columns down to 5 columns, by running some script of sorts.
I recall you can do this using regex / php but buggered if I can recall how we did it.
Example code ( of a GEO IP db ( 115,000 lines )
"3231296768","3231297023","ripencc","702518400","EU","EU","European Union"
"3231297024","3231297279","ripencc","441763200","EU","EU","European Union"
"3231297280","3231297535","ripencc","702518400","EU","EU","European Union"
"3231297536","3231297791","ripencc","702518400","EU","EU","European Union"
"3231297792","3231298047","ripencc","702518400","EU","EU","European Union"
"3231298048","3231298303","ripencc","702518400","EU","EU","European Union"
"3231298304","3231298559","ripencc","702518400","EU","EU","European Union"
I need to remove columns 3 and 4 from every line.
Any help appreciated.
While jimw's answer is the best answer in general, if you want a pure PHP solution I would suggest the following:
$input = 'input.txt';
$output = 'output.txt';
if (false !== ($ih = fopen($input, 'r'))) {
$oh = fopen($output, 'w');
while (false !== ($data = fgetcsv($ih))) {
// this is where you build your new row
$outputData = array($data[0], $data[1], $data[4], $data[5], $data[6]);
fputcsv($oh, $outputData);
}
fclose($ih);
fclose($oh);
}
From 'a script of sorts' and 'regex/PHP', I infer that you just want this done, and don't care what language is used. If you're on *nix:
cut -d, -f1,2,5,6,7 file.csv
'cut' is a standard unix command-line utility, found on everything from OS X to AIX. The arguments I've used are:
-d, # this sets the 'delimiter' to a comma, for CSV
-f1,2,5,6,7 # this selects which fields to print
So together, it takes a file where each line consists of fields separated by commas and prints out fields one to five of it.
The same effect can be achieved in any programming language. I don't know PHP very well, so I won't attempt to produce it in PHP.
Edit: From the PHP docs, adapted slightly:
function apply_quotes($string) {
return '"'.$string.'"';
}
$row = 1;
if (($handle = fopen("test.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$data = array_map("apply_quotes", $data);
echo join(",", array($data[0], $data[1], $data[4], $data[5], $data[6]))."\n";
}
fclose($handle);
}

Categories