Using array_combine to store data read from csv - php

From a csv file I need to extract the header and the values. Both are later accessed in frontend.
$header = array();
$contacts = array();
if ($request->isMethod('POST')) {
if (($handle = fopen($_FILES['file']['tmp_name'], "r")) !== FALSE) {
$header = fgetcsv($handle, 1000, ",");
while (($values = fgetcsv($handle, 1000, ",")) !== FALSE) {
// array_combine
// Creates an array by using one array for keys
// and another for its values
$contacts[] = array_combine($header, $values);
}
fclose($handle);
}
}
It works with csv files that look like this
Name,Firstname,Organisation,
Bar,Foo,SO,
I just exported my gmail contacts and tried to read them using the above code but I get following error
Warning: array_combine() [function.array-combine]: Both
parameters should have an equal number of elements
The gmail csv looks like this
Name,Firstname,Organisation
Bar,Foo,SO
Is the last missing , the reason for the error? What is wrong and how to fix it?
I found this on SO
function array_combine2($arr1, $arr2) {
$count = min(count($arr1), count($arr2));
return array_combine(array_slice($arr1, 0, $count),
array_slice($arr2, 0, $count));
}
This works but it skips the Name field and not all fields are combined. Is this because the gmail csv is not realy valid? Any suggestions?

I managed this by expanding the array size or slicing it depending on the size of the header.
if (count($header) > count($values)) {
$contacts = array_pad($values, count($header), null);
} else if (count($header) < count($values)) {
$contacts = array_slice($values, 0, count($header));
} else {
$contacts = $values;
}

Although this isn't the answer to the question you asked, it might be the answer to the source of the problem. I recently had this problem and realized I was making a silly error because I didn't understand the fgetcsv() function's parameters:
That 1000 up there denotes the maximum line length of a single line in the csv you're taking content from. Longer than that, and the function returns null! I don't know why the version given in the examples is so stingy, but it's not required; setting it to 0 allows fgetcsv() to read lines of any length. (The documentation warns this is slower. For most use cases of fgetcsv() I can hardly imagine it's slow enough to notice.)

Related

PHP Array Processing Ability Decreases

I need help processing files holding about 46k lines or more than 30MB of data.
My original idea was to open the file and turn each line into an array element. This worked the first time as the array held about 32k values total.
The second time, the process was repeated, the array only held 1011 elements, and finally, the third time it could only hold 100.
I'm confused and don't know much about the backend array processes. Can someone explain what is happening and fix the code?
function file_to_array($cvsFile){
$handle = fopen($cvsFile, "r");
$path = fread($handle, filesize($cvsFile));
fclose($handle);
//Turn the file into an array and separate lines to elements
$csv = explode(",", $path);
//Remove common double spaces
foreach ($csv as $key => $line){
$csv[$key] = str_replace(' ', '', str_getcsv($line));
}
array_filter($csv);
//get the row count for the file and array
$rows = count($csv);
$filerows = count(file($cvsFile)); //this no longer works
echo "File has $filerows and array has $rows";
return $csv;
}
The approach here can be split in 2.
Optimized file reading and processing
Proper storage solution
Optimized file processing can be done like so:
$handle = fopen($cvsFile, "r");
$rowsSucceed = 0;
$rowsFailed = 0;
if ($handle) {
while (($line = fgets($handle)) !== false) { // Reading file by line
// Process CSV line and check if it was parsed correctly
// And count as you go
if (!empty($parsedLine)) {
$csv[$key] = ... ;
$rowsSucceed++;
} else {
$rowsFailed++;
}
}
fclose($handle);
} else {
// Error handling
}
$totalLines = $rowsSucceed + $rowsFailed;
Also you can avoid array_filter() simply by not adding processed line if its empty.
It will allow to optimize memory usage during script execution.
Proper storage
Proper storage here is needed for performing operations on certain amount of data. File reading are ineffective and expensive. Using simple file based database like sqlite can help you a lot and increase overall performance of your script.
For this purpose you probably should process your CSV directly to database and than perform count operation on parsed data avoiding excessive file line counts etc.
Also it gives you further advantage on working with data not keeping it all in memory.
Your question says you want to "turn each line into an array element" but that is definitely not what you are doing. The code is quite clear; it reads the entire file into $path and then uses explode() to make one massive flat array of every element on every line. Then later you're trying to run str_getcsv() on each item, which of course isn't going to work; you've already exploded all the commas away.
Looping over the file using fgetcsv() makes more sense:
function file_to_array($cvsFile) {
$filerows = 0;
$handle = fopen($cvsFile, "r");
while ($line = fgetcsv($handle)) {
$filerows++;
// skip empty lines
if ($line[0] === null) {
continue;
}
//Remove common double spaces
$csv[] = str_replace(' ', '', $line);
}
//get the row count for the file and array
$rows = count($csv);
echo "File has $filerows and array has $rows";
fclose($handle);
return $csv;
}

Edit CSV field value for entire column

I have a CSV that is downloaded from the wholesaler everynight with updated prices.
What I need to do is edit the price column (2nd column) and multiply the current value by 1.3 (30%).
My code to read the provided CSV and take just the columns I need is below, however I can't seem to figure out how to edit the price column.
<?php
// open the csv file in write mode
$fp = fopen('var/import/tb_prices.csv', 'w');
// read csv file
if (($handle = fopen("var/import/Cbl_4036_2408.csv", "r")) !== FALSE) {
$targetColumns = array(1, 2, 3); // get data from the 1st, 4th and 15th column
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$targetData = array(); // array that hold target data
foreach($targetColumns as $column){ // loop throught the targeted columns array
if($column[2]){
$data[$column] = $data[0] * 1.3;
}
$targetData[] = $data[$column]; // get the data from the column
}
# Populate the multidimensional array.
$csvarray[$nn] = $targetData; // add target data to csvarray
// write csv file
fputcsv($fp, $targetData);
}
fclose($handle);
fclose($fp);
echo "CSV File Written Successfully!";
}
?>
Could somebody point me in the right direction please, explaining how you've worked out the function too so I can learn at the same time.
You are multiplying your price column always as - $data[0] * 1.3.
It may be wrong here.
Other views:
If you are doing it once in a lifetime of this data(csv) handling, try to solve it using mysql itself only. Create the table similar to the database, import the .csv data into that mysql table. And then, SQL operate as you want.
No loops; no coding, no file read/write, and precise control over what you want to do with UPDATE. You just need to be aware of the delimiters (line separators eg. \r\n, column separators (eg. comma or tab or semicolon) and data encoding in double/single-quotes or not)
Once you modify your data, you can export it back to csv again.
If you want to handle the .csv file itself, open it in one connection (read only mode), and write to another file - saving the original data.
you say that the column that contains the price is the second but then use that index with zero. anyway the whole thing can be easier
$handle = fopen("test.csv", "r");
if ( $handle !== FALSE) {
$out = "";
while (($data = fgetcsv($handle, 1000, ";")) !== FALSE) {
$data[1] = ((float)$data[1] * 1.3);
$out .= implode(";",$data) . "\n";
}
fclose($handle);
file_put_contents("test2.csv", $out);
}
this code open a csv file with comma as separator.
than read every line and for every line it's multiplies the second coloumn (index 1) for 1.3
this line
$out .= implode(";",$data) . "\n";
generate a line for new csb file. see implode on the officile documentation ...
after I close the connection to the file. and 'useless to have a connection with two files when you can do the writing of the second file in one fell swoop. the thing is true for small files

selecting and manipulating individual CSV columns in php

I am trying to use a function much like this.....
$file = fopen("/tmp/$importedFile.csv","r");
while ($line = fgetcsv($file))
{
$csv_data[] = $line;
}
fclose($file);
...to load CSV values. This is gravy but now I wish to select individual columns by their array number. I believe I want to select it with something like this, but cannot find any clarity.
$csv_data[2] = $line;
This however just shows second (third) row of data rather than column.
Regards
Do you need the whole file in memory or will you be processing the lines individually?
Processing individually:
$line is already an array. If you want the 3rd column, use $line[2]
Processing after reading the whole file:
$csv_data[$lineNo][$columnNo]
$inputfiledelimiter = ",";
if (($handle = fopen($PathOfCsvFile, "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 0, $inputfiledelimiter)) !== FALSE)
{
//get data from $data
}
}
Well, your CSV file is now split up in lines, that is all.
No concept of columns yet in that structure.
So you need to split the lines into columns.
Or, much better, let PHP do that for you: Have a look at fgetcsv() and the associated functions:
http://nl.php.net/manual/en/function.fgetcsv.php

Using fseek to start reading a CSV after a certain number of lines

I am using the current code to read a csv file and add it to an array:
echo "starting CSV import<br>";
$current_row = 1;
$handle = fopen($csv, "r");
while ( ($data = fgetcsv($handle, 10000, ",") ) !== FALSE )
{
$number_of_fields = count($data);
if ($current_row == 1) {
//Header line
for ($c=0; $c < $number_of_fields; $c++)
{
$header_array[$c] = $data[$c];
}
} else {
//Data line
for ($c=0; $c < $number_of_fields; $c++)
{
$data_array[$header_array[$c]] = $data[$c];
}
array_push($products, $data_array);
}
$current_row++;
}
fclose($handle);
echo "finished CSV import <br>";
However when using a very large CSV this times out on the server, or has a memory limit error.
I'd like a way to do it in stages, so after the first say 100 lines it will refresh the page, starting at line 101.
I will probably be doing this with a meta refresh and a URL parameter.
I just need to know how to adapt that code above to start at the line I tell it to.
I have looked into fseek() but I'm not sure how to implement this here.
Can you please help?
The timout can be circumvented using
ignore_user_abort(true);
set_time_limit(0);
When experiencing problems with the memory limit, it may be wise to take a step back and look at what you're actually doing with the data you're processing. Are you pushing the data into a database? calculate something off the data but don't need to store the actual data, …
Do you really need to push (array_push($products, $data_array);) the rows into an array (for later processing)? can you instead write to the database directly? or calculate directly? or build an html <table> directly? or whatever the hell you're doing right then an there, within the while() loop, without pushing everything into an array first?
If you're able to chunk the processing, I guess you don't need that array at all. Otherwise you'd have to restore the array for every chunk - not solving the memory issue one bit.
If you can manage to change your processing algorithm to waste less memory / time, you should seriously consider that over any chunked processing requiring a round-trip to the browser (for so many performance and security reasons…).
Anyways, you can, at any time, identify the current stream offset with ftell() and re-set to that position using fseek(). You'd only need to pass that integer to your next iteration.
Also there is no need for your inner for() loops. This should produce the same results:
<?php
$products = array();
$cols = null;
$first = true;
$handle = fopen($csv, "r");
while (($data = fgetcsv($handle, 10000, ",")) !== false) {
if ($first) {
$cols = $data;
$first = false;
} else {
$products[] = array_combine($cols, $data);
}
}
fclose($handle);
echo "finished CSV import <br>";

filegetcsv causing infinite loop

Trying to use filegetcsv to parse a CSV file and do stuff with it, using the following code found all over the Internet, including the PHP function definition page:
if (($handle = fopen("test.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
print_r($data);
}
fclose($handle);
}
But the code gives me an infinite loop of warnings on the $data = line:
PHP Warning: fgetcsv() expects parameter 1 to be resource, boolean given in...
I know the file I'm opening is a valid file, because if I add a dummy character to the file name I get a different error and no loop.
The file is in a folder with full permissions.
I'm not using a CSV generated by an Excel on Mac (there's a quirky error there)
PHP version 5.1.6, so there should be no problem with the function
I know the file's not too big, or malformed, because I kept shrinking the original file to see if that was a problem and finally just created a custom file in Notepad with nothing more than two lines like:
Value1A,Value1B,Value1C,Value1D
Still looping and giving no data. Here's the full code I'm working with now (using a variable that's greater than the number of lines so I can prove that it would loop infinitely without actually giving my server an infinite loop)
if ($handle = fopen($_SERVER['DOCUMENT_ROOT'].'/tmp/test-csv-file.csv', 'r') !== FALSE) {
while ((($data = fgetcsv($handle, 1000, ',')) !== FALSE) && ($row < 10)) {
print_r($data);
$row++;
}
fclose($handle);
}
So I really have two questions.
1) What could I possibly be overlooking that is causing this loop? I'm half-convinced it's something really "face-palm" simple...
2) Why is the recommended code for this function something that can cause an infinite loop if the file exists but there is some unknown problem? I would have thought the purpose of the !== FALSE and so forth would be to prevent that kind of stuff.
There's no question about what's going on here: the file is not opened successfully. That's why $handle is a bool instead of a resource (var_dump($handle) to confirm this yourself).
fgetcsv then returns null (not false!) because there's an error, and your test doesn't pick this up because you are testing with !== false. As the documentation states:
fgetcsv() returns NULL if an invalid handle is supplied or FALSE on
other errors, including end of file.
I agree that returning null and false for different error conditions is not ideal, and furthermore that it's against the precedent established by lots of other functions, but that's just how it is (and things could be worse). As things stand, you can simply change the test to
while ($data = fgetcsv($handle, 1000, ","))
and it will work correctly in both cases.
Update:
You are the victim of assignment inside an if condition:
if ($handle = fopen($_SERVER['DOCUMENT_ROOT'].'/tmp/test-csv-file.csv', 'r') !== FALSE)
should have been
// wrap the assignment to $handle inside parens!
if (($handle = fopen($_SERVER['DOCUMENT_ROOT'].'/tmp/test-csv-file.csv', 'r')) !== FALSE)
I 'm sure you understand what went wrong here. This is the reason why I choose to never, ever, make assignments inside conditionals. I don't care that it's possible. I don't care that it's shorter. I don't even care that sometimes it's quite less "elegant" to write the loop if the assignment is taken out. If you value your sanity, consider doing the same.
$row = 1;
if (($handle = fopen($_FILES['csv-file']['tmp_name'], "r")) !== FALSE) {
$data = fgetcsv($handle , 1000 , ",");
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data);
echo "<p> $num fields in line $row: <br /></p>\n";
$row++;
for ($c=0; $c < $num; $c++) {
echo $data[$c] . "<br />\n";
}
}
fclose($handle);
}
Try given Code Snippet once,because as i have noticed you are missing some important things in your code.

Categories