I have a 260k line csv file that has two columns. I have read in the csv file using fgetcsv and have a while loop which reads every line in the file. In the loop I am trying to add the values from the second column to an array.
When I have the line in to add to the array, my PHP freezes and doesn't finish. I have done debugging and the values are getting added to the array so I know that the adding to array and while loop work but I do not know why it freezes.
If I remove the line the while loop completes going through the 260k lines and then processes the rest of the file.
Here is my code:
$amountRecords = 0;
$totalValue = 0;
$valueArray = array();
// reads in csv file
$handle = fopen('Task1-DataForMeanMedianMode.csv', 'r');
// to skip the header names/values
fgetcsv($handle);
// creates array containing variables from csv file
while(($row = fgetcsv($handle, "\r")) != FALSE)
{
/*
echo "ROW CONTAINS: ";
var_dump($row[1]);
echo "<br />";
*/
$valueArray[] = $row[1];
/*
echo "VALUEARRAY NOW CONTAINS: ";
var_dump($valueArray);
echo "<br />";
*/
$totalValue = $totalValue + $row[1];
$amountRecords++;
}
And sample of csv file:
ID,Value
1,243.00
2,243.00
3,243.00
4,243.00
5,123.11
6,243.00
7,180.00
8,55.00
9,243.00
10,55.00
With an out-of-memory error, there are two general approaches. As usual with these choices, you can pick easy-but-wrong and hard-but-right. The easy-but-wrong solution is to increase your memory limit to an appropriate level:
ini_set('memory_limit', '64M');
the better (although harder) solution is to re-engineer your algorithm to not need as much memory. This is clearly the more sustainable and robust approach. To do this properly, you will need to evaluate what you need to do with the array you are building. For instance, I have written similar scripts which were importing the rows to a database. Instead of building a huge array and then inserting, I did it in batches, where I built an array of 50-100 rows, then inserted those and cleared the array (freeing the memory for re-use).
Pseudo-code:
for(each row in file) {
$rows_cache[] = $row[1];
if(count($rows_cache) >= 50) {
insert_these($rows_cache);
$rows_cache = array();
}
}
Your first row is string, maybe try adding
while(($row = fgetcsv($handle, "\r")) != FALSE)
{
if(is_numeric($row[1]))
{
$valueArray[] = $row[1];
$totalValue = $totalValue + $row[1];
$amountRecords++;
}
}
Why not drop the line:
$totalValue = $totalValue + $row[1];
from inside your loop, and instead use:
$totalValue = array_sum($valueArray);
after completing your loop
Not really the problem, but
while(($row = fgetcsv($handle, "\r")) != FALSE)
can be rewritten as
while($row = fgetcsv(...))
instead. There's no need for the explicit false check - if fgetcsv returns false, the while loop would terminate anyways. Plus this version is more legible, and not as risky. If you forget the () around the fget portion, you'll be doing the equivalent of $row = (fgetcsv() != false) and simply getting a boolean value.
Related
I am new to PHP and trying to loop and concatenate csv column values
My input is
and expected output is
can somebody help me how to loop the column values to get the expected output and thanks in advance.
Basically, you need to read the CSV file line-by-line, store parsed data in array, and then do some nested loops.
This code will do the job for your example (only works for 3 columns):
<?php
$rows = [];
$file = fopen("your-file.csv", "r"); // open your file
while (!feof($file)) { // read file till the end
$row = fgetcsv($file); // get current row as an array of elements
if (!empty($row)) { // check that it's not the empty row
array_push($rows, $row); // store row in array
}
}
fclose($file);
for ($r1 = 0; $r1 < count($rows); $r1++) {
for ($r2 = 0; $r2 < count($rows); $r2++) {
for ($r3 = 0; $r3 < count($rows); $r3++) {
echo $rows[$r1][0] . '_' . $rows[$r2][1] . '_' . $rows[$r3][2] . PHP_EOL;
}
}
}
It's pretty dirty solution, I hope that you can make it cleaner, using recursion, instead of nested loops in case if you need to parse unknown columns count.
If you have another fixed number of columns, just add more nested loops (for $r4, $r5, and so on).
More info about how to read CSV file in PHP on w3schools.com
Documentation about join() function, which is alias of implode() function on php.net
I have a large database that contains results of an experiment for 1500 individuals. Each individual has 96 data points. I wrote the following script to summarize and then format the data so it can be used by the analysis software. At first all was good until I had more than 500 individuals. Now I am running out of memory.
I was wondering if anyone has a suggestion on now to overcome the memory limit problem without sacrificing speed.
This is how the table look in the database
fishId assayId allele1 allele2
14_1_1 1 A T
14_1_1 2 A A
$mysql = new PDO('mysql:host=localhost; dbname=aquatech_DB', $db_user, $db_pass);
$query = $mysql->prepare("SELECT genotyped.fishid, genotyped.assayid, genotyped.allele1, genotyped.allele2, fishId.sex, " .
"fishId.role FROM `fishId` INNER JOIN genotyped ON genotyped.fishid=fishId.catId WHERE fishId.projectid=:project");
$query->bindParam(':project', $project, PDO::PARAM_INT);
$query->execute();
So this is the call to the database. It is joining information from two tables to build the file I need.
if(!$query){
$error = $query->errorInfo();
print_r($error);
} else {
$data = array();
$rows = array();
if($results = $query->fetchAll()){
foreach($results as $row)
{
$rows[] = $row[0];
$role[$row[0]] = $row[5];
$data[$row[0]][$row[1]]['alelleY'] = $row[2];
$data[$row[0]][$row[1]]['alelleX'] = $row[3];
}
$rows = array_unique($rows);
foreach($rows as $ids)
{
$col2 = $role[$ids];
$alelleX = $alelleY = $content = "";
foreach($snp as $loci)
{
$alelleY = convertAllele($data[$ids][$loci]['alelleY']);
$alelleX = convertAllele($data[$ids][$loci]['alelleX']);
$content .= "$alelleY\t$alelleX\t";
}
$body .= "$ids\t$col2\t" . substr($content, 0, -1) . "\n";
This parses the data. In the file I need I have to have one row per individual rather than 96 rows per individual, that is why the data has to be formatted. In the end of the script I just write $body to a file.
I need the output file to be
FishId Assay 1 Assay 2
14_1_1 A T A A
$location = "results/" . "$filename" . "_result.txt";
$fh = fopen("$location", 'w') or die ("Could not create destination file");
if(fwrite($fh, $body))
Instead of reading the whole result from your database query into a variable with fetchAll(), fetch it row by row:
while($row = $query->fetch()) { ... }
fetchAll() fetches the entire result in one go, which has its uses but is greedy with memory. Why not just use fetch() which handles one row at a time?
You seem to indexing the rows by the first column, creating another large array, and then removing duplicate items. Why not use SELECT DISTINCT in the query to remove duplicates before they get to PHP?
I'm not sure what the impact would be on speed - fetch() may be slower than fetchAll() - but you don't have to remove duplicates from the array which saves some processing.
I'm also not sure what your second foreach is doing but you should be able to do it all in a single pass. I.e. a foreach loop within a fetch loop.
Other observations on your code above:
the $role array seems to do the same indexing job as $rows - using $row[0] as the key effectively removes the duplicates in a single pass. Removing the duplicates by SELECT DISTINCT is probably better but, if not, do you need the $rows array and the array_unique function at all?
if the same value of $row[0] can have different values of $row[5] then your indexing method will be discarding data - but you know what's in your data so I guess you've already thought of that (the same could be true of the $data array)
I've got my php working to count the number of rows in the full csv document, but trying to get a number of rows that have data in them in a particular column. Is this possible with PHP?
$fp = file('test.csv');
echo count($fp);
You can use fgetcsv function and check in every row the data.
For example if you want to check how many rows have data in the second column, run that.
$data_found = 0;
$handle = fopen("test.csv", "r");
while ($data = fgetcsv($handle))
{
if ($data[1] != '')
{
// check the data specifically in the second column
$data_found ++;
}
}
echo 'Rows with data in the second column:'.$data_found;
I have a large database that contains results of an experiment for 1500 individuals. Each individual has 96 data points. I wrote the following script to summarize and then format the data so it can be used by the analysis software. At first all was good until I had more than 500 individuals. Now I am running out of memory.
I was wondering if anyone has a suggestion on now to overcome the memory limit problem without sacrificing speed.
This is how the table look in the database
fishId assayId allele1 allele2
14_1_1 1 A T
14_1_1 2 A A
$mysql = new PDO('mysql:host=localhost; dbname=aquatech_DB', $db_user, $db_pass);
$query = $mysql->prepare("SELECT genotyped.fishid, genotyped.assayid, genotyped.allele1, genotyped.allele2, fishId.sex, " .
"fishId.role FROM `fishId` INNER JOIN genotyped ON genotyped.fishid=fishId.catId WHERE fishId.projectid=:project");
$query->bindParam(':project', $project, PDO::PARAM_INT);
$query->execute();
So this is the call to the database. It is joining information from two tables to build the file I need.
if(!$query){
$error = $query->errorInfo();
print_r($error);
} else {
$data = array();
$rows = array();
if($results = $query->fetchAll()){
foreach($results as $row)
{
$rows[] = $row[0];
$role[$row[0]] = $row[5];
$data[$row[0]][$row[1]]['alelleY'] = $row[2];
$data[$row[0]][$row[1]]['alelleX'] = $row[3];
}
$rows = array_unique($rows);
foreach($rows as $ids)
{
$col2 = $role[$ids];
$alelleX = $alelleY = $content = "";
foreach($snp as $loci)
{
$alelleY = convertAllele($data[$ids][$loci]['alelleY']);
$alelleX = convertAllele($data[$ids][$loci]['alelleX']);
$content .= "$alelleY\t$alelleX\t";
}
$body .= "$ids\t$col2\t" . substr($content, 0, -1) . "\n";
This parses the data. In the file I need I have to have one row per individual rather than 96 rows per individual, that is why the data has to be formatted. In the end of the script I just write $body to a file.
I need the output file to be
FishId Assay 1 Assay 2
14_1_1 A T A A
$location = "results/" . "$filename" . "_result.txt";
$fh = fopen("$location", 'w') or die ("Could not create destination file");
if(fwrite($fh, $body))
Instead of reading the whole result from your database query into a variable with fetchAll(), fetch it row by row:
while($row = $query->fetch()) { ... }
fetchAll() fetches the entire result in one go, which has its uses but is greedy with memory. Why not just use fetch() which handles one row at a time?
You seem to indexing the rows by the first column, creating another large array, and then removing duplicate items. Why not use SELECT DISTINCT in the query to remove duplicates before they get to PHP?
I'm not sure what the impact would be on speed - fetch() may be slower than fetchAll() - but you don't have to remove duplicates from the array which saves some processing.
I'm also not sure what your second foreach is doing but you should be able to do it all in a single pass. I.e. a foreach loop within a fetch loop.
Other observations on your code above:
the $role array seems to do the same indexing job as $rows - using $row[0] as the key effectively removes the duplicates in a single pass. Removing the duplicates by SELECT DISTINCT is probably better but, if not, do you need the $rows array and the array_unique function at all?
if the same value of $row[0] can have different values of $row[5] then your indexing method will be discarding data - but you know what's in your data so I guess you've already thought of that (the same could be true of the $data array)
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
PHP Export Excel to specific Path?
I'm not really familiar with PHP exporting to excel or csv, but I'm using PHP MySQL for a local point of sale.
According to the code below, this actually works..But not in the way it should be ! All records are placed as 1 row inside the csv file, how can i fix that ? Also, How would I stop overwriting the same file...I mean When I click on a Button to export the csv, it should check if there is an existing csv file, If there is--Create new one !
Thank You
require_once('connect_db.php');
$items_array = array();
$result = mysql_query("SELECT * FROM sold_items");
while($row = mysql_fetch_array($result))
{
$items_array[] = $row['item_no'];
$items_array[] = $row['qty'];
}
$f = fopen('C:/mycsv.csv', 'w');
fputcsv($f, $items_array);
fclose($f);
fputcsv appears to only be writing one row/record, and includes a row/record terminator in its output. You will need to call fputcsv for each line of the report.
dbf's solution for a sequential filenaming works well in many cases. Personally, I've found appending a timestamp helpful, as it requires less IO when there is a collection of existing files. Additionally, it makes it possible to know when the report was from without having to open each, even in the cases where the report was modified/copied/touched.
Minor detail: adjusted the query to just the columns your using.
<?php
require_once('connect_db.php');
$result = mysql_query("SELECT item_no, qty FROM sold_items");
$timestamp = date('Ymd-His');
$f = fopen("C:/mycsv-{$timestamp}.csv", 'w');
// Headers
fputcsv($f, array('Item No', 'Qty'));
while($row = mysql_fetch_row($result))
{
fputcsv($f, $row);
}
fclose($f);
First of all
$items_array[] = array($row['item_no'], $row['qty']);
second, use a variable to store the files name.
$filename = $name = "myscsv";
$index = 1;
while(file_exists($filename.".csv")) {
$filename = $name.$index;
$index++;
}
now you can save it ;)
$f = fopen("C:/{$filename}.csv", 'w');