I know that yield can be used to create a data iterator, e.g. to read data from a CSV file.
function csv_generator($file) {
$handle = fopen($file,"r");
while (!feof($handle)) {
yield fgetcsv($file);
}
fclose($file);
}
But the Generator::send() method suggests that I can do the same for sequential writing, instead of reading.
E.g. I want to use the thing like this:
function csv_output_generator($file) {
$handle = fopen('file.csv', 'w');
while (null !== $row = yield) {
fputcsv($handle, $row);
}
fclose($handle);
}
$output_generator = csv_output_generator($file);
$output_generator->send($rows[0]);
$output_generator->send($rows[1]);
$output_generator->send($rows[2]);
// Close the output generator.
$output_generator->send(null);
The above will work, I think.
But $output_generator->send(null); for closing seems wrong, or not ideal. It means that I can never send a literal null. Which is ok for csv writing, but maybe there is a use case for sending null.
Is there any "best practice" for using php generators for sequential writing?
Not saying this is a marvelous idea but if you're talking semantics, this 'feels' great.
Check against a class. Like pass in objects of a particular class to terminate the generator. Like:
// should probably use namespacing here.
class GeneratorUtilClose {}
class GeneratorUtil {
public static function close() {
return new GeneratorUtilClose;
}
}
function csv_output_generator($file) {
$handle = fopen('file.csv', 'w');
while (!(($row = yield) instanceof GeneratorUtilClose)) {
fputcsv($handle, $row);
}
fclose($handle);
}
$output_generator = csv_output_generator($file);
$output_generator->send($rows[0]);
$output_generator->send(GeneratorUtil::close());
Added a little factory in here for extra semantic sugar.
Not ideal either but works without creating any other class
function csv_output_generator($file) {
$handle = fopen($file, 'w');
try {
while ($row = yield) {
fputcsv($handle, $row);
}
} catch (ClosedGeneratorException $e) {
// closing generator
}
fclose($handle);
}
$output_generator = csv_output_generator($file);
$output_generator->send($rows[0]);
$output_generator->send($rows[1]);
$output_generator->send($rows[2]);
// Close the output generator.
$output_generator->throw(new ClosedGeneratorException());
Related
I have pretty simple test case to write with parse data of the uploaded csv file. Google is suggesting THIS library that can be helpful in unit tests to mock the real file system but I would like to to it in some simpler way if possible.
I covered basic part of the method, but seems that code coverage stops on fopen method:
function fn_smdl_parse_csv($file, $max_line_size = 65536, $delimiter = ',')
{
$all_data = false;
if (!empty($file) && file_exists($file)) {
if ($f = fopen($file, 'r')) {
while (!feof($f)) {
$data = fgetcsv($f, $max_line_size, $delimiter);
if (!empty($data)) {
$all_data[] = $data;
}
}
fclose($f);
}
}
return $all_data;
}
I tried with:
public function testFnSmdlParseCsv()
{
$result = fn_smdl_parse_csv('file.csv');
self::assertFalse($result);
}
It is not much, but I can not find a way to do it. I want to enter and cover second if() statement. Is there a way to do it with usual mock and not using external libraries?
This question already has answers here:
Reading very large files in PHP
(8 answers)
Closed 1 year ago.
I have a file with around 100 records for now.
The file has users in json format per line.
Eg
{"user_id" : 1,"user_name": "Alex"}
{"user_id" : 2,"user_name": "Bob"}
{"user_id" : 3,"user_name": "Mark"}
Note : This is a just very simple example, I have more complex json values per line in the file.
I am reading the file line by line and store that in an array which obviously will be big if there are a lot of items in the file.
public function read(string $file) : array
{
//Open the file in "reading only" mode.
$fileHandle = fopen($file, "r");
//If we failed to get a file handle, throw an Exception.
if ($fileHandle === false) {
throw new Exception('Could not get file handle for: ' . $file);
}
$lines = [];
//While we haven't reach the end of the file.
while (!feof($fileHandle)) {
//Read the current line in.
$lines[] = json_decode(fgets($fileHandle));
}
//Finally, close the file handle.
fclose($fileHandle);
return $lines;
}
Next, Ill process this array and only take the parameters I need (some parameters might be further processed) and then Ill export this array to csv.
public function processInput($users){
$data = [];
foreach ($users as $key => $user)
{
$data[$key]['user_id'] = $user->user_id;
$data[$key]['user_name'] = strtoupper($user->user_name);
}
// Call export to csv $data.
}
What should be the best way to read the file (incase we have a big file)?
I know file_get_contents is not optimized way and instead fgets is a better approach.
Is there a much better way considering big file read and then put it to csv.
You need to modify your reader to make it more "lazy" in some sense. For example consider this:
public function read(string $file, callable $rowProcessor) : void
{
//Open the file in "reading only" mode.
$fileHandle = fopen($file, "r");
//If we failed to get a file handle, throw an Exception.
if ($fileHandle === false) {
throw new Exception('Could not get file handle for: ' . $file);
}
//While we haven't reach the end of the file.
while (!feof($fileHandle)) {
//Read the current line in.
$line = json_decode(fgets($fileHandle));
$rowProcessor($line);
}
//Finally, close the file handle.
fclose($fileHandle);
return $lines;
}
Then your will need different code that works with this:
function processAndWriteJson($filename) { //Names are hard
$writer = fopen('output.csv', 'w');
read($filename, function ($row) use ($writer) {
// Do processing of the single row here
fputcsv($writer, $processedRow);
});
}
If you want to get the same result as before with your read method you can do:
$lines = [];
read($filename, function ($row) use ($writer) {
$lines[] = $row;
});
It does provide some more flexibility. Unfortunately it does mean you can only process one line at a time and scanning up and down the file is harder
I am reading a file containing around 50k lines using the file() function in Php. However, its giving a out of memory error since the contents of the file are stored in the memory as an array. Is there any other way?
Also, the lengths of the lines stored are variable.
Here's the code. Also the file is 700kB not mB.
private static function readScoreFile($scoreFile)
{
$file = file($scoreFile);
$relations = array();
for($i = 1; $i < count($file); $i++)
{
$relation = explode("\t",trim($file[$i]));
$relation = array(
'pwId_1' => $relation[0],
'pwId_2' => $relation[1],
'score' => $relation[2],
);
if($relation['score'] > 0)
{
$relations[] = $relation;
}
}
unset($file);
return $relations;
}
Use fopen, fread and fclose to read a file sequentially:
$handle = fopen($filename, 'r');
if ($handle) {
while (!feof($handle)) {
echo fread($handle, 8192);
}
fclose($handle);
}
EDIT after update of question and comments to answer of fabjoa:
There is definitely something fishy if a 700kb file eats up 140MB of memory with that code you gave (you could unset $relation at the end of the each iteration though). Consider using a debugger to step through it to see what happens. You might also want to consider rewriting the code to use SplFileObject's CSV functions as well (or their procedural cousins)
SplFileObject::setCsvControl example
$file = new SplFileObject("data.csv");
$file->setFlags(SplFileObject::READ_CSV);
$file->setCsvControl('|');
foreach ($file as $row) {
list ($fruit, $quantity) = $row;
// Do something with values
}
For an OOP approach to iterate over the file, try SplFileObject:
SplFileObject::fgets example
$file = new SplFileObject("file.txt");
while (!$file->eof()) {
echo $file->fgets();
}
SplFileObject::next example
// Read through file line by line
$file = new SplFileObject("misc.txt");
while (!$file->eof()) {
echo $file->current();
$file->next();
}
or even
foreach(new SplFileObject("misc.txt") as $line) {
echo $line;
}
Pretty much related (if not duplicate):
How to save memory when reading a file in Php?
If you don't know the maximum line length and you are not comfortable to use a magic number for the max line length then you'll need to do an initial scan of the file and determine the max line length.
Other than that the following code should help you out:
// length is a large number or calculated from an initial file scan
while (!feof($handle)) {
$buffer = fgets($handle, $length);
echo $buffer;
}
Old question but since I haven't seen anyone mentioning it, PHP generators is a great way to reduce save memory consumption.
For example:
function read($fileName)
{
$fileHandler = fopen($fileName, 'rb');
while(($line = fgets($fileHandler)) !== false) {
yield rtrim($line, "\r\n");
}
fclose($fileHandler);
}
foreach(read(__DIR__ . '/filenameHere') as $line) {
echo $line;
}
allocate more memory during the operation, maybe something like ini_set('memory_limit', '16M');. Don't forget to go back to initial memory allocation once operation is done
I sense that I am almost there.
Here is a .txt file, which is about 60 Kbytes and full of German words. Every word is on a new line.
I want to iterate through it with this code:
<?php
$file = "GermanWords.txt";
$f = fopen($file,"r");
$parts = explode("\n", $f);
foreach ($parts as &$v)
{
echo $v;
}
?>
When I execute this code, I get: Resourceid#2
The word resource is not in the .txt, I do not know where it comes from.
How can I manage to show up all words in the txt?
No need for fopen just use file_get_contents:
$file = "GermanWords.txt";
$contents = file_get_contents($file);
$lines = explode("\n", $contents); // this is your array of words
foreach($lines as $word) {
echo $word;
}
fopen() just opens the file, it doesn't read it -- In your code, $f contains a file handle, not the file contents. This is where the word "Resource" comes from; it's PHP's internal name for the file handle.
One answer would be to replace fopen() with file_get_contents(). This opens and reads the file in one action. This would solve the problem, but if the file is big, you probably don't want to read the whole thing into memory in one go.
So I would suggest instead using SplFileObject(). The code would look like this:
<?php
$file = "GermanWords.txt";
$parts = new SplFileObject($file);
foreach ($parts as $line) {
echo $line;
}
?>
It only reads into memory one line at at time, so you don't have to worry about the size of the file.
Hope that helps.
See the PHP manual for more info: http://php.net/manual/en/splfileobject.construct.php
$f, the result of fopen is a resource, not the contents of the file. If you just want an array of the lines contained in the file, you can use file:
$parts = file('GermanWords.txt');
foreach($parts as $v){
echo $v;
}
Alternatively, if you want to stick with fopen you can use fread to read the content:
$f = fopen('GermanWords.txt', 'r');
// read the entire file into $contents
$contents = fread($f, filesize('GermanWords.txt'));
fclose($handle);
$parts = explode("\n", $contents);
The SplFileObject provides a way to do that :
$file = new SplFileObject("file.txt");
while (!$file->eof()) {
echo $file->fgets();
}
And if you prefer the foreach loop, you can create a generator function for that :
function lines($filename) {
$file = new SplFileObject($filename);
while (!$file->eof()) {
yield $file->fgets();
}
}
foreach (lines('German.txt') as $line) {
echo $line;
}
Reading the entire content of the file (with file_get_contents) before treating it can be memory consuming.
If you want to treat a file line by line, this class might help you.
It implements an Iterator (see phpdoc about it), that can be walked through in a foreach loop. Only the last line read is stored in memory.
class TxtFileIterator implements \Iterator{
protected $fileHandler;
protected $key;
protected $current;
protected $fileName;
function __construct($fileName){
$this->fileHandler = fopen($fileName, "r") or die("Unable to open file!");
$this->fileName = $fileName;
$this->key = 0;
}
function __destruct(){
fclose( $this->fileHandler );
}
//Iterator interface
public function current (){
return $this->current;
}
public function key (){
return $this->key;
}
public function next (){
if ( $this->valid() ){
$this->current = fgets( $this->fileHandler );
$this->key++;
}
}
public function rewind (){
$this->__destruct();
$this->__construct( $this->fileName );
}
public function valid (){
return !feof( $this->fileHandler );
}
Usage :
$iterator = new TxtFileIterator("German.txt");
foreach ($iterator as $line) {
echo $line;// or do whatever you want with line
}
I am reading a file containing around 50k lines using the file() function in Php. However, its giving a out of memory error since the contents of the file are stored in the memory as an array. Is there any other way?
Also, the lengths of the lines stored are variable.
Here's the code. Also the file is 700kB not mB.
private static function readScoreFile($scoreFile)
{
$file = file($scoreFile);
$relations = array();
for($i = 1; $i < count($file); $i++)
{
$relation = explode("\t",trim($file[$i]));
$relation = array(
'pwId_1' => $relation[0],
'pwId_2' => $relation[1],
'score' => $relation[2],
);
if($relation['score'] > 0)
{
$relations[] = $relation;
}
}
unset($file);
return $relations;
}
Use fopen, fread and fclose to read a file sequentially:
$handle = fopen($filename, 'r');
if ($handle) {
while (!feof($handle)) {
echo fread($handle, 8192);
}
fclose($handle);
}
EDIT after update of question and comments to answer of fabjoa:
There is definitely something fishy if a 700kb file eats up 140MB of memory with that code you gave (you could unset $relation at the end of the each iteration though). Consider using a debugger to step through it to see what happens. You might also want to consider rewriting the code to use SplFileObject's CSV functions as well (or their procedural cousins)
SplFileObject::setCsvControl example
$file = new SplFileObject("data.csv");
$file->setFlags(SplFileObject::READ_CSV);
$file->setCsvControl('|');
foreach ($file as $row) {
list ($fruit, $quantity) = $row;
// Do something with values
}
For an OOP approach to iterate over the file, try SplFileObject:
SplFileObject::fgets example
$file = new SplFileObject("file.txt");
while (!$file->eof()) {
echo $file->fgets();
}
SplFileObject::next example
// Read through file line by line
$file = new SplFileObject("misc.txt");
while (!$file->eof()) {
echo $file->current();
$file->next();
}
or even
foreach(new SplFileObject("misc.txt") as $line) {
echo $line;
}
Pretty much related (if not duplicate):
How to save memory when reading a file in Php?
If you don't know the maximum line length and you are not comfortable to use a magic number for the max line length then you'll need to do an initial scan of the file and determine the max line length.
Other than that the following code should help you out:
// length is a large number or calculated from an initial file scan
while (!feof($handle)) {
$buffer = fgets($handle, $length);
echo $buffer;
}
Old question but since I haven't seen anyone mentioning it, PHP generators is a great way to reduce save memory consumption.
For example:
function read($fileName)
{
$fileHandler = fopen($fileName, 'rb');
while(($line = fgets($fileHandler)) !== false) {
yield rtrim($line, "\r\n");
}
fclose($fileHandler);
}
foreach(read(__DIR__ . '/filenameHere') as $line) {
echo $line;
}
allocate more memory during the operation, maybe something like ini_set('memory_limit', '16M');. Don't forget to go back to initial memory allocation once operation is done