How can I create an array by parsing a large file? [duplicate] - php

I want to read a file line by line, but without completely loading it in memory.
My file is too large to open in memory, and if try to do so I always get out of memory errors.
The file size is 1 GB.

You can use the fgets() function to read the file line by line:
$handle = fopen("inputfile.txt", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
// process the line read.
}
fclose($handle);
}

if ($file = fopen("file.txt", "r")) {
while(!feof($file)) {
$line = fgets($file);
# do same stuff with the $line
}
fclose($file);
}

You can use an object oriented interface class for a file - SplFileObject http://php.net/manual/en/splfileobject.fgets.php (PHP 5 >= 5.1.0)
<?php
$file = new SplFileObject("file.txt");
// Loop until we reach the end of the file.
while (!$file->eof()) {
// Echo one line from the file.
echo $file->fgets();
}
// Unset the file to call __destruct(), closing the file handle.
$file = null;

If you want to use foreach instead of while when opening a big file, you probably want to encapsulate the while loop inside a Generator to avoid loading the whole file into memory:
/**
* #return Generator
*/
$fileData = function() {
$file = fopen(__DIR__ . '/file.txt', 'r');
if (!$file) {
return; // die() is a bad practice, better to use return
}
while (($line = fgets($file)) !== false) {
yield $line;
}
fclose($file);
};
Use it like this:
foreach ($fileData() as $line) {
// $line contains current line
}
This way you can process individual file lines inside the foreach().
Note: Generators require >= PHP 5.5

There is a file() function that returns an array of the lines contained in the file.
foreach(file('myfile.txt') as $line) {
echo $line. "\n";
}

The obvious answer wasn't there in all the responses.
PHP has a neat streaming delimiter parser available made for exactly that purpose.
$fp = fopen("/path/to/the/file", "r");
while (($line = stream_get_line($fp, 1024 * 1024, "\n")) !== false) {
echo $line;
}
fclose($fp);

Use buffering techniques to read the file.
$filename = "test.txt";
$source_file = fopen( $filename, "r" ) or die("Couldn't open $filename");
while (!feof($source_file)) {
$buffer = fread($source_file, 4096); // use a buffer of 4KB
$buffer = str_replace($old,$new,$buffer);
///
}

foreach (new SplFileObject(__FILE__) as $line) {
echo $line;
}

One of the popular solutions to this question will have issues with the new line character. It can be fixed pretty easy with a simple str_replace.
$handle = fopen("some_file.txt", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
$line = str_replace("\n", "", $line);
}
fclose($handle);
}

This how I manage with very big file (tested with up to 100G). And it's faster than fgets()
$block =1024*1024;//1MB or counld be any higher than HDD block_size*2
if ($fh = fopen("file.txt", "r")) {
$left='';
while (!feof($fh)) {// read the file
$temp = fread($fh, $block);
$fgetslines = explode("\n",$temp);
$fgetslines[0]=$left.$fgetslines[0];
if(!feof($fh) )$left = array_pop($lines);
foreach ($fgetslines as $k => $line) {
//do smth with $line
}
}
}
fclose($fh);

Be careful with the 'while(!feof ... fgets()' stuff, fgets can get an error (returnfing false) and loop forever without reaching the end of file. codaddict was closest to being correct but when your 'while fgets' loop ends, check feof; if not true, then you had an error.

SplFileObject is useful when it comes to dealing with large files.
function parse_file($filename)
{
try {
$file = new SplFileObject($filename);
} catch (LogicException $exception) {
die('SplFileObject : '.$exception->getMessage());
}
while ($file->valid()) {
$line = $file->fgets();
//do something with $line
}
//don't forget to free the file handle.
$file = null;
}

<?php
echo '<meta charset="utf-8">';
$k= 1;
$f= 1;
$fp = fopen("texttranslate.txt", "r");
while(!feof($fp)) {
$contents = '';
for($i=1;$i<=1500;$i++){
echo $k.' -- '. fgets($fp) .'<br>';$k++;
$contents .= fgets($fp);
}
echo '<hr>';
file_put_contents('Split/new_file_'.$f.'.txt', $contents);$f++;
}
?>

Function to Read with array return
function read_file($filename = ''){
$buffer = array();
$source_file = fopen( $filename, "r" ) or die("Couldn't open $filename");
while (!feof($source_file)) {
$buffer[] = fread($source_file, 4096); // use a buffer of 4KB
}
return $buffer;
}

Related

How to remove specific line from CSV file by its line number?

I'm trying to delete one line from CSV file by its line number, which I get as a parameter in URL.
I saw some discussions here, but it was mainly "delete a line by its id stored in first column" and so on. I tried to make it in the same way as others in these discussions, but it does not work. I only changed the condition.
if (isset($_GET['remove']))
{
$RowNo = $_GET['remove']; //getting row number
$row = 1;
if (($handle = fopen($FileName, "w+")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ";")) !== FALSE)
{
//Here, I don't understand, why this condition does not work.
if ($row != $RowNo)
{
fputcsv($handle, $data, ';');
}
$row++;
}
fclose($handle);
}
}
I supposed, that it should work for me too, BCS just condition was changed. But it does not. It clears the whole file. Could you help me with it, please?
Thank you very much for any advice. Daniel.
You could load the file as an array of lines by using file().
Then remove the line and write the file back.
// read the file into an array
$fileAsArray = file($FileName);
// the line to delete is the line number minus 1, because arrays begin at zero
$lineToDelete = $_GET['remove'] - 1;
// check if the line to delete is greater than the length of the file
if ($lineToDelete > sizeof($fileAsArray)) {
throw new Exception("Given line number was not found in file.");
}
//remove the line
unset($fileAsArray[$lineToDelete]);
// open the file for reading
if (!is_writable($fileName) || !$fp = fopen($fileName, 'w+')) {
// print an error
throw new Exception("Cannot open file ($fileName)");
}
// if $fp is valid
if ($fp) {
// write the array to the file
foreach ($fileAsArray as $line) {
fwrite($fp, $line);
}
// close the file
fclose($fp);
}
If you have a unix system you could also use sed command:
exec("sed -e '{$lineToDelete}d' {$FileName}");
Remember cleaning command parameters if user input used:
https://www.php.net/manual/de/function.escapeshellcmd.php
Option if your CSV can fit to memory:
// Read CSV to memory array
$lines = file($fileName, FILE_SKIP_EMPTY_LINES | FILE_IGNORE_NEW_LINES);
// Remove element from array
unset($lines[$rowNo - 1]); // Validate that element exists!
// Rewrite your CSV file
$handle = fopen($fileName, "w+");
for ($i = 0; $i < count($lines); $i++) {
fputcsv($handle, $data, ';');
}
fclose($handle);
Option if your CSV can not fit to memory:
Use code from question, just write to separate file and later replace it with actual file:
$handle = fopen($FileName, "r");
// Read file wile not End-Of-File
while (!feof($fn)) {
if ($row != $RowNo) {
file_put_contents($FileName . '.tmp', fgets($fn), FILE_APPEND);
}
$row++;
}
fclose($handle);
// Remove old file and rename .tmp to previously removed file
unlink($FileName);
rename($FileName . '.tmp', $FileName);

Read files for part by part for 1000 bytes in PHP

I want to read a file line by line and add it into a variable till its string length is 1000 bytes . The file is relatively large,
Hence, what I am doing is
if(file_exists($file)
{
$fh = fopen($file, "r");
while(!feof($fh) or strlen($chunk) < 10001)
{
$line = fgets($fh, 1000);
$chunk = $chunk."**".$line;
}
}
Issue is how does I store each chunk into an array index till I encounter end of file ?
What about this:
if(file_exists($file)
{
$fh = fopen($file, "r");
$chunks = array();
while(!feof($fh) or strlen($chunk) < 10001)
{
$line = fgets($fh, 1000);
// add line to the buffer
$chunks []= $line;
}
}
? Or am I missing something?

PHP: text fil esentence iterator

I would like to ask you about some known PHP libraries which may help me to parse *.txt files for sentences.
I have to parse too large text files, so I decided to make a stream parser (sentence by sentence).
I thought that it would be pretty to iterate file by sentences, something like:
foreach (new SentenceIterator("./data/huge.txt") as $sentence)
{
// do something...
}
Main idea is that file should be load to the memory completely.
What I have tried:
$f = fopen("./data/huge.txt", "r");
$dataBytes = 64;
$buffer = '';
while (!feof($f))
{
$data = fread($f, $dataBytes);
$dotPosition = strpos($data, '.');
if (false !== $dotPosition)
{
$sentence = $buffer . substr($data, 0, $dotPosition);
// correct cursor position
fseek($f, -1 * $dotPosition, SEEK_CUR);
// clear buffer
$buffer = '';
continue;
}
$buffer .= $data;
}
But in this case I get corrupted (lopped) sentences.
Could someone suggest me some existing libraries or maybe how to fix my code?
Thx in advance.
Sorry for inconvenience,
After some digging I have found solution which is... Spl lib..
Iterator called SplFileObject which implements Iterator, RecursiveIterator and SeekableIterator. And it allows read file line by line.
Updates and worked code is:
$file = new SplFileObject('./data/test.txt');
$file->setFlags(SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY);
$buffer = '';
foreach ($file as $lineNumber => $line)
{
$dotPos = strpos($line, '.');
if (false !== $dotPos)
{
$sentence = $buffer . substr($line, 0, $dotPos);
echo $sentence . "\n";
$buffer = substr($line, $dotPos);
continue;
}
$buffer .= $line;
}

How to read a large file line by line?

I want to read a file line by line, but without completely loading it in memory.
My file is too large to open in memory, and if try to do so I always get out of memory errors.
The file size is 1 GB.
You can use the fgets() function to read the file line by line:
$handle = fopen("inputfile.txt", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
// process the line read.
}
fclose($handle);
}
if ($file = fopen("file.txt", "r")) {
while(!feof($file)) {
$line = fgets($file);
# do same stuff with the $line
}
fclose($file);
}
You can use an object oriented interface class for a file - SplFileObject http://php.net/manual/en/splfileobject.fgets.php (PHP 5 >= 5.1.0)
<?php
$file = new SplFileObject("file.txt");
// Loop until we reach the end of the file.
while (!$file->eof()) {
// Echo one line from the file.
echo $file->fgets();
}
// Unset the file to call __destruct(), closing the file handle.
$file = null;
If you want to use foreach instead of while when opening a big file, you probably want to encapsulate the while loop inside a Generator to avoid loading the whole file into memory:
/**
* #return Generator
*/
$fileData = function() {
$file = fopen(__DIR__ . '/file.txt', 'r');
if (!$file) {
return; // die() is a bad practice, better to use return
}
while (($line = fgets($file)) !== false) {
yield $line;
}
fclose($file);
};
Use it like this:
foreach ($fileData() as $line) {
// $line contains current line
}
This way you can process individual file lines inside the foreach().
Note: Generators require >= PHP 5.5
There is a file() function that returns an array of the lines contained in the file.
foreach(file('myfile.txt') as $line) {
echo $line. "\n";
}
The obvious answer wasn't there in all the responses.
PHP has a neat streaming delimiter parser available made for exactly that purpose.
$fp = fopen("/path/to/the/file", "r");
while (($line = stream_get_line($fp, 1024 * 1024, "\n")) !== false) {
echo $line;
}
fclose($fp);
Use buffering techniques to read the file.
$filename = "test.txt";
$source_file = fopen( $filename, "r" ) or die("Couldn't open $filename");
while (!feof($source_file)) {
$buffer = fread($source_file, 4096); // use a buffer of 4KB
$buffer = str_replace($old,$new,$buffer);
///
}
foreach (new SplFileObject(__FILE__) as $line) {
echo $line;
}
One of the popular solutions to this question will have issues with the new line character. It can be fixed pretty easy with a simple str_replace.
$handle = fopen("some_file.txt", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
$line = str_replace("\n", "", $line);
}
fclose($handle);
}
This how I manage with very big file (tested with up to 100G). And it's faster than fgets()
$block =1024*1024;//1MB or counld be any higher than HDD block_size*2
if ($fh = fopen("file.txt", "r")) {
$left='';
while (!feof($fh)) {// read the file
$temp = fread($fh, $block);
$fgetslines = explode("\n",$temp);
$fgetslines[0]=$left.$fgetslines[0];
if(!feof($fh) )$left = array_pop($lines);
foreach ($fgetslines as $k => $line) {
//do smth with $line
}
}
}
fclose($fh);
Be careful with the 'while(!feof ... fgets()' stuff, fgets can get an error (returnfing false) and loop forever without reaching the end of file. codaddict was closest to being correct but when your 'while fgets' loop ends, check feof; if not true, then you had an error.
SplFileObject is useful when it comes to dealing with large files.
function parse_file($filename)
{
try {
$file = new SplFileObject($filename);
} catch (LogicException $exception) {
die('SplFileObject : '.$exception->getMessage());
}
while ($file->valid()) {
$line = $file->fgets();
//do something with $line
}
//don't forget to free the file handle.
$file = null;
}
<?php
echo '<meta charset="utf-8">';
$k= 1;
$f= 1;
$fp = fopen("texttranslate.txt", "r");
while(!feof($fp)) {
$contents = '';
for($i=1;$i<=1500;$i++){
echo $k.' -- '. fgets($fp) .'<br>';$k++;
$contents .= fgets($fp);
}
echo '<hr>';
file_put_contents('Split/new_file_'.$f.'.txt', $contents);$f++;
}
?>
Function to Read with array return
function read_file($filename = ''){
$buffer = array();
$source_file = fopen( $filename, "r" ) or die("Couldn't open $filename");
while (!feof($source_file)) {
$buffer[] = fread($source_file, 4096); // use a buffer of 4KB
}
return $buffer;
}

PHP: readfile() has been disabled for security reasons

I wrote a php script which outputs html files on the screen which uses readfile($htmlFile);
however in the web-hosting that I have purchased the readfile() has been disabled for security reasons.
Is there any substitution ( other php functions) for the readfile() or I have no choice but to ask the admin to enable it for me?
Thanks
You can check which functions are disabled by using:
var_dump(ini_get('disable_functions'));
You can try to use fopen() and fread() instead:
http://nl2.php.net/manual/en/function.fopen.php
http://nl2.php.net/manual/en/function.fread.php
$file = fopen($filename, 'rb');
if ( $file !== false ) {
while ( !feof($file) ) {
echo fread($file, 4096);
}
fclose($file);
}
Or fopen() with fpassthru()
$file = fopen($filename, 'rb');
if ( $file !== false ) {
fpassthru($file);
fclose($file);
}
Alternatively you can use fwrite() to write content.
You can also try to use file_get_contents()
http://nl2.php.net/file_get_contents
Or you can use file()
http://nl2.php.net/manual/en/function.file.php
I wouldn't recommend this method though, but if nothing works...
$data = file($filename);
if ( $data !== false ) {
echo implode('', $data);
}
If its disabled then you could do something like following as alternative:
$file = fopen($yourFileNameHere, 'rb');
if ( $file !== false ) {
while ( !feof($file) ) {
echo fread($file, 4096);
}
fclose($file);
}
//OR
$contents = file_get_contents($yourFileNameHere); //if for smaller files
Hope it helps
You can try :
$path = '/some/path/to/file.html';
$file_string = '';
$file_content = file($path);
// here is the loop
foreach ($file_content as $row) {
$file_string .= $row;
}
// finally print it
echo $file_string;

Categories