Reading large files from end

Reading large files from end - php

Can I read a file in PHP from my end, for example if I want to read last 10-20 lines?
And, as I read, if the size of the file is more than 10mbs I start getting errors.
How can I prevent this error?
For reading a normal file, we use the code :
if ($handle) {
while (($buffer = fgets($handle, 4096)) !== false) {
$i1++;
$content[$i1]=$buffer;
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
My file might go over 10mbs, but I just need to read the last few lines. How do I do it?
Thanks

You can use fopen and fseek to navigate in file backwards from end. For example
$fp = #fopen($file, "r");
$pos = -2;
while (fgetc($fp) != "\n") {
fseek($fp, $pos, SEEK_END);
$pos = $pos - 1;
}
$lastline = fgets($fp);

It's not pure PHP, but the common solution is to use the tac command which is the revert of cat and loads the file in reverse. Use exec() or passthru() to run it on the server and then read the results. Example usage:
<?php
$myfile = 'myfile.txt';
$command = "tac $myfile > /tmp/myfilereversed.txt";
exec($command);
$currentRow = 0;
$numRows = 20; // stops after this number of rows
$handle = fopen("/tmp/myfilereversed.txt", "r");
while (!feof($handle) && $currentRow <= $numRows) {
$currentRow++;
$buffer = fgets($handle, 4096);
echo $buffer."<br>";
}
fclose($handle);
?>

It depends how you interpret "can".
If you wonder whether you can do this directly (with PHP function) without reading the all the preceding lines, then the answer is: No, you cannot.
A line ending is an interpretation of the data and you can only know where they are, if you actually read the data.
If it is a really big file, I'd not do that though.
It would be better if you were to scan the file starting from the end, and gradually read blocks from the end to the file.
Update
Here's a PHP-only way to read the last n lines of a file without reading through all of it:
function last_lines($path, $line_count, $block_size = 512){
$lines = array();
// we will always have a fragment of a non-complete line
// keep this in here till we have our next entire line.
$leftover = "";
$fh = fopen($path, 'r');
// go to the end of the file
fseek($fh, 0, SEEK_END);
do{
// need to know whether we can actually go back
// $block_size bytes
$can_read = $block_size;
if(ftell($fh) < $block_size){
$can_read = ftell($fh);
}
// go back as many bytes as we can
// read them to $data and then move the file pointer
// back to where we were.
fseek($fh, -$can_read, SEEK_CUR);
$data = fread($fh, $can_read);
$data .= $leftover;
fseek($fh, -$can_read, SEEK_CUR);
// split lines by \n. Then reverse them,
// now the last line is most likely not a complete
// line which is why we do not directly add it, but
// append it to the data read the next time.
$split_data = array_reverse(explode("\n", $data));
$new_lines = array_slice($split_data, 0, -1);
$lines = array_merge($lines, $new_lines);
$leftover = $split_data[count($split_data) - 1];
}
while(count($lines) < $line_count && ftell($fh) != 0);
if(ftell($fh) == 0){
$lines[] = $leftover;
}
fclose($fh);
// Usually, we will read too many lines, correct that here.
return array_slice($lines, 0, $line_count);
}

Following snippet worked for me.
$file = popen("tac $filename",'r');
while ($line = fgets($file)) {
echo $line;
}
Reference: http://laughingmeme.org/2008/02/28/reading-a-file-backwards-in-php/

If your code is not working and reporting an error you should include the error in your posts!
The reason you are getting an error is because you are trying to store the entire contents of the file in PHP's memory space.
The most effiicent way to solve the problem would be as Greenisha suggests and seek to the end of the file then go back a bit. But Greenisha's mecanism for going back a bit is not very efficient.
Consider instead the method for getting the last few lines from a stream (i.e. where you can't seek):
while (($buffer = fgets($handle, 4096)) !== false) {
$i1++;
$content[$i1]=$buffer;
unset($content[$i1-$lines_to_keep]);
}
So if you know that your max line length is 4096, then you would:
if (4096*lines_to_keep<filesize($input_file)) {
fseek($fp, -4096*$lines_to_keep, SEEK_END);
}
Then apply the loop I described previously.
Since C has some more efficient methods for dealing with byte streams, the fastest solution (on a POSIX/Unix/Linux/BSD) system would be simply:
$last_lines=system("last -" . $lines_to_keep . " filename");

For Linux you can do
$linesToRead = 10;
exec("tail -n{$linesToRead} {$myFileName}" , $content);
You will get an array of lines in $content variable
Pure PHP solution
$f = fopen($myFileName, 'r');
$maxLineLength = 1000; // Real maximum length of your records
$linesToRead = 10;
fseek($f, -$maxLineLength*$linesToRead, SEEK_END); // Moves cursor back from the end of file
$res = array();
while (($buffer = fgets($f, $maxLineLength)) !== false) {
$res[] = $buffer;
}
$content = array_slice($res, -$linesToRead);

If you know about how long the lines are, you can avoid a lot of the black magic and just grab a chunk of the end of the file.
I needed the last 15 lines from a very large log file, and altogether they were about 3000 characters. So I just grab the last 8000 bytes to be safe, then read the file as normal and take what I need from the end.
$fh = fopen($file, "r");
fseek($fh, -8192, SEEK_END);
$lines = array();
while($lines[] = fgets($fh)) {}
This is possibly even more efficient than the highest rated answer, which reads the file character by character, compares each character, and splits based on newline characters.

Here is another solution. It doesn't have line length control in fgets(), you can add it.
/* Read file from end line by line */
$fp = fopen( dirname(__FILE__) . '\\some_file.txt', 'r');
$lines_read = 0;
$lines_to_read = 1000;
fseek($fp, 0, SEEK_END); //goto EOF
$eol_size = 2; // for windows is 2, rest is 1
$eol_char = "\r\n"; // mac=\r, unix=\n
while ($lines_read < $lines_to_read) {
if (ftell($fp)==0) break; //break on BOF (beginning...)
do {
fseek($fp, -1, SEEK_CUR); //seek 1 by 1 char from EOF
$eol = fgetc($fp) . fgetc($fp); //search for EOL (remove 1 fgetc if needed)
fseek($fp, -$eol_size, SEEK_CUR); //go back for EOL
} while ($eol != $eol_char && ftell($fp)>0 ); //check EOL and BOF
$position = ftell($fp); //save current position
if ($position != 0) fseek($fp, $eol_size, SEEK_CUR); //move for EOL
echo fgets($fp); //read LINE or do whatever is needed
fseek($fp, $position, SEEK_SET); //set current position
$lines_read++;
}
fclose($fp);

Well while searching for the same thing, I can across the following and thought it might be useful to others as well so sharing it here:
/* Read file from end line by line */
function tail_custom($filepath, $lines = 1, $adaptive = true) {
// Open file
$f = #fopen($filepath, "rb");
if ($f === false) return false;
// Sets buffer size, according to the number of lines to retrieve.
// This gives a performance boost when reading a few lines from the file.
if (!$adaptive) $buffer = 4096;
else $buffer = ($lines < 2 ? 64 : ($lines < 10 ? 512 : 4096));
// Jump to last character
fseek($f, -1, SEEK_END);
// Read it and adjust line number if necessary
// (Otherwise the result would be wrong if file doesn't end with a blank line)
if (fread($f, 1) != "\n") $lines -= 1;
// Start reading
$output = '';
$chunk = '';
// While we would like more
while (ftell($f) > 0 && $lines >= 0) {
// Figure out how far back we should jump
$seek = min(ftell($f), $buffer);
// Do the jump (backwards, relative to where we are)
fseek($f, -$seek, SEEK_CUR);
// Read a chunk and prepend it to our output
$output = ($chunk = fread($f, $seek)) . $output;
// Jump back to where we started reading
fseek($f, -mb_strlen($chunk, '8bit'), SEEK_CUR);
// Decrease our line counter
$lines -= substr_count($chunk, "\n");
}
// While we have too many lines
// (Because of buffer size we might have read too many)
while ($lines++ < 0) {
// Find first newline and remove all text before that
$output = substr($output, strpos($output, "\n") + 1);
}
// Close file and return
fclose($f);
return trim($output);
}

As Einstein said every thing should be made as simple as possible but no simpler. At this point you are in need of a data structure, a LIFO data structure or simply put a stack.

A more complete example of the "tail" suggestion above is provided here. This seems to be a simple and efficient method -- thank-you. Very large files should not be an issue and a temporary file is not required.
$out = array();
$ret = null;
// capture the last 30 files of the log file into a buffer
exec('tail -30 ' . $weatherLog, $buf, $ret);
if ( $ret == 0 ) {
// process the captured lines one at a time
foreach ($buf as $line) {
$n = sscanf($line, "%s temperature %f", $dt, $t);
if ( $n > 0 ) $temperature = $t;
$n = sscanf($line, "%s humidity %f", $dt, $h);
if ( $n > 0 ) $humidity = $h;
}
printf("<tr><th>Temperature</th><td>%0.1f</td></tr>\n",
$temperature);
printf("<tr><th>Humidity</th><td>%0.1f</td></tr>\n", $humidity);
}
else { # something bad happened }
In the above example, the code reads 30 lines of text output and displays the last temperature and humidity readings in the file (that's why the printf's are outside of the loop, in case you were wondering). The file is filled by an ESP32 which adds to the file every few minutes even when the sensor reports only nan. So thirty lines gets plenty of readings so it should never fail. Each reading includes the date and time so in the final version the output will include the time the reading was taken.

Related

How can I read newly appended lines from a LARGE (4GB+) open file?

Using PHP 7.3, I'm trying to achieve "tail -f" functionality: open a file, waiting for some other process to write to it, then read those new lines.
Unfortunately, it seems that fgets() caches the EOF condition. Even when there's new data available (filemtime changes), fgets() returns a blank line.
The important part: I cannot simply close, reopen, then seek, because the file size is tens of gigs in size, well above the 32 bit limit. The file must stay open in order to be able to read new data from the correct position.
I've attached some code to demonstrate the problem. If you append data to the input file, filemtime() detects the change, but fgets() reads nothing new.
fread() does seem to work, picking up the new data but I'd rather not have to come up with a roll-your-own "read a line" solution.
Does anyone know how I might be able to poke fgets() into realising that it's not the EOF?
$fn = $argv[1];
$fp = fopen($fn, "r");
fseek($fp, -1000, SEEK_END);
$filemtime = 0;
while (1) {
if (feof($fp)) {
echo "got EOF\n";
sleep(1);
clearstatcache();
$tmp = filemtime($fn);
if ($tmp != $filemtime) {
echo "time $filemtime -> $tmp\n";
$filemtime = $tmp;
}
}
$l = trim(fgets($fp, 8192));
echo "l=$l\n";
}
Update: I tried excluding the call to feof (thinking that may be where the state becomes cached) but the behaviour doesn't change; once fgets reaches the original file pointer position, any further fgets reads will return false, even if more data is subsequently appended.
Update 2: I ended up rolling my own function that will continue returning new data after the first EOF is reached (in fact, it has no concept of EOF, just data available / data not available). Code not heavily tested, so use at your own risk. Hope this helps someone else.
*** NOTE this code was updated 20th June 2021 to fix an off-by-one error. The comment "includes line separator" was incorrect up to this point.
define('FGETS_TAIL_CHUNK_SIZE', 4096);
define('FGETS_TAIL_SANITY', 65536);
define('FGETS_TAIL_LINE_SEPARATOR', 10);
function fgets_tail($fp) {
// Get complete line from open file which may have additional data written to it.
// Returns string (including line separator) or FALSE if there is no line available (buffer does not have complete line, or is empty because of EOF)
global $fgets_tail_buf;
if (!isset($fgets_tail_buf)) $fgets_tail_buf = "";
if (strlen($fgets_tail_buf) < FGETS_TAIL_CHUNK_SIZE) { // buffer not full, attempt to append data to it
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
$ptr = strpos($fgets_tail_buf, chr(FGETS_TAIL_LINE_SEPARATOR));
if ($ptr !== false) {
$rv = substr($fgets_tail_buf, 0, $ptr + 1); // includes line separator
$fgets_tail_buf = substr($fgets_tail_buf, $ptr + 1); // may reduce buffer to empty
return($rv);
} else {
if (strlen($fgets_tail_buf) < FGETS_TAIL_SANITY) { // line separator not found, try to append some more data
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
}
return(false);
}

The author found the solution himself how to create PHP tail viewer for gians log files 4+ Gb in size.
To mark this question as replied, I summary the solution:
define('FGETS_TAIL_CHUNK_SIZE', 4096);
define('FGETS_TAIL_SANITY', 65536);
define('FGETS_TAIL_LINE_SEPARATOR', 10);
function fgets_tail($fp) {
// Get complete line from open file which may have additional data written to it.
// Returns string (including line separator) or FALSE if there is no line available (buffer does not have complete line, or is empty because of EOF)
global $fgets_tail_buf;
if (!isset($fgets_tail_buf)) $fgets_tail_buf = "";
if (strlen($fgets_tail_buf) < FGETS_TAIL_CHUNK_SIZE) { // buffer not full, attempt to append data to it
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
$ptr = strpos($fgets_tail_buf, chr(FGETS_TAIL_LINE_SEPARATOR));
if ($ptr !== false) {
$rv = substr($fgets_tail_buf, 0, $ptr + 1); // includes line separator
$fgets_tail_buf = substr($fgets_tail_buf, $ptr + 1); // may reduce buffer to empty
return($rv);
} else {
if (strlen($fgets_tail_buf) < FGETS_TAIL_SANITY) { // line separator not found, try to append some more data
$t = fread($fp, FGETS_TAIL_CHUNK_SIZE);
if ($t != false) $fgets_tail_buf .= $t;
}
}
return(false);
}

Tailing Log File and Write results to new file

I'm not sure how to word this so I'll type it out and then edit and answer any questions that come up..
Currently on my local network device (PHP4 based) I'm using this to tail a live system log file: http://commavee.com/2007/04/13/ajax-logfile-tailer-viewer/
This works well and every 1 second it loads an external page (logfile.php) that does a tail -n 100 logfile.log The script doesn't do any buffering so the results it displayes onscreen are the last 100 lines from the log file.
The logfile.php contains :
<? // logtail.php $cmd = "tail -10 /path/to/your/logs/some.log"; exec("$cmd 2>&1", $output);
foreach($output as $outputline) {
echo ("$outputline\n");
}
?>
This part is working well.
I have adapted the logfile.php page to write the $outputline to a new text file, simply using fwrite($fp,$outputline."\n");
Whilst this works I am having issues with duplication in the new file that is created.
Obviously each time tail -n 100 is run produces results, the next time it runs it could produce some of the same lines, as this repeats I can end up with multiple lines of duplication in the new text file.
I can't directly compare the line I'm about to write to previous lines as there could be identical matches.
Is there any way I can compare this current block of 100 lines with the previous block and then only write the lines that are not matching.. Again possible issue that block A & B will contain identical lines that are needed...
Is it possible to update logfile.php to note the position it last tooked at in my logfile and then only read the next 100 lines from there and write those to the new file ?
The log file could be upto 500MB so I don't want to read it all in each time..
Any advice or suggestions welcome..
Thanks
UPDATE # 16:30
I've sort of got this working using :
$file = "/logs/syst.log";
$handle = fopen($file, "r");
if(isset($_SESSION['ftell'])) {
clearstatcache();
fseek($handle, $_SESSION['ftell']);
while ($buffer = fgets($handle)) {
echo $buffer."<br/>";
#ob_flush(); #flush();
}
fclose($handle);
#$_SESSION['ftell'] = ftell($handle);
} else {
fseek($handle, -1024, SEEK_END);
fclose($handle);
#$_SESSION['ftell'] = ftell($handle);
}
This seems to work, but it loads the entire file first and then just the updates.
How would I get it start with the last 50 lines and then just the updates ?
Thanks :)
UPDATE 04/06/2013
Whilst this works it's very slow with large files.
I've tried this code and it seems faster, but it doesn't just read from where it left off.
function last_lines($path, $line_count, $block_size = 512){
$lines = array();
// we will always have a fragment of a non-complete line
// keep this in here till we have our next entire line.
$leftover = "";
$fh = fopen($path, 'r');
// go to the end of the file
fseek($fh, 0, SEEK_END);
do{
// need to know whether we can actually go back
// $block_size bytes
$can_read = $block_size;
if(ftell($fh) < $block_size){
$can_read = ftell($fh);
}
// go back as many bytes as we can
// read them to $data and then move the file pointer
// back to where we were.
fseek($fh, -$can_read, SEEK_CUR);
$data = fread($fh, $can_read);
$data .= $leftover;
fseek($fh, -$can_read, SEEK_CUR);
// split lines by \n. Then reverse them,
// now the last line is most likely not a complete
// line which is why we do not directly add it, but
// append it to the data read the next time.
$split_data = array_reverse(explode("\n", $data));
$new_lines = array_slice($split_data, 0, -1);
$lines = array_merge($lines, $new_lines);
$leftover = $split_data[count($split_data) - 1];
}
while(count($lines) < $line_count && ftell($fh) != 0);
if(ftell($fh) == 0){
$lines[] = $leftover;
}
fclose($fh);
// Usually, we will read too many lines, correct that here.
return array_slice($lines, 0, $line_count);
}
Any way this can be amend so it will read from the last known position.. ?
Thanks

Introduction
You can tail a file by tracking the last position;
Example
$file = __DIR__ . "/a.log";
$tail = new TailLog($file);
$data = $tail->tail(100) ;
// Save $data to new file
TailLog is a simple class i wrote for this task here is a simple example to show its actually tailing the file
Simple Test
$file = __DIR__ . "/a.log";
$tail = new TailLog($file);
// Some Random Data
$data = array_chunk(range("a", "z"), 3);
// Write Log
file_put_contents($file, implode("\n", array_shift($data)));
// First Tail (2) Run
print_r($tail->tail(2));
// Run Tail (2) Again
print_r($tail->tail(2));
// Write Another data to Log
file_put_contents($file, "\n" . implode("\n", array_shift($data)), FILE_APPEND);
// Call Tail Again after writing Data
print_r($tail->tail(2));
// See the full content
print_r(file_get_contents($file));
Output
// First Tail (2) Run
Array
(
[0] => c
[1] => b
)
// Run Tail (2) Again
Array
(
)
// Call Tail Again after writing Data
Array
(
[0] => f
[1] => e
)
// See the full content
a
b
c
d
e
f
Real Time Tailing
while(true) {
$data = $tail->tail(100);
// write data to another file
sleep(5);
}
Note: Tailing 100 lines does not mean it would always return 100 lines. It would return new lines added 100 is just the maximum number of lines to return. This might not be efficient where you have heavy logging of more than 100 line per sec is there is any
Tail Class
class TailLog {
private $file;
private $data;
private $timeout = 5;
private $lock;
function __construct($file) {
$this->file = $file;
$this->lock = new TailLock($file);
}
public function tail($lines) {
$pos = - 2;
$t = $lines;
$fp = fopen($this->file, "r");
$break = false;
$line = "";
$text = array();
while($t > 0) {
$c = "";
// Seach for End of line
while($c != "\n" && $c != PHP_EOL) {
if (fseek($fp, $pos, SEEK_END) == - 1) {
$break = true;
break;
}
if (ftell($fp) < $this->lock->getPosition()) {
break;
}
$c = fgetc($fp);
$pos --;
}
if (ftell($fp) < $this->lock->getPosition()) {
break;
}
$t --;
$break && rewind($fp);
$text[$lines - $t - 1] = fgets($fp);
if ($break) {
break;
}
}
// Move to end
fseek($fp, 0, SEEK_END);
// Save Position
$this->lock->save(ftell($fp));
// Close File
fclose($fp);
return array_map("trim", $text);
}
}
Tail Lock
class TailLock {
private $file;
private $lock;
private $data;
function __construct($file) {
$this->file = $file;
$this->lock = $file . ".tail";
touch($this->lock);
if (! is_file($this->lock))
throw new Exception("can't Create Lock File");
$this->data = json_decode(file_get_contents($this->lock));
// Check if file is valida json
// Check if Data in the original files as not be delete
// You expect data to increate not decrease
if (! $this->data || $this->data->size > filesize($this->file)) {
$this->reset($file);
}
}
function getPosition() {
return $this->data->position;
}
function reset() {
$this->data = new stdClass();
$this->data->size = filesize($this->file);
$this->data->modification = filemtime($this->file);
$this->data->position = 0;
$this->update();
}
function save($pos) {
$this->data = new stdClass();
$this->data->size = filesize($this->file);
$this->data->modification = filemtime($this->file);
$this->data->position = $pos;
$this->update();
}
function update() {
return file_put_contents($this->lock, json_encode($this->data, 128));
}
}

Not really clear on how you want to use the output but would something like this work ....
$dat = file_get_contents("tracker.dat");
$fp = fopen("/logs/syst.log", "r");
fseek($fp, $dat, SEEK_SET);
ob_start();
// alternatively you can do a while fgets if you want to interpret the file or do something
fpassthru($fp);
$pos = ftell($fp);
fclose($fp);
echo nl2br(ob_get_clean());
file_put_contents("tracker.dat", ftell($fp));
tracker.dat is just a text file that contains where the read position position was from the previous run. I'm just seeking to that position and piping the rest to the output buffer.

Use tail -c <number of bytes, instead of number of lines, and then check the file size. The rough idea is:
$old_file_size = 0;
$max_bytes = 512;
function last_lines($path) {
$new_file_size = filesize($path);
$pending_bytes = $new_file_size - $old_file_size;
if ($pending_bytes > $max_bytes) $pending_bytes = $max_bytes;
exec("tail -c " + $pending_bytes + " /path/to/your_log", $output);
$old_file_size = $new_file_size;
return $output;
}
The advantage is that you can do away with all the special processing stuff, and get good performance. The disadvantage is that you have to manually split the output into lines, and probably you could end up with unfinished lines. But this isn't a big deal, you can easily work around by omitting the last line alone from the output (and appropriately subtracting the last line number of bytes from old_file_size).

PHP: Retrieving lines from the end of a large text file

I've searched for an answer for quite a while, and haven't found anything that works correctly.
I have log files, some reaching 100MB in size, around 140,000 lines of text.
With PHP, I am trying to get the last 500 lines of the file.
How would I get the 500 lines? With most functions, the file is read into memory, and that isn't a plausible case for this matter. I would preferably stay away from executing system commands.

If you are on a 'nix machine, you should be able to use shell escaping and the tool 'tail'.
It's been a while, but something like this:
$lastLines = `tail -n 500`;
notice the use of tick marks, which executes the string in BASH or similar and returns the results.

I wrote this function which seems to work quite nicely to me. It returns an array of lines just like file. If you want it to return a string like file_get_contents, then just change the return statement to return implode('', array_reverse($lines));:
function file_get_tail($filename, $num_lines = 10){
$file = fopen($filename, "r");
fseek($file, -1, SEEK_END);
for ($line = 0, $lines = array(); $line < $num_lines && false !== ($char = fgetc($file));) {
if($char === "\n"){
if(isset($lines[$line])){
$lines[$line][] = $char;
$lines[$line] = implode('', array_reverse($lines[$line]));
$line++;
}
}else
$lines[$line][] = $char;
fseek($file, -2, SEEK_CUR);
}
fclose($file);
if($line < $num_lines)
$lines[$line] = implode('', array_reverse($lines[$line]));
return array_reverse($lines);
}
Example:
file_get_tail('filename.txt', 500);

If you want to do it in PHP:
<?php
/**
Read last N lines from file.
#param $filename string path to file. must support seeking
#param $n int number of lines to get.
#return array up to $n lines of text
*/
function tail($filename, $n)
{
$buffer_size = 1024;
$fp = fopen($filename, 'r');
if (!$fp) return array();
fseek($fp, 0, SEEK_END);
$pos = ftell($fp);
$input = '';
$line_count = 0;
while ($line_count < $n + 1)
{
// read the previous block of input
$read_size = $pos >= $buffer_size ? $buffer_size : $pos;
fseek($fp, $pos - $read_size, SEEK_SET);
// prepend the current block, and count the new lines
$input = fread($fp, $read_size).$input;
$line_count = substr_count(ltrim($input), "\n");
// if $pos is == 0 we are at start of file
$pos -= $read_size;
if (!$pos) break;
}
fclose($fp);
// return the last 50 lines found
return array_slice(explode("\n", rtrim($input)), -$n);
}
var_dump(tail('/var/log/syslog', 50));
This is largely untested, but should be enough for you to get a fully working solution.
The buffer size is 1024, but can be changed to be bigger or larger. (You could even dynamically set it based on $n * estimate of line length.) This should be better than seeking character by character, although it does mean we need to do substr_count() to look for new lines.

PHP: How to read a file live that is constantly being written to

I want to read a log file that is constantly being written to. It resides on the same server as the application. The catch is the file gets written to every few seconds, and I basically want to tail the file on the application in real-time.
Is this possible?

You need to loop with sleep:
$file='/home/user/youfile.txt';
$lastpos = 0;
while (true) {
usleep(300000); //0.3 s
clearstatcache(false, $file);
$len = filesize($file);
if ($len < $lastpos) {
//file deleted or reset
$lastpos = $len;
}
elseif ($len > $lastpos) {
$f = fopen($file, "rb");
if ($f === false)
die();
fseek($f, $lastpos);
while (!feof($f)) {
$buffer = fread($f, 4096);
echo $buffer;
flush();
}
$lastpos = ftell($f);
fclose($f);
}
}
(tested.. it works)

Yes, you need to sleep some time in the loop but you don't have to reopen the file. I was just looking for a similar problem. I wanted to read a file that might have been changed since last read.
The problem is that the resource has reached end of file (EOF). And does not continue to read. The solution is to reset the pointer with fseek($fh, ftell($fh)).
A complete program that waits for input in a text file might look like this one:
<?php
$fh = fopen('/var/log/system', 'r');
while (true) {
$line = fgets($fh);
if ($line !== false) {
// show the line or send it via email or to a websocket..
} else {
// sleep for 0.1 seconds (or more?)
usleep(0.1 * 1000000);
fseek($fh, ftell($fh));
}
}

For example :
$log_file = '/tmp/test/log_file.log';
$f = fopen($log_file, 'a+');
$fr = fopen($log_file, 'r' );
for ( $i = 1; $i < 10; $i++ )
{
fprintf($f, "Line: %u\n", $i);
sleep(2);
echo fread($fr, 1024) . "\n";
}
fclose($fr);
fclose($f);
//Or if you want use tail
$f = fopen($log_file, 'a+');
for ( $i = 1; $i < 10; $i++ )
{
fprintf($f, "Line: %u\n", $i);
sleep(2);
$result = array();
exec( 'tail -n 1 ' . $log_file, $result );
echo "\n".$result[0];
}
fclose($f);

you can close the file handle when it is not used(once a portion of data has been written). or you can use a buffer to store the data and put it to the file only when it's full. this way you won't have the file open all the time.
if you want to get everything that is written to the file as soon as it is written there, you might need to extend the code, writing the data, so that it would output to other places too(screen, some variable, other file...)

<?php
$fp = fopen('/var/log/syslog', 'r');// Read only
while (true) {
$line = stream_get_line($fp, 1024 * 1024, "\n");// Full line found ? (searches for a line break)
if ($line === false) {
usleep(100000);// 100ms
continue;
}
echo 'line:' . $line . PHP_EOL;
}
// -- Code impossible to reach --
// fclose($fp);

Just an idea..
Did you think of using the *nix tail command? execute the command from php (with a param that will return a certain number of lines) and process the results in your php script.

Is this the most efficient way to get and remove first line in file?

I have a script which, each time is called, gets the first line of a file. Each line is known to be exactly of the same length (32 alphanumeric chars) and terminates with "\r\n".
After getting the first line, the script removes it.
This is done in this way:
$contents = file_get_contents($file));
$first_line = substr($contents, 0, 32);
file_put_contents($file, substr($contents, 32 + 2)); //+2 because we remove also the \r\n
Obviously it works, but I was wondering whether there is a smarter (or more efficient) way to do this?
In my simple solution I basically read and rewrite the entire file just to take and remove the first line.

I came up with this idea yesterday:
function read_and_delete_first_line($filename) {
$file = file($filename);
$output = $file[0];
unset($file[0]);
file_put_contents($filename, $file);
return $output;
}

There is no more efficient way to do this other than rewriting the file.

No need to create a second temporary file, nor put the whole file in memory:
if ($handle = fopen("file", "c+")) { // open the file in reading and editing mode
if (flock($handle, LOCK_EX)) { // lock the file, so no one can read or edit this file
while (($line = fgets($handle, 4096)) !== FALSE) {
if (!isset($write_position)) { // move the line to previous position, except the first line
$write_position = 0;
} else {
$read_position = ftell($handle); // get actual line
fseek($handle, $write_position); // move to previous position
fputs($handle, $line); // put actual line in previous position
fseek($handle, $read_position); // return to actual position
$write_position += strlen($line); // set write position to the next loop
}
}
fflush($handle); // write any pending change to file
ftruncate($handle, $write_position); // drop the repeated last line
flock($handle, LOCK_UN); // unlock the file
}
fclose($handle);
}

This will shift the first line of a file, you dont need to load the entire file in memory like you do using the 'file' function. Maybe for small files is a bit more slow than with 'file' (maybe but i bet is not) but is able to manage largest files without problems.
$firstline = false;
if($handle = fopen($logFile,'c+')){
if(!flock($handle,LOCK_EX)){fclose($handle);}
$offset = 0;
$len = filesize($logFile);
while(($line = fgets($handle,4096)) !== false){
if(!$firstline){$firstline = $line;$offset = strlen($firstline);continue;}
$pos = ftell($handle);
fseek($handle,$pos-strlen($line)-$offset);
fputs($handle,$line);
fseek($handle,$pos);
}
fflush($handle);
ftruncate($handle,($len-$offset));
flock($handle,LOCK_UN);
fclose($handle);
}

you can iterate the file , instead of putting them all in memory
$handle = fopen("file", "r");
$first = fgets($handle,2048); #get first line.
$outfile="temp";
$o = fopen($outfile,"w");
while (!feof($handle)) {
$buffer = fgets($handle,2048);
fwrite($o,$buffer);
}
fclose($handle);
fclose($o);
rename($outfile,$file);

I wouldn't usually recommend opening up a shell for this sort of thing, but if you're doing this infrequently on really large files, there's probably something to be said for:
$lines = `wc -l myfile` - 1;
`tail -n $lines myfile > newfile`;
It's simple, and it doesn't involve reading the whole file into memory.
I wouldn't recommend this for small files, or extremely frequent use though. The overhead's too high.

You could store positional info into the file itself. For example, the first 8 bytes of the file could store an integer. This integer is the byte offset of the first real line in the file.
So, you never delete lines anymore. Instead, deleting a line means altering the start position. fseek() to it and then read lines as normal.
The file will grow big eventually. You could periodically clean up the orphaned lines to reduce the file size.
But seriously, just use a database and don't do stuff like this.

Here's one way:
$contents = file($file, FILE_IGNORE_NEW_LINES);
$first_line = array_shift($contents);
file_put_contents($file, implode("\r\n", $contents));
There's countless other ways to do that also, but all the methods would involve separating the first line somehow and saving the rest. You cannot avoid rewriting the whole file. An alternative take:
list($first_line, $contents) = explode("\r\n", file_get_contents($file), 2);
file_put_contents($file, implode("\r\n", $contents));

My problem was large files. I just needed to edit, or remove the first line. This was a solution I used. Didn't require to load the complete file in a variable. Currently echos, but you could always save the contents.
$fh = fopen($local_file, 'rb');
echo "add\tfirst\tline\n"; // add your new first line.
fgets($fh); // moves the file pointer to the next line.
echo stream_get_contents($fh); // flushes the remaining file.
fclose($fh);

I think this is best for any file size
$myfile = fopen("yourfile.txt", "r") or die("Unable to open file!");
$ch=1;
while(!feof($myfile)) {
$dataline= fgets($myfile) . "<br>";
if($ch == 2){
echo str_replace(' ', ' ', $dataline)."\n";
}
$ch = 2;
}
fclose($myfile);

The solutions here didn't work performantly for me.
My solution grabs the last line (not the first line, in my case it was not relevant to get the first or last line) from the file and removes that from that file.
This is very quickly even with very large files (>150000000 lines).
function file_pop($file)
{
if ($fp = #fopen($file, "c+")) {
if (!flock($fp, LOCK_EX)) {
fclose($fp);
}
$pos = -1;
$found = 0;
while ($found < 2) {
if (fseek($fp, $pos--, SEEK_END) < 0) { // can not seek to position
rewind($fp); // rewind to the beginnung of the file
break;
};
if (ord(fgetc($fp)) == 10) { // newline
$found++;
}
}
$lastpos = ftell($fp); // get current position of file
$lastline = fgets($fp); // get current line
ftruncate($fp, $lastpos); // truncate file to last position
flock($fp, LOCK_UN); // unlock
fclose($fp); // close the file
return trim($lastline);
}
}

You could use file() method.
Gets the first line
$content = file('myfile.txt');
echo $content[0];

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Reading large files from end - php

You can use fopen and fseek to navigate in file backwards from end. For example $fp = #fopen($file, "r"); $pos = -2; while (fgetc($fp) != "\n") { fseek($fp, $pos, SEEK_END); $pos = $pos - 1; } $lastline = fgets($fp);

Following snippet worked for me. $file = popen("tac $filename",'r'); while ($line = fgets($file)) { echo $line; } Reference: http://laughingmeme.org/2008/02/28/reading-a-file-backwards-in-php/

As Einstein said every thing should be made as simple as possible but no simpler. At this point you are in need of a data structure, a LIFO data structure or simply put a stack.

Related

How can I read newly appended lines from a LARGE (4GB+) open file?

Tailing Log File and Write results to new file

PHP: Retrieving lines from the end of a large text file

PHP: How to read a file live that is constantly being written to

Is this the most efficient way to get and remove first line in file?

Categories

Resources