php read large text file log - php

I have a text log file, about 600 MB.
I want to read it using php and display the data on a html page, but I only need the last 18 lines that were added each time I run the script.
Since its a large file, I can't read it all in then flip the array as I would have hoped. Is their another way?

Use fopen, filesize and fseek to open the file and start reading it only near the end of the file.
Comments on the fseek manual page include full code to read the last X lines of a large file.

Loading that size file into memory would probably not be a good idea. This should get you around that.
$file = escapeshellarg($file);
$line = 'tail -n 18 '.$file;
system($line);

you can stream it backwards with
$file = popen("tac $filename",'r');
while ($line = fgets($file)) {
echo $line;
}

The best way to do this is use fread and fgets to read line by line, this is extreamly fast as only one line is read at one time and not the while file:
Example of usage would be:
$handle = fopen("/logs/log.txt", "r")
if ($handle)
{
fseek($handle,-18,SEEK_END); //Seek to the end minus 18 lines
while (!feof($handle))
{
echo fgets($handle, 4096); //Make sure your line is less that 4096, otherwise update
$line++;
}
fclose($handle);
}

For the record, had the same problem and tried every solution here.
Turns out Dagon's popen "tac $filename" way is the fastest and the one with the lowest memory and CPU loads.
Tested with a 2Gb log-file reading 500, 1000 and 2000 lines each time. Smooth. Thank you.

Related

What is the most efficient PHP way to read first and last line of a file?

I'm trying to open a file and determine if it is valid. It's valid if the first line is START and the last line is END.
I've seen different ways of getting the last line of a file, but it does not pay particular attention to the first line either.
How should I go about this? I was thinking of loading the file contents in an array and checking $array[0] and $array[x] for START and END. But this seems to be a waste for all the junk that could possibly be in the middle.
If its a valid file, I will be reading/processing the contents of the file between START and END.
Don't read entire file into an array if it is not needed. If file can be big you can do it that way:
$h = fopen('text.txt', 'r');
$firstLine = fgets($h);
fseek($h, -3, SEEK_END);
$lastThreeChars = fgets($h);
Memory footprint is much lower
That's from me:
$lines = file($pathToFile);
if ($lines[0] == 'START' && end($lines) == 'END') {
// do stuff
}
Reading whole file with fgets will be efficient for small siles. iF ur file is big then:
open It and read first line
use tail (i didn't check it but it looks OK) function I found in php.net in fseek documentation

Concatenate files in PHP

I'd like to know if there is a faster way of concatenating 2 text files in PHP, than the usual way of opening txt1 in a+, reading txt2 line by line and copying each line to txt1.
If you want to use a pure-PHP solution, you could use file_get_contents to read the whole file in a string and then write that out (no error checking, just to show how you could do it):
$fp1 = fopen("txt1", 'a+');
$file2 = file_get_contents("txt2");
fwrite($fp1, $file2);
It's probably much faster to use the cat program in linux if you have command line permissions for PHP
system('cat txt1 txt2 > txt3');
$content = file_get_contents("file1");
file_put_contents("file2", $content, FILE_APPEND);
I have found using *nix cat to be the most effective here, but if for whatever reason you don't have access to it, and you are concatenating large files, then you can use this line by line function. (Error handling stripped for simplicity).
function catFiles($arrayOfFiles, $outputPath) {
$dest = fopen($outputPath,"a");
foreach ($arrayOfFiles as $f) {
$FH = fopen($f,"r");
$line = fgets($FH);
while ($line !== false) {
fputs($dest,$line);
$line = fgets($FH);
}
fclose($FH);
}
fclose($dest);
}
While the fastest way is undobtedly to use OS commands, like cp or cat, this is hardly advisable for compatibility.
The fastest "PHP only" way is using file_get_contents, that reads the whole source file, in one shot but it also has some drawbacks. It will require a lot of memory for large files and for this reason it may fail depending on the memory assigned to PHP.
A universal clean and fast solution is to use fread and fwrite with a large buffer.
If the file is smaller than the buffer, all reading will happen in one burst, so speed is optimal, otherwise reading happens at big chunks (the size of the buffer) so the overhead is minimal and speed is quite good.
Reading line by line with fgets instead, has to test for every charachter, one by one, if it's a newline or line feed.
Also, reading line by line with fgets a file with many short lines will be slower as you will read many little pieces, of different sizes, depending of where newlines are positioned.
fread is faster as it only checks for EOF (which is easy) and reads files using a fixed size chunk you decide, so it can be made optimal for your OS or disk or kind of files (say you have many files <12k you can set the buffer size to 16k so they are all read in one shot).
// Code is untested written on mobile phone inside Stack Overflow, comes from various examples online you can also check.
<?php
$BUFFER_SIZE=1*1024*1024; // 1MB, bigger is faster.. depending on file sizes and count
$dest = fopen($fileToAppendTo "a+");
if (FALSE === $dest) die("Failed to open destination");
$handle = fopen("source.txt", "rb");
if (FALSE === $handle) {
fclose($dest);
die("Failed to open source");
}
$contents = '';
while( !feof($handle) ) {
fwrite($dest, fread($handle, $BUFFER_SIZE) );
}
fclose($handle);
fclose($dest);
?>

PHP using fwrite and fread with input stream

I'm looking for the most efficient way to write the contents of the PHP input stream to disk, without using much of the memory that is granted to the PHP script. For example, if the max file size that can be uploaded is 1 GB but PHP only has 32 MB of memory.
define('MAX_FILE_LEN', 1073741824); // 1 GB in bytes
$hSource = fopen('php://input', 'r');
$hDest = fopen(UPLOADS_DIR.'/'.$MyTempName.'.tmp', 'w');
fwrite($hDest, fread($hSource, MAX_FILE_LEN));
fclose($hDest);
fclose($hSource);
Does fread inside an fwrite like the above code shows mean that the entire file will be loaded into memory?
For doing the opposite (writing a file to the output stream), PHP offers a function called fpassthru which I believe does not hold the contents of the file in the PHP script's memory.
I'm looking for something similar but in reverse (writing from input stream to file). Thank you for any assistance you can give.
Yep - fread used in that way would read up to 1 GB into a string first, and then write that back out via fwrite. PHP just isn't smart enough to create a memory-efficient pipe for you.
I would try something akin to the following:
$hSource = fopen('php://input', 'r');
$hDest = fopen(UPLOADS_DIR . '/' . $MyTempName . '.tmp', 'w');
while (!feof($hSource)) {
/*
* I'm going to read in 1K chunks. You could make this
* larger, but as a rule of thumb I'd keep it to 1/4 of
* your php memory_limit.
*/
$chunk = fread($hSource, 1024);
fwrite($hDest, $chunk);
}
fclose($hSource);
fclose($hDest);
If you wanted to be really picky, you could also unset($chunk); within the loop after fwrite to absolutely ensure that PHP frees up the memory - but that shouldn't be necessary, as the next loop will overwrite whatever memory is being used by $chunk at that time.

PHP - how to read big remote files efficiently and use buffer in loop

i would like to understand how to use the buffer of a read file.
Assuming we have a big file with a list of emails line by line ( delimiter is a classic \n )
now, we want compare each line with each record of a table in our database in a kind of check like line_of_file == table_row.
this is a simple task if you have a normal file, otherwise, if you have a huge file the server usually stop the operation after few minute.
so what's the best way of doing this kind of stuff with the file buffer?
what i have so far is something like this:
$buffer = file_get_contents('file.txt');
while($row = mysql_fetch_array($result)) {
if ( preg_match('/'.$email.'/im',$buffer)) {
echo $row_val;
}
}
$buffer = file_get_contents('file.txt');
$lines = preg_split('/\n/',$buffer);
//or $lines = explode('\n',$buffer);
while($row = mysql_fetch_array($result)) {
if ( in_array($email,$lines)) {
echo $row_val;
}
}
Like already suggested in my closevotes to your question (hence CW):
You can use SplFileObject which implements Iterator to iterate over a file line by line to save memory. See my answers to
Least memory intensive way to read a file in PHP and
How to save memory when reading a file in Php?
for examples.
Don't use file_get_contents for large files. This pulls the entire file into memory all at once. You have to read it in pieces
$fp = fopen('file.txt', 'r');
while(!feof($fp)){
//get onle line
$buffer = fgets($fp);
//do your stuff
}
fclose($fp);
Open the file with fopen() and read it incrementally. Probably one line at a time with fgets().
file_get_contents reads the whole file into memory, which is undesirable if the file is larger than a few megabytes
Depending on how long this takes, you may need to worry about the PHP execution time limit, or the browser timing out if it doesn't receive any output for 2 minutes.
Things you might try:
set_time_limit(0) to avoid running up against the PHP time limit
Make sure to output some data every 30 seconds or so so the browser doesn't time out; make sure to flush(); and possibly ob_flush(); so your output is actually sent over the network (this is a kludge)
start a separate process (e.g. via exec()) to run this in the background. Honestly, anything that takes more than a second or two is best run in the background

How to save memory when reading a file in Php?

I have a 200kb file, what I use in multiple pages, but on each page I need only 1-2 lines of that file so how I can read only these lines what I need if I know the line number?
For example if I need only the 10th line, I don`t want to load in memory all the lines, just the 10th line.
Sorry for my bad english!
Try SplFileObject
echo memory_get_usage(), PHP_EOL; // 333200
$file = new SplFileObject('bible.txt'); // 996kb
$file->seek(5000); // jump to line 5000 (zero-based)
echo $file->current(), PHP_EOL; // output current line
echo memory_get_usage(), PHP_EOL; // 342984 vs 3319864 when using file()
For outputting the current line, you can either use current() or just echo $file. I find it clearer to use the method though. You can also use fgets(), but that would get the next line.
Of course, you only need the middle three lines. I've added the memory_get_usage calls just to prove this approach does eat almost no memory.
Unless you know the offset of the line, you will need to read every line up to that point. You can just throw away the old lines (that you don't want) by looping through the file with something like fgets(). (EDIT: Rather than fgets(), I would suggest #Gordon's solution)
Possibly a better solution would be to use a database, as the database engine will do the grunt work of storing the strings and allow you to (very efficiently) get a certain "line" (It wouldn't be a line but a record with an numeric ID, however it amounts to the same thing) without having to read the records before it.
Do the contents of the file change? If it's static, or relatively static, you can build a list of offsets where you want to read your data. For instance, if the file changes once a year, but you read it hundreds of times a day, then you can pre-compute the offsets of the lines you want and jump to them directly like this:
$offsets = array();
while ($line = fread($filehandle)) { .... find line 10 .... }
$offsets[10] = ftell($filehandle); // store line 10's location
.... find next line
$offsets[20] = ftell($filehandle);
and so on. Afterwards, you can trivially jump to that line's location like this:
$fh = fopen('file.txt', 'rb');
fseek($fh, $offsets[20]); // jump to line 20
But this could entirely be overkill. Try benchmarking the operations - compare how long it takes to do an oldfashioned "read 20 lines" versus precompute/jump.
<?php
$lines = array(1, 2, 10);
$handle = #fopen("/tmp/inputfile.txt", "r");
if ($handle) {
$i = 0;
while (!feof($handle)) {
$line = stream_get_line($handle, 1000000, "\n");
if (in_array($i, $lines)) {
echo $line;
$line = ''; // Don't forget to clean the buffer!
}
if ($i > end($lines)) {
break;
}
$i++;
}
fclose($handle);
}
?>
Just loop through them without storing, e.g.
$i = 1;
$file = fopen('file.txt', 'r');
while (!feof($file)) {
$line = fgets($file); // this gets whole line from the file;
if ($i == 10) {
break; // break on tenth line
}
$i ++;
}
The above example would keep memory for only the last line it got from the file, so this is the most memory efficient way to do it.
use fgets(). 10 times :-) in this case you will not store all 10 lines in the memory
Why are you only trying to load the first ten lines? Do you know that loading all those lines is in fact a problem?
If you haven't measured, then you don't know that it's a problem. Don't waste your time optimizing for non-problems. Chances are that any performance change you'll have in not loading the entire 200K file will be imperceptible, unless you know for a fact that loading that file is indeed a bottleneck.

Categories