How to echo random line from text file - php

My text file format is :
This is first line.
This is second line.
This is third line.
There could be more lines in the text file. How to echo one random line on each refresh from the text file with php.
All comments are appreciated. Thanks

How big of a file are we talking? the easy approach is to load the entire file into memory as string array and pick a random array index from 0 to N and show that line..
If the size of the file can get really big, then you'd have to implement some sort of streaming solution..
Streaming Solution Explained!
The following solution will yield a uniformly distributed random line from a relatively large file with an adjustable max line size per file.
<?php
function rand_line($fileName, $maxLineLength = 4096) {
$handle = #fopen($fileName, "r");
if ($handle) {
$random_line = null;
$line = null;
$count = 0;
while (($line = fgets($handle, $maxLineLength)) !== false) {
$count++;
// P(1/$count) probability of picking current line as random line
if(rand() % $count == 0) {
$random_line = $line;
}
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
fclose($handle);
return null;
} else {
fclose($handle);
}
return $random_line;
}
}
// usage
echo rand_line("myfile.txt");
?>
Let's say the file had 10 lines, the probability of picking line X is:
P(1) = 1
P(2) = 1/2 * P(1)
P(3) = 2/3 * P(2)
P(N) = (N-1)/N * P(N-1) = 1/N
Which will ultimately give us a uniformly distributed random line from a file of any size without actually reading the entire file into memory.
I hope it will help.

A generally good approach to this kind of situation is to:
Read the lines into an array using file()
echo out a random array value using array_rand()
Your code could look something like this:
$lines = file('my_file.txt');
echo $lines[array_rand($lines)];

Related

How to read a certain line from a string via PHP? [duplicate]

I am working on reading a file in php.
I need to read specific lines of the file.
I used this code:
fseek($file_handle,$start);
while (!feof($file_handle))
{
///Get and read the line of the file pointed at.
$line = fgets($file_handle);
$lineArray .= $line."LINE_SEPARATOR";
processLine($lineArray, $linecount, $logger, $xmlReply);
$counter++;
}
fclose($file_handle);
However I realized that the fseek() takes the number of bytes and not the line number.
Does PHP have other function that bases its pointer in line numbers?
Or do I have to read the file from the start every time, and have a
counter until my desired line number is read?
I'm looking for an efficient algorithm, stepping over 500-1000 Kb file to get to the desired line seems inefficient.
Use SplFileObject::seek
$file = new SplFileObject('yourfile.txt');
$file->seek(123); // seek to line 124 (0-based)
Does this work for you?
$file = "name-of-my-file.txt";
$lines = file( $file );
echo $lines[67]; // echos line 68 (lines numbers start at 0 (replace 68 with whatever))
You would obviously need to check the lines exists before printing though. Any good?
You could do like:
$lines = file($filename); //file in to an array
echo $lines[1]; //line 2
OR
$line = 0;
$fh = fopen($myFile, 'r');
while (($buffer = fgets($fh)) !== FALSE) {
if ($line == 1) {
// $buffer is the second line.
break;
}
$line++;
}
You must read from the beginning. But if the file never/rarely changes, you could cache the line offsets somewhere else, perhaps in another file.
Try this,the simplest one
$buffer=explode("\n",file_get_contents("filename"));//Split data to array for each "\n"
Now the buffer is an array and each array index contain each lines;
To get the 5th line
echo $buffer[4];
You could use function file($filename) . This function reads data from the file into array.

Get nth number from a string

I have a very large file with only single line. It contains about 2.6 million of numbers. The file is about 15 mb.
My goal is to find the nth number in this single line string.
I tried to read the file into a string (remember it is single line file). Then I exploded the strings into an array which I ran out of memory. (Allowed memory size of 268435456 bytes exhausted (tried to allocate 71 bytes)
Am I doing it right? Or is there another easier way to find the nth value in a very large string?
$file = file_get_contents ('a.txt', true);
$array = explode(" ", $file, -1);
echo $array[$nth];
Create a counter variable; read the file using fopen and loop it in a while with feof and fgets (with the desired buffer size); within the loop, check how many spaces are present in the bit you just read (I'm assuming your entries are separated by spaces, it could be commas or whatever); finally increment the counter and go on until you reach the part you want (after a n number of spaces, you have the [n+1]th entry you are looking for).
I include some tested (with a 16 MB file) proof-of-concept code. I don't know if there are better ways to do it; this is the only one that came to my mind and it works. memory_get_usage reports a memory usage of ~8 kb.
<?php
$counter;
$nth = 49959;
$handle = #fopen('numbers.txt', 'r'); // File containing numbers from 1 to 2130829, size ~16 MB.
if ($handle) {
while (($buffer = fgets($handle, 128)) !== false) {
$spaces = substr_count($buffer, ' ');
if ($counter + $spaces > $nth) {
$numbers = explode(' ', $buffer);
$key = $nth - $counter;
echo $numbers[$key]; // print '49959'
exit;
}
else {
$counter += $spaces;
}
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
?>

How do I choose a specific line from a file?

I'm trying to make (as immature as this sounds) an application online that prints random insults. I have a list that is 140 lines long, and I would like to print one entire line. There is mt_rand(min,max) but when I use that alongside fgets(file, "line") It doesn't give me the line of the random number, it gives me the character. Any help? I have all the code so far below.
<?php
$file = fopen("Insults.txt","r");
echo fgets($file, (mt_rand(1, 140)));
fclose($file);
?>
Try this, it's easier version of what you want to do:
$file = file('Insults.txt');
echo $file[array_rand($file)];
$lines = file("Insults.txt");
echo $lines[array_rand($lines)];
Or within a function:
function random_line($filename) {
$lines = file($filename) ;
return $lines[array_rand($lines)] ;
}
$insult = random_line("Insults.txt");
echo $insult;
use file() for this. it returns an array with the lines of the file:
$lines = file($filename);
$line = mt_rand(0, count($lines));
echo $lines[$line];
First: You totally screwed on using fgets() correctly, please refer to the manual about the meaning of the second parameter (it just plainly not what you think it is).
Second: the file() solution will work... until the filesize exceeds a certain size and exhaust the complete PHP memory. Keep in mind: file() reads the complete file into an array.
You might be better off with reading line-by-line, even if that means you'll have to discard most of the read data.
$fp = fopen(...);
$line = 129;
// read (and ignore) the first 128 lines in the file
$i = 1;
while ($i < $line) {
fgets($fp);
$i++;
}
// at last: this is the line we wanted
$theLine = fgets($fp);
(not tested!)

Determine if a file has more than X lines?

as I was not able to find a function which retrieves the amount of lines a file has,
do I need to use
$handle = fopen("file.txt");
For($Line=1; $Line<=10; $Line=$Line+1){
fgets($handle);
}
If feof($handle){
echo "File has 10 lines or more.";
}Else{
echo "File has less than 10 lines.";
}
fclose($handle)
or something similar? All I want to know is if the file has more than 10 lines or not :-).
Thanks in advance!
You can get the number of lines using:
$file = 'smth.txt';
$num_lines = count(file($file));
Faster, more memory resourceful:
$file = new SplFileObject('file.txt');
$file->seek(9);
if ($file->eof()) {
echo 'File has less than 10 lines.';
} else {
echo 'File has 10 lines or more.';
}
SplFileObject
This bigger problems will occur if you have a LARGE file, PHP tends to slow down some. Why not run an exec command and let the system return the number? Then you do not have to worry about the PHP overhead to read the file.
$count = exec("wc -l /path/to/file");
Or if you want to get a bit more fancy:
$count = exec("awk '// {++x} END {print x}' /path/to/file");
If you have big file then better schould be read files in segments and counts "\n" chars, or what ever is the lineend char, for example on some systems you will also need "\r" counter or whatever...
$lineCounter=0;
$myFile =fopen('/pathto/file.whatever','r');
while ($stringSegment = fread($myFile, 4096000)) {
$lineCounter += substr_count($stringSegment, "\n");
}

Efficiently counting the number of lines of a text file. (200mb+)

I have just found out that my script gives me a fatal error:
Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 440 bytes) in C:\process_txt.php on line 109
That line is this:
$lines = count(file($path)) - 1;
So I think it is having difficulty loading the file into memeory and counting the number of lines, is there a more efficient way I can do this without having memory issues?
The text files that I need to count the number of lines for range from 2MB to 500MB. Maybe a Gig sometimes.
Thanks all for any help.
This will use less memory, since it doesn't load the whole file into memory:
$file="largefile.txt";
$linecount = 0;
$handle = fopen($file, "r");
while(!feof($handle)){
$line = fgets($handle);
$linecount++;
}
fclose($handle);
echo $linecount;
fgets loads a single line into memory (if the second argument $length is omitted it will keep reading from the stream until it reaches the end of the line, which is what we want). This is still unlikely to be as quick as using something other than PHP, if you care about wall time as well as memory usage.
The only danger with this is if any lines are particularly long (what if you encounter a 2GB file without line breaks?). In which case you're better off doing slurping it in in chunks, and counting end-of-line characters:
$file="largefile.txt";
$linecount = 0;
$handle = fopen($file, "r");
while(!feof($handle)){
$line = fgets($handle, 4096);
$linecount = $linecount + substr_count($line, PHP_EOL);
}
fclose($handle);
echo $linecount;
Using a loop of fgets() calls is fine solution and the most straightforward to write, however:
even though internally the file is read using a buffer of 8192 bytes, your code still has to call that function for each line.
it's technically possible that a single line may be bigger than the available memory if you're reading a binary file.
This code reads a file in chunks of 8kB each and then counts the number of newlines within that chunk.
function getLines($file)
{
$f = fopen($file, 'rb');
$lines = 0;
while (!feof($f)) {
$lines += substr_count(fread($f, 8192), "\n");
}
fclose($f);
return $lines;
}
If the average length of each line is at most 4kB, you will already start saving on function calls, and those can add up when you process big files.
Benchmark
I ran a test with a 1GB file; here are the results:
+-------------+------------------+---------+
| This answer | Dominic's answer | wc -l |
+------------+-------------+------------------+---------+
| Lines | 3550388 | 3550389 | 3550388 |
+------------+-------------+------------------+---------+
| Runtime | 1.055 | 4.297 | 0.587 |
+------------+-------------+------------------+---------+
Time is measured in seconds real time, see here what real means
True line count
While the above works well and returns the same results as wc -l, if the file ends without a newline, the line number will be off by one; if you care about this particular scenario, you can make it more accurate by using this logic:
function getLines($file)
{
$f = fopen($file, 'rb');
$lines = 0; $buffer = '';
while (!feof($f)) {
$buffer = fread($f, 8192);
$lines += substr_count($buffer, "\n");
}
fclose($f);
if (strlen($buffer) > 0 && $buffer[-1] != "\n") {
++$lines;
}
return $lines;
}
Simple Oriented Object solution
$file = new \SplFileObject('file.extension');
while($file->valid()) $file->fgets();
var_dump($file->key());
#Update
Another way to make this is with PHP_INT_MAX in SplFileObject::seek method.
$file = new \SplFileObject('file.extension', 'r');
$file->seek(PHP_INT_MAX);
echo $file->key();
If you're running this on a Linux/Unix host, the easiest solution would be to use exec() or similar to run the command wc -l $path. Just make sure you've sanitized $path first to be sure that it isn't something like "/path/to/file ; rm -rf /".
There is a faster way I found that does not require looping through the entire file
only on *nix systems, there might be a similar way on windows ...
$file = '/path/to/your.file';
//Get number of lines
$totalLines = intval(exec("wc -l '$file'"));
If you're using PHP 5.5 you can use a generator. This will NOT work in any version of PHP before 5.5 though. From php.net:
"Generators provide an easy way to implement simple iterators without the overhead or complexity of implementing a class that implements the Iterator interface."
// This function implements a generator to load individual lines of a large file
function getLines($file) {
$f = fopen($file, 'r');
// read each line of the file without loading the whole file to memory
while ($line = fgets($f)) {
yield $line;
}
}
// Since generators implement simple iterators, I can quickly count the number
// of lines using the iterator_count() function.
$file = '/path/to/file.txt';
$lineCount = iterator_count(getLines($file)); // the number of lines in the file
If you're under linux you can simply do:
number_of_lines = intval(trim(shell_exec("wc -l ".$file_name." | awk '{print $1}'")));
You just have to find the right command if you're using another OS
Regards
This is an addition to Wallace Maxter's solution
It also skips empty lines while counting:
function getLines($file)
{
$file = new \SplFileObject($file, 'r');
$file->setFlags(SplFileObject::READ_AHEAD | SplFileObject::SKIP_EMPTY |
SplFileObject::DROP_NEW_LINE);
$file->seek(PHP_INT_MAX);
return $file->key() + 1;
}
The most succinct cross-platform solution that only buffers one line at a time.
$file = new \SplFileObject(__FILE__);
$file->setFlags($file::READ_AHEAD);
$lines = iterator_count($file);
Unfortunately, we have to set the READ_AHEAD flag otherwise iterator_count blocks indefinitely. Otherwise, this would be a one-liner.
private static function lineCount($file) {
$linecount = 0;
$handle = fopen($file, "r");
while(!feof($handle)){
if (fgets($handle) !== false) {
$linecount++;
}
}
fclose($handle);
return $linecount;
}
I wanted to add a little fix to the function above...
in a specific example where i had a file containing the word 'testing' the function returned 2 as a result. so i needed to add a check if fgets returned false or not :)
have fun :)
Based on dominic Rodger's solution,
here is what I use (it uses wc if available, otherwise fallbacks to dominic Rodger's solution).
class FileTool
{
public static function getNbLines($file)
{
$linecount = 0;
$m = exec('which wc');
if ('' !== $m) {
$cmd = 'wc -l < "' . str_replace('"', '\\"', $file) . '"';
$n = exec($cmd);
return (int)$n + 1;
}
$handle = fopen($file, "r");
while (!feof($handle)) {
$line = fgets($handle);
$linecount++;
}
fclose($handle);
return $linecount;
}
}
https://github.com/lingtalfi/Bat/blob/master/FileTool.php
Counting the number of lines can be done by following codes:
<?php
$fp= fopen("myfile.txt", "r");
$count=0;
while($line = fgetss($fp)) // fgetss() is used to get a line from a file ignoring html tags
$count++;
echo "Total number of lines are ".$count;
fclose($fp);
?>
You have several options. The first is to increase the availble memory allowed, which is probably not the best way to do things given that you state the file can get very large. The other way is to use fgets to read the file line by line and increment a counter, which should not cause any memory issues at all as only the current line is in memory at any one time.
There is another answer that I thought might be a good addition to this list.
If you have perl installed and are able to run things from the shell in PHP:
$lines = exec('perl -pe \'s/\r\n|\n|\r/\n/g\' ' . escapeshellarg('largetextfile.txt') . ' | wc -l');
This should handle most line breaks whether from Unix or Windows created files.
TWO downsides (at least):
1) It is not a great idea to have your script so dependent upon the system its running on ( it may not be safe to assume Perl and wc are available )
2) Just a small mistake in escaping and you have handed over access to a shell on your machine.
As with most things I know (or think I know) about coding, I got this info from somewhere else:
John Reeve Article
public function quickAndDirtyLineCounter()
{
echo "<table>";
$folders = ['C:\wamp\www\qa\abcfolder\',
];
foreach ($folders as $folder) {
$files = scandir($folder);
foreach ($files as $file) {
if($file == '.' || $file == '..' || !file_exists($folder.'\\'.$file)){
continue;
}
$handle = fopen($folder.'/'.$file, "r");
$linecount = 0;
while(!feof($handle)){
if(is_bool($handle)){break;}
$line = fgets($handle);
$linecount++;
}
fclose($handle);
echo "<tr><td>" . $folder . "</td><td>" . $file . "</td><td>" . $linecount . "</td></tr>";
}
}
echo "</table>";
}
I use this method for purely counting how many lines in a file. What is the downside of doing this verses the other answers. I'm seeing many lines as opposed to my two line solution. I'm guessing there's a reason nobody does this.
$lines = count(file('your.file'));
echo $lines;
this is a bit late but...
Here is my solution for a text log file I have which uses \n to separate each line.
$data = file_get_contents("myfile.txt");
$numlines = strlen($data) - strlen(str_replace("\n","",$data));
It does load the file into memory but doesn't need to cycle through an unknown number of lines. It may be unsuitable if the file is GB in size but for smaller files with short lines of data it works a treat for me.
It just removes the "\n" from the file and compares how many have been removed by comparing the length of the data in the file to the length after removing all the line breaks ("\n" chars n my case). If your line delineator is a different char, replace the "\n" with whatever is your line delineation character.
I know it is not the best answer for all occasions but is something I have found quick and simple for my purposes where each line of the log is only a few hundred chars and total log file is not too large.
For just counting the lines use:
$handle = fopen("file","r");
static $b = 0;
while($a = fgets($handle)) {
$b++;
}
echo $b;

Categories