Related
I have no luck when the subject is reading text files. I have a small script to read a log file (real time updated) but I want to send some data to DB.
And the problem is, if I don't stat reading from the end of the files, I will get duplicated entries in database. Wich can't happen!
// Keep alive
for (;;)
{
$handle = fopen("data.log", "r");
if (!$handle) die("Open error - data.log");
while (!feof($handle))
{
$line = fgets($handle, 4096);
// If match with, I output the result
if (strpos($line, ':gshop_trade:') > 0)
{
if (!preg_match('/([\d-: ]+)\s*.*\sformatlog:gshop_trade:userid=(\d+):(.*)item_id=(\d+):expire=(\d+):item_count=(\d+):cash_need=(\d+):cash_left=(\d+).*$/', $line, $data))
{
echo "Parsing error on line: {$line}";
}
// show the data
}
}
sleep(5);
}
This script is working, but as I mentioned above, I need to send the data to BD. But also, I need to leave script running, with this current code, the script match the wanted string, and instead of wait for new entries on data.log he starting reading the whole file again.
I see this question here and I tested but doesn't work. I'll start the script when I start the service that generates "data.log" but to prevent duplicate entries in database, I need to read the last lines.
How can I do that?
Keep a track of the file offset from the previous reading using ftell() and keeping that result in a variable, and jump to that offset in the file when you re-open it for the next reading using fseek()
$lastPos = 0;
for (;;)
{
$handle = fopen("data.log", "r");
if (!$handle) die("Open error - data.log");
fseek($handle, $lastPos); // <--- jump to last read position
while (!feof($handle))
{
$line = fgets($handle, 4096);
$lastPos = ftell($handle); // <--- maintain last read position
// If match with, I output the result
if (strpos($line, ':gshop_trade:') > 0)
{
if (!preg_match('/([\d-: ]+)\s*.*\sformatlog:gshop_trade:userid=(\d+):(.*)item_id=(\d+):expire=(\d+):item_count=(\d+):cash_need=(\d+):cash_left=(\d+).*$/', $line, $data))
{
echo "Parsing error on line: {$line}";
}
// show the data
}
}
sleep(5);
}
Maybe you can use file_get_contents, explode and read the array backwards?
$arr = explode(PHP_EOL, file_get_contents("data.log")); // or file("data.log");
$arr = array_reverse($arr);
foreach($arr as $line){
// do stuff here in reverse order
}
From comments above I suggest this method to only use the new data in your code.
It will read your log and a text file with what has been read last time.
Remove what was read last time and use the new data in the code.
$logfile = file_get_contents("data.log");
$ReadData = file_get_contents("readdata.txt");
$newdata = str_replace($ReadData, "", $logfile); // this is what is new since last run.
file_put_contents("readdata.txt", $logfile); // save what has been read.
$arr = explode(PHP_EOL, $newdata);
foreach($arr as $line){
// do your stuff here with the new data.
}
?>
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="refresh" content="5"> <!-- This will run the page every five seconds.
</head>
</html>
i got 35 second to execution this code. how to reduce the execution time? what should i change in this source code.
$file_handle = fopen("WMLG2_2017_07_11.log", "r");
while (!feof($file_handle)) {
$line = fgets($file_handle);
if (strpos($line, 'root#CLA-0 [WMLG2] >') !== false) {
$namafileA = explode('> ', $line);
$namafile = str_replace(' ', '_', $namafileA[1]);
$filenameExtension = $namafile.".txt";
$file = preg_replace('/[^A-Za-z0-9\-_.]/', '', $filenameExtension); // hapus special character kecuali "." dan "_"
} else {
$newfile = fopen("show_command_file_Tes2/$file", "a");
fwrite($newfile, $line);
}
}
fclose($file_handle);
I found some mistakes you did with the original code that could impact your performance, but I am not sure how much.
If I understand correctly, you are opening a log file and sorting the messages out to separate files.
You have not pasted an example from the log file, but I assume you have duplicate file targets, not every line of the log file has individual file targets.
Your code opens, but never closes the handles and it stays open during the script run. The file handles are not closing on outer-scope by garbage collector, you have to do it manually to release the resources.
Based on that you should store the file pointers (or at least close them) and re-use that handle what is already open. You are opening at least X line of handle during the execution and not closing it / reusing it where X is the line count in the file.
Other thing I noticed, your lines may be long ones, an that is a rare case where php's strpos() function could be slower than a regex matching the correct position of the string. Without the log file, I can't say for sure because preg_match() is pretty expensive function on simple / short strings (strpos() is way faster.)
If its a log file, most likely starts with that "root#CLA"... string, you should try to match that if you can specify the string position with ^ (begining of the string) or $ (end of string).
<?php
$file_handle = fopen("WMLG2_2017_07_11.log", "r");
//you 'll store your handles here
$targetHandles = [];
while (!feof($file_handle))
{
$line = fgets($file_handle);
if (strpos($line, 'root#CLA-0 [WMLG2] >') !== false)
{
$namafileA = explode('> ', $line);
$namafile = str_replace(' ', '_', $namafileA[1]);
$filenameExtension = $namafile . ".txt";
$file = preg_replace('/[^A-Za-z0-9\-_.]/', '', $filenameExtension); // hapus special character kecuali "." dan "_"
}
else
{
//no $file defined, most likely nothing to write yet
if (empty($file))
{
continue;
}
//if its not open, we'll make them open
if (empty($targetHandles[$file]))
{
$targetHandles[$file] = fopen("show_command_file_Tes2/$file", "a");
}
//writing the line to target
fwrite($targetHandles[$file], $line);
}
}
//you should close your handles every time
foreach ($targetHandles as $handle)
{
fclose($handle);
}
fclose($file_handle);
I want to do the following
I want to create .php file (executed via cronjobs) that will paste this code $files[] = 'example.php';
to other php file (paste.php) but it has to find the lastest $files[] line like regex $files[] = '(AnythingHere)'; and after this line to paste the new line. It can have random number of pages so I have no way of knowing.
<?php
if (!isset($php_file)) {
$files[] = 'page1.php';
$files[] = 'page2.php';
$files[] = 'page3.php';
$files[] = 'page4.php';
$file = $files[ rand(0,count($files)) ];
I hope you guys understand what I want; can anyone help me out with this one?
if you have ONLY $file[] = '...' in paste.php, you can simply append to the file:
$line = '$file[] = "pageX.php";' . PHP_EOL;
file_put_contents('paste.php', $line, FILE_APPEND);
of you want the last "page[]" enty.
$yourNewLine = '$file[] = "pageX.php";'; // this is an example. put your "line" prm here
$filename = 'paste.php';
$lines = file($filename);
$lines = array_reverse($lines)
$found = false;
$i = 0;
while ( ! $found )
{
if ( strpos($lines[$i], '$files[] = ' === 0) )
{
$found = true;
array_splice($lines, $i, 0, $yourNewLine.PHP_EOL);
}
$i++;
}
$lines = array_reverse($lines);
file_put_contents($filename, $lines);
Instead of doing it this way, how about instead setting your files array in a script and then include it at the top. This way you can reference the array directly and still only have to edit the file listing in only one place.
Quick and dirty first-fit solution:
Open the file
Read each line until you find one matching your regex for $files[] = ...
Read more lines until you find one that doesn't match the regex
Write each line read in 2 and 3 to the output file
Insert your new line into the output
Write the rest of the input to the output
This may not be the best way to approach the problem, drawbacks being that you have to read each line in and compare it with your regex until you find your insertion point. You'll also probably have a temporary file for output which you'll then rename to the original filename.
You'll have 2 while loops:
while (line does not match): read next line
and then
while (line does match): read next line
Someone who knows PHP better than I do might be able to come up with something a bit cleaner, but if you're just looking for something quick to get the job done, this ought to work.
Having this code:
$filesArray = array('page1.php','page2.php','page3.php','page4.php','page5.php',);
then getting the php file with $data = file("path/to/editable_file.php");
foreach($data as $line)
{
if(preg_replace("/\$filesArray\s=\sarray\([\w'.,]+()\);/", "'".$newfilename."',", $line, $match))
{
file_put_contents(implode("\r\n", $data));
break;
}
}
I have a file named $dir and a string named $line, I know that this string is a complete line of that file but I don't know its line number and I want to remove it from file, what should I do?
Is it possible to use awk?
$contents = file_get_contents($dir);
$contents = str_replace($line, '', $contents);
file_put_contents($dir, $contents);
Read the lines one by one, and write all but the matching line to another file. Then replace the original file.
this will just look over every line and if it not what you want to delete, it gets pushed to an array that will get written back to the file. see this
$DELETE = "the_line_you_want_to_delete";
$data = file("./foo.txt");
$out = array();
foreach($data as $line) {
if(trim($line) != $DELETE) {
$out[] = $line;
}
}
$fp = fopen("./foo.txt", "w+");
flock($fp, LOCK_EX);
foreach($out as $line) {
fwrite($fp, $line);
}
flock($fp, LOCK_UN);
fclose($fp);
It can be solved without the use of awk:
function remove_line($file, $remove) {
$lines = file($file, FILE_IGNORE_NEW_LINES);
foreach($lines as $key => $line) {
if($line === $remove) unset($lines[$key]);
}
$data = implode(PHP_EOL, $lines);
file_put_contents($file, $data);
}
Another approach is to read the file line by line until you find a match, then truncate the file to that point, and then append the rest of the lines.
This is also good if you're looking for a substring (ID) in a line and want to replace the old line with the a new line.
Code:
$contents = file_get_contents($dir);
$new_contents = "";
if (strpos($contents, $id) !== false) { // if file contains ID
$contents_array = explode(PHP_EOL, $contents);
foreach ($contents_array as &$record) { // for each line
if (strpos($record, $id) !== false) { // if we have found the correct line
continue; // we've found the line to delete - so don't add it to the new contents.
} else {
$new_contents .= $record . "\r"; // not the correct line, so we keep it
}
}
file_put_contents($dir, $new_contents); // save the records to the file
echo json_encode("Successfully updated record!");
}
else {
echo json_encode("failed - user ID ". $id ." doesn't exist!");
}
Example:
input: "123,student"
Old file:
ID,occupation
123,student
124,brick layer
Running the code will change file to:
New file:
ID,occupation
124,brick layer
All answeres here have in common, that they load the complete file into the memory. Here is an implementation of removing one (or more) line(s) without coyping the files content into a variable.
The idea is to iterate over the files lines. If a line should be removed, the lines length is added to the $byte_offset. The next line is then moved $byte_offset bytes "upwards". This is done with all following lines. If all lines are processed, the files last $byte_offset bytes are removed.
I guess that this is faster for bigger files because nothing is copied. And I guess that at some file size the other answers do not work at all while this one should. But I didn't test it.
Usage:
$file = fopen("path/to/file", "a+");
// remove lines 1 and 2 and the line containing only "line"
fremove_line($file, 1, 2, "line");
fclose($file);
The code of the fremove_line() function:
/**
* Remove the `$lines` by either their line number (as an int) or their content
* (without trailing new-lines).
*
* Example:
* ```php
* $file = fopen("path/to/file", "a+"); // must be opened writable
* // remove lines 1 and 2 and the line containing only "line"
* fremove_line($file, 1, 2, "line");
* fclose($file);
* ```
*
* #param resource $file The file resource opened by `fopen()`
* #param int|string ...$lines The one-based line number(s) or the full line
* string(s) to remove, if the line does not exist, it is ignored
*
* #return boolean True on success, false on failure
*/
function fremove_line($file, ..$lines): bool{
// set the pointer to the start of the file
if(!rewind($file)){
return false;
}
// get the stat for the full size to truncate the file later on
$stat = fstat($file);
if(!$stat){
return false;
}
$current_line = 1; // change to 0 for zero-based $lines
$byte_offset = 0;
while(($line = fgets($file)) !== false){
// the bytes of the lines ("number of ASCII chars")
$line_bytes = strlen($line);
if($byte_offset > 0){
// move lines upwards
// go back the `$byte_offset`
fseek($file, -1 * ($byte_offset + $line_bytes), SEEK_CUR);
// move the line upwards, until the `$byte_offset` is reached
if(!fwrite($file, $line)){
return false;
}
// set the file pointer to the current line again, `fwrite()` added `$line_bytes`
// already
fseek($file, $byte_offset, SEEK_CUR);
}
// remove trailing line endings for comparing
$line_content = preg_replace("~[\n\r]+$~", "", $line);
if(in_array($current_line, $lines, true) || in_array($line_content, $lines, true)){
// the `$current_line` should be removed so save to skip the number of bytes
$byte_offset += $line_bytes;
}
// keep track of the current line
$current_line++;
}
// remove the end of the file
return ftruncate($file, $stat["size"] - $byte_offset);
}
Convert text to array, remove first line and reconvert to text
$line=explode("\r\n",$text);
unset($line[0]);
$text=implode("\r\n",$line);
I think the best way to work with files is to act them like strings:
/**
* Removes the first found line inside the given file.
*
* #param string $line The line content to be searched.
* #param string $filePath Path of the file to be edited.
* #param bool $removeOnlyFirstMatch Whether to remove only the first match or
* the whole matches.
* #return bool If any matches found (and removed) or not.
*
* #throw \RuntimeException If the file is empty.
* #throw \RuntimeException When the file cannot be updated.
*/
function removeLineFromFile(
string $line,
string $filePath,
bool $removeOnlyFirstMatch = true
): bool {
// You can wrap it inside a try-catch block
$file = new \SplFileObject($filePath, "r");
// Checks whether the file size is not zero
$fileSize = $file->getSize();
if ($fileSize !== 0) {
// Read the whole file
$fileContent = $file->fread($fileSize);
} else {
// File is empty
throw new \RuntimeException("File '$filePath' is empty");
}
// Free file resources
$file = null;
// Divide file content into its lines
$fileLineByLine = explode(PHP_EOL, $fileContent);
$found = false;
foreach ($fileLineByLine as $lineNumber => $thisLine) {
if ($thisLine === $line) {
$found = true;
unset($fileLineByLine[$lineNumber]);
if ($removeOnlyFirstMatch) {
break;
}
}
}
// We don't need to update file either if the line not found
if (!$found) {
return false;
}
// Join lines together
$newFileContent = implode(PHP_EOL, $fileLineByLine);
// Finally, update the file
$file = new \SplFileObject($filePath, "w");
if ($file->fwrite($newFileContent) !== strlen($newFileContent)) {
throw new \RuntimeException("Could not update the file '$filePath'");
}
return true;
}
Here is a brief description of what is being done: Get the whole file content, split the content into its lines (i.e. as an array), find the match(es) and remove them, join all lines together, and save the result back to the file (only if any changes happened).
Let's now use it:
// $dir is your filename, as you mentioned
removeLineFromFile($line, $dir);
Notes:
You can use fopen() family functions instead of SplFileObject, but I do recommend the object form, as it's exception-based, more robust and more efficient (in this case at least).
It's safe to unset() an element of an array being iterated using foreach (There's a comment here showing it can lead unexpected results, but it's totally wrong: As you can see in the example code, $value is copied (i.e. it's not a reference), and removing an array element does not affect it).
$line should not have new line characters like \n, otherwise, you may perform lots of redundant searches.
Don't use
$fileLineByLine[$lineNumber] = "";
// Or even
$fileLineByLine[$lineNumber] = null;
instead of
unset($fileLineByLine[$key]);
The reason is, the first case doesn't remove the line, it just clears the line (and an unwanted empty line will remain).
Hope it helps.
Like this:
file_put_contents($filename, str_replace($line . "\r\n", "", file_get_contents($filename)));
I have a script which, each time is called, gets the first line of a file. Each line is known to be exactly of the same length (32 alphanumeric chars) and terminates with "\r\n".
After getting the first line, the script removes it.
This is done in this way:
$contents = file_get_contents($file));
$first_line = substr($contents, 0, 32);
file_put_contents($file, substr($contents, 32 + 2)); //+2 because we remove also the \r\n
Obviously it works, but I was wondering whether there is a smarter (or more efficient) way to do this?
In my simple solution I basically read and rewrite the entire file just to take and remove the first line.
I came up with this idea yesterday:
function read_and_delete_first_line($filename) {
$file = file($filename);
$output = $file[0];
unset($file[0]);
file_put_contents($filename, $file);
return $output;
}
There is no more efficient way to do this other than rewriting the file.
No need to create a second temporary file, nor put the whole file in memory:
if ($handle = fopen("file", "c+")) { // open the file in reading and editing mode
if (flock($handle, LOCK_EX)) { // lock the file, so no one can read or edit this file
while (($line = fgets($handle, 4096)) !== FALSE) {
if (!isset($write_position)) { // move the line to previous position, except the first line
$write_position = 0;
} else {
$read_position = ftell($handle); // get actual line
fseek($handle, $write_position); // move to previous position
fputs($handle, $line); // put actual line in previous position
fseek($handle, $read_position); // return to actual position
$write_position += strlen($line); // set write position to the next loop
}
}
fflush($handle); // write any pending change to file
ftruncate($handle, $write_position); // drop the repeated last line
flock($handle, LOCK_UN); // unlock the file
}
fclose($handle);
}
This will shift the first line of a file, you dont need to load the entire file in memory like you do using the 'file' function. Maybe for small files is a bit more slow than with 'file' (maybe but i bet is not) but is able to manage largest files without problems.
$firstline = false;
if($handle = fopen($logFile,'c+')){
if(!flock($handle,LOCK_EX)){fclose($handle);}
$offset = 0;
$len = filesize($logFile);
while(($line = fgets($handle,4096)) !== false){
if(!$firstline){$firstline = $line;$offset = strlen($firstline);continue;}
$pos = ftell($handle);
fseek($handle,$pos-strlen($line)-$offset);
fputs($handle,$line);
fseek($handle,$pos);
}
fflush($handle);
ftruncate($handle,($len-$offset));
flock($handle,LOCK_UN);
fclose($handle);
}
you can iterate the file , instead of putting them all in memory
$handle = fopen("file", "r");
$first = fgets($handle,2048); #get first line.
$outfile="temp";
$o = fopen($outfile,"w");
while (!feof($handle)) {
$buffer = fgets($handle,2048);
fwrite($o,$buffer);
}
fclose($handle);
fclose($o);
rename($outfile,$file);
I wouldn't usually recommend opening up a shell for this sort of thing, but if you're doing this infrequently on really large files, there's probably something to be said for:
$lines = `wc -l myfile` - 1;
`tail -n $lines myfile > newfile`;
It's simple, and it doesn't involve reading the whole file into memory.
I wouldn't recommend this for small files, or extremely frequent use though. The overhead's too high.
You could store positional info into the file itself. For example, the first 8 bytes of the file could store an integer. This integer is the byte offset of the first real line in the file.
So, you never delete lines anymore. Instead, deleting a line means altering the start position. fseek() to it and then read lines as normal.
The file will grow big eventually. You could periodically clean up the orphaned lines to reduce the file size.
But seriously, just use a database and don't do stuff like this.
Here's one way:
$contents = file($file, FILE_IGNORE_NEW_LINES);
$first_line = array_shift($contents);
file_put_contents($file, implode("\r\n", $contents));
There's countless other ways to do that also, but all the methods would involve separating the first line somehow and saving the rest. You cannot avoid rewriting the whole file. An alternative take:
list($first_line, $contents) = explode("\r\n", file_get_contents($file), 2);
file_put_contents($file, implode("\r\n", $contents));
My problem was large files. I just needed to edit, or remove the first line. This was a solution I used. Didn't require to load the complete file in a variable. Currently echos, but you could always save the contents.
$fh = fopen($local_file, 'rb');
echo "add\tfirst\tline\n"; // add your new first line.
fgets($fh); // moves the file pointer to the next line.
echo stream_get_contents($fh); // flushes the remaining file.
fclose($fh);
I think this is best for any file size
$myfile = fopen("yourfile.txt", "r") or die("Unable to open file!");
$ch=1;
while(!feof($myfile)) {
$dataline= fgets($myfile) . "<br>";
if($ch == 2){
echo str_replace(' ', ' ', $dataline)."\n";
}
$ch = 2;
}
fclose($myfile);
The solutions here didn't work performantly for me.
My solution grabs the last line (not the first line, in my case it was not relevant to get the first or last line) from the file and removes that from that file.
This is very quickly even with very large files (>150000000 lines).
function file_pop($file)
{
if ($fp = #fopen($file, "c+")) {
if (!flock($fp, LOCK_EX)) {
fclose($fp);
}
$pos = -1;
$found = 0;
while ($found < 2) {
if (fseek($fp, $pos--, SEEK_END) < 0) { // can not seek to position
rewind($fp); // rewind to the beginnung of the file
break;
};
if (ord(fgetc($fp)) == 10) { // newline
$found++;
}
}
$lastpos = ftell($fp); // get current position of file
$lastline = fgets($fp); // get current line
ftruncate($fp, $lastpos); // truncate file to last position
flock($fp, LOCK_UN); // unlock
fclose($fp); // close the file
return trim($lastline);
}
}
You could use file() method.
Gets the first line
$content = file('myfile.txt');
echo $content[0];