There is text file about 3 GB. I need to delete a some strings from this, but I'm not sure that my method is good. I did next steps:
- read echo string from the doc
- find needed strings to delete
- get 2 massive: strings to save and strings to delete
What is must next steps? Yeah, this task looks easy for small docs, but there are more issues with giant file.
if( $fh = fopen("file.txt", "r") ){
$left='';
while (!feof($fh)) {// read the file
$temp = fread($fh);
$fgetslines = explode("\n",$temp);
$fgetslines[0]=$left.$fgetslines[0];
if(!feof($fh) )$left = array_pop($lines);
foreach($fgetslines as $k => $line){
//This is where you can build your check for the strings you want to remove
//if statement or switch, which ever makes sence with your current logic.
//After excluding your strings from the temp file
//overwrite your original file with the temp file of proper strings that you want.
}
}
}
fclose($fh);
I think this is what your looking for.
Related
I have a CSV file in which I want the first 11 lines to be removed. The file looks something like:
"MacroTrends Data Download"
"GOOGL - Historical Price and Volume Data"
"Historical prices are adjusted for both splits and dividends"
"Disclaimer and Terms of Use: Historical stock data is provided 'as is' and solely for informational purposes, not for trading purposes or advice."
"MacroTrends LLC expressly disclaims the accuracy, adequacy, or completeness of any data and shall not be liable for any errors, omissions or other defects in, "
"delays or interruptions in such data, or for any actions taken in reliance thereon. Neither MacroTrends LLC nor any of our information providers will be liable"
"for any damages relating to your use of the data provided."
date,open,high,low,close,volume
2004-08-19,50.1598,52.1911,48.1286,50.3228,44659000
2004-08-20,50.6614,54.7089,50.4056,54.3227,22834300
2004-08-23,55.5515,56.9157,54.6938,54.8694,18256100
2004-08-24,55.7922,55.9728,51.9454,52.5974,15247300
2004-08-25,52.5422,54.1672,52.1008,53.1641,9188600
I want only the stocks data and not anything else. So I wish to remove the first 11 lines. Also, there will be several text files for different tickers. So str_replace doesn't seem to be a viable option. The function I've been using to get CSV file and putting the required contents to a text file is
function getCSVFile($url, $outputFile)
{
$content = file_get_contents($url);
$content = str_replace("date,open,high,low,close,volume", "", $content);
$content = trim($content);
file_put_contents($outputFile, $content);
}
I want a general solution which can remove the first 11 lines from the CSV file and put the remaining contents to a text file. How do I do this?
Every example here won't work for large/huge files. People don't care about the memory nowadays. You, as a great programmer, want your code to be efficient with low memory footprint.
Instead parse file line by line:
function saveStrippedCsvFile($inputFile, $outputFile, $lineCountToRemove)
{
$inputHandle = fopen($inputFile, 'r');
$outputHandle = fopen($outputFile, 'w');
// make sure you handle errors as well
// files may be unreadable, unwritable etc…
$counter = 0;
while (!feof($inputHandle)) {
if ($counter < $lineCountToRemove) {
fgets($inputHandle);
++$counter;
continue;
}
fwrite($outputHandle, fgets($inputHandle) . PHP_EOL);
}
fclose($inputHandle);
fclose($outputHandle);
}
I have a CSV file in which I want the first 11 lines to be removed.
I always prefer to use explode to do that.
$string = file_get_contents($file);
$lines = explode('\n', $string);
for($i = 0; $i < 11; $i++) { //First key = 0 - 0,1,2,3,4,5,6,7,8,9,10 = 11 lines
unset($lines[$i]);
}
This will remove it and with implode you can create a new 'file' out of it
$new = implode('\n',$lines);
$new will contain the new file
Did'nt test it, but I'm pretty sure that this will work
Be carefull! I will quote #emix his comment.
This will fail spectacularly if the file content exceeds available PHP memory.
Be sure that the file isn't to 'huge'
Use file() to read it as array and simply trim first 11 lines:
$content = file($url);
$newContent = array_slice($content, 12);
file_put_contents($outputFile, implode(PHP_EOL, $newContent));
But answer these questions:
Why there is additional content in this CSV?
How will you know how much lines to cut off? What if it's more than 11 lines to cut?
My text file sample.txt. I want to exclude the first row from the text file and store the other rows into mysql database.
ID Name EMail
1 Siva xyz#gmail.com
2 vinoth xxx#gmail.com
3 ashwin yyy#gmail.com
Now I want to read this data from the text file except the first row(ID,name,email) and store into the MYsql db.Because already I have created a filed in database with the same name.
I have tried
$handle = #fopen($filename, "r"); //read line one by one
while (!feof($handle)) // Loop till end of file.
{
$buffer = fgets($handle, 4096); // Read a line.
}
print_r($buffer); // It shows all the text.
Please let me know how to do this?
Thanks.
Regards,
Siva R
It's easier if you use file() since it will get all rows in an array instead:
// Get all rows in an array (and tell file not to include the trailing new lines
$rows = file($filename, FILE_IGNORE_NEW_LINES);
// Remove the first element (first row) from the array
array_shift($rows);
// Now do what you want with the rest
foreach ($rows as $lineNumber => $row) {
// do something cool with the row data
}
If you want to get it all as a string again, without the first row, just implode it with a new line as glue:
// The rows still contain the line break, since we only trimmed the copy
$content = implode("\n", $rows);
Note: As #Don'tPanic pointed out in his comment, using file() is simple and easy but not advisable if the original file is large, since it will read the whole thing into memory as an array (and arrays take more memory than strings). He also correctly recommended the FILE_IGNORE_NEW_LINES-flag, just so you know :-)
You can just call fgets once before your while loop to get the header row out of the way.
$firstline = fgets($handle, 4096);
while (!feof($handle)) // Loop till end of file.
{ ...
I was using a script to exclude a list of words from another list of keywords. I would like to change the format of the output. (I found the script on this website and I have made some modification.)
Example:
Phrase from outcome: my word
I would like to add quotes: "my word"
I was thinking that I should put the outcome in new-file.txt and after to rewrite it, but I do not understand how to capture the result. Please, kindly give me some tips. It's my first script :)
Here is the code:
<?php
$myfile = fopen("newfile1.txt", "w") or die("Unable to open file!");
// Open a file to write the changes - test
$file = file_get_contents("test-action-write-a-doc-small.txt");
// In small.txt there are words that will be excluded from the big list
$searchstrings = file_get_contents("test-action-write-a-doc-full.txt");
// From this list the script is excluding the words that are in small.txt
$breakstrings = explode(',',$searchstrings);
foreach ($breakstrings as $values){
if(!strpos($file, $values)) {
echo $values." = Not found;\n";
}
else {
echo $values." = Found; \n";
}
}
echo "<h1>Outcome:</h1>";
foreach ($breakstrings as $values){
if(!strpos($file, $values)) {
echo $values."\n";
}
}
fwrite($myfile, $values); // write the result in newfile1.txt - test
// a loop is missing?
fclose($myfile); // close newfile1.txt - test
?>
There is also a little mistake in the script. It works fine however before entering the list of words in test-action-write-a-doc-full.txt and in test-action-write-a-doc-small.txt I have to put a break for the first line otherwise it does not find the first word.
Example:
In test-action-write-a-doc-small.txt words:
pick, lol, file, cool,
In test-action-write-a-doc-full.txt wwords:
pick, bad, computer, lol, break, file.
Outcome:
Pick = Not found -- here is the mistake.
It happens if I do not put a break for the first line in .txt
lol = Found
file = Found
Thanks in advance for any help! :)
You can collect the accepted words in an array, and then glue all those array elements into one text, which you then write to the file. Like this:
echo "<h1>Outcome:</h1>";
// Build an array with accepted words
$keepWords = array();
foreach ($breakstrings as $values){
// remove white space surrounding word
$values = trim($values);
// compare with false, and skip empty strings
if ($values !== "" and false === strpos($file, $values)) {
// Add word to end of array, you can add quotes if you want
$keepWords[] = '"' . $values . '"';
}
}
// Glue all words together with commas
$keepText = implode(",", $keepWords);
// Write that to file
fwrite($myfile, $keepText);
Note that you should not write !strpos(..) but false === strpos(..) as explained in the docs.
Note also that this method of searching in $file will maybe give unexpected results. For instance, if you have "misery" in your $file string then the word "is" (if separated by commas in the original file) will be refused, as it is found in $file. You might want to review this.
Concerning the second problem
The fact that it does not work without first adding a line-break in your file leads me to think it is related to the Byte-Order Mark (BOM) that appears in the beginning of many UTF-8 encoded files. The problem and possible solutions are discussed here and elsewhere.
If indeed it is this problem, there are two solutions I would propose:
Use your text editor to save the file as UTF-8, but without BOM. For instance, notepad++ has this possibility in the encoding menu.
Or, add this to your code:
function removeBOM($str = "") {
if (substr($str, 0,3) == pack("CCC",0xef,0xbb,0xbf)) {
$str = substr($str, 3);
}
return $str;
}
and then wrap all your file_get_contents calls with that function, like this:
$file = removeBOM(file_get_contents("test-action-write-a-doc-small.txt"));
// In small.txt there are words that will be excluded from the big list
$searchstrings = removeBOM(file_get_contents("test-action-write-a-doc-full.txt"));
// From this list the script is excluding the words that are in small.txt
This will strip these funny bytes from the start of the string taken from the file.
This question already has answers here:
Need to write at beginning of file with PHP
(10 answers)
Closed 9 years ago.
Hi I want to append a row at the beginning of the file using php.
Lets say for example the file is containing the following contnet:
Hello Stack Overflow, you are really helping me a lot.
And now i Want to add a row on top of the repvious one like this:
www.stackoverflow.com
Hello Stack Overflow, you are really helping me a lot.
This is the code that I am having at the moment in a script.
$fp = fopen($file, 'a+') or die("can't open file");
$theOldData = fread($fp, filesize($file));
fclose($fp);
$fp = fopen($file, 'w+') or die("can't open file");
$toBeWriteToFile = $insertNewRow.$theOldData;
fwrite($fp, $toBeWriteToFile);
fclose($fp);
I want some optimal solution for it, as I am using it in a php script. Here are some solutions i found on here:
Need to write at beginning of file with PHP
which says the following to append at the beginning:
<?php
$file_data = "Stuff you want to add\n";
$file_data .= file_get_contents('database.txt');
file_put_contents('database.txt', $file_data);
?>
And other one here:
Using php, how to insert text without overwriting to the beginning of a text file
says the following:
$old_content = file_get_contents($file);
fwrite($file, $new_content."\n".$old_content);
So my final question is, which is the best method to use (I mean optimal) among all the above methods. Is there any better possibly than above?
Looking for your thoughts on this!!!.
function file_prepend ($string, $filename) {
$fileContent = file_get_contents ($filename);
file_put_contents ($filename, $string . "\n" . $fileContent);
}
usage :
file_prepend("couldn't connect to the database", 'database.logs');
My personal preference when writing to a file is to use file_put_contents
From the manual:
This function is identical to calling fopen(), fwrite() and fclose()
successively to write data to a file.
Because the function automatically handles those three functions for me I do not have to remember to close the resource after I'm done with it.
There is no really efficient way to write before the first line in a file. Both solutions mentioned in your questions create a new file from copying everything from the old one then write new data (and there is no much difference between the two methods).
If you are really after efficiency, ie avoiding the whole copy of the existing file, and you need to have the last inserted line being the first in the file, it all depends how you plan on using the file after it is created.
three files
Per you comment, you could create three files header, content and footer and output each of them in sequence ; that would avoid the copy even if header is created after content.
work reverse in one file
This method puts the file in memory (array).
Since you know you create the content before the header, always write lines in reverse order, footer, content, then header:
function write_reverse($lines, $file) { // $lines is an array
for($i=count($lines)-1 ; $i>=0 ; $i--) fwrite($file, $lines[$i]);
}
then you call write_reverse() first with footer, then content and finally header. Each time you want to add something at the beginning of the file, just write at the end...
Then to read the file for output
$lines = array();
while (($line = fgets($file)) !== false) $lines[] = $line;
// then print from last one
for ($i=count($lines)-1 ; $i>=0 ; $i--) echo $lines[$i];
Then there is another consideration: could you avoid using files at all - eg via PHP APC
You mean prepending. I suggest you read the line and replace it with next line without losing data.
<?php
$dataToBeAdded = "www.stackoverflow.com";
$file = "database.txt";
$handle = fopen($file, "r+");
$final_length = filesize($file) + strlen($dataToBeAdded );
$existingData = fread($handle, strlen($dataToBeAdded ));
rewind($handle);
$i = 1;
while (ftell($handle) < $final_length)
{
fwrite($handle, $dataToBeAdded );
$dataToBeAdded = $existingData ;
$existingData = fread($handle, strlen($dataToBeAdded ));
fseek($handle, $i * strlen($dataToBeAdded ));
$i++;
}
?>
I am trying to use a php call through AJAX to replace a single line of a .txt file, in which I store user-specific information. The problem is that if I use fwrite once getting to the correct line, it leaves any previous information which is longer than the replacement information untouched at the end. Is there an easy way to clear a single line in a .txt file with php that I can call first?
Example of what is happening - let's say I'm storing favorite composer, and a user has "Beethoven" in their .txt file, and want's to change it to "Mozart", when I used fwrite over "Beethoven" with "Mozart", I am getting "Mozartven" as the new line. I am using "r+" in the fopen call, as I only want to replace a single line at a time.
If this configuration data doesn't need to be made available to non-PHP apps, consider using var_export() instead. It's basically var_dump/print_r, but outputs the variable as parseable PHP code. This'd reduce your code to:
include('config.php');
$CONFIG['musician'] = 'Mozart';
file_put_contents('config.php', '<?php $CONFIG = ' . var_export($CONFIG, true));
This is a code I've wrote some time ago to delete line from the file, it have to be modified. Also, it will work correctly if the new line is shorter than the old one, for longer lines heavy modification will be required.
The key is the second while loop, in which all contents of the file after the change is being rewritten in the correct position in the file.
<?php
$size = filesize('test.txt');
$file = fopen('test.txt', 'r+');
$lineToDelete = 3;
$counter = 1;
while ($counter < $lineToDelete) {
fgets($file); // skip
$counter++;
}
$position = ftell($file);
$lineToRemove = fgets($file);
$bufferSize = strlen($lineToRemove);
while ($newLine = fread($file, $bufferSize)) {
fseek($file, $position, SEEK_SET);
fwrite($file, $newLine);
$position = ftell($file);
fseek($file, $bufferSize, SEEK_CUR);
}
ftruncate($file, $size - $bufferSize);
echo 'Done';
fclose($file);
?>