Need to process large csv files with php.
Working with fgetcsv and performance seems pretty good. For even better/ faster processing I would like the csv files to have column names on the first row, which are missing right now.
Now I would like to add a top row with columns names to the csv file with PHP before processing the files with fgetcsv.
Is there a way to add a line top the top of the file with php easily? Or would this mean I have to take a approach like below?
List item
List item
create a temp file
add the column names to this temp file
read original csv file contents
put original csv contents into the temp file
delete original file
rename the temp file
Any feedback on how to do this in the most effective way is highly appreciated. Thanks for your time :)
Please try this .. much simpler and straight to the point.
$file = 'addline.csv';
$header = "Name, IP, Host, RAM, Task, NS \r\n";
$data = file_get_contents($file);
file_put_contents($file, $header.$data);
You are done. Hope this helps...
Read the CSV file using fgetcsv.
Close the file.
Open the file using "w" (write) mode.
Write the headers.
Close the file.
Open the file using "a" (append) mode.
Write the CSV file using fputcsv.
Voila!
Just do it the way you provided. there is no easy way to extend a file on the beginning instead of the end. it's like writing on paper. it's easy to add new lines below your text but you can't write above you text.
Don't use fopen($file, "r+") or fopen($file, "c") as this will remove existing lines at the beginning or even all lines of your file.
Why not to define header in your code manually, especially if every file has the same header?
If you really need to insert header into CSV files before you process them maybe you should consider using sed like this: sed -i 1i'"header 1","header 2"' file.csv
Related
What is the best way to overwrite a specific line in a file? I basically want to search a file for the string '#parsethis' and overwrite the rest of that line with something else.
If the file is really big (log files or something like this) and you are willing to sacrifice speed for memory consumption you could open two files and essentially do the trick Jeremy Ruten proposed by using files instead of system memory.
$source='in.txt';
$target='out.txt';
// copy operation
$sh=fopen($source, 'r');
$th=fopen($target, 'w');
while (!feof($sh)) {
$line=fgets($sh);
if (strpos($line, '#parsethis')!==false) {
$line='new line to be inserted' . PHP_EOL;
}
fwrite($th, $line);
}
fclose($sh);
fclose($th);
// delete old source file
unlink($source);
// rename target file to source file
rename($target, $source);
If the file isn't too big, the best way would probably be to read the file into an array of lines with file(), search through the array of lines for your string and edit that line, then implode() the array back together and fwrite() it back to the file.
Your main problem is the fact that the new line may not be the same length as the old line. If you need to change the length of the line, there is no way out of rewriting at least all of the file after the changed line. The easiest way is to create a new, modified file and then move it over the original. This way there is a complete file available at all times for readers. Use locking to make sure that only one script is modifying the file at once, and since you are going to replace the file, do the locking on a different file. Check out flock().
If you are certain that the new line will be the same length as the old line, you can open the file in read/write mode (use r+ as the second argument to fopen()) and call ftell() to save the position the line starts at each time before you call fgets() to read a line. Once you find the line that you want to overwrite, you can use fseek() to go back to the beginning of the line and fwrite() the new data. One way to force the line to always be the same length is to space pad it out to the maximum possible length.
This is a solution that works for rewriting only one line of a file in place with sed from PHP. My file contains only style vars and is formatted:
$styleVarName: styleVarProperty;\n
For this I first add the ":" to the ends of myStyleVarName, and sed replaces the rest of that line with the new property and adds a semicolon.
Make sure characters are properly escaped in myStyleVarProp.
$command = "pathToShellScript folder1Name folder2Name myStyleVarName myStyleVarProp";
shell_exec($command);
/* shellScript */
#!/bin/bash
file=/var/www/vhosts/mydomain.com/$1/$2/scss/_variables.scss
str=$3"$4"
sed -i "s/^$3.*/$str;/" $file
or if your file isn't too big:
$sample = file_get_contents('sample');
$parsed =preg_replace('##parsethis.*#', 'REPLACE TO END OF LINE', $sample);
You'll have to choose delimiters '#' that aren't present in the file though.
If you want to completely replace the contents of one file with the contents of another file you can use this:
rename("./some_path/data.txt", "./some_path/data_backup.txt");
rename("./some_path/new_data.txt", "./some_path/data.txt");
So in the first line you backup the file and in the second line you replace the file with the contents of a new file.
As far as I can tell the rename returns a boolean. True if the rename is successful and false if it fails. One could, therefore, only run the second step if the first step is successful to prevent overwriting the file unless a backup has been made successfully. Check out:
https://www.php.net/manual/en/function.rename.php
Hope that is useful to someone.
Cheers
Adrian
I'd most likely do what Jeremy suggested, but just for an alternate way to do it here is another solution. This has not been tested or used and is for *nix systems.
$cmd = "grep '#parsethis' " . $filename;
$output = system($cmd, $result);
$lines = explode("\n", $result);
// Read the entire file as a string
// Do a str_repalce for each item in $lines with ""
I have big sql file, i need to change some lines there, now i have such data in my file
INSERT INTO `LINK_LA_TYP` VALUES
('1','8917181','1','24','2'),
('1','8934610','1','24','1'),
('1','9403766','1','30','1'),
('1','9422299','1','30','2'),
I done that, for example on $count line i write one more
INSERT INTO LINK_LA_TYP VALUES
but then my file looks like this:
INSERT INTO `LINK_LA_TYP` VALUES
('1','8917181','1','24','2'),
('1','8934610','1','24','1'),
('1','9403766','1','30','1'),
INSERT INTO `LINK_LA_TYP` VALUES
('1','9422299','1','30','2'),
But i need that previous line symbol , change to ; How i can do this in big file?
So i need to see file and write there line(done) and then change , to ; on previous line, and do this on every ~500 line (depends on counter) to the end of file
This is the simplest workflow:
Create a temporary file. Then:
Read one line at a time from the original file
Do your work on that line
Write the changed line to the temporary file
Repeat.
When you are done, simply rename the temp file to the correct filename, which will remove the old file.
If each line are going to be the exact same size as before, you may instead change the file inline. You can do this by aligning the file pointer correctly with fseek and then writing. This is a little bit trickier to achieve but may save you the space for the temp file (and possibly be a little bit faster as well). This is only possible if the resulting file size will be exactly the same as the original file size (e.g., if you only change certain bytes from one character to another).
If you read in the file stream (for example via fgets; see example there), you should count the lines and replace the comma with a semicolon on line X; and insert a new line at X + 1. No need to get a previous line.
Took some time to mock this together.
sed -i ':a;N;$!ba;s/,\n INSERT INTO/;\n INSERT INTO/g' your_file.sql
You can put this in exec() if you really want to run it from php script.
I'm planning to run a php program from Mac Terminal. I have a folder on my desktop with around 800 .csv files and I need to write a php program that reads through and reads each one so that I can run some transformations on the data it's storing. I know how to parse the .csv once it's loaded but I'm wondering if there is a way to load each file without having to name it explicitly? I don't have a list of the 800 file names but I feel like there has to be a way to just read in all the files in a folder in a loop or something without having the title of each file listed -- I don't have much coding experience, so forgive me if there's an obvious answer of which I'm oblivious.
Thank you!
There are a few way todo this but glob'ing is very straightforward:
<?php
foreach (glob("*.csv") as $filename) {
//do somthing
}
?>
You can loop through all files in a directory using readdir() :http://php.net/manual/en/function.readdir.php.
Once you get the file name using readdir() you can parse it by either breaking the file content into an array and working with the cells by looping through the array using str_getcsv() (requires at least phpv5.3) or the old fashion fgetcsv() to read through the file one line at a time. For each file create a string variable, and after line you read through and transform, simply append the modified line to this string with an end-of-line character appended as well. After reading through the entire file, simply replace the file contents of the original with file_put_contents()
Sorry for my bad English.
I must to check 2 csv files, if strings with one id is different, must write to file.
If there is no string with id from 1st file in second file, must write this to file too.
it works, but with element (id=47) i have got a trouble. it into to files, but script sad, that there is only in one.
download script you can from here
http://sil-design.ru/uploads/script.zip
If you do a echo $str1[0].' - '.$str2[0].'<br />'; you will see that the two 47's are never compared. Also I am not sure what the t is in: $f2 = fopen($fileurl, 'rt');.
If you open your backup.csv in notepad and place your cursor after the 47;XL and hold delete to delete anything after it and save. Then try your script again, it should work. It seems that the backup.csv was created in a weird way, I am guessing PHP is getting an EOF before the file has even ended!
What is the best way to overwrite a specific line in a file? I basically want to search a file for the string '#parsethis' and overwrite the rest of that line with something else.
If the file is really big (log files or something like this) and you are willing to sacrifice speed for memory consumption you could open two files and essentially do the trick Jeremy Ruten proposed by using files instead of system memory.
$source='in.txt';
$target='out.txt';
// copy operation
$sh=fopen($source, 'r');
$th=fopen($target, 'w');
while (!feof($sh)) {
$line=fgets($sh);
if (strpos($line, '#parsethis')!==false) {
$line='new line to be inserted' . PHP_EOL;
}
fwrite($th, $line);
}
fclose($sh);
fclose($th);
// delete old source file
unlink($source);
// rename target file to source file
rename($target, $source);
If the file isn't too big, the best way would probably be to read the file into an array of lines with file(), search through the array of lines for your string and edit that line, then implode() the array back together and fwrite() it back to the file.
Your main problem is the fact that the new line may not be the same length as the old line. If you need to change the length of the line, there is no way out of rewriting at least all of the file after the changed line. The easiest way is to create a new, modified file and then move it over the original. This way there is a complete file available at all times for readers. Use locking to make sure that only one script is modifying the file at once, and since you are going to replace the file, do the locking on a different file. Check out flock().
If you are certain that the new line will be the same length as the old line, you can open the file in read/write mode (use r+ as the second argument to fopen()) and call ftell() to save the position the line starts at each time before you call fgets() to read a line. Once you find the line that you want to overwrite, you can use fseek() to go back to the beginning of the line and fwrite() the new data. One way to force the line to always be the same length is to space pad it out to the maximum possible length.
This is a solution that works for rewriting only one line of a file in place with sed from PHP. My file contains only style vars and is formatted:
$styleVarName: styleVarProperty;\n
For this I first add the ":" to the ends of myStyleVarName, and sed replaces the rest of that line with the new property and adds a semicolon.
Make sure characters are properly escaped in myStyleVarProp.
$command = "pathToShellScript folder1Name folder2Name myStyleVarName myStyleVarProp";
shell_exec($command);
/* shellScript */
#!/bin/bash
file=/var/www/vhosts/mydomain.com/$1/$2/scss/_variables.scss
str=$3"$4"
sed -i "s/^$3.*/$str;/" $file
or if your file isn't too big:
$sample = file_get_contents('sample');
$parsed =preg_replace('##parsethis.*#', 'REPLACE TO END OF LINE', $sample);
You'll have to choose delimiters '#' that aren't present in the file though.
If you want to completely replace the contents of one file with the contents of another file you can use this:
rename("./some_path/data.txt", "./some_path/data_backup.txt");
rename("./some_path/new_data.txt", "./some_path/data.txt");
So in the first line you backup the file and in the second line you replace the file with the contents of a new file.
As far as I can tell the rename returns a boolean. True if the rename is successful and false if it fails. One could, therefore, only run the second step if the first step is successful to prevent overwriting the file unless a backup has been made successfully. Check out:
https://www.php.net/manual/en/function.rename.php
Hope that is useful to someone.
Cheers
Adrian
I'd most likely do what Jeremy suggested, but just for an alternate way to do it here is another solution. This has not been tested or used and is for *nix systems.
$cmd = "grep '#parsethis' " . $filename;
$output = system($cmd, $result);
$lines = explode("\n", $result);
// Read the entire file as a string
// Do a str_repalce for each item in $lines with ""