I need to parse a csv but this csv is malformed, in some fields there are new lines (\n) that break my parser.
Is there any way to avoid that ? Currently i'm parsing like that :
if($file){
$full_path = DRUPAL_ROOT . '/' . file_stream_wrapper_get_instance_by_uri($file->uri)->getDirectoryPath() . '/' . file_uri_target($file->uri);
$csvData = file_get_contents($full_path);
if (!mb_detect_encoding($csvData, 'UTF-8')) $csvData = utf8_encode($csvData);
//$lines = explode(",", $csvData);
$lines = array_map("str_getcsv", file($full_path), array_fill(0, count(file($full_path)), ';'));
dsm($csvData);
foreach ($lines as $line){
dsm($line);
}
But when the str_getcsv find a \n he create a new line and I can't correctly parse the csv (I can't modify the csv it's from another person).
Maybe there is a way to remove the \n from fields but not from the end of the line ?
Example of the field who have line breakers in my csv
It seems that str_getcsv() cannot be used by itself to do this.
Have a look at the comments of the str_getcsv function. There are examples of how other people have already solved exactly your problem.
Related
This is a really odd behavior that I can't explain. I have a CSV file that I'm trying to format. The lines could have trailing ','s that I want to remove.
$lines = explode(PHP_EOL, $csv);
$csv = '';
foreach($lines as $line) {
$csv .= trim($line, ',') . PHP_EOL;
}
The trim is not doing anything and just returning the line back as it is. Just to make sure I copied a line from the csv trim("a,b,c,d,,", ','); which works fine. Can anyone tell me why the above code won't work?
If the CSV file was created on a different operating system, it may use different line breaks than PHP_EOL. So trim any line break characters in addition to commas.
foreach($lines as $line) {
$csv .= trim($line, ",\r\n") . PHP_EOL;
}
Don't manually edit the CSV file. Parse it into an array, then edit the array. Then you can write the modified array back to a CSV file.
You can use fputcsv to write the data to a file, or str_putcsv (a custom function).
$newData = [];
$data = array_map('str_getcsv', $lines); // parse each line as a CSV
foreach ($data as $row) {
$row = array_filter($row); // remove blank values
// for some dumb reason, php has `str_getcsv` but not `str_putcsv`
// so let's use `str_putcsv` from: https://gist.github.com/johanmeiring/2894568
$newData[] = str_putcsv($row);
}
$newData = implode(PHP_EOL, $newData);
I have a text file with line breaks that I have already imploded in to single line and added commas to the end of every line.
I need to remove every 36th comma from the line and add line break as well.
this is my code:
$lines = file("filename.txt", FILE_IGNORE_NEW_LINES);
$comma_separated = implode(",", $lines);
$lines2 = preg_replace('/(?:[^,]*,){36}/', '$0\r\n', $comma_separated);
I have always had trouble with coding. My brains just don't go that way. I try my best, and I am 100 % sure I have searched and read the solution, but I have not understood it.
I would rather do something like this.
Instead of a regex that will eventually fail, add a string to every 36 line before you implode.
This string can easily be found with a simple str_replace() to remove and add the line break.
$lines = file("filename.txt", FILE_IGNORE_NEW_LINES);
For($i=0; $i<count($lines);){
$lines[$i] .= "something";
$i = $i+36;
}
$comma_separated = implode(",", $lines);
// String replace something to line break.
$result = str_replace("something,", "<br>", $comma_separated);
Edit sorry the string replace of course needs to go after the implode.
Sorry for the confusion.
$lines = file("filename.txt", FILE_IGNORE_NEW_LINES);
For($i=0; $i<count($lines);){
//your code
if($i%36==0){
trim($line,',');
}
}
I have a CSV file with:
Test1,One line
Test2,"Two lines
Hello"
Test3,One line
As you can see, one of the columns has a value which is separated with a new line.
To parse this CSV file into an array, I run:
$csvArray = array();
$csvData = file_get_contents('file.csv');
$lines = explode(PHP_EOL, $csvData);
foreach ($lines as $line) {
$csvArray[] = str_getcsv($line);
}
// print_r($csvArray);
It works beside one problem. It reads the new line in the value as a new row, which is completely incorrect.
How do I make it so that it properly reads a multi-line value?
Edit: this question focuses on new lines.
$fp = fopen('filecsv', 'r');
$csvArray = array();
while ($row = fgetcsv($fp)) {
$csvArray[] = $row;
}
fclose($fp);
Using explode(PHP_EOL, $csvData) will not correctly split the CSV by its row delimitor. The multi line cell is encapsulated with quotations meaning the row will continue onto new lines until they are closed.
PHP's built in fgetcsv function will correctly read a single row from a file, moving the pointer to the next row in the process. While str_getcsv will only read a single row from a string without considering additional rows (because its a string, not a stream).
Since you have already split the string with explode, your CSV row is broken, so the row will be incomplete.
I am writing array values to a CSV file using PHP. In the array values, I have included a line break using \n. After the array values are updated, I am using the implode function as below.
$newLine[] = $row[$i].",";
$newLine[] = "\n";
$csv2 [] = implode(" ", $newLine);
However, while writing to the CSV file, an extra space gets appended to the front of the line. This is causing me some problems in display. I want to eliminate the space in front of the line while it is getting written. I tried to do the below.
$line1 = str_replace(' .','.',$line);
However, I am not able to write without the space in beginning to the CSV file.
You don't need to use spaces at all:
$newLine[] = $row[$i].",";
$csv2 [] = implode("\n", $newLine);
$csv2[] = trim ( implode("\n", $newLine) , "\n");
This should work since it removes only line breaks at the beginning and end of the string.
In PHP, how can one edit a text file and save it so that everything after the first space is removed?
In other words, so that each line only has its first word?
For example, if the text file looked like this:
Adi NNP
Adia NNP
Adios NNP FW
Adios-Direct NNP
Adios-On NNP
Adios-Rena NNP
Adios-Trustful NNP
Adirondack NNP
Adirondacks NNPS
Adjoining VBG
Adjournment NN
after executing the script, the text file would look like this:
Adi
Adia
Adios
Adios-Direct
Adios-On
Adios-Rena
Adios-Trustful
Adirondack
Adirondacks
Adjoining
Adjournment
How I would approach this would be to open the file, read it in, and take each line and store it in an array. Then replace everything after the first space with nothing. And lastly, save the edited array to a new file.
Is there a better way to do it than that?
All I know how to do in the above method is everything except the last two tasks. I would do it like so:
$file = array();
$lines = file('file.txt');
foreach($lines as $line){
array_push($file, $line);
}
// now travel through $file and replace everything after first space with nothing
// travel though $file again, but write each element as a new line in a .txt file
You can use explode() to separate the line by spaces. Then you can immediately write the string back to a file, no second loop is required:
$file = array();
$lines = file('file.txt');
$new_file = fopen('new.txt', 'w+');
foreach($lines as $line){
$bits = explode(' ', $line);
fwrite($new_file, $bits[0] . PHP_EOL);
}
fclose($new_file);
You can do it in the same line: just replace array_push($file, $line) with...
$file[] = strtok($line, ' ');
It can be written even more compact with help of array_map:
$lines = array_map(function($line) {
return strtok($line, ' ');
}, file('file.txt'));
... or you can write it back immediately, as shown in #hek2mgl answer.
You can bypass arrays entirely and do this with a simple regular expression:
// Read in contents into a variable
$data = file_get_contents('input.txt');
// Drop the space and everything after on each line
$data = preg_replace('/ .*$/m', '', $data);
// Dump contents to file (change this to input.txt if you want to overwrite the file)
file_put_contents('output.txt', $data);