Read in text file line by line php - newline not being detected - php

I have a php function I wrote that will take a text file and list each line as its own row in a table.
The problem is the classic "works fine on my machine", but of course when I ask somebody else to generate the .txt file I am looking for, it keeps on reading in the whole file as 1 line. When I open it in my text editor, it looks just as how I would expect it with a new name on each line, but its the newline character or something throwing it off.
So far I have come to the conclusion it might have something to do with whatever text editor they are using on their Mac system.
Does this make sense? and is there any easy way to just detect this character that the text editor is recognizing as a new line and replace it with a standard one that php will recognize?
UPDATE: Adding the following line solved the issue.
ini_set('auto_detect_line_endings',true);
Function:
function displayTXTList($fileName) {
if(file_exists($fileName)) {
$file = fopen($fileName,'r');
while(!feof($file)) {
$name = fgets($file);
echo('<tr><td align="center">'.$name.'</td></tr>');
}
fclose($file);
} else {
echo('<tr><td align="center">placeholder</td></tr>');
}
}

This doesn't work for you?
http://us2.php.net/manual/en/filesystem.configuration.php#ini.auto-detect-line-endings

What's wrong with file()?
foreach (file($fileName) as $name) {
echo('<tr><td align="center">'.$name.'</td></tr>');
}

From the man page of fgets:
Note: If PHP is not properly recognizing the line endings when reading files either on or created by a Macintosh computer, enabling the auto_detect_line_endings run-time configuration option may help resolve the problem.
Also, have you tried the file function? It returns an array; each element in the array corresponds to a line in the file.
Edit: if you don't have access to the php.ini, what web server are you using? In Apache, you can change PHP settings using a .htaccess file. There is also the ini_set function which allows changing settings at runtime.

This is a classic case of the newline problem.
ASCII defines several different "newline" characters. The two specific ones we care about are ASCII 10 (line feed, LF) and 13 (carriage return, CR).
All Unix-based systems, including OS X, Linux, etc. will use LF as a newline. Mac OS Classic used CR just to be different, and Windows uses CR LF (that's right, two characters for a newline - see why no one likes Windows? Just kidding) as a newline.
Hence, text files from someone on a Mac (assuming it's a modern OS) would all have LF as their line ending. If you're trying to read them on Windows, and Windows expects CR LF, it won't find it. Now, it has already been mentioned that PHP has the ability to sort this mess out for you, but if you prefer, here's a memory-hogging solution:
$file = file_get_contents("filename");
$array = split("/\012\015?/", $file); # won't work for Mac Classic
Of course, you can do the same thing with file() (as has already been mentioned).

Related

php reading csv data into a single line

It's strange PHP Reading my Excel generated CSV file into a single line. Code is:
if ($file) {
while (($line = fgets($file)) !== false) {
print '<div>'.$line.'</div>'."<br/>";
}
} else {
// error opening the file.
}
fclose($file);
CSV
Name, City
Jon,Paris
Doe,Madrid
Add this code before reading the file.
ini_set("auto_detect_line_endings", true);
When turned on, PHP will examine the data read by fgets() and file() to see if it is using Unix, MS-Dos or Macintosh line-ending conventions.
This enables PHP to interoperate with Macintosh systems, but defaults to Off, as there is a very small performance penalty when detecting the EOL conventions for the first line, and also because people using carriage-returns as item separators under Unix systems would experience non-backwards-compatible behaviour.
Most likely, PHP is not correctly detecting the line endings in your file. The fgets documentation points this out.
You will probably want to write code like this:
$oldLineEndings = ini_set('auto_detect_line_endings', true);
//your while loop here
ini_set('auto_detect_line_endings', $oldLineEndings);
If you need to actually parse the csv, you may also want to look at fgetcsv.

How can I get PHP to parse control characters?

I have a PHP site and a cron job which runs to update the DB for the site. The cron reads a CSV file which is uploaded by a third party. Recently this cron job stopped working correctly. After some investigation, I've discovered that the problem is in the CSV file. The problem is that the new line character in the CSV has changed from the standard "\n" to the older ASCII "^M" and PHP doesn't seem to recognise this as a new line so instead of seeing the CSV as multiline, it is seeing it as one single line of info. I have only been able to see this difference in the Command Line text apps less and vim. Does anyone know of a way to get PHP to recognise these new line characters?
By way of an example, the incorrect CSV file looks similar to this in vim:
Heading 1,Heading 2,Heading 3,^MInfo 1-1,Info 1-2,Info 1-3,^MInfo 2-1,Info 2-2,Info 2-3,^M^M
Whereas the older (correct) version displays like this in vim:
Heading 1,Heading 2,Heading 3,
Info 1-1,Info 1-2,Info 1-3,
Info 2-1,Info 2-2.Info 2-3,
Set the auto_detect_line_endings option appropriately.
Can you not change your code to look for ^M or replace ^M with \n?
str_replace("^M", "\n", $input);

fgetcsv returns too many entries

I have the following code:
while (!feof($file)) {
$arrayOfIdToBodyPart = fgetcsv($file,0, "\t");
if (count($arrayOfIdToBodyPart)==2){
the problem is, the contents of the file look like this:
39 ankle
40 tibia
41 Vastus Intermedius
and so on
sometimes, the test in the if will show three entries, with the first being the number, the second being the name, and the third being just... emtpy.
This causes the if block to fail, and me to be sad. I know i can just make the if block test for >=2, but is there any way i can get it to just recognise the fact that there are two items? I don't like that the fgetcsv is finding "mystery" characters at the end of the line.
Is this possibly a unix server running a windows-based file error? If so, and i'm running an ubuntu server without dos2unix, where do i get it?
You probably have tabs at the end of a line:
value<tab>value<tab><newline>
If that's the case, dos2unix won't help you. You might have to do something like read each line into a variable, trim() the variable, and then use str_getcsv() to split it.
Is it possible that you have a tab at the end of those lines? They are invisible and often hard to spot... you might want to double check.
Also if you are working with csv files, while you are running windows locally and the server is unix, I found this line:
ini_set('auto_detect_line_endings', true);
saves a lot of headaches.

How to use line feeds in CSV file?

I have an excel file that I converted to a CSV so it could be parsed in PHP. However, for some reason the cells in excel only have Carriage Returns (\r) and no Line Feeds (\n). I need line feeds in the csv or else the PHP parses everything in one line, which it shouldn't do.
Is there a way to add line feeds to an excel/csv file?
Thanks!
EDIT: It would seem as though I was exporting the file as the wrong csv—I didn't do Windows Comma Separated. Thanks for the answers guys.
Before you read in your CSV file, do:
ini_set('auto_detect_line_endings', true);
Then set it to false right after reading the file.
From the manual:
This enables PHP to interoperate with Macintosh systems, but defaults
to Off, as there is a very small performance penalty when detecting
the EOL conventions for the first line, and also because people using
carriage-returns as item separators under Unix systems would
experience non-backwards-compatible behaviour.
There are a number of ways to handle it. An easy one in PHP would be to just replace \r with \n before processing it:
// Load the whole data file as a string
$data = file_get_contents("yourcsv.csv");
$data = str_replace("\r","\n", $data);
// use str_getcsv() in PHP 5.3+ to parse it to an array
$csv_array = str_getcsv($data);

Why might my PHP log file not entirely be text?

I'm trying to debug a plugin-bloated Wordpress installation; so I've added a very simple homebrew logger that records all the callbacks, which are basically listed in a single, ultimately 250+ row multidimensional array in Wordpress (I can't use print_r() because I need to catch them right before they are called).
My logger line is $logger->log("\t" . $callback . "\n");
The logger produces a dandy text file in normal situations, but at two points during this particular task it is adding something which causes my log file to no longer be encoded properly. Gedit (I'm on Ubuntu) won't open the file, claiming to not understand the encoding. In vim, the culprit corrupt callback (which I could not find in the debugger, looking at the array) is about in the middle and printed as ^#lambda_546 and at the end of file there's this cute guy ^M. The ^M and ^# are blue in my vim, which has no color theme set for .txt files. I don't know what it means.
I tried adding an is_string($callback) condition, but I get the same results.
Any ideas?
^# is a NUL character (\0) and ^M is a CR (\r). No idea why they're being generated though. You'd have to muck through the source and database to find out. geany should be able to open the file easily enough though.
Seems these cute guys are a result of your callback formatting for windows.
Mystery over. One of the callbacks was an anonymous function. Investigating the PHP create_function documentation, I saw that a commenter had noted that the created function has a name like so: chr(0) . lambda_n. Thanks PHP.
As for the \r. Well, that is more embarrassing. My logger reused some older code that I previously written which did end lines in \r\n.

Categories