php reading csv data into a single line - php

It's strange PHP Reading my Excel generated CSV file into a single line. Code is:
if ($file) {
while (($line = fgets($file)) !== false) {
print '<div>'.$line.'</div>'."<br/>";
}
} else {
// error opening the file.
}
fclose($file);
CSV
Name, City
Jon,Paris
Doe,Madrid

Add this code before reading the file.
ini_set("auto_detect_line_endings", true);
When turned on, PHP will examine the data read by fgets() and file() to see if it is using Unix, MS-Dos or Macintosh line-ending conventions.
This enables PHP to interoperate with Macintosh systems, but defaults to Off, as there is a very small performance penalty when detecting the EOL conventions for the first line, and also because people using carriage-returns as item separators under Unix systems would experience non-backwards-compatible behaviour.

Most likely, PHP is not correctly detecting the line endings in your file. The fgets documentation points this out.
You will probably want to write code like this:
$oldLineEndings = ini_set('auto_detect_line_endings', true);
//your while loop here
ini_set('auto_detect_line_endings', $oldLineEndings);
If you need to actually parse the csv, you may also want to look at fgetcsv.

Related

CSV file line count not working in PHP

I have a webpage that needs to count the number of lines in a CSV file, but the following code isn't working:
$linecount = count(file("sample.csv"));
var_dump($linecount);
When I run this code, the code returns the number 1, but there are 8 lines in sample.csv. Does anybody know why this is happening and how to fix it?
If the sample.csv file created in mac/linux you might want to consider setting auto_detect_line_endings to ON.
From the manual:
auto_detect_line_endings boolean
When turned on, PHP will examine the data read by fgets() and file() to see if it is using
Unix, MS-Dos or Macintosh line-ending conventions.
Another option (if you don't want to use this) is to read the file and split the lines by all new-line options (\r\n|\r|\n):
$linecount = count(preg_split("/\r\n|\r|\n/", file_get_contents("sample.csv")));

How to use line feeds in CSV file?

I have an excel file that I converted to a CSV so it could be parsed in PHP. However, for some reason the cells in excel only have Carriage Returns (\r) and no Line Feeds (\n). I need line feeds in the csv or else the PHP parses everything in one line, which it shouldn't do.
Is there a way to add line feeds to an excel/csv file?
Thanks!
EDIT: It would seem as though I was exporting the file as the wrong csv—I didn't do Windows Comma Separated. Thanks for the answers guys.
Before you read in your CSV file, do:
ini_set('auto_detect_line_endings', true);
Then set it to false right after reading the file.
From the manual:
This enables PHP to interoperate with Macintosh systems, but defaults
to Off, as there is a very small performance penalty when detecting
the EOL conventions for the first line, and also because people using
carriage-returns as item separators under Unix systems would
experience non-backwards-compatible behaviour.
There are a number of ways to handle it. An easy one in PHP would be to just replace \r with \n before processing it:
// Load the whole data file as a string
$data = file_get_contents("yourcsv.csv");
$data = str_replace("\r","\n", $data);
// use str_getcsv() in PHP 5.3+ to parse it to an array
$csv_array = str_getcsv($data);

Verifying a CSV file is really a CSV file

I want to make sure a CSV file uploaded by one of our clients is really a CSV file in PHP. I'm handling the upload itself just fine. I'm not worried about malicious users, but I am worried about the ones that will try to upload Excel workbooks instead. Unless I'm mistaken, an Excel workbook and a CSV can still have the same MIME, so checking that isn't good enough.
Is there one regular expression that can handle verifying a CSV file is really a CSV file? (I don't need parsing... that's what PHP's fgetcsv() is for.) I've seen several, but they are usually followed by comments like "it didn't work for case X."
Is there some other better way of handling this?
(I expect the CSV to hold first/last names, department names... nothing fancy.)
Unlike other file formats, CSV has no tell-tale bytes in the file header. It starts straight away with the actual data.
I don't see any way except to actually parse it, and to count whether there is the expected number of columns in the result.
It may be enough to read as many characters as are needed to determine the first line (= until the first line break).
You can write a RE that will give you a guess if the file is valid CSV or not - but perhaps a better approach would be to try and parse the file as if it was CSV (with your fgetcsv() call), and assume it's NOT a valid one if the call fails?
In other words, the best way to see if the file is a valid CSV file is to try and parse it as such, and assume that if you failed to parse, it wasn't a CSV!
The easiest way is to try parsing the CSV and attempting to read value from it. Parse it using str_getcsv and then attempt to read a value from it. If you are able to read and validate at least a couple of values, then the CSV is valid.
EDIT
If you don't have access to str_getcsv, use this, a drop-in replacement for str_getcsv from http://www.electrictoolbox.com/php-str-getcsv-function/:
if (!function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ",", $enclosure = '"', $escape = "\\") {
$fp = fopen("php://memory", 'r+');
fputs($fp, $input);
rewind($fp);
$data = fgetcsv($fp, null, $delimiter, $enclosure); // $escape only got added in 5.3.0
fclose($fp);
return $data;
}
}
Technically speaking, almost any text file could be a CSV file (barring quotes that don't match, etc.). You can try to guess if it's a binary file, but there isn't a reliable way to do that unless your data only has ASCII or something of the sort. If all you care is that people don't upload Excel files by mistake, check the file extension.
Any text file is a valid CSV file so it is impossible to come up with a standard way of verifying its correctness because it depends on what you really expect it to be.
Before you even start, you have to know what delimiter is used in that CSV file. After that, the easiest way to verify is to use fgetcsv function. For example:
<?php
$row = 1;
if (($handle = fopen("test.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data); // Number of fields in a row.
if ($num !== 5)
{
// OMG! Column count is not five!
}
else if (intval($data[$c]) == 0)
{
// OMG! Customer thinks we sold a car for $0!
}
}
fclose($handle);
}
?>

How to parse file in php and generate insert statements for mysql?

In case of csv file we have fgetcsv in php to parse and get the output but in my case file is .dat and I need to parse it and store it into MySQL Database and so do we have any built in function in php like fgetcsv that can work in similar fashion on .dat file ?
Here is the sample value, it has headers DF_PARTY_ID;DF_PARTY_CODE;DF_CONNECTION_ID and its value as mentioned under.
Sample Data:
DF_PARTY_ID;DF_PARTY_CODE;DF_CONNECTION_ID
87961526;4002524;13575326
87966204;4007202;13564782
What's wrong with fgetcsv()? The extension on the file is irrelevant as long as the format of the data is consistent across all of your files.
Example:
$fh = fopen('example.dat', 'r');
while (!feof($fh)) {
var_dump(fgetcsv($fh, 0, ';'));
}
Alternatively, with PHP5.3 you can also do:
$lines = file('example.dat');
foreach($lines as $line) {
var_dump(str_getcsv(trim($line), 0, ';'));
}
IMHO .dat files can be of different formats. Blindly following the extension can be error-prone. If however you have a file from some specific application, maybe tell us what this app is. Chances are there are some parsing libraries or routines.
I would imagine it would be easier to write a short function using fopen, fread, and fclose to parse it yourself. Read each line, explode to an array, and store them as you wish.

Read in text file line by line php - newline not being detected

I have a php function I wrote that will take a text file and list each line as its own row in a table.
The problem is the classic "works fine on my machine", but of course when I ask somebody else to generate the .txt file I am looking for, it keeps on reading in the whole file as 1 line. When I open it in my text editor, it looks just as how I would expect it with a new name on each line, but its the newline character or something throwing it off.
So far I have come to the conclusion it might have something to do with whatever text editor they are using on their Mac system.
Does this make sense? and is there any easy way to just detect this character that the text editor is recognizing as a new line and replace it with a standard one that php will recognize?
UPDATE: Adding the following line solved the issue.
ini_set('auto_detect_line_endings',true);
Function:
function displayTXTList($fileName) {
if(file_exists($fileName)) {
$file = fopen($fileName,'r');
while(!feof($file)) {
$name = fgets($file);
echo('<tr><td align="center">'.$name.'</td></tr>');
}
fclose($file);
} else {
echo('<tr><td align="center">placeholder</td></tr>');
}
}
This doesn't work for you?
http://us2.php.net/manual/en/filesystem.configuration.php#ini.auto-detect-line-endings
What's wrong with file()?
foreach (file($fileName) as $name) {
echo('<tr><td align="center">'.$name.'</td></tr>');
}
From the man page of fgets:
Note: If PHP is not properly recognizing the line endings when reading files either on or created by a Macintosh computer, enabling the auto_detect_line_endings run-time configuration option may help resolve the problem.
Also, have you tried the file function? It returns an array; each element in the array corresponds to a line in the file.
Edit: if you don't have access to the php.ini, what web server are you using? In Apache, you can change PHP settings using a .htaccess file. There is also the ini_set function which allows changing settings at runtime.
This is a classic case of the newline problem.
ASCII defines several different "newline" characters. The two specific ones we care about are ASCII 10 (line feed, LF) and 13 (carriage return, CR).
All Unix-based systems, including OS X, Linux, etc. will use LF as a newline. Mac OS Classic used CR just to be different, and Windows uses CR LF (that's right, two characters for a newline - see why no one likes Windows? Just kidding) as a newline.
Hence, text files from someone on a Mac (assuming it's a modern OS) would all have LF as their line ending. If you're trying to read them on Windows, and Windows expects CR LF, it won't find it. Now, it has already been mentioned that PHP has the ability to sort this mess out for you, but if you prefer, here's a memory-hogging solution:
$file = file_get_contents("filename");
$array = split("/\012\015?/", $file); # won't work for Mac Classic
Of course, you can do the same thing with file() (as has already been mentioned).

Categories