Some characters in CSV file are not read during PHP fgetcsv()

Some characters in CSV file are not read during PHP fgetcsv() - php

I am reading a CSV file with php. Many of the rows have a "check mark" which is really the square root symbol: √ and the php code is just skipping over this character every time it is encountered.
Here is my code (printing to the browser window in "CSV style" format so I can check that the lines break at the right place:
$file = fopen($uploadfile, 'r');
while (($line = fgetcsv($file)) !== FALSE) {
foreach ($line as $key => $value) {
if ($value) {
echo $value.",";
}
}
echo "<br />";
}
fclose($file);
As an interim solution, I am just finding and replacing the checkmarks with 1's manually, in Excel. Obviously I'd like a more efficient solution :) Thanks for the help!

fgetcsv() only works on standard ASCII characters; so it's probably "correct" in skipping your square root symbols. However, rather than replacing the checkmarks manually, you could read the file into a string, do a str_replace() on those characters, and then parse it using fgetcsv(). You can turn a string into a file pointer (for fgetcsv) thusly:
$fp = fopen('php://memory', 'rw');
fwrite($fp, (string)$string);
rewind($fp);
while (($line = fgetcsv($fp)) !== FALSE)
...

I had a similar problem with accented first characters of strings. I eventually gave up on fgetscv and did the following, using fgets() and explode() instead (I'm guessing your csv is comma separated):
$file = fopen($uploadfile, 'r');
while (($the_line = fgets($file)) !== FALSE) // <-- fgets
{
$line = explode(',', $the_line); // <-- explode
foreach ($line as $key => $value)
{
if ($value)
{
echo $value.",";
}
}
echo "<br />";
}
fclose($file);

You should setlocale ar written in documentation
Note:
Locale setting is taken into account by this function. If LANG is e.g. en_US.UTF-8, files in one-byte encoding are read wrong by this function.
before fgetcsv add setlocale(LC_ALL, 'en_US.UTF-8'). In my case it was 'lt_LT.UTF-8'.
This behaviour is reported as a php bug

Related

utf-16le to UTF-8

I am using php on osx terminal to open the file generated with windows.
I confirmed file is utf-16le encoded
$file --mime myfile.ini
myfile.ini: text/plain; charset=utf-16le
Now I convert it to UTF-8 with this script.
while ($line = fgets($handle)) {
$line = rtrim($line);
$line = mb_convert_encoding($line,"UTF-8","UTF-16LE");
var_dump($line);
}
somehow it shows the corruption like this
string(63) "䘀爀漀洀䐀愀琀攀㴀㈀　㄀㄀⸀　㄀⸀　㄀ഀ਀"
How can I get the correct encoding???
When I don't use mb_convert_encoding
while ($line = fgets($handle)) {
$line = rtrim($line);
$line = mb_convert_encoding($line,"UTF-8","UTF-16LE");
var_dump($line);
if (preg_match('/Optimization/',$line)){print "hit";}
}
var_dump shows the strange result why 28????
string(28) "Optimization=0"
and preg_match also dosen't hit.

You could try doing this:
while ($line = fgets($handle)) {
$line = rtrim($line);
$line = iconv(mb_detect_encoding($line, mb_detect_order(), true), "UTF-8", $line);;
var_dump($line);
}

fgets() won't possibly detect line endings reliably if the stream isn't encoded in an ASCII-compatible encoding. Similarly, when rtrim() seeks for e.g. \n ('LINE FEED (LF)' (U+000A)) it expects a literal 0x0A but in UTF-16LE the encoding is 0x0A00. Bad things can happen.
I suggest you read the file in chunks that are a multiple of 4 bytes, so you won't split individual characters, and forget about line endings until you've successfully re-encoded the file:
$output = '';
while ($line = fgets($handle, 4 * 4096)) {
$output .= mb_convert_encoding($line, "UTF-8", "UTF-16LE");
}
var_dump(bin2hex($output));
Ideally, save output to a file so you can use a text editor or hexadecimal editor to inspect the result.

Finally I use UTF-16BE not UTF-16LE , it shows the correct strings.
My problem was solved.
$line = mb_convert_encoding($line,"UTF-8","UTF-16BE");
However I don't know why it works,
Even file commend says This file is utf-16le
$file --mime myfile.ini
myfile.ini: text/plain; charset=utf-16le

Php - Reading file lines into an array

The numbers in my file are 5X5:
13456
23789
14789
09678
45678
I'm trying to put it into this form
array[0]{13456}
array[1]{23789}
array[2]{14789}
array[3]{09678}
array[4]{45678}
My code is:
$fileName = $_FILES['file']['tmp_name'];
//Throw an error message if the file could not be open
$file = fopen($fileName,"r") or exit("Unable to open file!");
while ($line = fgets($file)) {
$digits .= trim($line);
$members = explode("\n", str_replace(array("\r\n","\n\r","\r"),"\n",$digits));
echo $members;
The output I'm getting is this:
ArrayArrayArrayArrayArray

fgets gets a line from the file pointer, so theoretically there should be no "\r" or "\n" characters in $line. explode will still work, even if the delimiter is not found. You'll just end up with an array with one item, the entire string. You can't echo an array, though. (That's why you're seeing Array for each line; it's the best PHP can do when you use echo on an array.)
If I were you, I would rather just use file() instead.
$members = array_map('trim', file($fileName, FILE_IGNORE_NEW_LINES));
With the example file you showed, this should result in
$members = ['13456', '23789', '14789', '09678', '45678'];

You can simply put the lines into an array and use print_r instead of echo to print that array
while ($line = fgets($file)) {
$members[] = $line;
}
print_r($members);

It should depend on the file that you are dealing with.
FileType:
text -> fgets($file)
CSV -> fgetcsv($file)

PHP - echo and fgets weird characters

I'm trying to display the content of a text file on my website using PHP's fgets, but when I echo the lines in combination with something else (<br>, \n, ...) I get pretty weird characters.
Here's my code :
<?php
header('Content-Type: text/plain;charset=utf-8');
$handle = #fopen("test.txt", "r");
if ($handle) {
while (($buffer = fgets($handle, 4096)) !== false) {
echo $buffer."<br>";
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
?>
Here is the content of test.txt :
1
2
3
4
5
... (6 - 18)
19
20
And here's what I get :
Result with <br>
If I use \n instead of <br>, I don't even get Chinese characters :
Result with \n
I think the issue comes from fgets(), because when I print only one line (without the loop) I get the same issue, but if replace $buffer by "1" (echo "1"."<br>";) I get the expected result.
EDIT
As suggested I modified the code to add header('Content-Type: text/plain;charset=utf-8'); at the beginning of the php file, and modified the output as well.

I found that the issue must be somewhere in the text file : I created a new one and the issue was gone.
I don't know the original encryption of the file because a friend gave it to me.
I'll update this answer if I find out exactly what was going on.
EDIT
I made a copy via TextEdit and when saving it the default encoding format was UTF-16, I guess that was the problem.

Working DEMO: http://phpfiddle.org/main/code/xrsk-a0uv
Text File:: http://m.uploadedit.com/ba3s/1500405331493.txt
Problem: at the Time of create text file it's select the encoding format is UTF-16. !! UTF-8 by default for nodepad,nodepad++,sublime etc.. !!
<?php
header('Content-Type: text/plain;charset=utf-8');
$handle = #fopen("http://m.uploadedit.com/ba3s/1500405331493.txt", "r");
if ($handle) {
while (($buffer = fgets($handle, 4096)) !== false) {
echo $buffer."</br>";
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
?>
NOTE: Add header for charset-utf-8
header('Content-Type: text/plain;charset=utf-8');
OUTPUT Using With "\n"
OUTPUT Using With "</br>"

fgetcsv skip blank lines in file

I have this script that I did, it basically grabs all the files in my "logs" folder and merge them all in one array file, my only problem is that, sometimes the script breaks if there is blank line or empty line! how can I tell it to automatically skip blank empty lines and go to next? blank lines are not necessarily at the top or bottom! could be in the middle of the csv file
<?php
$csv = array();
$files = glob('../logs/*.*');
$out = fopen("newfile.txt", "w");
foreach($files as $file){
$in = fopen($file, "r");
while (($result = fgetcsv($in)) !== false)
{
$csv[] = $result;
}
fclose($in);
fclose($out);
}
print json_encode(array('aaData' => $csv ));
?>

As you can read in the documentation for fgetcsv():
A blank line in a CSV file will be returned as an array comprising a single null field, and will not be treated as an error.
Checking for that before adding it to your data array should be sufficient:
while (($result = fgetcsv($in)) !== false) {
if (array(null) !== $result) { // ignore blank lines
$csv[] = $result;
}
}

This works 100% tested, simplest way. The explanation is that blank lines make fgetcsv return a non-empty array with just a null element inside.
if ($result[0] == NULL)
continue;

In short
$csv = array_map('str_getcsv', file($file_path, FILE_SKIP_EMPTY_LINES|FILE_IGNORE_NEW_LINES));
Explanation
file reads the content of the file into an array. The FILE_SKIP_EMPTY_LINES will skip the empty lines in the file.
array_map will apply the function str_getcsv on each element of the array. str_getcsv will parse the string input for fields in
csv format and return an array containing the fields.
Read more about str_getcsv
Read more about file
Read more about array_map

php import csv into sql database

whats wrong with this, when i echo out a row from the csv file and concat anything to the end of the row, it doesnt show up, instead all the rows are echo'ed and the concated string only shows up once at the very end, is this some kind of buffering thing that wont let me concat strings with stuff from my csv file, its running on my local wamp server, and i have tryed different line delimiter in my expload function, im sure the file only uses \n at the end of a line
im trying to parse a csv file row by row so i can check the content of it before i use it to construct an sql statement and insert it into my database.
$file = fopen($filename, "r")
$filesize = filesize($filename);
$filecontent = fread($file, $filesize);
fclose($file);
$rows = explode("\n", trim($filecontent));
foreach ($rows as $row)
{
echo $row . '<br />';
}

You are splitting the string by the string \n. Unless the actual string "\n" appears anywhere in the file, this will probably do nothing. You probably meant "\n" (double quotes), which makes this an actual line break.
Your overall process is terribly inefficient though. You should use fgetcsv and process the file line by line, instead of reading it into memory all at once.
$handle = fopen('test.csv', 'r');
while (($row = fgetcsv($handle)) !== false) {
foreach ($row as $field) {
echo $field . '<br />';
}
}
fclose($handle);

Use fgetcsv() function to convert a CSV file to an array:
$csvFile = "test.csv";
$csvSeparator = ",";
$csvFileLength = filesize($csvFile);
$handle = fopen($csvFile, "r");
$csvData = fgetcsv($handle, $csvFileLength, $csvSeparator);
fclose($handle);
Dump the data to show the structure:
var_dump($csvData);
Now you can convert the data to use in database.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Some characters in CSV file are not read during PHP fgetcsv() - php

Related

utf-16le to UTF-8

Php - Reading file lines into an array

PHP - echo and fgets weird characters

fgetcsv skip blank lines in file

php import csv into sql database

Categories

Resources