php + fgetcsv does not support some special characters - php

I am uploading csv file and its content is fetching using the function fgetcsv,
I am already using utf8 encoding still some characters are gets converted in to ?
Following are some of those charcters:
ť č ň
Is there any way which accept all the special charcters of any language which supports while reading CSV file.
How to add the BOM element while reading CSV

Try fgets instead of fgetcsv. fgetcsv() tries to be binary-safe, but it's actually not.

Related

Laravel Storage file encoding

I'm trying to save text file as UTF-8 by using Laravel's Storage facade. Unfortunately couldn't find a way and it saves as us-ascii. How can I save as UTF-8?
Currently I'm using following code to save file;
Storage::disk('public')->put('files/test.txt", $fileData);
You should be able to append "\xEF\xBB\xBF" (the BOM which defines it as UTF-8) to your $fileData. So:
Storage::disk('public')->put('files/test.txt", "\xEF\xBB\xBF" . $fileData);
There are other ways to convert your text before writing it to the file, but this is the simplest and easiest to read and execute. As far as I know, there is also no character encoding methods within Illuminate\Filesystem\Filesystem.
For more information: https://stackoverflow.com/a/9047876/823549 and What's different between UTF-8 and UTF-8 without BOM?.
ASCII is a subset of UTF-8, so all ASCII files are already UTF-8 encoded. The bytes in the ASCII file and the bytes that would result from "encoding it to UTF-8" would be exactly the same bytes. There's no difference between them, so there's no need to do anything.
It looks like your problem is that the files are not actually ASCII. You need to determine what encoding they are using, and transcode them properly.
I recommend using mb_convert_encoding instead
$fileData = mb_convert_encoding($fileData, "UTF-8", "auto");
Storage::disk('public')->put('files/test.txt", $fileData);

Convert txt file encoding from DOS737 to UTF8

I have a txt file that has greek characters. When i open the file with notepad it shows that the encoding is ASCII.
But the only way that i can read the greek characters is to change (in openoffice writer or Editpad lite) the character set to DOS737.
The process that i need to implement in PHP is to open the file, split the text and import it to database. Everything is ok except that i cannot get the greek characters as they are.
I tried iconv but with no result.
I also tried mb_convert_encoding($data[0], "DOS737"); but i get warning mb_convert_encoding(): Unknown encoding "DOS737"
Also tried utf8_encode but with no luck
Any suggestions?
Finally found it.
It was easy... For anyone that might have the same issue use iconv("cp737","UTF-8","$string");

Why fgetcsv adds some characters between characters?

I use a php script with fgetcsv() to import data from csv files I did not create (no choice).
My problem is the data are imported to my database with a special character between each character, on each field...
For example, "Mounted print" is imported as "�M�o�u�n�t�e�d� �P�r�i�n�t�"
I tried to modify encoding with no result : utf8_encode/decode, iconv ...
Any idea ?
Thanks

PHP script convert an ainsi file to utf 8

As part of a project in PHP, I have to deal with a CSV file to put data in a database.
However, the csv file is encoded in AINSI but I would treat data as UTF-8 for them appear correctly in my database. Do you know a way to automate this conversion?
I already read the function mb_convert_encoding, but it works with $string parameters.
if you know for sure that your current encoding is pure ASCII, then you don't have to do anything because ASCII is already a valid UTF-8
But if you still want to convert just to be sure, then you can use iconv
$string = iconv('ASCII', 'UTF-8//IGNORE', $string);
The IGNORE will discard any invalid characters just in case some were not valid ASCII

PHP: Use (or not) 'utf8_encode' in combination with setting BOM to \xEF\xBB\xBF

When using the following code:
$myString = 'some contents';
$fh=fopen('newfile.txt',"w");
fwrite($fh, "\xEF\xBB\xBF" . $myString);
Is there any point of using PHP functions to first encode the text ($myString in the example) e.g. like running utf8_encode($myString); or similar iconv() commands?
Assuming that the BOM \xEF\xBB\xBF is first inputted into the file and that UTF8 represents practically all characters in the world I don't see any potential failure scenarion of creating a file this way. In other words I don't see any case where any major text editor wouldn't be able to interpret the newly created file corectly, displaying all characters as intended. This even if $myString would be a PHP $_POST variable from a HTML form. Am I right?
If your source file is UTF-8 encoded, then the string $myString is also UTF-8 encoded, you don't need to convert it. Otherwise, you need to use iconv() to convert the encoding first before write it to the file.
And note utf8_encode() is used to encode an ISO-8859-1 string to UTF-8.
Note that utf8_encode will only convert ISO-8859-1 encoded strings.
In general, given that PHP only supports a 256 char character set, you will need to utf-8 encode any string containing non-ASCII characters before writing it to UTF-8.
The BOM is optional (most text file readers now will scan the file for its encoding).
From Wikipedia
The Unicode Standard permits the BOM in UTF-8,[2] but does not require
or recommend for or against its use

Categories