Character encoding php and excel - php

First of all, sorry for my English :p
I want to upload an excel file (.xlsx) with names through my web. I upload and save the data correctly in my database, but when I show that data on my website the names like João or André are shown like: Jo�o and Andr�.
The collation in that table is utf_8_general_ci, and that names are shown like Joã£o and Andrã©.
According to the function mb_detect_encoding(), that names in the excel file are utf-8.
I tried to convert the names to utf-8 with utf8_encode() and mb_convert_encoding(), I tried to save the excel file like utf-8, I tried to save the excel file like ISO-8859-15, I tried to paste the names to notepad and save them like utf-8 and copy to my excel...I have tried many things and none has worked for me!.
I can't covert the excel file to .csv because it has to be an Excel Workbook, I'm saying it because I read that it could be a solution.
I have run out of ideas...
UPDATE: It's very strange because in localhost doesn't work, but when I upload it to the server the characters are displayed correctly

If mb_detect_encoding() is TRUE for 'UTF-8' then you only have to specify the character encoding in your HTML meta tag. Then your browser knows how to decode and display the data.
<meta charset="UTF-8">
https://www.w3schools.com/html/html_charset.asp

Related

How to decode unicode text data back to excel file with or without php?

Hello I'm trying to convert the encoded text data back to excel file.
I have an excel file(.xlsx), when it is opened in notepad, it show encoded text data. And I'm getting the data in the encoded text which I need to convert back to excel file.
I have an excel- file which when I tried to open in notepad it looks like: (excel_file_txt.txt)
PK ! ¤Y€¼ h ¥ xl/worksheets/sheet1.xmlœ½[“ý¸qåû~>…
¢ßµE7Â!{bÔU¬[ĉs}î‘Ú–b$µCÝÏ|ûá®]õWe&2mIÑ’øC,®
XL ¿ý/ÿó/þÕÿøño?
ÿ駿þãwëmùîW?þõ÷?ýáOý—üîÿýŽ_ïßýêç_~øë~øóOýñ¿û_?
þüÝù§ÿã·ÿþÓßþûÏüñÇ_~užà¯?ÿãwüå—ý‡ßüæçßÿñÇ¿üðóí§ýñ¯'ùçŸþö—~9ÿïßþå7?
ÿëß~üáïAùóo¶e‰¿ùËúëwÏ3üÃß®œã§þç?ýþÇ—Ÿ~ÿoùñ¯¿<Oò·ÿüÃ/çåÿüÇ?
ýëÏßýÓoÿð§“=þž_ýíÇþÇïþëú#8÷Ýoþé·ïuÿúñßþò¿õËÿíÿþñÏ?þþ—ÿpÞï~õøÓþÛO?
ý÷,ç¡åú{¼_Öÿù·_ýáÇþáßþüËÿõÓ¿ßüÓ¿üñ—ó$áòûŸþüóû?õ—?ýõýÌùá>køÓ~ùã
£Ø-­Kv)<îñÿúóïð÷ÿöó/?ýåÿÿ(òq
I tried to save this text file as using SAVEAS excelfile.xlsx & encoding as "UNICODE" (I even tried other encodings like UTF-8, ANSII.).
But when opened in excel it shows as: (excel_file_txt_encode_save.xlsx)
However, the original excel files looks like: (excel_file.xlsx)
Can someone help me to get back the excel file from the encoded text data or another way to get it done.

PHP: characters wrongly encoded in csv when using utf8_encode

I am facing a strange issue when extracting data from a MySql database and inserting it in a CSV file. In the database, the field value is the following:
K Secure Connection 1 año 1 PC
When I echo it before writing it to the CSV file, I get the same as the above in my terminal.
I use the following code to write content to the CSV file:
fwrite($this->fileHandle, utf8_encode($lineContent . PHP_EOL));
Yet, when I open the CSV with LibreOffice Calc (and specify UTF-8 as the encoding format), the following is displayed:
K Secure Connection 1 año 1 PC
I have no idea why this happens. Can someone explain how to solve this?
REM:
SELECT ##character_set_database;
returns
latin1
REM 2:
`var_dump($lineContent, bin2hex($lineContent))`
gives
string(39) "Kaspersky Secure Connection 1 año 1 PC"
string(78) "4b6173706572736b792053656375726520436f6e6e656374696f6e20312061c3b16f2031205043"
The var_dump shows that the string is already encoded in UTF-8. Using utf8_encode on it will garble it (the function attempts a conversion from Latin-1 to UTF-8). You're therefore actually writing "año" encoded in UTF-8 into your file, which is then "correctly" picked up by LibreOffice.
Simply don't utf8_encode.
I would try to open the csv file with other editor just to make sure te problem is not with the office...
You may be double encoding the content if it is already in UTF-8 format.
I also prefer to aways work with UTF-8, so I get the data from database already in UTF-8 and no more convertion is needed. For that I run this query right after opening the SQL connection:
"set names 'utf8'"

move_uploaded_file function in PHP

I have a problem that I'm using move_uploaded_file() function to upload files and some of the files named in Arabic so I googled the problem but still no answer I used meta tag and I used Base64 encode and everything but still doesn't work.
What is the solution ?
<?php
$data_name=$_POST['name'];
$name=base64_encode($_FILES['file']['name']);
$location="../Files/".$course_name."/";
$tmp_name=$_FILES['file']['tmp_name'];
if(move_uploaded_file($tmp_name, $location.$name))
echo"OK";
?>
One solution can be:
Have a database where save your arabic name of file and give that file some custom unique name with current time, also save custom name into db, at time of retreival change file name and show to user.
OR use some name conversion library which convert text from arabic to englidh and vice versa.
for this purpose have a look on these refferences
how to convert english into arabic dynamically
convert Persian/Arabic numbers to English numbers
OR convert string into utf-8 using php for help:
PHP: Convert any string to UTF-8 without knowing the original character set, or at least try
http://php.net/manual/en/function.utf8-encode.php

Accents in uploaded file being replaced with '?'

I am building a data import tool for the admin section of a website I am working on. The data is in both French and English, and contains many accented characters. Whenever I attempt to upload a file, parse the data, and store it in my MySQL database, the accents are replaced with '?'.
I have text files containing data (charset is iso-8859-1) which I upload to my server using CodeIgniter's file upload library. I then read the file in PHP.
My code is similar to this:
$this->upload->do_upload()
$data = array('upload_data' => $this->upload->data());
$fileHandle = fopen($data['upload_data']['full_path'], "r");
while (($line = fgets($fileHandle)) !== false) {
echo $line;
}
This produces lines with accents replaced with '?'. Everything else is correct.
If I download my uploaded file from my server over FTP, the charset is still iso-8850-1, but a diff reveals that the file has changed. However, if I open the file in TextEdit, it displays properly.
I attempted to use PHP's stream_encoding method to explicitly set my file stream to iso-8859-1, but my build of PHP does not have the method.
After running out of ideas, I tried wrapping my strings in both utf8_encode and utf8_decode. Neither worked.
If anyone has any suggestions about things I could try, I would be extremely grateful.
It's Important to see if the corruption is happening before or after the query is being issued to mySQL. There are too many possible things happening here to be able to pinpoint it. Are you able to output your MySql to check this?
Assuming that your query IS properly formed (no corruption at the stage the query is being outputted) there are a couple of things that you should check.
What is the character encoding of the database itself? (collation)
What is the Charset of the connection - this may not be set up correctly in your mysql config and can be manually set using the 'SET NAMES' command
In my own application I issue a 'SET NAMES utf8' as my first query after establishing a connection as I am unable to change the MySQL config.
See this.
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
Edit: If the issue is not related to mysql I'd check the following
You say the encoding of the file is 'charset is iso-8859-1' - can I ask how you are sure of this?
What happens if you save the file itself as utf8 (Without BOM) and try to reprocess it?
What is the encoding of the php file that is performing the conversion? (What are you using to write your php - it may be 'managing' this for you in an undesired way)
(an aside) Are the files you are processing suitable for processing using fgetcsv instead?
http://php.net/manual/en/function.fgetcsv.php
Files uploaded to your server should be returned the same on download. That means, the encoding of the file (which is just a bunch of binary data) should not be changed. Instead you should take care that you are able to store the binary information of that file unchanged.
To achieve that with your database, create a BLOB field. That's the right column type for it. It's just binary data.
Assuming you're using MySQL, this is the reference: The BLOB and TEXT Types, look out for BLOB.
The problem is that you are using iso-8859-1 instead of utf-8. In order to encode it in the correct charset, you should use the iconv function, like so:
$output_string = iconv('utf-8", "utf-8//TRANSLIT", $input_string);
iso-8859-1 does not have the encoding for any sort of accents.
It would be so much better if everything were utf-8, as it handles virtually every character known to man.

How to differentiate between MacRoman and Windows-1251 encodings in PHP?

I'm pulling my hairs for a few days now. I've googled and stackoverflowed a lot without success.
I'm importing some data from a csv file. This CSV file is generated in Excel either on Windows or Mac, which gives 2 different encodings "Windows-1251" and "MacRoman". Both are variants from ISO-8859-1 and mb_detect_encoding dos not help : it always detect the first encoding I put in the list.
For example :
mb_detect_encoding($buffer, 'macroman, windows-1251, UTF-8');
Will give "macroman".
With the same string, trying :
mb_detect_encoding($buffer, 'windows-1251, macroman, UTF-8');
will give "window-1251".
So how can you properly make the difference ? I need to convert my input string (the csv file content) to utf-8 to insert into the DB.
Maybe I'm missing something? How do you guys usually manage to parse csv files, and save data properly in DB (utf8).
Thanks for any clue!
I think the only way to make sure this is handled properly is to define a process for saving the csv file in the first place. Then you just have to utf8_encode what's coming in and it'll go fine...

Categories