I have a flash application that communicates with php to save data to nvarchar(1200) column. However when I change to different language support i.e locale, and type into the flash app the letters are good but in the db they are saved as question marks instead of the reall letters.
How can I solve this problem?
How to save the real letters in db?
Your database may not be configured to use UTF-8 encoding. SQL Server 7.0 and SQL Server 2000 use a different Unicode encoding (UCS-2) and do not recognize UTF-8 as valid character data.
Other versions of mssql may be similar.
See this for more information: http://support.microsoft.com/kb/232580
If that's not the issue, backtrack to PHP and test the encoding type on the data you are receiving. Make sure it matches what needs to be in your DB, or convert it first.
Related
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
UTF-8 all the way through
okay, this is stupid that I can't figure it out.
Mysql database is set to utf8_general_ci collation. The field i'm having problems with is longtext type.
characters added to the database as é or other accented characters are returning as �.
I run the output through stripslashes and i've tried both with and without html_entity_decode but can find no change in the output. What am I doing wrong?
Cheers
What character encoding does the string have that you try to insert? If it is in ISO-8859-1 you can use the PHP function utf8_encode() to encode it to UTF-8 before inserting it into the database.
http://php.net/manual/en/function.utf8-encode.php
Getting encoding right is really tricky - there are too many layers:
Browser
Page
PHP
MySQL
The SQL command "SET CHARSET utf8" from PHP will ensure that the client side (PHP) will get the data in utf8, no matter how they are stored in the database. Of course, they need to be stored correctly first.
DDL definition vs. real data
Encoding defined for a table/column doesn't really mean that the data are in that encoding. If you happened to have a table defined as utf8 but stored as differtent encoding, then MySQL will treat them as utf8 and you're in trouble. Which means you have to fix this first.
What to check
You need to check in what encoding the data flow at each layer.
Check HTTP headers, headers.
Check what's really sent in body of the request.
Don't forget that MySQL has encoding almost everywhere:
Database
Tables
Columns
Server as a whole
Client
Make sure that there's the right one everywhere.
Conversion
If you receive data in e.g. windows-1250, and want to store in utf-8, then use this SQL before storing:
SET NAMES 'cp1250';
If you have data in DB as windows-1250 and want to retreive utf8, use:
SET CHARSET 'utf8';
Last note:
Don't rely on too "smart" tools to show the data. E.g. phpMyAdmin does (was doing when I was using it) encoding really bad. And it goes through all the layers so it's hard to find out. Also, Internet Explorer had really stupid behavior of "guessing" the encoding based on weird rules. Use simple editors where you can switch encoding. Also, I recommend MySQL Workbench.
I now work on a web-base PHP app to work with a MySQL Server database .
database is with Latin1 Character set and all Persian texts don't show properly .
database is used with a windows software Show and Save Persian texts good .
I don't want to change the charset because windows software work with that charset .
Question:
how can convert latin1 to utf8 to show and utf8 to latin1 for saving from my web-base PHP app , or using Persian/Arabic language on a latin1 charset database without problem ?
note:
one of my texts is احمد رحمانی when save from my windows-based software save as ÇÍãÏ ÑÍãÇäí and still show with احمد رحمانی in my old windows-based software
image : image of database , charsets,collation and windows-based software (full Size)
Edit: Your screenshot shows that the diagnosis below is probably correct.
What to do: Try using iconv() in your PHP web application. You would have to guess or find out what collation/codepage your Windows app uses.
Then something like this might work:
$string_decoded = iconv("windows-1256", "utf-8", $string);
You may need to experiment to get this working. Also, I think you need to force your database connection to use latin1 instead of UTF-8!
If you ask me, this is not a good basis for your web app. You would have to convert data into a broken format all the time. You may have to break compatibility with your application, or write an import tool.
The latin1 character set does not cover Persian characters. Proof at collation-charts.org
The only explanation I have why your Delphi program is able to store Arabic characters in a latin1 database is, it could be misusing the latin1 database to store data that isn't covered by latin1, e.g. Windows-1256 Arabic. So the program would store the raw bytes of each arabic character, while in fact these bytes are occupied by other, latin characters in the latin1 character set. But as long as was only the Delphi program storing and fetching the data, no one noticed.
If I'm correct in that - it's the only way I can see how what you describe could be happening - that works as long as only applications are involved that do this the same way, a way which is wrong really.
You should be able to confirm whether this is the case by looking at the data from a "neutral" database tool like phpMyAdmin or HeidiSQL. If you see garbage there instead of Arabic / Persian characters, I may be right.
As to what to do to make your PHP web app work with the same database as your Delphi app - I'm not really sure what to do to be honest. As far as I know, there is no way to force mySQL to use one encoding instead of the other. You would have to manually "re-encode" the data before fetching it into your web app. This is likely to be a painful process.
But first, try to find out what exactly is happening.
Having trouble getting foreign characters and Emoji to display.
Edited to clarify
A user types an Emoji character into a text field which is then sent to the server (php) and saved into the database (mysql). When displaying the text we grab a JSON encoded string from the server, which is parsed and displayed on the client side.
QUESTION: the character for a "trophy" emoji saved in the DB reads as
%uD83C%uDFC6
When that is sent back to the client we don't see the emoji picture, we actually see the raw encoded text.
How would we get the client side to read that text as an emoji character and display the image?
(all on an iphone / mobile safari)
Thanks!
Check the encodings used by your client, your web server, and your database table. Make sure they are all using encodings that can handle the characters you are concerned about.
Looks like the problem is my MySql encoding... utf8mb4 would allow it - unfortunately it's unavailable before MySQL v5.5
the character for a "trophy" emoji saved in the DB reads as %uD83C%uDFC6
Then your data are already mangled. %u escapes are specific to the JavaScript escape() function, which should generally never be used. Make sure your textarea->PHP handling uses standards-compliant encoding, eg encodeURIComponent if you need to get a JS variable into a URL query.
Then, having proper raw UTF-8 strings in your PHP layer, you can worry about getting MySQL to store characters like the emoji that are outside of the Basic Multilingual Plane. Best way is columns with a utf8mb4 collation; if that is not available try binary columns which will allow you to store any byte sequence (treating it as UTF-8 when it comes back out). That way, however, you won't get case-insensitive comparisons.
I have made code that stores utf-8 in a database.
It shows it well in the browser but looks distorted in the database. Since the functionality seems to work and it doesn't look like I have had any problems with processing the string input, is it any point in 'fixing what is not broken' and make utf-8 characters like Japanese show in the database?
I don't search the database since the strings are serialized anyway.
You have to specify the text encoding of the queries, you are sending to MySQL with for instance
SET NAMES `utf8` COLLATE `utf8_unicode_ci`
If you don't, MySQL may interpret your query with the servers default text-encoding that can be different to UTF-8, e.g. iso-latin. So you will have strings in your tables, that are UTF-8 encoded, but MySQL marked them as iso-latin. That won't have much effect on your code, because MySQL just returns your UTF-8 strings back to you and you ignore the text-encoding. If you view the data in phpMyAdmin or any other application, that sets the connections character encoding, you will end up with distorted strings.
You could on the other hand utf8_decode your query strings and utf8_encode the result's provided by MySQL and don't change the connections text encoding from iso-latin. but if you query a different MySQL server that uses UTF-8 as default text encoding, you will end up with the same problem the other way around. so just set the connection's text encoding once after connecting.
What do you use to access the database. If you use a console just the the encoding in the console to utf-8. If you use GUI software just check the options the set the encoding to utf-8. You can try 'set names' to ser the client encoding.
I have been using php + mysql (phpmyadmin) to construct websites with Chinese contents (utf-8) for a long time.
When inputting forms, and also generate output php from db, the Chinese Words display well; but when I look at the database, although sometimes they are normal chinese characters, but something they are not (become strange strings), that made me notice that, the way that mysql handle and input data is not always utf-8.
Some experts on web mentioned, mysql were used to record the input data by latin1; nevertheless, I note that the existing charset in phpmyadmin is utf-8...
Will there be any solid way to detect the encoding format of chinese characters appeared in a phpmyadmin table cell?
Also, apart from mentioning at header of the page, will there be any method so that I can make sure the data entered to the db is utf-8 but not others?
Thank you.
The biggest problem that people encounter in this regard is that they don't tell MySQL that they're sending/expecting UTF-8 encoded data when connecting to the database, so MySQL thinks it's supposed to handle latin1 encoded data and converts it accordingly. Issue the command SET NAMES utf8 after connecting to the db or use mysql_set_charset.
in my case, it just because htmlentities(); Solution is change echo htmlentities($email_db); to echo htmlentities($email_db, ENT_COMPAT, 'UTF-8');