view Persian/Arabic data in mysql with django - php

I have table with Persian data and utf8_general_ci collection and with php program i was inserted data to database.
now i have new program with python - django and want view data but all data is bad view like پست
why? and what i can do for solve this problem?
ps: when i insert new data with python, all things is correct and view correctly.

If you’re running into the problem where unicode items in your Django / MySQL project are displayed as question marks, here’s the likely problem and solution, found in this django-users thread:
The likely problem is that your MySQL encoding is set to latin1, as opposed to utf8. You can check this via:
mysqld --verbose --help | grep character-set
You’ll probably see:
character-set-server latin1
You want this to be uft8. To modify it, edit your my.conf file ( /etc/mysql/my.conf on ubuntu ), adding the following lines to the appropriate sections:
[client]
...
default-character-set = utf8
[mysqld]
...
character-set-server=utf8
collation-server=utf8_unicode_ci
init_connect='set collation_connection = utf8_unicode_ci;'
Now restart mysql:
sudo /etc/init.d/mysql restart
And alter your existing tables to use the utf8 encoding:
mysql your_db_name
alter table your_table_name convert to character set utf8;
And that should do it.

Can you please check what the charset of the html page? It should be like <meta content='text/html; charset=UTF-8' http-equiv='Content-Type'/>

Related

MySQL 5.7 character_set_client stack at utf8mb4

Long story short: we have a PHP-based self-developed CMS, originally on PHP5.x and MySQL, using a healthy combination of utf8 and iso-8859-1 char-sets (don't judge, I know it's weird but it's working). On our production environment our server provider upgraded to PHP7.2 and (after a few weeks of refactoring) everything works just fine.
Parallel to this production environment I've set up (or at least I tried to) a test environment for our development, VirtualBox Ubuntu 20.04, apache2.4, PHP7.2 and MySQL5.7.
in /etc/php/7.2/apache2/php.ini I have:
default_charset = "iso-8859-1"
in /etc/mysql/my.cnf I have:
[client]
default-character-set = utf8
[mysqld_safe]
default-character-set = utf8
[mysql]
default-character-set = utf8
[mysqld]
init_connect = 'SET NAMES utf8'
character-set-client-handshake = false #force encoding to uft8
character-set-server = utf8
collation-server = utf8_unicode_ci
Now, on our development server the character_set_client=utf8mb4 and character_set_results=utf8mb4 and I can't find a way to change it.
The problem is, that when I try to import on our development server dumps from our production server (through our CMS), or when I try to save texts with special characters like ü or ä it always cuts the word at the occurrence and saves only the rest, e.g. instead of chüd will save only ch or instead of einträge it saves only eintr.
However I can save ü manually in DB without a problem (don't have to use ü)
(we have a second development server, Ubuntu 14.04, apache2.4, PHP5.6, MySQL5.7 and basically the same settings as on PHP7.2 testserver, and everything works fine)
Maybe PHP7.2 is doing the mess here, I am really out of ideas.
Any help will be appreciated. Thank you
See "truncation" in Trouble with UTF-8 characters; what I see is not what I stored
I wonder if having apache not set to UTF-8 messes up <form>s.
init_connect = 'SET NAMES utf8' sets 3 CHARACTER_SET_% values if you are not connecting as "root". So, change it to utf8mb4 and do not connect as "root".
Are you sure about the encoding in the imported data? (I suspect this causes the truncation problem.) Can you get a hex dump of a small portion of the data.
For Western European languages, MySQL's utf8 and utf8mb4 work the same. That is, the init_connect that you have should be adequate _if the incoming data is really UTF-8, not iso...
For reference here are hex values:
char latin1 utf8
ä E4 C3A4
ü FC C3BC

database stores strange characters [duplicate]

I have problem with directly inserting foreign characters like "ó,č,ĕ,ř" characters into database. dont working even with my php frontend to be sure there is no transformation or other encoding. So im using logged in psql directly and here is my setup :
server_encoding
-----------------
UTF8
(1 row)
and
client_encoding
-----------------
UTF8
(1 row)
database is :
Name | Owner | Encoding | Collate | Ctype |
my_db | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
So i guess there should be no problem.
I created this :
CREATE TABLE test (a text);
and now i want to insert some text
INSERT INTO TEST (a) ('ó');
And there is a message :
ERROR: invalid byte sequence for encoding "UTF8": 0xf327293b
Is there anyone who can help me please? it looks like it was ignoring my input encoding or i really dont know.
EDIT :
my terminal configuration
LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
EDIT2:
my_db=# \encoding
UTF8
EDIT3: psql from file
file
file -bi test
text/plain; charset=utf-8
execute
ERROR: syntax error at or near "'Ăł'"
LINE 1: INSERT INTO tes (a) ('Ăł');
EDIT4:
set client_encoding='latin1';
this works in psql but i need it to works with utf8. I know its possible i used this setup everytime with mysql databases and it works like a charm.
My jdbc driver needs it to be UTF8.
EDIT5:
Here is what am i doing here : click me
Before its stored i can see it - so php is working fine - , but after then when i read it from DB i cant see it. Thats because i moved closer to DB into psql to see whats going on. It looks like maybe server issue. Is it possible server can't handle that characters?
EDIT6:
Tomcat config
-Dfile.encoding=UTF8
URI encoding is set to UTF8 too. Where can be that problem? :(
If your shell is in latin1 encoding, as it appears from the comments, this will fix it:
set client_encoding = 'latin1';
If you don't want to change the client's system encoding you can change the default in postgresql.conf
client_encoding = latin1
Or change PHP's default character encoding:
default_charset = "utf-8";
Do it also in the Apache, or whatever http server you are using, config:
AddDefaultCharset UTF-8
Just another debugging test (I still think it's a terminal thing): can you write the insert statement in a UTF-8 encoded file and try to run the command from the file? Eg:
psql my_db -U postgres -f <utf8-encoded-file>
If this works fine then it's back to the terminal somehow ...
According to the comments you're using PuTTY, which defaults to latin-1. You need to configure PuTTY to use UTF-8. Just setting the server locale won't do any good unless your PuTTY encoding matches what the environment claims the encoding is.
Open PuTTy. Under the Window settings heading choose the Translation sub-heading. Set "Remote character set" to "utf-8". In the Fonts sub-tab make sure you are using a font with reasonable Unicode coverage. Then, in the Session menu type a name into the "saved settings" text entry box and type "save" to save your settings as a profile. You can override the "Default Settings" profile by selecting it and setting Save, but this will affect all future connections and new profiles so it may cause confusion if you use other servers that aren't utf-8.
(These instructions are based on my PuTTY on Fedora 18; there may be some differences in UI details in the recent Windows versions. If in doubt, search for how to set PuTTY to use utf-8.)

UTF8 characters from database don't show up properly in the browser - MySQL & PHP CodeIgniter

My database and tables are set to utf8_general_ci collation and utf8 charset. CodeIgniter is set to utf8. I've added meta tag charset=utf8, and I'm still getting something like: квартиры instead of cyrillic letters...
The same code running on the local machine works fine - Mac OSX. It's only breaking in the production machine, which is Ubuntu 11.10 64bit in AWS EC2. Static content from the .php files show up correctly, only the data coming from the database are messed up. Example page: http://dev.uzlist.com/browse/cat/nkv
Any ideas why?
Thanks.
FYI:
When I do error_log() the data coming from the database, it's the same values I'm seeing on the page. Hence, it's not the browser-server issue. It's something between mysql and php, since when I run SELECT * FROM categories, it shows the data in the right format. I'm using PHP CodeIgniter framework for database connection and query and as mentioned here, I have configured it to use utf8 connection and utf8_general_ci collation.
Make sure your my.cnf (likely to be in /etc/) has the following entries :
[mysqld]
default-character-set=utf8
default-collation=utf8_general_ci
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
[client]
default-character-set=utf8
You'll need to restart the mysql service once you make your changes.
Adding my comments in here to make this a little clearer.
Make sure the following HTTP header is being set so the browser knows what charset to expect.
Content-type: text/html; charset=UTF-8
Also try adding this tag into the top of your html <head> tag
<meta http-equiv="Content-type" value="text/html; charset=UTF-8" />
To make the browser show up correctly.you should check three points:
encoding of your script file.
encoding of connection.
encoding of database or table schema.
if all of these are compatible, you'll get the page you want.
The original data has been encoded as UTF-8, the result interpreted in Windows-1252 and then UTF-8 encoded again. This is really bad; it isn't about a simple encoding mismatch that a header would fix. Your data is actually broken.
If the data is ok in the database (check with SELECT hex(column) FROM myTable) to see if it was double encoded already in the database), then there must be your code that is converting it to UTF-8 on output.
Search your project for uses of function utf8_encode, convert_to_utf8, or just iconv or mb_convert_encoding. Running
$ grep -rn "\(utf8_\(en\|de\)code\|convert_to_utf8\|iconv\|mb_convert_encoding\)" .
On your application's /application folder should be enough to find something.
Also see config values for these:
<?php
var_dump(
ini_get( "mbstring.http_output" ),
ini_get( "mbstring.encoding_translation" )
);
Well, if you absolutely and positively sure that your mysql client encoding is set to utf8, there are 2 possible cases. One - double encoding - described by Esailija.
But there is another one: you have your data actually encoded in 1251, not in utf-8.
In this case you have to either recode your data or set proper encoding on the tables. Though it is not one button push task
Here is a manual (in russian) exаctly for that case: http://phpfaq.ru/charset#repair
In short, you have to dump your table, using the same encoding set on the table (to avoid recoding), backup that dump in safe place, then change table definitions to reflect the actual encoding and then load it back.
Potentially this may also be caused by the mbstring extension not being installed (which would explain a difference between your dev and production environments)
Check out this post, might give you a few more answers.
Try mysql_set_charset('utf8') after the mysql connect. Then it should works.
After 2 days of fighting this bug, finally figured out the issue. Thanks for #yourcommonsense, #robsquires, and a friend of mine from work for good resources that helped to debug the issue.
The issue was that at the time of the sql file dump to the database (import), charset for server, database, client, and connection was set to latin1 (status command helped to figure that out). So the command line was set to latin1 as well, which is why it was showing the right characters, but the connection with the PHP code was UTF8 and it was trying to encode it again. Ended up with double encoding.
Solution:
mysqldump the tables and the data (while in latin1)
dump the database
set the default charsets to UTF8 in /etc/my.cnf as Rob Squires mentioned
restart the mysql
create the database again with the right charset and collation
dump the file back into it
And it works fine.
Thanks all for contribution!

PHP MySQL using Latin1(iso-8859-1) despite UTF-8 settings

Once again I have a weird and tricky problem.
I've been working with converting my MySQL databases (and everything else on my server for that matter) to UTF-8 to avoid having to convert text when getting and putting text into the different databases.
I think I've partially succeeded because SHOW VARIABLES LIKE 'character_set%' returns:
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
But still mysql_client_encoding($con); returns latin1 and in the output every special character is replaced with �. My conclusion is that the client or the connection between PHP and the MySQL database is using latin1 even though I've been specifying utf-8 in the document header and in my.ini with the following code:
character-set-server = utf8
character-set-client = utf8
default-character-set = utf8
edit: I've added the settings above under both [client], [mysqld] and [mysql]
When I use mysql_query('SET NAMES utf8;'); or mysql_set_charset("utf8"); the text shows properly but for me that's not a sollution, just a temporary fix.
Does anyone know how to force PHP (or whatever it is reverting to latin1) to use utf-8?
I should mention that I'm using Windows 2003 Server and Apache 2.
I've been working with converting my MySQL databases (and everything
else on my server for that matter) to UTF-8 to avoid having to convert
text when getting and putting text into the different databases.
You would still have to make sure the target database (or system) is setup to use UTF-8 by default - and the library you are using to connect to is is using the correct character set. Just because data is in UTF-8 doesn't make it universally compatible.
When I use mysql_query('SET NAMES utf8;'); or
mysql_set_charset("utf8"); the text shows properly but for me that's
not a sollution, just a temporary fix.
This is the proper solution. Since your connection's encoding determines how the data is received by your client (in this case, PHP).
To set it as default; you need to configure MySQL appropriately and then restart the server:
[client]
default-character-set=UTF8
Note that this affects all clients even the command line mysql client.

diacritics problem in project made with Zend Framework

found a interesting problem during testing our web application.
I have application on localhost (Windows) and online testing server (Linux). Both are connected to same DB (on Linux server). When I tried to edit one text field through form in application located on Linux server it crop diacritics from result and save it to DB without it. But when I tried the same action, with the same code on locahost (Windows) it save whole text with diacritics right as I wrote it.
I've tried to check PHP configuration, but I have exact same configuration on both machines.
Does anybody have an idea where should I have to look to find what problem should cause that ?
Sounds like one or more of the character settings on the MySql instance on your Windows machine is not set to UTF8, try executing this query:
show variables like '%character%'
Your output will be the character_encoding related server variables, executing that on my database outputs:
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
My best guess is that one or more of those is set to latin1
Also, you might want to check the collation, i.e. execute this
show variables like '%collation%'
And you will get something like:
collation_connection utf8_general_ci
collation_database utf8_general_ci
collation_server utf8_general_ci
Do both servers serve the page in the same encoding?
In firefox, visit the page, right click and 'View page info'. There'll be an entry for Encoding. For these to transfer correctly you'll need UTF-8 or similar encoding.
After that, check your database setup. You'll need to set the MySQL connection to use UTF8, and your tables and columns should be set to utf8 as well.
You can set the mysql connection character set by issuing the following query after connecting:
SET NAMES 'utf8'
Found a problem of using one filter ont that field which acts differently on different system.

Categories