My database is UTF-8 (PostgreSQL). I saved 'TESTµTEST' into the database and it's OK. But when I selected this value from the database I saw 'TESTµTEST'.
Moreover, when I made a request select * from tbl where f='TESTµTEST', I got this error:
ERROR: invalid byte sequence for encoding "UTF8": 0xb5.
Would you please give me any solutions?
That error shows that you are trying to decode latin-1 text as if it were utf-8. Most likely your client_encoding setting in PHP doesn't match the encoding of the data you're actually sending.
The string "TESTµTEST" is produced by encoding data from Unicode to a utf-8 byte sequence, then decoding it as latin-1. You can see this in psql:
regress=# select convert_from(convert_to('TESTµTEST','utf-8'),'latin-1');
convert_from
--------------
TESTµTEST
If the PostgreSQL database were utf-8 it would convert latin-1 input to utf-8 if client_encoding was correctly set to latin-1. If client_encoding is incorrectly set to utf-8 and you send latin-1 encoded data, PostgreSQL will refuse to accept it with the message:
invalid byte sequence for encoding "UTF8": 0xb5
... which is what happens when you run that SELECT you showed. So - I'd say your client is set to client_encoding = 'utf-8' but your PHP scripts are actually sending latin-1 data. I expect that's because, as #dezso says, you're editing your PHP scripts with a text editor that's using the latin-1 encoding.
To find out which encoding PHP is using, use a PHP database connection to run SHOW client_encoding;.
To show the database encoding, run:
SELECT d.datname, pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding"
FROM pg_database d WHERE datname = 'my_db_name_here';
Oh, another possibility is that Apache (or whatever) expects your PHP scripts to be utf-8 encoded, but they're actually latin-1 encoded files.
I ran into the same error when copying tables into PostgreSQL 9.1 which contained the same symbol (tables from the standard nutrient database v26). I recreated the database with the new encoding, but I also had to specify the appropriate locale and template.
CREATE DATABASE testdb
WITH OWNER = postgres
ENCODING = 'LATIN1'
LC_COLLATE = 'eng_canada.28591'
LC_CTYPE = 'eng_canada.28591'
TEMPLATE = template0;
Related
How do I create a database with UTF-8 encoding and pt-BR.UTF-8 collation?
I'm using PostgreSQL 9.2, and it creates databases with UTF-8 encoding, but with Portuguese.Brazil.1252 encoding by default.
I've tried to create one with the following statement:
CREATE DATABASE "example_db"
WITH OWNER "postgres"
ENCODING 'UTF8'
LC_COLLATE = 'pt-br.UTF-8'
LC_CTYPE = 'pt-br.UTF-8'
TEMPLATE template0;
but it returns the error:
Error: ERROR: invalid locale name: "pt-br.UTF-8"
SQLState: 42809
ErrorCode: 0
I want to set that location to resolve the error on Laravel:
Malformed UTF-8 characters, possibly incorrectly encoded
9To get a list of all collations available, query the pg_collation system catalog. You can only use the collations you find there.
The collation you choose also has to be compatible with your encoding, but it looks like you are trying to do that anyway. Also, as an exception to the rule, you can use any collation with UTF8 on Windows.
The error message you show is unrelated to collations, however. It must be an encoding problem in laravel — at any rate, it does not come from the database.
I'm in a situation where I need to update some rows in a table named "matrículas'. The query looks something like this:
UPDATE `matrículas` SET...
When I run this query in my SQL program (HeidiSQL) directly, it executes without problems. When I do it in PHP via a PDO object, I get the following error:
SQLSTATE[HY000]: General error: 1300 Invalid utf8 character string: 'matr\xEDculas'
My PDO object is set up like this:
$db= new PDO(
'mysql:host='.$credentials['host'].';dbname='.$credentials['dbname'].';charset=utf8',
$credentials['user'],
$credentials['password'],
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8")
);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
The actual update is done by taking the above query and doing this:
$query = $this->db->prepare($sql);
$query->execute($params);
Both the table and the database were created using the utf8_general_ci collation.
Any ideas what I'm doing wrong? btw, I'm currently testing in Windows in case that has anything to do with it...
ERROR 1300 (HY000): Invalid utf8 character string: 'matr\xEDculas'
The \xNN notation gives the hex encoded value for the invalid byte(s) in the character string.
Unicode code point 237 (í), when encoded in utf-8, is a 2-byte character that is encoded as 0xC3 0xAD... but the error shows 0xED, which happens to be the ISO/IEC-8859-1 (Latin1) encoding for the character í.
Since the error is related to a column name being passed from the script rather than external data, that suggested what turned out to be the issue -- that the PHP script, itself, had the column name encoded incorrectly, since the character set in which the script had been saved was ISO-8859-1 rather than UTF-8.
`matrículas`
this is cp866-gp2312 encoding
please change it to utf-8 like matriculas
i having the different encoding style
If you must use accented letters in table names, then they must be encoded in UTF-8 in the client.
That is, it is not a PDO problem, but an encoding problem is your source editor/language/whatever.
I am using PHP 5.3.3 and MySQL 5.1.61. The column in question is using UTF-8 encoding and the PHP file is encoded in UTF-8 without BOM.
When doing a MySQLi query with a ² character in SQLyog on Windows, the query executes properly and the correct search result displays.
If I do this same exact query in PHP, it will execute but will show 0 affected_rows.
Here's what I tried:
Using both LIKE instead of =
Changing the encoding of the PHP file to ANSI, UTF-8 without BOM, and UTF-8
Doing 'SET NAMES utf-8' and 'latin1' before running the query
Did header('Content-Type: text/html; charset=UTF-8'); in PHP
Escaping using MySQLi::real_escape_string
Doing a filter_var($String, FILTER_SANITIZE_STRING)
Tried a MySQLi stmt bind
The only way I could get it to work properly is if I swapped the ² for a % and changed = to LIKE in PHP.
How can I get it query properly in PHP when using the ²?
You should be able to get the query to work by ensuring the following:
Prepping PHP for UTF-8
You first need to make sure the PHP pages that will be issuing these queries are served as UTF-8 encoded pages. This will ensure that any UTF-8 output coming from the database is displayed properly. In Firefox, you can check to see if this is the case by visiting the page you're interested in and using the View Page Info menu item. When you do so, you should see UTF-8 as the value for the page's Encoding. If the page isn't being served as UTF-8, you can do so one of two ways. Either you can set the encoding in a call to header(), like this:
header('Content-Type: text/html; charset=UTF-8');
Or, you can use a meta tag in your page's head block:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Prepping MySQL for UTF-8
Next up, you need to make sure the database is set up to use the UTF-8 encoding. This can be set at the server, database, table, or column levels. If you're on a shared host, you probably can only control the table and column levels of your hierarchy. If you have control of the server or database, you can check to see what character encoding they are using by issuing these two commands:
SHOW VARIABLES LIKE 'character_set_system';
SHOW VARIABLES LIKE 'character_set_database';
Changing the database level encoding can be done using a command like this:
(CREATE | ALTER) DATABASE ... DEFAULT CHARACTER SET utf8;
To see what character encoding a table uses, simply do:
SHOW CREATE TABLE myTable;
Similarly, here's how to change a table-level encoding:
(CREATE | ALTER) TABLE ... DEFAULT CHARACTER SET utf8;
I recommend setting the encoding as high as you possibly can in the hierarchy. This way, you don't have to remember to manually set it for new tables. Now, if your character encoding for a table is not already set to UTF-8, you can attempt to convert it using an alter statement like this:
ALTER TABLE ... CONVERT TO CHARACTER SET utf8;
Be very careful about using this statement! If you already have UTF-8 values in your tables, they may become corrupted when you attempt to convert. There are some ways to get around this, however.
Forcing MySQLi to Use UTF-8
Finally, before you connect to your database, make sure you issue the appropriate call to say that you are using the UTF-8 encoding. Here's how:
$db = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);
// Change the character set to UTF-8 (have to do it early)
if(! $db->set_charset("utf8"))
{
printf("Error loading character set utf8: %sn", $db->error);
}
Once you do that, everything should hopefully work as expected. The only characters you need to worry about encoding are the big 5 for HTML: <, >, ', ", and &. You can handle that using the htmlspecialchars() function.
If you want to read more (and get links to additional resources), feel free to check out the articles I wrote about this process. There are two parts: Unicode and the Web: Part 1, and Unicode and the Web: Part 2. Good luck!
im facing a really stressing problem here.. i have everything in UTF-8 , all my DB and tables are utf8_general_ci but when trying to insert or update from a single PHP script all i see are symbols.. but if i edit in phpmyadmin the words are shown correctly.. i found that if i run the utf8_decode() function to my strings in php i can make it work, but im not planning to do that because is a mess and it should work without doing that :S
Here is a basic code im using to test this:
<?php
$conn=mysql_connect("localhost","root","root")
or die("Error");
mysql_select_db("mydb",$conn) or
die("Error");
mysql_query("UPDATE `mydb`.`Clients` SET `name` = '".utf8_decode("Araña")."' WHERE `Clients`.`id` =25;",
$conn) or die(mysql_error());
mysql_close($conn);
echo "Success.";
?>
This is what i get if i dont decode utf8 with php utf8_decode function:
instead of Araña, i get : Araña
I've run into the same issue many times. Sometimes it's because the type of database link I'm selecting from isn't the same type that I'm using for inserting and other times, it's from file data into a database.
For the later instance, mysql_set_charset('utf8',$link); is the magic answer.
Place the call to mysql_set_charset just after you select your database via mysql_select_db.
#ref http://php.net/manual/en/function.mysql-set-charset.php
"Araña" IS UTF-8. The characters "ñ" represent the two bytes into which the Spanish ñ are encoded in UTF-8. Whatever you're reading it back with is not handling the UTF-8 and is displaying it as (it appears) ISO-8859-1.
That DDL you mentioned has to do with the collation, not the character set. The correct statement would be:
ALTER TABLE Clients CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
You still need to make sure the client library (libmysql or whatever driver PHP is using) is not transcoding the data back to ISO-8859. mysql_set_charset('utf8') will explicitly set the client encoding to UTF-8. Alternatively, you can send a SET NAMES UTF8; right after you connect to the database. To do that implicitly, you can change the my.cnf [client] block to have utf-8 as the client character encoding (and /etc/init.d/mysql reload to apply). Either way, make sure the client doesn't mangle the results it's pulling.
[client]
default-character-set=utf8
You do not need to use utf8_decode if you're using mbstrings. The following php.ini configuration should ensure UTF-8 support on the PHP side:
mbstring.internal_encoding = utf-8
mbstring.http_output = utf-8
mbstring.func_overload = 6
Finally, when you display the results in HTML, verify that the page's encoding is explicitly UTF-8.
I've a MySQL table that has a UTF-8 charset and upon attempting to insert to it via a PHP form, the database gives the following error:
PDOStatement::execute():
SQLSTATE[HY000]: General error: 1366
Incorrect string value: '\xE8' for
column ...
The character in question is 'è', yet I don't see why this should be a problem considering the database and table are set to UTF-8.
Edit
I've tried directly from the mysql terminal and have the same problem.
Your database might be set to UTF-8, but the database connection also needs to be set to UTF-8. You should do that with a SET NAMES utf8 statement. You can use the driver_options in PDO to have it execute that as soon as you connect:
$handle = new PDO("mysql:host=localhost;dbname=dbname",
'username', 'password',
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
Have a look at the following two links for more detailed information about making sure your entire site uses UTF-8 appropriately:
UTF-8 all the way through…
UTF8, PHP and MySQL
E8 is greater than the maximum usable character 7F in a one-byte UTF8 character: http://en.wikipedia.org/wiki/UTF-8
It seems your connection is not set to UTF8 but some other 8 bit encoding like ISO Latin. If you set the database to UTF8 you only change the character set the database uses internally, connections may be on a different default value (latin1 for older MySQL versions) so you should try to send an initial SET CHARACTER SET utf-8 after connecting to the database. If you have access to my.cnf you can also set the correct default value there, but keep in mind that changing the default may break any other sites/apps running on the same host.
Before passing the value to Mysql you can use the following code:
$val = mb_check_encoding($val, 'UTF-8') ? $val : utf8_encode($val);
convert the string the to UTF-8, If it's matter of only one field.