How do i have a SQL query with UTF-8 chars in? - php

<?php
$path = "mysql:host=localhost;dbname=kanjisearch;charset=utf-8";
$pdo = new PDO($path, "xxx", "xxx");
$sth = $pdo->prepare("SELECT * FROM `kanji` WHERE `kanjiName` = '$kanji'");
$sth->execute();
$results = $sth->fetchAll();
?>
I am building a PHP search that uses Chinese characters in the actual SQL query, I have -> exec("SET CHARACTER SET utf8"); on the pdo. And it renders just fine, the text I bring back renders the charects great, if I put numbers ins the sql but I need to search based on the Kanji, so when I query with WHERE kanji = 一, the 一 doesn't seem to go through ok, it renders on the screen on the debug, but just wont query, results return empty but when I copy the query directly into PHPMyAdmin it returns the result.

Were your tables and or columns created with the UTF-8 character set as the default encoding? By default MySQL creates tables with a latin based charset and collate options. UTF-8, especially none latiin characters, gets mangled when it is saved in a latin encoded column.
Try changing the table's charset to use UTF8 using the ALTER table command and reloading/inserting your data.

Try add encoding (accept-charset="UTF-8" and charset=utf-8") to your form and html head:
<form method="post" action="yourAction" accept-charset="UTF-8">
<meta http-equiv="Content-type" content="text/html; charset=utf-8"/>
EDIT: also have a look at this similar question here. (by shark555)
$pdo = new PDO(
'mysql:host=hostname;dbname=defaultDbName',
'username',
'password',
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8")
);

$kanji needs to be handled as a particular/specific character set.
so, what's typed in is likely not seen as a UTF-8 encoded string. you need to handle this.
note: http://www.fileformat.info/info/unicode/char/4e00/index.htm
basically you want to convert to UTF-8. but try this:
http://us1.php.net/mb_convert_encoding
like
mb_convert_encoding($text, 'UTF-8');
or if you know the encoding of the language stick as the 3rd param.
or try auto: mb_convert_encoding($text, 'UTF-8', 'auto');

Related

why does my html not display special characters taken from my database

I included this at the top of my php file:
<?php
header('Content-Type: text/html; charset=UTF-8');
?>
I did this because my file.php was not displaying "á, é, í, ó, ú or ¿" in the html file or from data queried from my database.
After I placed the 'header('Content-Type: text/html; charset=UTF-8');' line of code my html page started to understand the special characters in the html file but, data received from my database now has a black rhombus with a question mark.
The collation my database has is "utf8_spanish_ci"
at the html tag i tried to put lang=es but this never worked I also tried to put the meta tag inside the head tag
<!DOCTYPE html>
<html lang=es>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<head>
I also tried:
<meta charset="utf-8">
and:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
I don't know what the problem is. When I insert data directly into the data base the special characters are there but when I insert them from my file.php they appear with random characters.
Does anyone know why this is happening?
There are a couple of reasons this could be happening. It is however important that your entire line of code uses the same set of charset, and that all functions that can be set to a specific charset, is set to the same. The most widely used one is UTF-8, which is the one I'm suggesting you use.
Connection
You also need to specify the charset in the connection itself.
PDO (specified in the object itself):
$handler = new PDO('mysql:host=localhost;dbname=database;charset=utf8', 'username', 'password', array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET UTF8"));
MySQLi: (placed directly after creating the connection)
* For OOP: $mysqli->set_charset("utf8");
* For procedural: mysqli_set_charset($mysqli, "utf8");
(where $mysqli is the MySQLi connection)
MySQL (depricated, you should convert to PDO or MySQLi): (placed directly after creating the connection)
mysql_set_charset("utf8");
Database
Your database and all its tables has to be set to UTF-8. Note that charset is not the same as collation.
You can do that by running the queries below once for each database and tables (for example in phpMyAdmin)
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
File-encoding
It's also important that the .php file itself is UTF-8 encoded. If you're using Notepad++ to write your code, this can be done in the "Format" drop-down on the taskbar (Convert to UFT-8 w/o BOM). You should use UTF-8 w/o BOM.
Should you follow all of the pointers above, chances are your problem will be solved. If not, you can take a look at this StackOverflow post: UTF-8 all the way through.
Are you sure? And are you sure that you are retrieving your data from the data base? Having said that, most databases require you to save data in a way that is NOT exactly like your question. There is really valid security reasons for this.
You should use utf8_general_ci as a database encoding also before insert query you should run this query
Mysql_query(" SET NAMES 'utf8'");

Ñ characters showing as ñ

Firstly, I know this question has been asked a lot of times but all the answers I got, they say that I have to put all my encoding into UTF-8, but actually, they're all with UTF-8 and still not working!! Here is my problem:
I have a registration in my webpage, and when a user with a ñ in his name registers, it stores it in the database as ñ, so if I register with the name "ñaña", it goes to the db as "ñaña".
I have set my db table and database to utf8_general_ci, and I also have this code in my header:
<meta charset="utf-8">
<meta http-equiv="Content-type" content="text/html;charset=utf-8" />
And this code in my PDO connection:
$connection = new PDO("mysql:charset=utf8;host=$host; dbname=$dbname", $user, $password);
But it's still stored in the db as a different character... Also, all the accents like "à, é, ê, ö..." are working fine.
What am I missing?
Thanks.
You may have created your tables with a character encoding that doesn't support special. characters; that would explain what you're seeing. You can try these SQL commands to discover the charset of your database/table/column.
If your tables are encoded with something other than utf8, you can use these other SQL commands to convert them to utf8. But I'm not sure what will happen with the current data.
You can create a function to replace the values
function sanitize($string)
{
$string = str_replace('ñ', 'ñ', $string);
return $string;
}
Hope it works for you

UTF-8 and German characters?

I have problem with German characters on my web site,
in html/php part of website i have this code to set utf-8:
<meta charset="utf-8">
in mysql, i have this code to set utf-8
SET CHARSET 'utf8';
Here is some word on German: Gemäß
Here is how that word looks in mysql table:
Gemäß
Here is how that word is shown on the site: Gemäß
What is a problem? Thanks.
I was using this code to get title:
$title = mysql_real_escape_string(htmlentities($_POST['title']));
I just override that to
$title = $_POST['title'];
At first, make sure, that you have UTF-8 characters in your database.
After that, try using SET NAMES 'UTF8' after connecting to MySQL:
$con=mysqli_connect("host", "user", "pw", "db");
if (!$con)
{
die('Failed to connect to mySQL: ' .mysqli_connect_errno());
}
mysqli_query($con, "SET NAMES 'UTF8'") or die("ERROR: ". mysqli_error($con));
As the manual says:
SET NAMES indicates what character set the client will use to send SQL
statements to the server... It also specifies the character set that the server should
use for sending results back to the client.
Try SET NAMES 'utf8' or SET NAMES 'utf-8'. Some of these works fine for portuguese, probably for german too. I just can't remember which one is correct, but if it is not, an error will be produced.
you should make sure that the CONNECTION is also utf-8.
with mysqli this is done with something like this:
$connection = mysqli_connect($host, $user, $pass, $db_name);
$connection->set_charset("utf8");
Now if somehow you ended up with wrong characters in the database there is a way to make it right:
in a PHP script, retrieve the information as you do now, i.e without setting the connection. This way the mistake will be inverted and corrected and in your php file you will have the characters in the correct utf-8 format.
in a PHP script, write back the information with setting the connection to utf-8
at this point you should see the character correct in your database
now change all your read/write functions of your site to use the utf-8 from now on
in HTML5 use
<meta charset="utf-8">
in HTML 4.0.1 use
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
the results are html entity encoded as if they were processed by htmlentities(), I wonder if your variables are ibserted as received from the form or are being processed by say a wysiwg editor for instance?
Anyway, these should print fine on an html template but an html_entity_decode() should do it to.
Hope this helps
Set the data type in your database to use UTF-8 as well, this should solve the problem.
I had the same problem. which I solved by using:
if you have already created your table, you need the modify the character set as:
alter table <table name> convert to character set utf8 collate utf8_general_ci.
your tables character set is set to latin_swedish by default by MySQL.
also, you might face some problems while retrieving the data and displaying it to you page.For that include: mysql_set_charset('utf8') just below the line where you have connected your database.
eg:
mysql_connect('localhost','root','');
mysql_select_db('my db');
mysql_set_charset('utf8');
You will need to do this for php 5.x
$yourNiceLookingString =
htmlspecialchars ($YourStringFromDB, ENT_COMPAT | ENT_HTML401, 'ISO-8859-1');
and for php 4.x
$yourNiceLookingString = htmlspecialchars($YourStringFromDB);

Converting html entities to utf-8 and inserting them into a mysql database

I am trying to convert a string from HTML-ENTITIES to UTF-8 and then save the encoded string in my database. The html entities are greek letters and look for example like this: νω
Now I tried thousands of different ways, starting from just using utf8_encode or html_entity_decode until now I came across the function mb_convert_encoding().
Now the really weird thing is that when converting my string and then outputting it, it is correctly encoded to utf-8, but when inserting this string into my database I end up getting something like: ξÏνω.
This is the code for the encoding:
header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');
......
while($arr = $select->fetch_array(MYSQLI_ASSOC))
{
$text = $arr["greek"];
$result = mb_convert_encoding($text, 'UTF-8', 'HTML-ENTITIES');
$mysqli->query("UPDATE some SET greek = '".$result."'");
}
When outputting my query and then manually doing a sql query in phpmyadmin it works fine, so it doesnt seem to be a problem of my db. There must be some problem when transferring the encoded string to my database...
As you see in your script, you are instructing the browser to use UTF8. That is the first step.
However your database needs the same thing and also the encoding/collation on the tables need to be UTF8 too.
You can either recreate your tables using utf8_general_ci or utf8_unicode_ci as the collation, or convert the existing tables (see here)
You need to also make sure that your database connection i.e. php code to mysql is using UTF8. If you are using PDO there are plenty of articles that show how to do that. The simplest way is to do:
$mysqli->query('SET NAMES utf8');
NOTE The change you will make now is final. If you change the connection encoding to your database, you could affect existing data.
EDIT You can do the following to set the connection
$mysqli = new mysqli($host, $user, $pass, $db);
if (!$mysqli->set_charset("utf8")) {
die("Error loading character set utf8: %s\n", $mysqli->error);
}
$mysqli->close();
Links of interest:
Whether to use "SET NAMES"
Execute the SET NAMES 'utf8' query prior to any others.

Save Accents in MySQL Database

I'm trying to save French accents in my database, but they aren't saved like they should in the DB.For example, a "é" is saved as "é".I've tried to set my files to "Unicode (utf-8)", the fields in the DB are "utf8_general_ci" as well as the DB itself.When I look at my data posted through AJAX with Firebug, I see the accent passed as "é", so it's correct.Thanks and let me know you need more info!
Personally I solved the same issue by adding after the MySQL connection code:
mysql_set_charset("utf8");
or for mysqli:
mysqli_set_charset($conn, "utf8");
or the mysqli OOP equivalent:
$conn->set_charset("utf8");
And sometimes you'll have to define the main php charset by adding this code:
mb_internal_encoding('UTF-8');
On the client HTML side you have to add the following header data :
<meta http-equiv="Content-type" content="text/html;charset=utf-8" />
In order to use JSON AJAX results (e.g. by using jQuery), you should define the header by adding :
header("Content-type: application/json;charset=utf8");
json_encode(
some_data
);
This should do the trick
The best bet is that your database connection is not UTF-8 encoded - it is usually ISO-8859-1 by default.
Try sending a query
SET NAMES utf8;
after making the connection.
mysqli_set_charset($conn, "utf8");
if you use PDO, you must instanciate like that :
new \PDO("mysql:host=$host;dbname=$schema", $username, $password, array(\PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8') );
Use UTF8:
Set a meta in your
<meta http-equiv="Content-type" content="text/html;charset=utf-8" />
When you connect to your mySQL DB, force encoding so you DONT have to play with your mysql settings
$conn = mysql_connect('server', 'user', 'password') or die('Could not connect to mysql server.');
mysql_select_db('mydb') or die('Could not select database.');
mysql_set_charset('utf8',$conn); //THIS IS THE IMPORTANT PART
If you use AJAX, set you encoding like this:
header('Content-type: text/html; charset=utf-8');
Have you reviewed http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html:
Client applications that need to
communicate with the server using
Unicode should set the client
character set accordingly; for
example, by issuing a SET NAMES 'utf8'
statement. ucs2 cannot be used as a
client character set, which means that
it does not work for SET NAMES or SET
CHARACTER SET. (See Section 9.1.4,
“Connection Character Sets and
Collations”.)
Further to that:
if you get data via php from your
mysql-db (everything utf-8) but still
get '?' for some special characters in
your browser (), try this:
after mysql_connect() , and
mysql_select_db() add this lines:
mysql_query("SET NAMES utf8");
worked for me. i tried first with the
utf8_encode, but this only worked for
äüöéè... and so on, but not for
kyrillic and other chars.
You need to a) make sure your tables are using a character encoding that can encode such characters (UTF-8 tends to be the go-to encoding these days) and b) make sure that your form submissions are being sent to the database in the same character encoding. You do this by saving your HTML/PHP/whatever files as UTF-8, and by including a meta tag in the head that tells the browser to use UTF-8 encoding.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Oh, and don't forget C, when connecting to the database, make sure you're actually using the correct character set by executing a SET NAMES charset=utf8 (might not be the correct syntax, I'll have to look up what it should be, but it will be along those lines)
PHP(.net) advises against setting charsets after connecting using a query like SET NAMES utf8 because your functionality for escaping data inside MySQL statements might not work as intended.
Do not use SET NAMES utf8 but use the appropriate ..._set_charset() function (or method) instead, in case you are using PHP.
Ok I have found a working solution for me :
Run this mysql command
show variables like 'char%';
Here you have many variables : "character_set_server", "character_set_system" etc.
In my case I have "é" for "é" in database and I want to show "é" on my website.
To work I have to change "character_set_server" value from "utf8mb4" to "latin1".
All my correct value are :
And other values are :
With theses values the wrong database accent are corrected and well displayed by the server.
But each case can be different.

Categories