php - sqlsrv encoding - php

My Project builds on PHP and connect to MS SQL Server. I am using sqlsrv library. The fields type in MS SQL is nvarchar. When I define the parameters for connection I also put "utf8". It is:
global $cnf;
$cnf = array();
$cnf["mssql_user"] = "xxx";
$cnf["mssql_host"] = "xxx";
$cnf["mssql_pw"] = "xxx";
$cnf["mssql_db"] = "xxx";
$cnf["CharacterSet"] = "**UTF-8**";
When Insert records to database, for Vietnamese content and Chinese content I use:
$city = iconv('UTF-8', 'utf-16le', $post['city']);
$params = array(array($city, null, SQLSRV_PHPTYPE_STRING(SQLSRV_ENC_BINARY)));
$sql= "INSERT INTO tblCityGarden (city) VALUES(?)
$stmt = sqlsrv_query( $this->dbhandle, $sql, $params);
It inserts data OK for Vietnamese and Chinese language (the data stored in database for Vietnamese and Chinese is correct).
However when I load the records back into web, It appears the strange character (?, �).
I try some php as iconv, mb_detect_encoding, mb_convert_encoding and search many results on internet, but It cannot work. How can I display correct data
Please someone who has experiences about this issues

I had this same problem (�), but with single quotes, double qoutes, and "Rights Reserved" characters... Here is what I've found:
The CharacterSet specified seems to "rule all", so what you set this to will determine the encoding for the connection (as it should). I did not have ANY CharacterSet configured on my connection(s). Simply setting this resolved my issue, along with making sure that the values that were inserted into my DB were not double encoded via htmlspecialchars().
header()'s must be set before ANY output (this is really important)
headers cannot be set to something different later in the document
Sometimes trailing spaces before and/or after the closing ?> in your PHP file can cause issues (I don't use this closing tag, but I saw this mentioned a lot while searching)
I am not familiar with iconv(), and I am most certainly not experienced with encoding in general, but I solved my issue just by taking the time to check my headers and ensure they meet the above standards...
Your query parameters also look strange:
$params = array(array($city, null, SQLSRV_PHPTYPE_STRING(SQLSRV_ENC_BINARY)));
I have not seen a multidimensional array passed into that argument... (just a note)

Related

Query with email headers from special Latin characters rejected by PHP mysqli_query and MariaDB command line, works in HeidiSQL

I have encountered a scenario where an email from someone in Europe keeps failing to execute. After minimizing the query I've determined that after all special characters like å and é are removed the query works fine in PHP / mysqli_query. The queries also don't work in MariaDB's command line though they do work in HeidiSQL, I imagine whatever HeidiSQL uses it internally adjusts strings used in the Query tabs.
Let's get the following out of the way:
Database Character Set: utf8mb4.
Database Collation: utf8mb4_unicode_520_ci.
Database column collation: utf8mb4_unicode_520_ci.
The correct query for the request method SET CHARACTER SET 'utf8mb4' is being correctly executed.
Here is the query:
INSERT INTO example_table (example_column) VALUES ('Håko');
I should note that I tried the following (which also failed) even though I firmly believe that this issue occurs from and should be resolved via PHP:
INSERT INTO example_table (example_column) VALUES (CONVERT('Håko' USING utf8));
Here is the MariaDB error:
Incorrect string value: '\xE9rard ...'
Like I said this string is originating from an email message so I'm pretty sure that the issue is with PHP, not MariaDB. So let's go backwards to that code that seems to otherwise work. Please keep in mind that this has taken at least two days to put together in the correct order to even get the strings to appear correctly in the MariaDB query log without being incorrectly converted to UTF-8 and corrupting the special Latin characters:
<?php
$s1 = '=?iso-8859-1?Q?=22G=E9rd_Tabt=22?= <berbs#example.com>';//"Gérd Tabt" <berbs#example.com>
if (strlen($s1) > 0)
{
if (substr_count($s1, '=?') && substr_count($s1, '?= '))
{
$p = explode('?= ', $s1);
$p[0] = $p[0].'?=';
$s2 = imap_mime_header_decode($p[0])[0]->text.' '.$p[1];
}
else {$s2 = imap_mime_header_decode($s1)[0]->text;}
if (strpos($s1, '=96') !== false) {$s2 = mb_convert_encoding($s2, 'UTF-8', 'CP1252');}
else if (mb_convert_encoding($s2, 'UTF-8') == substr_count($s1, '?')) {$s2 = mb_convert_encoding($s2, 'UTF-8');}
}
else {$s2 = $s1;}
?>
There isn't any other relevant code handling this header string.
What is causing what I presume to be UTF-8 encoded strings to break PHP's mysqli_query and the MariaDB command line from working with this query?
Where did the hex E9 come from? That is encoded latin1. Yet your configuration seems to claim that your client is encoded utf8mb4. You must have the connection charset match what the encoding is in the client. The database and table and client can have a different encoding; MariaDB is happy to convert on the fly when INSERTing or SELECTing.
For more analysis, see Trouble with UTF-8 characters; what I see is not what I stored
if (mb_convert_encoding($s2, 'UTF-8') == substr_count($s1, '?'))
This makes no sense: comparing a string (converted from anything to UTF-8) against an integer (amount of matches) will only ever be equal when the converted text is '0', which is also the amount of finding '?' in it, and due to the type unsafe comparison parameter == this is the only scenario where '0' equals 0.
So your text is never converted to UTF-8 and remains whatever it was (in this case ISO-8859-1).
mb_convert_encoding($s2, 'UTF-8')
Sure you want to convert to UTF-8 without telling the source encoding? ISO-8859-1 as per email header isn't the only one to expect - why not extracting that information and passing it to the function?
MariaDB is right: you're handing over ISO-8859-1 encoded text in that case, while the DBMS expects the UTF-8 encoding.

Can not insert french string in database mysql php

I have form with input text, when i add text
Un sac à main de femme recèlerait une quantité importante de bactéries
it adds in database only Un sac
i have tried with addslashes, mysql_real_escape_string, htmlspecialchars etc. also using UTF-8 encoding, but still it can not insert whole string
YOu should use utf8_unicode_ci as your column's collation in orer for French strings to be added in it.
In order to store non-US strings in the database, you must ensure that each of the following 3 steps are correctly implemented:
You database table must be set to a charset compatible with French. To be future proof, I recommend creating tables with UTF-8. For more information see the MySQL documentation.
Your database connection must be set to a proper character set both when storing and when querying. To do this, use mysqli_set_charset() (or whatever your MySQL connector offers).
Your input form AND your view page must be served with the exact character set as your data. To do that, you will need to set the following header: header('Content-Type: text/html; charset=UTF-8'); (If you are using a different charset, change it accordingly.)
You can of course use a different character set for storage and representation but why would you want to do that?
Also, when working with databases and HTML, you should consider:
ALWAYS escape your data as it goes into the database. Use mysqli_real_escape_string() or whatever escape method your database connector offers. Also, do NOT set the connection charset by using SET NAMES UTF8, otherwise your connector library will not know what charset to use for escaping. For more information google "sql injection".
ALWAYS escape your data as it goes into HTML with htmlspecialchars(). Also pay attention to ALWAYS provide the correct character set. For more information google "xss".
After breaking my head for 2 days straight and reading all the possible answers here's what solved the problem and allows me to insert additional weird characters like em dash etc. and retrieve data without seeing weird characters.
Here's the complete step-by-step setup.
The collation of the db column need to be: utf8_general_ci
The type is: varchar(250)
In the PHP header set the default client character set to UTF8
mysql_set_charset("UTF8", $link);
Set the character set result so we can show french characters
$sql = "SET character_set_results=utf8";
$result = mysql_query($sql);
In the html header specify, so you can view the french characters:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
When inserting the data do NOT use utf8_decode, just the below will work fine
$query = 'insert into tbl (col) VALUES ("'.mysql_real_escape_string($variable).'");
Use normal queries to retreive data, example query:
$query = "select * from table;";
Finally got this fixed, hope this is helpful to others.
In the php:
header ('Content-type: text/html; charset=utf-8');
After connection:
mysql_set_charset("utf8");
Just to follow up with this, I was using dbForge Studio and just pasting in French text and I had all the collations/encoding set properly. The one thing I didn't have set was the actual encoding for the connection to the db. Set it to UTF8 and all was well again. #2 in #Janoszen answer.
Had the same problem. The input text came from ANSII file, so it wasn't quite UTF8, despite all my utf8 settings. utf8_encode(input_text) solved it.
I have tried
htmlentities()
. .it saves the string as it is in the database
You should try this to insert special character in mysql :
$con = mysql_connect($server,$uname,$pass);
$res = mysql_select_db($database,$con)
mysql_set_charset("letin1", $con);

Superscript character in PHP causing a MySQLi select query to find 0 rows

I am using PHP 5.3.3 and MySQL 5.1.61. The column in question is using UTF-8 encoding and the PHP file is encoded in UTF-8 without BOM.
When doing a MySQLi query with a ² character in SQLyog on Windows, the query executes properly and the correct search result displays.
If I do this same exact query in PHP, it will execute but will show 0 affected_rows.
Here's what I tried:
Using both LIKE instead of =
Changing the encoding of the PHP file to ANSI, UTF-8 without BOM, and UTF-8
Doing 'SET NAMES utf-8' and 'latin1' before running the query
Did header('Content-Type: text/html; charset=UTF-8'); in PHP
Escaping using MySQLi::real_escape_string
Doing a filter_var($String, FILTER_SANITIZE_STRING)
Tried a MySQLi stmt bind
The only way I could get it to work properly is if I swapped the ² for a % and changed = to LIKE in PHP.
How can I get it query properly in PHP when using the ²?
You should be able to get the query to work by ensuring the following:
Prepping PHP for UTF-8
You first need to make sure the PHP pages that will be issuing these queries are served as UTF-8 encoded pages. This will ensure that any UTF-8 output coming from the database is displayed properly. In Firefox, you can check to see if this is the case by visiting the page you're interested in and using the View Page Info menu item. When you do so, you should see UTF-8 as the value for the page's Encoding. If the page isn't being served as UTF-8, you can do so one of two ways. Either you can set the encoding in a call to header(), like this:
header('Content-Type: text/html; charset=UTF-8');
Or, you can use a meta tag in your page's head block:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Prepping MySQL for UTF-8
Next up, you need to make sure the database is set up to use the UTF-8 encoding. This can be set at the server, database, table, or column levels. If you're on a shared host, you probably can only control the table and column levels of your hierarchy. If you have control of the server or database, you can check to see what character encoding they are using by issuing these two commands:
SHOW VARIABLES LIKE 'character_set_system';
SHOW VARIABLES LIKE 'character_set_database';
Changing the database level encoding can be done using a command like this:
(CREATE | ALTER) DATABASE ... DEFAULT CHARACTER SET utf8;
To see what character encoding a table uses, simply do:
SHOW CREATE TABLE myTable;
Similarly, here's how to change a table-level encoding:
(CREATE | ALTER) TABLE ... DEFAULT CHARACTER SET utf8;
I recommend setting the encoding as high as you possibly can in the hierarchy. This way, you don't have to remember to manually set it for new tables. Now, if your character encoding for a table is not already set to UTF-8, you can attempt to convert it using an alter statement like this:
ALTER TABLE ... CONVERT TO CHARACTER SET utf8;
Be very careful about using this statement! If you already have UTF-8 values in your tables, they may become corrupted when you attempt to convert. There are some ways to get around this, however.
Forcing MySQLi to Use UTF-8
Finally, before you connect to your database, make sure you issue the appropriate call to say that you are using the UTF-8 encoding. Here's how:
$db = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);
// Change the character set to UTF-8 (have to do it early)
if(! $db->set_charset("utf8"))
{
printf("Error loading character set utf8: %sn", $db->error);
}
Once you do that, everything should hopefully work as expected. The only characters you need to worry about encoding are the big 5 for HTML: <, >, ', ", and &. You can handle that using the htmlspecialchars() function.
If you want to read more (and get links to additional resources), feel free to check out the articles I wrote about this process. There are two parts: Unicode and the Web: Part 1, and Unicode and the Web: Part 2. Good luck!

Which collation should I use to store these country names in MySQL?

I am trying to store a list of countries in a mySQL database.
I am having problems storing (non English) names like these:
São Tomé and Príncipe
República de El Salvador
They are stored with strange characters in the db, (and therefore output strangely in my HTML pages).
I have tried using different combinations of collations for the database and the MySQL connection collation:
The "obvious" setting was to use utf8_unicode_ci for both the databse and the connection information. To my utter surprise, that did not solve the problem.
Does anyone know how to resolve this issue?
[Edit]
It turns out the problem is not to do with collation, but rather encoding, as pointed out by the col. I notice that at the command line, I can type two separate commands:
SET NAMES utf8
followed by
[QUERY]
where [QUERY] is my SQL statment to retrieve the names, and that works (names are no longer mangled). However, when I do the same thing programatically (i.e. through code), I still get the mangled names. I tried combining the two statements like this:
SET NAMES utf8; [QUERY]
at the command line, again, this returned the correct strings. Once again, when I tried the same statements through code, I got wrong values.
This is a snippet of what my code looks like:
$mysqli = self::get_db_connection();
$mysqli->query('SET NAMES utf8');
$sql = 'SELECT id, name FROM country';
$results = self::fetch($sql);
the fetch method is:
private static function fetch($query)
{
$rows = array();
if (!empty($query))
{
$mysqli = self::get_db_connection();
if ($mysqli->connect_errno)
{
self::logError($mysqli->connect_error);
}
else
{
if ($result = $mysqli->query($query))
{
if(is_object($result)){
while ($row = $result->fetch_array(MYSQLI_ASSOC))
$rows[] = $row;
$result->close();
}
}
}
}
return $rows;
}
Can anyone spot what I may be doing thats wrong?
Just to clarify, the HTTP headers in the page are set correctly
'Content-type': 'text/html; charset=utf-8'
so thats not the issue here.
As a matter of fact, collation affects nothing of a kind. it's a thing used for ordering and comparison, not recoding.
It is encoding responsible for the characters itself.
So, your problem comes not from the table collation but from the connection encoding
SET NAMES utf8
query should solve the problem, at leas for the newly inserted data
if you use uf8 everywhere*, it will work - seems like you forgot anything
*everywhere means: for your database-collation and -connection, for your (php?) script files and for the pages that are sent to the browser (by setting a meta-tag or, better, set an uftf-8-header)

Encoding problems in PHP / MySQL

EDIT: After feedback from my original post, I've change the text to clarify my problem.
I have the following query (pseudo code):
$conn = mysql_connect('localhost', 'mysql_user', 'mysql_password');
mysql_query("SET NAMES 'utf8'; COLLATE='utf8_danish_ci';");
mysql_query("SELECT id FROM myTable WHERE name = 'Fióre`s måløye'", $conn);
This returns 0 rows.
In my logfile, I see this:
255 Connect root#localhost on
255 Query SET NAMES 'utf8'; COLLATE='utf8_danish_ci'
255 Init DB norwegianfashion
255 Query SELECT id FROM myTable WHERE name = 'Fióre`s måløye'
255 Quit
If I run the query directly in phpMyAdmin, I get the result.
Table encoding: UTF-8
HTML page encoding: UTF-8
I can add records (from form input) where names uses accents (e.g. "Fióre`s Häßelberg")
I can read records with accents when using -> "name LIKE '$labelName%'"
The information in the DB looks fine
I have no clue why I can't select any rows which name has accent characters.
I really hope someone can help me.
UPDATE 1:
I've come to a compromise. I'll be converting accents with htmlentities when storing data, and html_entity_decode when retrieving data from the DB. That seems to work.
The only drawback I see so far, is that I can't read the names in cleartext using phpMySQL.
I think you should rather return $result than $this->query.
Additionally you should be aware of SQL injection and consider using mysql_real_escape_string or Prepared Statements to protect you against such attacks. addslashes is not a proper protection.
As other answers indicate, this very much seems like an encoding problem. I suggest turning on query logging ( http://dev.mysql.com/doc/refman/5.1/en/query-log.html ) as it can show you what the database really receives.
UPDATE:
I finally found a page explaining the dirty details of PHP and UTF-8 (http://www.phpwact.org/php/i18n/charsets). Also, make sure you read this (http://niwo.mnsys.org/saved/~flavell/charset/form-i18n.html) to understand how you to get proper data returned from form posts.
Try this query. If you get results, then it's an issue with your backtick character in the query
SELECT * FROM sl_label WHERE name Like 'Church%'
Maybe try checking for error messages after calling the query (if you aren't already doing this outside that function). It could be telling you exactly what's wrong.
As Artem commented, printing out the actual query is a good idea - sometimes things aren't exactly as you expect them to be.
This might be an encoding issue, the ' in Church's might be a fancy character. PHPMyAdmin could be UTF-8, and your own PHP website could be iso-latin1.
I'm looking at this line
mysql_query("SET NAMES 'utf8'; COLLATE='utf8_danish_ci';");
and I think it might be an error. With the ';' you are sending two queries to the server, but COLLATE is a clause, not a legal statement on its own. Try:
mysql_query("SET NAMES 'utf8' COLLATE 'utf8_danish_ci'");
If the COLLATE clause is not being accepted by the server, you might be having the problem of your label column having a danish_ci collation, but the statements coming in have the default (prob utf_general_ci). There would be no match for the accented characters, but the wildcard works because the representation for the basic ascii characters are the same.

Categories