UTF-8 special characters are inserting into table as weird characters [duplicate] - php

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 9 years ago.
I searched in this site there were lots of answers but none of them are working for me. I am trying to parse the below XML using PHP simplexml_load_string.
My XML is:
<?xml version="1.0" encoding="utf-8"?>
<address>
<name>Peter</name>
<country>Großbritannien</country>
</address>
When I print it using print_r the result is showing as below:
SimpleXMLElement Object
(
[name] => Peter
[country] => Großbritannien
)
and when I use ini_set('default_charset', 'UTF-8') or header('Content-Type: text/html; charset=utf-8') the result is:
SimpleXMLElement Object
(
[name] => Peter
[country] => Großbritannien
)
But When I try to insert the country value using PHP (with the header set as utf8) in the database (MySQL) then its inserting as Großbritannien. My tables character-set is UTF8 and collation of the column is utf8_unicode_ci. However when I directly insert the country value into the table(using phpMyAdmin or terminal) then its inserting properly.
I would like to know why the country value is not inserting properly from my PHP parsing page.
My PHP sample code is below:
$notificationXml = simplexml_load_string($xml);
$con = $notificationXml->country;
mysql_query("INSERT INTO test_1 (con) values ('$con')");
Please help me out. Thanking you all in advance.

a) the extension providing the mysql_* functions is marked as deprecated. Better to start with pdo_mysql or mysqli
b) You have to take care of the connection charset, i.e. essentially the charset the MySQL server expects the client to use when sending/receiving data. Avoid using something like SET NAMES ... as it would leave the client side mysql library in the dark about the new charset/encoding rules, so some character handling may not work as intended which may even lead to security related issues. Use the dedicated client side mechanism for changing the charset instead. For mysql_* that would be mysql_set_charset()., for mysqli_* mysqli::set_charset() and for pdo you should put that information into the dsn like
$pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=utf8')
(and use a php version >= 5.3.6)

Insert this above your insert query
mysql_query('SET NAMES utf8');
AS #deceze pointed out It is better to use
mysql_set_charset('utf8');

Related

Issue MySQL, PHP and CSV

I have a problem for days with MySQL and PHP.
I load data from a CSV and write it to a database. Unfortunately, umlauts are not displayed correctly, so a ü is written to the database as u00fc, for example.
I use MySQL because it can not be otherwise, after establishing the MySQL connection, I execute the following commands
mysql_set_charset('UTF8', $newsystem);
mysql_query("SET CHARACTER SET 'UTF8'", $newsystem);
The collation is set to utf8_general_ci.
To enter the data, the data is converted using json_encode.
The array is created as follows:
$jsonOrderDaten = array('id' => $daten[11], 'bezeichnung' => $daten[12], 'stueckzahl' =>$daten[14], 'preis' => $daten[15], 'mwst_satz' => $daten[16]);
Und der MySQL Eintrag erfolgt so:
$insertquery = "INSERT INTO `e_paket` (`paket_id`, `ebay_verkaufsnr`, `adress_daten`, `bestell_daten`, `ebay_order`, `status`) VALUES ('$packid', '$ebayorderid', '$jsonKD', '$jsonOD', 1, 0)";
mysql_query($insertquery ,$newsystem);
Can someone help me with my problem?
Best regards,
Pascal
The problem here is the JSON encoding. All special unicode characters get escaped (e.g. german umlauts).
To fix this you could set the JSON_UNESCAPED_UNICODE flag on the encode method like this:
$jsonOD = json_encode($jsonOrderDaten ,JSON_UNESCAPED_UNICODE);
Please note that you should watch out for SQL-Injection!
I think using "LOAD DATA LOCAL INFILE" will solve your problem. It is more faster than normal insertion. Please https://dev.mysql.com/doc/refman/5.7/en/load-data.html for more details.
You could also do mysql_query ("SET NAMES UTF8");
Also, not only the connection has character encoding, but also the database, the table and each column have their own default encoding. Check them if they are set to something different.
Mysqli, injection, prepared_statements, the others have already commented on them.

how to retrieve utf-8 data [duplicate]

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 7 years ago.
I am using tamil language (utf-8) inserted into MySQL database collation latin1_swedish_ci by default. but the data shows like ????? ??? ???????????.????? when I retrieve it. I studied in the net lot about the problem. but the solution nothing was useful. Totally it is making mad. anybody can help me. I am using query below like this. Give me solution in simple way only.
<?php
$con=mysql_connect('localhost','root','');
$db=mysql_select_db('nikah', $con);
$sql="select * from matrimony";
$result=mysql_query($sql) or die(mysql_error());
While($row=mysql_fetch_array($result)){
echo $row['name'];
}
?>
Try using mysql_set_charset() as explained here.
$link = mysql_connect('localhost', 'user', 'password');
mysql_set_charset('utf8',$link);
EDIT:
As stated by Jay keep in mind that mysql extension has been deprecated in PHP 5.5.0. From the php doc:
Warning
This extension was deprecated in PHP 5.5.0, and it was removed in PHP 7.0.0. Instead, the MySQLi or PDO_MySQL extension should be used. See also MySQL: choosing an API guide and related FAQ for more information. Alternatives to this function include:
mysqli_set_charset()
PDO: Add charset to the connection string, such as charset=utf8
In addition to setting mysql_set_charset("utf8"); as mentioned in the other answer, you might need to adjust for a few more settings in order to fully guard yourself against broken characters.
Connection
The connection needs to know what charset to expect. Just after creating the connection, specify the charset like this
$con = mysql_connect('localhost','root','');
mysql_set_charset("utf8");
Headers
Setting the charset in both HTML and PHP headers to UTF-8
PHP: header('Content-Type: text/html; charset=utf-8');
(PHP headers has to be placed before any kind output (echo, whitespace, HTML))
HTML: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
(HTML-headers are placed within the <head> / </head> tag)
Database and tables
Your database and all its tables has to be set to UTF-8. Note that charset is not exactly the same as collation (see this post).
You can do that by running the queries below once for each database and tables (for example in phpMyAdmin)
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Should you follow all of the pointers above, chances are your problem will be solved. If not, you can take a look at this StackOverflow post: UTF-8 all the way through.
mysql_* functions are deprecated since PHP 5.5 (and removed entirely in PHP 7) and you shoud stop using them if you can.
You should choose another API, like mysqli_* or PDO instead - see choosing an API.

Unable to output cyrillic ASCII encoding [duplicate]

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 9 years ago.
I have MySQL field which contains plain text Cyrillic characters (ex. Широка поляна). Collation is utf8_general_ci.
When I pull out this content with MySQL query and try to output it with php I always get ???? symbols. HTML encoding is utf8, document encoding is utf8, mb_detect_encoding() shows ASCII for the string but none of the PHP / MySQL convert functions turns it into something readable.
For the outdated mysql driver it have to be
mysql_set_charset('utf8');
for mysqli
$mysqli->set_charset('utf8');
for PDO you have to set encoding in DSN:
$dsn = "mysql:host=localhost;dbname=test;charset=utf8";
You most likely do very common mistake by not setting connection encoding. It's usually done in my.ini config file, but IMHO the better way is to always enforce this in your code. To do so, just execute this query:
SET NAMES encoding;
i.e. for utf8 it would be:
SET NAMES utf8;
You do this just after you connect to database and do it once per connection.
You have to set charset for your database connection.
For this reason you can use mysql_set_charset function.
mysql_set_charset('utf8',$link1);
More information: http://php.net/mysql_set_charset
Use in first line of php file
header('Content-Type: text/html; charset=utf-8');

Superscript character in PHP causing a MySQLi select query to find 0 rows

I am using PHP 5.3.3 and MySQL 5.1.61. The column in question is using UTF-8 encoding and the PHP file is encoded in UTF-8 without BOM.
When doing a MySQLi query with a ² character in SQLyog on Windows, the query executes properly and the correct search result displays.
If I do this same exact query in PHP, it will execute but will show 0 affected_rows.
Here's what I tried:
Using both LIKE instead of =
Changing the encoding of the PHP file to ANSI, UTF-8 without BOM, and UTF-8
Doing 'SET NAMES utf-8' and 'latin1' before running the query
Did header('Content-Type: text/html; charset=UTF-8'); in PHP
Escaping using MySQLi::real_escape_string
Doing a filter_var($String, FILTER_SANITIZE_STRING)
Tried a MySQLi stmt bind
The only way I could get it to work properly is if I swapped the ² for a % and changed = to LIKE in PHP.
How can I get it query properly in PHP when using the ²?
You should be able to get the query to work by ensuring the following:
Prepping PHP for UTF-8
You first need to make sure the PHP pages that will be issuing these queries are served as UTF-8 encoded pages. This will ensure that any UTF-8 output coming from the database is displayed properly. In Firefox, you can check to see if this is the case by visiting the page you're interested in and using the View Page Info menu item. When you do so, you should see UTF-8 as the value for the page's Encoding. If the page isn't being served as UTF-8, you can do so one of two ways. Either you can set the encoding in a call to header(), like this:
header('Content-Type: text/html; charset=UTF-8');
Or, you can use a meta tag in your page's head block:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Prepping MySQL for UTF-8
Next up, you need to make sure the database is set up to use the UTF-8 encoding. This can be set at the server, database, table, or column levels. If you're on a shared host, you probably can only control the table and column levels of your hierarchy. If you have control of the server or database, you can check to see what character encoding they are using by issuing these two commands:
SHOW VARIABLES LIKE 'character_set_system';
SHOW VARIABLES LIKE 'character_set_database';
Changing the database level encoding can be done using a command like this:
(CREATE | ALTER) DATABASE ... DEFAULT CHARACTER SET utf8;
To see what character encoding a table uses, simply do:
SHOW CREATE TABLE myTable;
Similarly, here's how to change a table-level encoding:
(CREATE | ALTER) TABLE ... DEFAULT CHARACTER SET utf8;
I recommend setting the encoding as high as you possibly can in the hierarchy. This way, you don't have to remember to manually set it for new tables. Now, if your character encoding for a table is not already set to UTF-8, you can attempt to convert it using an alter statement like this:
ALTER TABLE ... CONVERT TO CHARACTER SET utf8;
Be very careful about using this statement! If you already have UTF-8 values in your tables, they may become corrupted when you attempt to convert. There are some ways to get around this, however.
Forcing MySQLi to Use UTF-8
Finally, before you connect to your database, make sure you issue the appropriate call to say that you are using the UTF-8 encoding. Here's how:
$db = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);
// Change the character set to UTF-8 (have to do it early)
if(! $db->set_charset("utf8"))
{
printf("Error loading character set utf8: %sn", $db->error);
}
Once you do that, everything should hopefully work as expected. The only characters you need to worry about encoding are the big 5 for HTML: <, >, ', ", and &. You can handle that using the htmlspecialchars() function.
If you want to read more (and get links to additional resources), feel free to check out the articles I wrote about this process. There are two parts: Unicode and the Web: Part 1, and Unicode and the Web: Part 2. Good luck!

PHP + SQL Server - How to set charset for connection?

I'm trying to store some data in a SQL Server database through php.
Problem is that special chars aren't converted properly. My app's charset is iso-8859-1
and the one used by the server is windows-1252.
Converting the data manually before inserting doesn't help, there seems to be some
conversion going on.
Running the SQL query 'set char_convert off' doesn't help either.
Anyone have any idea how I can get this to work?
EDIT: I have tried ini_set('mssql.charset', 'windows-1252'); as well, but no result with that one either.
Client charset is necessary but not sufficient:
ini_set('mssql.charset', 'UTF-8');
I searched for two days how to insert UTF-8 data (from web forms) into MSSQL 2008 through PHP. I read everywhere that you can't, you need to convert to UCS2 first (like cypher's solution recommends).
On Windows SQLSRV said to be a good solution, which I couldn't try, since I am developing on Mac OSX.
However, FreeTDS manual (what PHP mssql uses on OSX) says to add a letter "N" before the opening quote:
mssql_query("INSERT INTO table (nvarcharField) VALUES (N'űáúőűá球最大的采购批发平台')", +xon);
According to this discussion, N character tells the server to convert to Unicode.
https://softwareengineering.stackexchange.com/questions/155859/why-do-we-need-to-put-n-before-strings-in-microsoft-sql-server
I had the same problem and ini_set('mssql.charset', 'utf-8') did not work for me.
However, it worked in uppercase:
ini_set('mssql.charset', 'UTF-8');
I suggest looking at the following points:
Ensure that the columns that you're storing the information in are nchar or nvarchar as char and nvarchar don't support UCS-2 (SQLServer doesn't store in UTF-8 format btw)
If you're connecting with the mssql library/extension for PHP, run: ini_set('mssql.charset', 'utf-8'); as there's no function with a charset argument (connect, query etc)
Ensure that your browsers charset is also set to UTF-8
If ini_set('mssql.charset', 'UTF-8'); doesn't help AND you don't have root access to modify the system wide freetds.conf file, here's what you can do:
1. Set up /your/local/freetds.conf file:
[sqlservername]
host=192.168.0.56
port=1433
tds version=7.0
client charset=UTF-8
2. Make sure your connection DSN is using the servername, not the IP:
'dsn' => 'dblib:host=sqlservername;dbname=yourdb
3. Make FreeTDS to use your local freetds.conf file as an unprivileged user from php script via env variables:
putenv('FREETDSCONF=/your/local/freetds.conf');
If you are using TDS protocol version 7 or above, ALL communications over the wire are converted to UCS2. The server will convert from UCS2 into whatever the table or column collation is set to, unless the column is nvarchar or ntext. You can store UTF-8 into regular varchar or text, you just have to use a TDS protocol version lower than 7, like 6.0 or 4.2. The only drawback with this method is that you cannot query any nvarchar, ntext, or sys.* tables (I think you also can't do any CAST()ing) - as the server refuses to send anything that might possibly be converted to UTF-8 to any client using protocol version lower than 7.
It is not possible to avoid converting character sets when using TDS protocol version 7 or higher (roughly equivalent to MSSQL 2005 or newer).
In my case, It worked after I added the "CharacterSet" parameters into sqlsrv_connect() connection's option.
$connectionInfo = array(
"Database"=>$DBNAME,
"ConnectionPooling"=>0,
"CharacterSet"=>"UTF-8"
);
$LAST_CONNECTION = sqlsrv_connect($DBSERVER, $connectionInfo);
See documentation here :
https://learn.microsoft.com/en-us/sql/connect/php/connection-options?view=sql-server-2017
I've had luck in a similar situation (using a PDO ODBD connection) using the following code to convert the encoding before printing output:
$data = mb_convert_encoding($data, 'ISO-8859-1', 'windows-1252');
I had to manually set the source encoding, because it was erroneously being reported as 'ISO-8859-1' by mb_detect_encoding().
My data was also being stored in the database by another application, so I might be in a unique situation, although I hope it helps!
For me editing this file:
/etc/freetds/freetds.conf
...and changing/setting 'tds version' parameter to '7.0' helped. Edit your freetds.conf and try to change this parameter for your server configuration (or global).
It will work even without apache restart.
I did not notice someone to mention another way of converting results from MSSQL database. The good old iconv() function:
iconv (string $in_charset, string $out_charset, string $str): string;
In my case everything else failed to provide meaningful conversion, except this one when getting the results. Of course, this is done inside the loop of parsing the results of the query - from CP1251 to UTF-8:
foreach ($records as $row=>$col) {
$array[$row]['StatusName'] = iconv ('CP1251', 'UTF-8' , $records[$row]['StatusName']);
}
Ugly, but it works.
Can't you just convert your tables to your application encoding? Or use utf-8 in both?
I don't know whether MSSQL supports table-level encodings, though.
Also, try the MB (multibyte) string functions, if the above fails.
You should set the charset with ini_set('mssql.charset', 'windows-1252') before the connection. If you use it after the mssql_connect it has no effect.
Just adding ini_set('mssql.charset', 'UTF-8'); didn't help me in my case. I had to specify the UTF-8 character set on the column:
$age = 30;
$name = utf8_encode("Joe");
$select = sqlsrv_query($conn, "SELECT * FROM Users WHERE Age = ? AND Name = ?",
array(array($age), array($name, SQLSRV_PARAM_IN, SQLSRV_PHPTYPE_STRING('UTF-8')));
You can use the mysql_set_charset function:
http://it2.php.net/manual/en/function.mysql-set-charset.php

Categories