How to convert HTML code of Unicode to Real Unicode Character - php

Is there any php function that can convert
ابب
to it equallent unicode character
ت ب ا
I have googled a lot, But I think there is no any PHP built in function available for this purpose. Actually, I want to store the user submitted comments (that are in unicode characters) to mysql database. But It is stored in this format ابب in mysql database. I am also using SET NAMES 'utf8' in mysql query. The Output is in real unicode characters that is fine, but the insertion in mysql is in this format ابب that i don't want.
Any Solution??

I googled It and found a very interesing solution here
I have also tried it and I think Its working
<?php
$trans_tbl = get_html_translation_table(HTML_ENTITIES);
foreach($trans_tbl as $k => $v)
{
$ttr[$v] = utf8_encode($k);
}
$text = 'اب&#1576';
$text = strtr($text, $ttr);
echo $text;
?>
for mysql solution you can set the character set as
$mysqli = new mysqli($host, $user, $pass, $db);
if (!$mysqli->set_charset("utf8")) {
die("error");
}

if you
1 - set your html page's lang encoding to utf-8 which includes your forms
2 - only use your forms to enter input into your related MySQL db tables
3 - set all collations to utf8_unicode_ci in your MySQL (tables and rows collations)
4 - if you have premission you can also setyour MySQL DB collation as utf8_unicode_ci
then you won't see entities in your mySQL records also
This is my solution I use and have no trouble with my mother language which also has lots of unicode characters.
Below I introduce you my db connection php code & recommend (by using prepared statements; please check http://php.net/manual/en/mysqli.quickstart.prepared-statements.php)
//mysql bağlantısı
global $db_baglanti;
$db_baglanti = new mysqli(vt_host, vt_user, vt_password, vt_name);
if ($db_baglanti->connect_errno)
{
echo "MySQL bağlantısı kurulamadı: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;
}
if (!$db_baglanti->set_charset("utf8"))
{
printf("utf8 karakter setinin yüklenmesinde problem oluştu: %s\n", $db_baglanti->error);
}
else
{
$db_baglanti->set_charset("utf8");
}

I think you should be able to use mysql_set_charset() http://php.net/manual/en/function.mysql-set-charset.php to set the mysql connection encoding so when you retrieve these from the DB and display them they should be fine.
According to php.net "This extension is deprecated as of PHP 5.5.0, and will be removed in the future. Instead, the MySQLi or PDO_MySQL extension should be used."

Related

Urdu / Arabic font data from MySQL is displaying as ????? in JSON

I am developing an Android app with Urdu/Arabic data store in MySQL database on my web server and using JSON_Encoding to generate the JSON string. The JSON string is then being used in Android app to perform various functions (populating RecyclerView and other view objects with data). I am able to store Urdu / Arabic data in MySQL database, but when I use PHP script to generate JSON, all the fields containing Urdu characters is displaying data as ??????
I was using the utf8mb4_unicode_ci as I read the this is easy for storing non-English data and performing multiple functions, but after this encoding problem, I have changed that to utf8_general_ci for all the tables and fields in MySQL database. Below is the PHP script I am using to generate the JSON string from MySQL:
<?php
require "conn.php";
mysqli_query("SET NAMES 'utf8'");
mysqli_query('SET CHARACTER SET utf8');
$sql_qry = "SELECT * FROM countrybasic;";
$result = mysqli_query($conn, $sql_qry);
$response = array();
while($row = mysqli_fetch_array($result)){
array_push($response, array("id"=>$row[0],"name"=>$row[1],"capital"=>$row[2],"continent"=>$row[3],"population"=>$row[4],"gdp"=>$row[5]));
}
echo json_encode(array("server_response"=>$response));
mysqli_close($conn);
?>
The Name and Capital fields are the ones I store my Urdu data in.
Please help me out to resolve this issue.
Thanks.
Create your table [countrybasic] with collation utf8mb4_unicode_ci and make the name column with the same colation.
Now insert some sample data in different languages.
Get the data using MySQLi query result.
Note: If you save the data when the collation is different and get that one after changing the collation then that data will not fetch correctly.
I hope this will work.
You just have to change the Charset to UTF8, and you can use these lines for PHP to do it:
$statSQL= 'SET CHARACTER SET utf8';
mysqli_query($your_db,$sSQL)
or die ('charset in DB didn\'t change');
I hope this help :)
$CONNECTION = mysqli_connect($host,$user,$password,$database);
// Check connection
if (mysqli_connect_errno())
{
echo "Failed to connect to MySQL: " . mysqli_connect_error();
}
mysqli_query ($CONNECTION ,"set character_set_results='utf8'");
$queryutf8 = "select * from yourtable";
$res_utf8 = mysqli_query($CONNECTION ,$queryutf8 );
You need to change the default utf8 in the wamp server
check the below link for more detail.
Arabic characters doesn't show in phpMyAdmin

MySQL and PHP: UTF-8 with Cyrillic characters [duplicate]

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 3 years ago.
I'm trying to insert a Cyrillic value in the MySQL table, but there is a problem with encoding.
Php:
<?php
$servername = "localhost";
$username = "a";
$password = "b";
$dbname = "c";
$conn = new mysqli($servername, $username, $password, $dbname);
mysql_query("SET NAMES 'utf8';");
mysql_query("SET CHARACTER SET 'utf8';");
mysql_query("SET SESSION collation_connection = 'utf8_general_ci';");
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
$sql = "UPDATE `c`.`mainp` SET `search` = 'test тест' WHERE `mainp`.`id` =1;";
if ($conn->query($sql) === TRUE) {
}
$conn->close();
?>
MySQL:
| id | search |
| 1 | test ав |
Note: PHP file is utf-8, database collation utf8_general_ci
You are mixing APIs here, mysql_* and mysqli_* doesn't mix. You should stick with mysqli_ (as it seems you are anyway), as mysql_* functions are deprecated, and removed entirely in PHP7.
Your actual issue is a charset problem somewhere. Here's a few pointers which can help you get the right charset for your application. This covers most of the general problems one can face when developing a PHP/MySQL application.
ALL attributes throughout your application must be set to UTF-8
Save the document as UTF-8 w/o BOM (If you're using Notepad++, it's Format -> Convert to UTF-8 w/o BOM)
The header in both PHP and HTML should be set to UTF-8
HTML (inside <head></head> tags):
<meta charset="UTF-8">
PHP (at the top of your file, before any output):
header('Content-Type: text/html; charset=utf-8');
Upon connecting to the database, set the charset to UTF-8 for your connection-object, like this (directly after connecting)
mysqli_set_charset($conn, "utf8"); /* Procedural approach */
$conn->set_charset("utf8"); /* Object-oriented approach */
This is for mysqli_*, there are similar ones for mysql_* and PDO (see bottom of this answer).
Also make sure your database and tables are set to UTF-8, you can do that like this:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
(Any data already stored won't be converted to the proper charset, so you'll need to do this with a clean database, or update the data after doing this if there are broken characters).
If you're using json_encode(), you might need to apply the JSON_UNESCAPED_UNICODE flag, otherwise it will convert special characters to their hexadecimal equivalent.
Remember that EVERYTHING in your entire pipeline of code needs to be set to UFT-8, otherwise you might experience broken characters in your application.
In addition to this list, there may be functions that has a specific parameter for specifying a charset. The manual will tell you about this (an example is htmlspecialchars()).
There are also special functions for multibyte characters, example: strtolower() won't lower multibyte characters, for that you'll have to use mb_strtolower(), see this live demo.
Note 1: Notice that its someplace noted as utf-8 (with a dash), and someplace as utf8 (without it). It's important that you know when to use which, as they usually aren't interchangeable. For example, HTML and PHP wants utf-8, but MySQL doesn't.
Note 2: In MySQL, "charset" and "collation" is not the same thing, see Difference between Encoding and collation?. Both should be set to utf-8 though; generally collation should be either utf8_general_ci or utf8_unicode_ci, see UTF-8: General? Bin? Unicode?.
Note 3: If you're using emojis, MySQL needs to be specified with an utf8mb4 charset instead of the standard utf8, both in the database and the connection. HTML and PHP will just have UTF-8.
Setting UTF-8 with mysql_ and PDO
PDO: This is done in the DSN of your object. Note the charset attribute,
$pdo = new PDO("mysql:host=localhost;dbname=database;charset=utf8", "user", "pass");
mysql_: This is done very similar to mysqli_*, but it doesn't take the connection-object as the first argument.
mysql_set_charset('utf8');
Solution:
mysql_query("SET NAMES 'utf8';"); > $mysqli->set_charset('utf8');

can't get UTF-8 names into a mysql database

I'm having a problem getting UTF-8 names written into a MySQL database... Here's what I have.
PHP page head has....
<meta charset="utf-8">
the MySQL column is: Char (80) with utf8_unicode_ci (these were originally latin1... I've changed them to UTF-8, truncated the database, then rerun the code)
The variable echoes to screen: Germán Mera
but writes it to database as Germán Mera
I tried putting utf8_encode(); around the variable, but then it writes to database as: Germán Mera and screen as Germán Mera (I know that command only works on iso-8859-1.. I think the JSON page is already UTF-8)
Here is an excerpt of the code I am using to get the name (for sake of simplicity, I'm only showing relevant code - I know what's shown below is not secure)
$str = file_get_contents('http://fantasy.mlssoccer.com/web/api/elements/647/');
$jsonarray = json_decode($str, true);
$name = $jsonarray['web_name'];
mysqli_query ($con, "INSERT INTO mlsprices (name) VALUES ('$name')");
Any idea how I can get this to write to the database properly? When I search, I only get quite complicated answers (eg, this) and there's surely an easier way.
Try using SET NAMES 'UTF8' after connecting to MySQL:
$con=mysqli_connect("host", "user", "pw", "db");
if (!$con)
{
die('Failed to connect to mySQL: ' .mysqli_connect_errno());
}
/* change character set to utf8 */
if (!$con->set_charset("utf8")) {
printf("Error loading character set utf8: %s\n", $con->error);
}
As the manual says:
SET NAMES indicates what character set the client will use to send SQL
statements to the server... It also specifies the character set that the server should
use for sending results back to the client.

Converting html entities to utf-8 and inserting them into a mysql database

I am trying to convert a string from HTML-ENTITIES to UTF-8 and then save the encoded string in my database. The html entities are greek letters and look for example like this: νω
Now I tried thousands of different ways, starting from just using utf8_encode or html_entity_decode until now I came across the function mb_convert_encoding().
Now the really weird thing is that when converting my string and then outputting it, it is correctly encoded to utf-8, but when inserting this string into my database I end up getting something like: ξÏνω.
This is the code for the encoding:
header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');
......
while($arr = $select->fetch_array(MYSQLI_ASSOC))
{
$text = $arr["greek"];
$result = mb_convert_encoding($text, 'UTF-8', 'HTML-ENTITIES');
$mysqli->query("UPDATE some SET greek = '".$result."'");
}
When outputting my query and then manually doing a sql query in phpmyadmin it works fine, so it doesnt seem to be a problem of my db. There must be some problem when transferring the encoded string to my database...
As you see in your script, you are instructing the browser to use UTF8. That is the first step.
However your database needs the same thing and also the encoding/collation on the tables need to be UTF8 too.
You can either recreate your tables using utf8_general_ci or utf8_unicode_ci as the collation, or convert the existing tables (see here)
You need to also make sure that your database connection i.e. php code to mysql is using UTF8. If you are using PDO there are plenty of articles that show how to do that. The simplest way is to do:
$mysqli->query('SET NAMES utf8');
NOTE The change you will make now is final. If you change the connection encoding to your database, you could affect existing data.
EDIT You can do the following to set the connection
$mysqli = new mysqli($host, $user, $pass, $db);
if (!$mysqli->set_charset("utf8")) {
die("Error loading character set utf8: %s\n", $mysqli->error);
}
$mysqli->close();
Links of interest:
Whether to use "SET NAMES"
Execute the SET NAMES 'utf8' query prior to any others.

MySQL stores the common special characters, but not the rare ones

When I try to insert some rare special characters (∨ ∧ → ↔ ∴), they get stored as question marks. But when I try to insert some more common special characters (© ® ¬ á) everything goes fine.
I've set every variable, database, table and connection I could find to UTF-8, but no luck yet. What am I missing? Thanks in advance!
Here is a minimal example:
<?php
header('content-type:text/html; charset=utf-8;');
$connection = mysql_connect('localhost', 'root', '');
mysql_select_db('test', $connection);
mysql_set_charset('utf8', $connection);
mysql_query('SET NAMES UTF8');
$special_character = '∴';
echo 'Encoding of the special character before insert: '.base64_encode($special_character).'<br />';
mysql_query('INSERT INTO table (column) VALUES ("'.$special_character.'")');
$query = mysql_query('SELECT column FROM table');
while ($ROW = mysql_fetch_assoc($query))
{
echo 'Encoding of the special character after retrieval: '.base64_encode($ROW['column']).'<br />';
echo 'Output: '.$ROW['column'].'<br />';
}
mysql_close($connection);
?>
The output of this script is:
Encoding of the special character before insert: 4oi0
Encoding of the special character after retrieval: Pw==
Output: ?
Could it be because I have the standard MySQL installation for Mac OSX, which doesn't have a my.cnf file? Maybe the defaults that come with this installation are not UTF-8? Anyone knows?
UPDATE: I have determined that the problem is in my local installation of MySQL, because it does not appear when I run the code in my web host. I still want to solve it though.
I had a similar issue in the eclipse java console in the past. The information in the database in my case was stored properly. What happened is that the console is not meant to display those characters.
To see them I had my application generate a file with the special characters in them and opened them in notepad++ with which you can change the character set to apply to the file. I had this issue with Russian characters.
What you storage in database it must content in charset table that you use
for example for UTF-8:
http://www.utf8-chartable.de/
You missed one. Your connection to mysql also has to be set to UTF8. After initially connecting to mysql, your first "query" should be "SET NAMES UTF8". If you are using a terminal program, run that "query" also.
It seems like the problem was in my local installation of MySQL. I was using an outdated version that came bundled with my Mac OSX 10.5. I downloaded and installed the more recent version of MySQL, and problem solved.
And that's why, boys and girls, it is always good to have the most recent version of the software you use.

Categories