i am using json output for my application and stored all data in my native language in mysql server with utf8_general_ci
when i am fetching that using json_encode i got the json array but the data format is not supported in that. how can i solve it.
code which i used to create json data.
<?php
header('Content-type: text/html; charset=utf-8; pageEncoding="ISO-8859-1"');
include('include/config.php');
mysql_query("SET NAMES 'utf-8' 'ISO-8859-1'");
//mysql_query("SET CHARACTER SET utf8 ISO-8859-1");
$sth = mysql_query("select v.verse,b.book_name,v.chapter,v.verse_number from tbl_verses_mal v inner join tbl_books_mal b on v.book_id=b.book_id");
$rows = array();
while($r = mysql_fetch_assoc($sth)) {
$rows[] = $r;
}
print json_encode($rows);
?>
the output i got is like
[{"verse":"???????? ?????? ???????? ?????? ??? ????????? . ?? ????????? ??? ??????? ????????????????? .","book_name":"Genesis","chapter":"1","verse_number":"1"},{"verse":"???? ??????? ?????????? ??????? : ???????????? ???? ????????????? .????????????? ??????? ??????","book_name":"Genesis","chapter":"1","verse_number":"2"},{"verse":"???????? ?????????? ????? ???? ?????????: ???????? ??????? ","book_name":"Genesis","chapter":"1","verse_number":"3"}]
????? marks represents the language which is in the database.
the expected results is like given below
[{"verse":"അയൽകാരന് ആവശ്യം വരുമ്പോൾ നിങ്ങൾ കടം കൊടുക്കു. .","book_name":"Genesis","chapter":"1","verse_number":"1"},{"verse":"ഭൂമി പാഴായും ശൂന്യമായും ഇരുന്നു : ആഴത്തിന്മീതെ","book_name":"Genesis","chapter":"1","verse_number":"2"},]
how can i solve this issue??
mysql_query("SET NAMES 'utf-8' 'ISO-8859-1'");
This makes no sense. Set the charset properly:
mysql_set_charset('utf8');
you need to set character set properly try like
mysql_set_charset('utf8');
And mysql_* function are deprecated use PDO or Mysqli instead
Related
I am using a MYSQL database where in I store the data in collation : utf16le_general_ci. I did this to support my regional language in the database and I am able store the data in my regional language(GUJARATI) successfully.
But when I use PHP(version 7+) to read the data and echo it in JSON format I get this kind of result :
[
{
"content_id":"1",
"cat_id_FK":"1",
"content_data":"??????????? ? ????? ??????? ??."
},
{
"content_id":"2",
"cat_id_FK":"1",
"content_data":"???? ?? ??? ?? 80% ????? ???? ??? ? ??? ??, ???? ??? ?????? ?? ??????,?????????? ??? ?????\r\n??????? ?? ??? ??????? ???? ??? ??? ??."
},
{
"content_id":"3",
"cat_id_FK":"1",
"content_data":"?????????? ??????? ????????? ???? ???? ???? ? ???????????."
},
{
"content_id":"4",
"cat_id_FK":"1",
"content_data":"??????????? ??? ???? ?????? ?????? ???? ???? ????? ?? ???????? ?? ??? ??."
},
{
"content_id":"5",
"cat_id_FK":"1",
"content_data":"??? ????? ????? ???????? ?? ????? ????? ??? ??????? ? ??????????? ?? ????? ???? ????."
},
{
"content_id":"6",
"cat_id_FK":"1",
"content_data":"??????????? ???? ?? ??? ??? ??? ? ????? ???."
}
]
My Code :-
header('Content-Type:application/json;charset=utf-8');
//header('Content-Type:application/json;charset=utf16le_general_ci');
include 'init.php';
global $connect;
$query = "SELECT * FROM gsrahasyacontent WHERE cat_id_FK=1";
$queryResult = mysqli_query($connect, $query);
while ($row = mysqli_fetch_assoc($queryResult)) {
$array[] = $row;
}
echo json_encode($array, JSON_UNESCAPED_UNICODE);
//print json_encode($array, JSON_UNESCAPED_UNICODE);
-> I tried changing out the charset to utf16le_general_ci from utf-8 but that didn't worked.
-> I also used print instead of json but still got the same result.
This is what my table looks like.
try this mysqli_set_charset($con,"utf8");
I have gone through many posts on the same topic but still not able to figure out what is the solution. If you think that an available solution will work for me then you can direct me there too.
Now, I want to display text in my regional language (Hindi) inside my app. This text is stored in a table on my website server MySQL database. I am retrieving this data along with many more columns through an SQL query run through a php script in the form of a JSON.
When I am testing this JSON on JSONLint and also by taking a log of the json string received by the Async task inside the app, all the fields having Varchar type and 'utf8_unicode_ci' collation are showing as below with question marks. Other fields types are coming correctly.
"?????? ??? ???? ?? ??????? ?? ????? ???? ?? 10 ?????"
Server details
phpMyAdmin Version 4.7.3
PHP version: 5.6.30
Database
Server type: MySQL
Server version: 5.6.38 - MySQL Community Server (GPL)
Protocol version: 10
Server charset: UTF-8 Unicode (utf8)
Below is my table in the database. The difference in utf8 in the image is because I was trying if utf8mb4 works.
The database is MyISAM type and thus there is no 'nvarchar' field type.
Below is my php script. I have two tables, Table 1 contains non-localized fields, Table 2 has the localized translation fields. 'hi' is my language code. The above table image is of the TableTranslation.
<?php
$host='localhost';
$hostname='*******';
$password='*******';
$db='*******';
$sql="SELECT TablePost.PostID,TableTranslation.PostCategory,TablePost.PostImage,TableTranslation.PostTitle,TableTranslation.PostBrief,
TableTranslation.PostLink,TablePost.IsShort,TablePost.IsSeenBy
FROM TablePost LEFT JOIN TableTranslation ON TablePost.PostID = TableTranslation.PostID
WHERE TableTranslation.Language LIKE 'hi';";
$con=mysqli_connect($host,$hostname,$password,$db);
$result = mysqli_query($con,$sql);
$response = array();
while($row=mysqli_fetch_array($result)){
array_push($response, array(
"PostID" => $row[0],
"PostCategory" => $row[1],
"PostImage" => $row[2],
"PostTitle" => $row[3],
"PostBrief" => $row[4],
"PostLink" => $row[5],
"IsShort" => $row[6],
"IsSeenBy" => $row[7]));
}
echo json_encode(array("server_response"=>$response));
mysqli_close($con);
?>
Below is the JSON response I am getting.
{
"server_response": [{
"PostID": "1",
"PostCategory": "??????????",
"PostImage": "http:\/\/websitename.com\/image.png",
"PostTitle": "?????? ??? ???? ?? ??????? ?? ????? ???? ?? 10 ?????",
"PostBrief": "???? ?? ????????? ?? ???-??? ?????? ?? ??? ???????? ???? ???? ?? ?? ??? ????? ??? ??? ?????? ????? ???? ???? ??? ?? ?? ???? ?? ??????? ?? ????? ???? ?? ??? ???????? ????? ???? ????",
"PostLink": "http:\/\/websitename.com\/post_link\/",
"IsShort": "0",
"IsSeenBy": "98"
}]
}
How do I correct it? Will really appreciate any help here.
add line mysqli_set_charset( $con, 'utf8'); after mysqli_connect()
like below code
<?php
$host='localhost';
$hostname='*******';
$password='*******';
$db='*******';
$sql="SELECT TablePost.PostID,TableTranslation.PostCategory,TablePost.PostImage,TableTranslation.PostTitle,TableTranslation.PostBrief,
TableTranslation.PostLink,TablePost.IsShort,TablePost.IsSeenBy
FROM TablePost LEFT JOIN TableTranslation ON TablePost.PostID = TableTranslation.PostID
WHERE TableTranslation.Language LIKE 'hi';";
$con=mysqli_connect($host,$hostname,$password,$db);
mysqli_set_charset( $con, 'utf8'); //change here.
$result = mysqli_query($con,$sql);
$response = array();
while($row=mysqli_fetch_array($result)){
array_push($response, array(
"PostID" => $row[0],
"PostCategory" => $row[1],
"PostImage" => $row[2],
"PostTitle" => $row[3],
"PostBrief" => $row[4],
"PostLink" => $row[5],
"IsShort" => $row[6],
"IsSeenBy" => $row[7]));
}
echo json_encode(array("server_response"=>$response));
mysqli_close($con);
?>
also try to add collation utf8_general_ci to PostTitle and PostBrief
like
I had a similar issue with Arabic and Hindi i solved like below.You can solve this issue in MySQL side.
In the table structure, you have to change collation utf8_bin
"PostID" - `utf8_bin` instead of `latin`,
"PostCategory" - `utf8_bin` and so on.
Below my code i try this solution but not work it. and also i give utf8_unicode_ci for hindi and utf8_bin for Gujarati Language in database Collation.
Please Help me how i fetch data Hindi & Gujarati Language.
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
<?php
session_start();
require_once('config.php');
$JSONoutput = array();
$q=mysqli_query($con,"SELECT * FROM tbl_Hindi");
header('Content-Type: text/html; charset=UTF-8');
while($rs=mysqli_fetch_assoc($q))
{
$JSONoutput['SMS'][] = array("ID"=>$rs['ID'],"Message"=>$rs['Message']);
}
print(json_encode($JSONoutput));
?>
Output:
{"SMS":[{"ID":"1","Message":"?? ????? ?? ??? ???? ??, ???? ?? ???? ??? ?? ????? ?? ?? ???? ?? ??????"},{"ID":"2","Message":"???? ????? : ??? ???? ?? ? ????? ????? : shopping ???? ?? ??? ???? ????? : : ???? ?? ???? ? ????? ????? : ???? ??? ??? ?? Gf ?? ? ???? ????? : ?? ?? ??? ??? ?? ? ????? ????? : ?? ???? "}]}
Your sql table change field format change language collection
utf16_general_ci all language accept.
ALTER TABLE tbl_Hindi CHANGE Message Message VARCHAR( 50 ) CHARACTER SET utf16 COLLATE utf16_general_ci NOT NULL ;
Try this query
See https://stackoverflow.com/a/38363567/1766831 -- In particular the discussion of "Question marks". It says
The bytes to be stored are not encoded as utf8/utf8mb4. Fix this.
The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
Also, check that the connection during reading is UTF-8.
The question marks are already in the database table; the original data is lost.
Do not us ucs2 or utf16; use only utf8 or utf8mb4.
So after a whole day of googling and debugging I end up here.
MySQL
set to the following encoding:
db: utf8_general_ci
table: utf8_general_ci
column: utf8_general_ci, TEXT
I put in some euro symbols and some other weird characters
acentuação €€€€€
PHP (codeigniter)
config
$config['charset'] = 'UTF-8';
dsn
char_set=utf8,dbcollat=utf8_general_ci
I made some queries to compare
model
$query = $this->db->query("SET NAMES latin1");
$query = $this->db->query("SELECT shortdesc,HEX(shortdesc) FROM `contracttypes` WHERE id = 4");
$ret['latin1'] = $query->row();
$query = $this->db->query("SET NAMES utf8");
$query = $this->db->query("SELECT shortdesc,HEX(shortdesc) FROM `contracttypes` WHERE id = 4");
$ret['utf8'] = $query->row();
return $ret;;
controller
public function utfhell() {
var_dump($this->campagne_model->utfhell());
}
This outputs
array (size=2)
'latin1' =>
object(stdClass)[34]
public 'shortdesc' => string 'acentua��o �����' (length=16)
public 'HEX(shortdesc)' => string '6163656E747561C3A7C3A36F20E282ACE282ACE282ACE282ACE282AC' (length=56)
'utf8' =>
object(stdClass)[33]
public 'shortdesc' => string 'acentuação €€€€€' (length=28)
public 'HEX(shortdesc)' => string '6163656E747561C3A7C3A36F20E282ACE282ACE282ACE282ACE282AC' (length=56)
So far so good, on to a
view
<?php header('Content-Type: text/html; charset="utf-8"', true); ?>
<!doctype html>
<html>
<head>
<title>UTFhell</title>
<link rel="stylesheet" href="../assets/css/style.css"/>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
...
<?php
echo 'Original : ', $campagne_info->contractName->shortdesc."<br />";
echo 'UTF8 Encode : ', utf8_encode($campagne_info->contractName->shortdesc)."<br />";
echo 'UTF8 Decode : ', utf8_decode($campagne_info->contractName->shortdesc)."<br />";
echo 'TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//TRANSLIT", $campagne_info->contractName->shortdesc)."<br />";
echo 'IGNORE TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//IGNORE//TRANSLIT", $campagne_info->contractName->shortdesc)."<br />";
echo 'IGNORE : ', iconv("ISO-8859-1", "UTF-8//IGNORE", $campagne_info->contractName->shortdesc)."<br />";
echo 'Plain: ', iconv("ISO-8859-1", "UTF-8", $campagne_info->contractName->shortdesc)."<br />";
echo '€€€€€€€€€€<br>';
?>
None of these now show me a normal euro symbol except the final echo statement, they all give me questionmark diamonds for the eurosymbols
The HEX is the utf8 encoding for that string. So the data is in the table 'correctly'.
The black diamond (�) is the browser's way of saying wtf. It comes from having latin1 characters, but telling the browser
to display utf8 characters.
You could tell the browser to display "Western", that is avoiding the underlying problems.
Remember, the goal is to really use utf8.
Sometimes this occurs together with Question Marks, in which case you must start over.
The cause (probably):
The bytes you had were encoded latin1. You acquired them from somewhere -- file dump online input, etc.
The connection parameters said latin1.
The column/table is declared to be CHARACTER SET said utf8, so during INSERT, they were correctly converted.
When SELECTing, the seting in step 2 was again latin1, so they were converted back to latin1.
When displaying text in a web page, the page's header said that the bytes were utf8.
Solution, Plan A: (Sloppy, but probably workable)
Change #5 so say the appropriate equivalent of latin1.
Solution, Plan B:
Fix the source to be utf8-encoded
query("SET NAMES utf8") (unless there is a way to set it at connect time)
Leave the table/column at CHARACTER SET utf8
Step 2 cover this.
Leave <meta ... UTF-*>.
I am trying to get an Access DB converted into MySQL. Everything works perfectly, expect for one big monkey wrench... If the access db has any non standard characters, it wont work. My query will tell me:
Incorrect string value: '\xE9d'
If I directly echo out the rows text that has the 'invalid' character I get a question mark in a black square in my browser (so é would turn into that invalid symbal on echo).
NOTE: That same from will accept, save and display the "é" fine in a textbox that is used to title this db upload. Also if I 'save as' the page and re-open it up the 'é' is displayed correctly....
Here is how I connect:
$conn = new PDO("odbc:Driver={Microsoft Access Driver (*.mdb)};Dbq=$fileLocation;SystemDB=$securefilePath;Uid=developer;Pwd=pass;charset=utf;");
I have tried numerous things, including:
$conn -> exec("set names utf8");
When I try a 'CurrentDb.CollatingOrder' in access it tells me 1033 apparently that is dbSortGeneral for "English, German, French, and Portuguese collating order".
What is wrong? It is almost like the PDO is sending me a collation my browser and PHP does not fully understand.
The Problem
When using native PHP ODBC features (PDO_ODBC or the older odbc_ functions) and the Access ODBC driver, text is not UTF-8 encoded, even though it is stored in the Access database as Unicode characters. So, for a sample table named "Teams"
Team
-----------------------
Boston Bruins
Canadiens de Montréal
Федерация хоккея России
the code
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'odbc:' .
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb;' .
'Uid=Admin;';
$db = new PDO($connStr);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$sql = "SELECT Team FROM Teams";
foreach ($db->query($sql) as $row) {
$s = $row["Team"];
echo $s . "<br/>\n";
}
?>
</body>
</html>
displays this in the browser
Boston Bruins
Canadiens de Montr�al
????????? ?????? ??????
The Easy but Incomplete Fixes
The text returned by Access ODBC actually matches the Windows-1252 character encoding for the characters in that character set, so simply changing the line
$s = $row["Team"];
to
$s = utf8_encode($row["Team"]);
will allow the second entry to be displayed correctly
Boston Bruins
Canadiens de Montréal
????????? ?????? ??????
but the utf8_encode() function converts from ISO-8859-1, not Windows-1252, so some characters (notably the Euro symbol '€') will disappear. A better solution would be to use
$s = mb_convert_encoding($row["Team"], "UTF-8", "Windows-1252");
but that still wouldn't solve the problem with the third entry in our sample table.
The Complete Fix
For full UTF-8 support we need to use COM with ADODB Connection and Recordset objects like so
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb';
$con = new COM("ADODB.Connection", NULL, CP_UTF8); // specify UTF-8 code page
$con->Open($connStr);
$rst = new COM("ADODB.Recordset");
$sql = "SELECT Team FROM Teams";
$rst->Open($sql, $con, 3, 3); // adOpenStatic, adLockOptimistic
while (!$rst->EOF) {
$s = $rst->Fields("Team");
echo $s . "<br/>\n";
$rst->MoveNext;
}
$rst->Close();
$con->Close();
?>
</body>
</html>
A bit more easily to manipulate the data. (Matrix array).
function consulta($sql) {
$db_path = $_SERVER["DOCUMENT_ROOT"] . '/database/Registros.accdb';
$conn = new COM('ADODB.Connection', NULL, CP_UTF8) or exit('Falha ao iniciar o ADO (objeto COM).');
$conn->Open("Persist Security Info=False;Provider=Microsoft.ACE.OLEDB.12.0;Jet OLEDB:Database Password=ifpb#10510211298;Data Source=$db_path");
$rs = $conn->Execute($sql);
$numRegistos = $rs->Fields->Count;
$index = 0;
while (!$rs->EOF){
for ($n = 0; $n < $numRegistos; $n++) {
if(is_null($rs->Fields[$n]->Value)) continue;
$resultados[$index][$rs->Fields[$n]->Name] = $rs->Fields[$n]->Value;
echo '.';
}
echo '<br>';
$index = $index + 1;
$rs->MoveNext();
}
$conn->Close();
return $resultados;
}
$dados = consulta("select * from campus");
var_dump($dados);
Found the following solution. True, I did not have the opportunity to test it on php. But I suppose it should work out.
In order for native PHP ODBC features (PDO_ODBC or the older odbc_ functions) and the Access ODBC driver to be able to correctly subtract texts in Unicode encoding, that stored in the Access database as Unicode character, it is need enables "Beta: Use Unicode UTF-8 for worldwide language support" in Region Settiongs of Windows Operetion System.
After I did this at me, many programs using the standard ODBC driver MC Access, began to display correct texts in Unicode encoding.
All Settings -> Time & Language -> Language -> "Administrative Language Settings"