Wrong utf8 encoding exporting Mysql database - php

I always seen in phpmyadmin special characters encoded like:
میثم ابراهیمیجØعشق ØŒ سر ب راه ØŒ نبض ØŒ شبهای ØŒ غنچه ها ØŒ شکوه ØŒ همین امروز ØŒ عادت ت'
I always thought that was just a problem related to phpmyadmin since on my application all of them were displayed correctly.
Now I'm exporting this database and in my mysql dump I see exactly the characters above so seems that they are stored in this way on the database.
There is an easy way to dump the utf8 characters?
I already tried to follow those suggestion: I need help fixing Broken UTF8 encoding
but the only thing that let visualize the characters properly is print them on a web page and add on top:
<html>
<head>
<meta charset="UTF-8">
</head>
<body>
The collation of the fields is utf8_unicode_ci.
The connection exploit:
$options = array(\PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8',);
Edit: #Álvaro-González Attemp to retrieve hexadecimal values
I enter in the database the symbol € then from phpmyadmin I select with: SELECT name, HEX(name) as hex_name FROM `items`
that's the result:
name €
hex_name C3A2E2809AC2AC
I will provide any further information on request
Thanks

You wanted €? But the hex is C3A2E2809AC2AC. That is a case of "double encoding" - when you stored the data.
See Trouble with utf8 characters; what I see is not what I stored , especially in the discussion about "double encoding".
The data is broken. You may be able to fix the data with a messy UPDATE. See http://mysql.rjweb.org/doc.php/charcoll#example_of_double_encoding .
Edit
Your original stuff looks like
CONVERT(BINARY(CONVERT('میثم اب' USING latin1)) USING utf8mb4)میثم اب -- mojibake to ut8,

Related

Insert Russian characters mysql

I'm using mysql database and php to insert russian characters into a table.
I'm using:
$conn->set_charset('utf-8');
into my .php page to set charset to utf-8 but, when I try to print the DB charset with:
echo "set name:".$conn->character_set_name();
it shows
set name:latin1
I've set my Table to:
utf8mb4_unicode_ci
but nothing change.
Printing the passed text from the ajax request, I can see the text written correctly.
What should I do?
I guess you aren't checking the return value of mysqli::set_charset(). It must be returning false because utf-8 is not a valid encoding name in MySQL; the correct name is utf8 (no dash). Or, even better, utf8mb4.
You can get a list of supported encodings with:
SHOW COLLATION;

JSON creating from PHP giving wrong data?

I have one php form where i used to enter data to database(phpmyadmin), and i used SELECT query to display all values in database to view in php form.
Also i have another PHP file which i used to create JSON from the same db table.
Here when i enter foreign languages like "Experiența personală:" the value getting saved in DB is "ExperienÈ›a personală: " but when i use select query to display this in same php form it coming correctly "Experiența personală:". So the db is correct and now am using following php code to create JSON
<?php
$servername = "localhost";
$username = "root";
$password = "root";
$dbname = "aaps";
// Create connection
$con=mysqli_connect($servername,$username,$password,$dbname);
// Check connection
mysqli_set_charset($con, 'utf8');
//echo "connected";
$rslt=mysqli_query($con,"SELECT * FROM offers");
while($row=mysqli_fetch_assoc($rslt))
{
$taxi[] = array('code'=> $row["code"], 'name'=> $row["name"],'contact'=> $row["contact"], 'url'=> $row["url"], 'details'=> $row["details"]);
}
header("Content-type: application/json; charset=utf-8");
echo json_encode($taxi);
?>
and JSON looks like
[{"code":"CT1","name":"Experien\u00c8\u203aa personal\u00c4\u0192: ","contact":"4535623643","url":"images\/offers\/event-logo-8.jpg","details":"Experien\u00c8\u203aa personal\u00c4\u0192: jerhbehwgrh 234234 hjfhjerg#$%$#%#4"},{"code":"ewrw","name":"Experien\u00c8\u203aa personal\u00c4\u0192: ","contact":"ewfew","url":"","details":"eExperien\u00c8\u203aa personal\u00c4\u0192: Experien\u00c8\u203aa personal\u00c4\u0192: Experien\u00c8\u203aa personal\u00c4\u0192: "},{"code":"Experien\u00c8\u203aa personal\u00c4\u0192: ","name":"Experien\u00c8\u203aa personal\u00c4\u0192: ","contact":"","url":"","details":"Experien\u00c8\u203aa personal\u00c4\u0192: "}]
In this "\u00c8\u203aa" this is wrong it supposed to be "\u021b" (t).
So pho used to creating JSON making this issue.
But am unable to find exactly why its coming like this . please help
Avoid Unicode -- note the extra argument:
json_encode($s, JSON_UNESCAPED_UNICODE)
Don't use utf8_encode/decode.
ă turning into ă is Mojibake. It probably means that
The bytes you have in the client are correctly encoded in utf8 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8.)
The column in the tables may or may not have been CHARACTER SET utf8, but it should have been that.
If you need to fix for the data it takes a "2-step ALTER", something like
ALTER TABLE Tbl MODIFY COLUMN col VARBINARY(...) ...;
ALTER TABLE Tbl MODIFY COLUMN col VARCHAR(...) ... CHARACTER SET utf8 ...;
Before making any changes, do
SELECT col, HEX(col) FROM tbl WHERE ...
With that, ă should show hex of C483. If you see C384C692, you have "double-encoding", which is messier to fix.
Depending on the version of MySql in the database, it may not be using the full utf-8 set, as stated in the documentation:
The ucs2 and utf8 character sets do not support supplementary characters that lie outside the BMP. Characters outside the BMP compare as REPLACEMENT CHARACTER and convert to '?' when converted to a Unicode character set.
This, however, is not likely to be related to your problem. I would try a couple of different things and see if it solves your problem.
use SET NAMES utf-8
You can read more about that here
use utf8_encode() when inserting data to the database, and utf8_decode() when extracting. That way, you don't have to worry about MySql manipulating the unicode characters. Documentation

XOR encode a multibyte string and save to MySQL field without loss

I'm currently using this function to obfuscate a bit the field values in MySQL and protect it from direct dumping. It all works good and values are stored correctly, but what happens when i try to store a multibyte string?
Here's an example, let's try to encode the string álex:
<?
$v = xorencode('álex');
// step 1 - encode
echo $v."\n";
// step 2 - decode
echo xorencode($v);
?>
Works good, i see some obfuscated string first time, and then i see álex again. Now if i try to save it in a VARCHAR field in a MySQL table, and then select it - i no longer have a utf string, instead it gets returned as gllex.
Note, MySQL tables and fields collations are utf8_general_ci, files are UTF-8, and i SET NAMES utf8 after connecting. Any workaround to this?
Thanks

Oracle connection to retrieve or insert Arabic values from database

I have this code in drupal 6 to retrieve arabic values from Oracle databse:
<?php
session_start();
$conn=oci_connect('localhost','pass','IP....');
$stid=oci_parse($conn,"select arabic_name from arabic_names_table");
oci_execute($stid);
if($row-oci_fetch_array($stid,OCI_ASSOC+OCI_RETURNS_NULLS))
{
$name_ar=$row['arabic_name'];
}
?>
When values are retrieved from the DB or inserted to the DB they appears like this ???
Please note:
My Oracle database reads normal Arabic characters. From PL/SQL I can insert arabic values
I have installed the mbstring
I have the utf-8 encoding enabled.
How can I solve this problem?
From the oracle database, when you try to fetch data, normally you will get the character encoding will be the encoding type of the client installed in the system (the machine that you installed the php). This encoding will be the charset of the windows registry for the oracle client. (see HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_OraClient11g_home1), and the key is NLS_LANG. If you search the value of the above key, you will get something like ARABIC_UNITED ARAB EMIRATES.AR8MSWIN1256. Please note that the encoding type is AR8MSWIN1256. In the character map array this is mapped to windows-1256 ( windows-1256 => AR8MSWIN1256 ).
See this link http://websvn.projects.ez.no/wsvn/ezoracle/?op=comp&compare[]=%2Fstable#385&compare[]=%2Fstable#386.
That is, after you fetch the data from the database the char encoding will be windows-1256. Now if your web page is using utf-8 charset, you need to convert the string to utf-8. For this you can use iconv().
$win1256 = iconv('windows-1256', 'utf-8', $my_string); //$my_string -> windows-1256
echo $win1256; // Results the utf-8 format .
If you are still facing problem you check the charset in the page, it must be utf-8.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
I think this will solve your problem.

PDO utf8_encoding my text string twice in INSERT?

Relevent code:
$status = $db->run(
"INSERT INTO user_wall (accountID, fromID, text, datetime) VALUES (:toID, :fromID, :text, '" . time() . "')",
array(":toID" => $toID, ":fromID" => %accountID, ":text" => $text)
);
I am taking input text from javascript, throwing it in an AJAX call to handle it, which calls a function which includes these lines of code.
The text string in question is: "Türkçe Türkçe Türkçe!"
Upon investigating the database, the following value is saved "Türkçe Türkçe Türkçe!", which is double utf8_encode'd.
When viewing the text by SELECTing it from the database, I get "Türkçe Türkçe Türkçe!", which is how they should be saved in the database in the first place.
As far as I know, I am not encoding the data as it is being prepared by PDO...
Encoding is a b*tch. You need to make sure it is as you expect it to be in several places:
The HTML page with the Javascript. Set it to utf-8 with a meta tag like:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
The connection to your database. Execute the query set names 'utf8' after connecting to the database (and before any other queries).
Your database field. In MySQL it's called a collation set it to utf8_general_ci (ci stands for case-insensitive).
If you have these 3 your data should always be, and stay, utf-8 (unless you're doing encoding yourself).
For good measure, make sure your source code files are utf-8 as well. Windows typically defaults to iso.

Categories