Convert latin1 to UTF8

Convert latin1 to UTF8 - php

I have a DB - with the table articles.
I want to convert the title, and content field to utf8
now - all data looks like this: ×¤×•×¨×˜×œ ×¨×¢×œ × ×¤×ª×— ×¨×©×ž×™×ª!
I want it to become normal hebrew characters.
Thanks

The following MySQL function will return the correct utf8 string after double-encoding:
CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8)
It can be used with an UPDATE statement to correct the fields:
UPDATE tablename SET field = CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8);

if you need to convert the whole database , you can back it as databaseback.sql file then form your command line
iconv -f latain -t utf-8 < databaseback.sql > databaseback.utf8.sql
you can use the http://www.php.net/manual/en/function.iconv.php
to convert each row in php in case you don't have command line access
and lastly don't forget to convert the collation of each field in phpmyadmin , then you can resotre the utf8 back easily
update
if you got iconv is not recognized , it means that you don't have iconv installed
much more easier solution is :
Migrating MySQL Data to Unicode
http://daveyshafik.com/archives/166-migrating-mysql-data-to-unicode.html

You can make mysqldump from this database. Then download something like Notepad++, open dump file, convert it to UTF8, then replace through the file all encodings to utf-8 including the first SET NAMES operator.
If you make dump to file via phpMyAdmin (with default settings) use output file encoding ISO-8859-1 instead of UTF-8 as you can see by default.

You can write a little php script which does the conversion. See http://www.php.net/manual/en/function.mb-detect-encoding.php and http://php.net/manual/en/function.mb-convert-encoding.php This is how I did this.
And remember to use strict mode! http://www.php.net/manual/en/function.mb-detect-encoding.php#102510
In pseudocode it would be sth. like this:
str = getDataAsString()
if(!isUTF8(str)) {
str = convert2UTF8(str)
}
saveStr2DB()

try
ALTER TABLE `tablename` CHANGE `field_name` `field_name` VARCHAR( 200 ) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL

Related

how to convert raw utf8mb4_unicode_ci like X'313735313136' to a regular string (utf8) in php

This question is about converting a raw utf8mb4_unicode_ci from a .sql to a normal utf-8
My .sql was from a database that had COLLATE utf8mb4_unicode_ci
like this :
(5558,'2019-11-18 18:28:36',X'313735313136','am',''),
in PHP how would I convert this X'313735313136' to a normal utf8 string ?
I tried utf8_decode("\xE18885E18C8D"); but the result is ?8885E18C8D ...

Regardless of the collation of the database, the exported file you are looking at has decoded that and has provided a binhex encoding. I'm not sure what a reasonable value is for that column, but
php -r "print hex2bin('313735313136');"
returns 175116

Insert Russian characters mysql

I'm using mysql database and php to insert russian characters into a table.
I'm using:
$conn->set_charset('utf-8');
into my .php page to set charset to utf-8 but, when I try to print the DB charset with:
echo "set name:".$conn->character_set_name();
it shows
set name:latin1
I've set my Table to:
utf8mb4_unicode_ci
but nothing change.
Printing the passed text from the ajax request, I can see the text written correctly.
What should I do?

I guess you aren't checking the return value of mysqli::set_charset(). It must be returning false because utf-8 is not a valid encoding name in MySQL; the correct name is utf8 (no dash). Or, even better, utf8mb4.
You can get a list of supported encodings with:
SHOW COLLATION;

charset issue with accents retrieving DB2 values with php over odbc

I'm trying to do a select from a DB2 through PHP and odbc and then save those values on a file. The OS where the code is being executed is Debian. What I do is the following:
$query = "SELECT NAME FROM DATABASE_EXAMPLE.TABLE_EXAMPLE";
$result = odbc_prepare($server, $query);
$success = odbc_execute($result);
$linias = "";
if ($success) {
while ($myRow = odbc_fetch_array($result)) {
$linias .=format_word($myRow['NAME'], 30) . "\r\n";
}
generate_file($linias);
function format_word($paraula, $longitut) {
return str_pad(utf8_encode($paraula), $longitut, " ", STR_PAD_LEFT);
}
function generate_file($linias) {
$nom_fitxer = date('YmdGis');
file_put_contents($nom_fitxer . ".tmp", $linias);
rename($nom_fitxer . '.tmp', $nom_fitxer . '.itf');
}
The problem is that some of the retrieved values contains spanish letters and accents. To make and example, one of the values is "ÁNGULO". If I var_dump the code on my browser I get the word fine, but when it's write into the file it apends weird characters on it (that's why I think there is a problem with the charset). I have tried different workarounds but it just make it worst. The file opened with Notepad++ (with UTF8 encoding enabled) looks like:
Is there a function in PHP that translate between charsets?
Edit
Following erg instructions I do further research:
The DB2 database use IBM284 charset, as I found executing the next command:
select table_schema, table_name, column_name, character_set_name from SYSIBM.COLUMNS
Firefox says the page is encoded as Unicode.
If i do:
var_dump(mb_detect_encoding($paraula));
I get bool(false) as a result.
I have changed my function for formating the word hoping that iconv resolve the conflict:
function format_word($paraula, $longitut) {
$paraula : mb_convert_encoding($paraula, 'UTF-8');
$paraula= iconv("IBM284", "UTF-8", $paraula);
return $paraula;
}
But it doesn't. Seems like the ODBC it's doing some codification bad and that is what mess the data. How can I modify the odbc to codificate to the right charset? I have seen some on Linux changing the locale, but if I execute the command locale on the PC I get:
LC_NAME="es_ES.UTF-8"
LC_ADDRESS="es_ES.UTF-8"
...

I will try to summarize from the comments into an answer:
First note that PHPs utf8_encode will convert from ISO-8859-1 to utf-8. If your database / ODBC-Driver does not return ISO-8859-1 encoded strings, PHPs utf8_encode will fail or return garbage.
The easiest solution should be to let the database / driver convert the values to correct encoding, using its CAST function: https://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/sqlref/src/tpc/db2z_castspecification.html
Try to alter your query to let DB2 convert everything to UTF-8 directly and omit the utf8_encode call. This can be done by altering you query to something like:
SELECT CAST(NAME AS VARCHAR(255) CCSID 1208) FROM DATABASE_EXAMPLE.TABLE_EXAMPLE
Thanks to Sergei for the note about CCSID 1208 on IBM PUA. I changed CCSID UNICODE to CCSID 1208.
I do not have DB2 at hand here, so the above query is untested. I'm not sure if this will return utf-8 or utf-16..

mysql & UTF8 Issue with arabic

this might look like a similar issues for utf8 and Arabic language with MySQL database but i searched for result and found none..
my database endocing is set to utf8_general_ci ,
i had my php paging to be encoded as ansi by default
the arabic language in database shows as : ãÌÑÈ
but i changed it to utf8 ,
if i add new input to database , the arabic language in database shows as : Ø²ÙŠÙ†
i dont care how it show indatabase as long as it shows normally in php page ,
after changing the php page to utf8 , when adding input than retriving it , if show result as it should .
but the old data which was added before converting the page encoding to uft8 show like this : �����
i tried a lot of methods for fixis this like using iconv in ssh and php , utf8_decode() utf8_encode() .. and more but none worked .
so i was hoping that you have a solution for me here ?
update :: Main goal was solved by retrieving data from php page in old encoding ' windows-1256' than update it from ssh .
but one issue left ::
i have some text that was inserted as 'windows-1256' and other that was inserted as 'utf-8' so now the windows encoding was converted to utf-8 and works fine , but the original utf-8 was converted as well to something unreadable , using iconv in php, with old page encoding ..
so is there a way to check what encoding is original in order to convert or not ?

Try run query set name utf8 after create a DB connection, before run any other query.
Such as :
$dbh = new PDO('mysql:dbname='.DB_NAME.';host='.DB_HOST, DB_USER, DB_PASSWORD);
$dbh->exec('set names utf8');

How can I insert Hebrew text into columns? [SQL 2000]

I tried to insert Hebrew a text value into a column,
But it changes the value to Gibberish.
An example of that:
mssql_query ("UPDATE TABLE SET COLUMON = N'בדיקה'");
As you can assume, It changes the value of the column, But the value changed to ????? and if I try to do it from Query Analyser it works fine.
My column's collation is HEBREW_CI_AS. How can I fix this?

You need to specify collation preperty for the string in the INSERT statement you are using. Also the string you are inserting should be of UNICODE datatype - use N prefix for that.
INSERT INTO MEMB_INFO (User, Pass, Name) VALUES ('Joni', '123456', N'גוני דף' COLLATE HEBREW_CI_AS)
Check that PHP variable can handle unicode characters. Otherwise it will be PHP that turns your string into question marks.
You may check out SQL Server drivers for PHP.
And Unicode Character Properties from PHP doicumentation.
Some resources on PHP and unicode:
http://www.sitepoint.com/bringing-unicode-to-php-with-portable-utf8/
http://php.net/manual/en/function.utf8-encode.php
http://allseeing-i.com/How-to-setup-your-PHP-site-to-use-UTF8
http://www.yiiframework.com/wiki/16/how-to-set-up-unicode/
http://pageconfig.com/post/portable-utf8

I solve this problem if someone else has this problem here is my way to fix that:
Create a new database for this specific table or else tables for your web.
Set Hebrew_CI_AS as collation (everyone to what he created).
In your PHP code use mb_convert_encoding() function for SELECT and INSERT.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Convert latin1 to UTF8 - php

I have a DB - with the table articles. I want to convert the title, and content field to utf8 now - all data looks like this: ×¤×•×¨×˜×œ ×¨×¢×œ × ×¤×ª×— ×¨×©×ž×™×ª! I want it to become normal hebrew characters. Thanks

try ALTER TABLE `tablename` CHANGE `field_name` `field_name` VARCHAR( 200 ) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL

Related

how to convert raw utf8mb4_unicode_ci like X'313735313136' to a regular string (utf8) in php

Insert Russian characters mysql

charset issue with accents retrieving DB2 values with php over odbc

mysql & UTF8 Issue with arabic

How can I insert Hebrew text into columns? [SQL 2000]

Categories

Resources