Display special characters from table in php page - php

I have a table which contains name of teachers. Some names are like René Visser having special characters. When I write the SQL query for displaying the name, the special characters are replaced by � symbols.
I have tried cast() but it's not working properly. My query is like this.
$qry = mssql_query("SELECT CAST(FirstName_1 AS NVARCHAR(250)) AS Name FROM
tbl_teachers");
The FirstName_1 column is nvarchar type. I have tried to cast FirstName_1 to VARBINARY(8000), then casting result to IMAGE like following.
CAST(CAST(FirstName_1 AS VARBINARY(8000)) AS IMAGE) AS Name.

You should have UTF-8 encoding for the SQL server.
Then, make sure you send the encoding headers also from php using :
header('Content-Type: text/html; charset=utf-8');

You will need to specify the charset, or if you have already, set it to Windows-1252. It's likely your page is reading in the data with UTF-8 encoding. Which explains the ? symbols.
<head>
<meta charset="Windows-1252">
</head>

You most likely have a charset issue. Your query has little to do with this, you don't need to cast it.
You'll need to set the charset of the connection, the PHP and HTML header and the document itself as the same charset. UTF-8 will most likely cover all of the special characters you'll ever need.
Below is some things you could do.
ini_set('mssql.charset', 'UTF-8'); (Have this run upon connecting to your database)
Set both PHP and HTML header to UTF-8
PHP: header('Content-Type: text/html; charset=utf-8'); (has to be called before any and all output)
HTML: <meta charset="utf-8"> (has to be inside the <head>-tag
Save the document in UTF-8 encoding. If you're doing it in Notepad++, it's Format -> Convert to UFT-8 (you may also choose UTF-8 w/o BOM)
The database itself, and it's tables, may need to be set to UTF-8. This can be done with the query below (need only to be run once):
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Keep in mind that all parts of your application has to be set to the same type of charset, or you'll experience those kind of things in your database.

Related

Problems with ÆØÅ (Norwegian letters) when inserting into database (MySQLi)

I'm having problems when inserting Norwegian letters (æøå) into a database. The charset coding of my document (using Notepad++) is set to UTF-8. The variable that is being inserted has the proper characters, but when it is inserted a word that should be shown as "spørsmål", is when inserted to the database, shown as "spørsmÃ¥l".
I'm using the following code to insert:
$newContent = htmlspecialchars($_POST['newContent']);
$stmt = $mysqli->prepare("UPDATE info SET frontpageText=?");
$stmt->bind_param('s', $newContent);
$stmt->execute();
$stmt->close();
And when connecting to the databse, I've tried using
$mysqli->set_charset("utf8");
$mysqli->query("SET NAMES 'utf8'");
I've also done the following:
The database collation is set to utf8_danish_ci
Using header('Content-type: text/html; charset=utf-8'); (PHP header)
Using meta charset="utf-8" (in the HTML-head)
The document itself is encoded in utf-8
Tried running SET NAMES utf8;
The worst part about this, is that I've actually had it working earlier, but I've appearantly broken something (which I don't know what).
Anyone has an idea what could be done to fix this issue?
EDIT: Problem has been solved. Apparently the table wasn't properly set to UTF-8. Ran this code in phpMyAdmin
ALTER TABLE table_name CHARSET = 'utf8';
The character set and the collation have separate encodings (one can be latin1 and the other can be utf8).
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set.
To diagnose the encoding of a table this query can be run, where ##TABLENAME## should be the actual table name.
SHOW CREATE TABLE ##TABLENAME##
If the encoding is not utf8 it can be altered with
alter table table_name charset = 'utf8';
Here's a thread on collation vs. character set, What does character set and collation mean exactly?.
You can convert all characters to html chars and save into database that way.
mb_convert_encoding($string, 'HTML-ENTITIES', 'utf-8');
All browsers will display it in the right way.
Make sure the default table charset or field charset is set to utf-8.
I know you've said that database charset is set correctly, but tables may interfere with that if they specify a different charset.

UTF8 mysql encoded database, but not showing special characters in PHP

I have a database with utf_general_ci encoding in MySQL, when I insert data in Navicat or PhpMyAdmin and do queries there, the special characters are returned fine.
But in a php variable they are showing like this:
My database and collations:
and in my html doc if it is a normal it shows the characters OK, it only not work when is a php variable
please help me
Try:
$variable = utf8_encode($valueCommingFromDB);
It's important to note that utf8 in mysql is fairly limited, especially utf8_general_ci. You might want to switch to utf8_unicode_ci or utf8mb4
What's the difference between utf8_general_ci and utf8_unicode_ci
http://mathiasbynens.be/notes/mysql-utf8mb4
Hei,
Firstly be sure you have this meta in html tag
<meta charset="UTF-8">
You can check also with this line in php, above you mysql extraction
header('Content-type: text/html; charset=utf-8');
If you store your special character in database as html code like this ' which is ' when you extract try to use this php function
htmlspecialchars_decode($your_text_from_db);

Strings containing non-ASCII characters are truncated by PHP/MySQL

I have a page that has a translation function here. My problem here is that, when I translate the language into French, the words are cut because the page didn't interpret the words correctly. I checked posts related to my problem but none of them work.
In my page, I put these stuffs:
header ('Content-Type:text/html; charset=WINDOWS-1252'); -> This is just to insist the encoding on start up. I think this one is optional but I still use it.
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
Equivalent translations are fetched from a database tablename: labels. Labels's table type is InnoDB with utf8 -- UTF-8 Unicode as default Character set.
Characters after é are being cut. Is there anything that I need to do to display the characters correctly? Thanks!
I don't see any point in using Unicode on the backend and a code page in the frontend of a multilingual application. You either use the same encoding throughout your project, or you manually convert back and forth between UTF-8 and windows-1252.
I don't think you have a problem with reading. The labels come truncated from the DB, otherwise your browser would display garbage characters. So this is not an issue with PHP/HTML, but with MySQL. In the case of èéàòì and the like, MySQL is certainly able to convert from UTF-8 to CP1252 (latin1). However, if this were not the case (as if we try to convert the same string from UTF-8 to CP1251), MySQL would show a question mark ?.
In your case I think it's an input problem, ie the labels are truncated in the DB. How is this possible? You may have a UTF8 PHP and MySQL, but your browser sends windows-1252 strings when it submits a form from a page loaded with such a charset. In your PHP script you should transcode this string to UTF-8 before inserting it in the db, or connect to MySQL with SET NAMES 'CP1252'. Since you don't do so, you end up trying to insert a bunch of invalid UTF-8 bytes, so MySQL truncates the string and your labels are empty. Attached is a test case. Here is the test table
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
Here is the PHP part. Note that this script is UTF-8 encoded, so every literal string appearing in it has the same encoding.
// This is a UTF-8 file, so my editor uses UTF-8 and thus each literal
// string is a UTF-8 string, since PHP only has binary strings.
$label = "Référence";
// Now let's translate this string as if it came from a browser submitting
// a form loaded from a cp1252 encoded page
$src = mb_convert_encoding($label, "CP1252", "UTF-8");
// But connect as if I were UTF-8
$db = new PDO('mysql:host=localhost;dbname=test;charset=utf8',
'test', 'test');
// Insert the string
$stmt = $db->prepare('INSERT INTO test (name) VALUES ( ? )');
$stmt->bindValue(1, $src);
$stmt->execute();
// Read it
header("content-type: text/plain; charset=windows-1252");
foreach($db->query('SELECT * FROM test') as $row)
echo $row['name'] . "\n";
How do you recover? Either you connect to MySQL with the cp1252 charset and let MySQL translate for you, or you transcode the string in your script.
After correctly getting data in, you'll have to extract them and put it on a HTML page. This time you'll have the same problem, but reversed: showing a UTF-8 string in a CP1252 document. The bytes in the DB are unsuitable, because UTF-8 is a variable-length encoding, whilst in CP1252 a char is exactly 1 byte long. If you put these bytes directly into the page, the browser will show some random gibberish for the extra bytes. So, again, you either connect to the db specifying the CP1252 charset so that MySQL takes care of the conversion and give you the right bytes, or you transcode the bytes yourself on the PHP side.
Or you'd better doing yourself a favor: use the same encoding everywhere. I suggest UTF-8 because today is the right thing to do, but you can successfully opt for CP1252 because it can represents English and French chars (and saves some storage, but I don't consider this an issue)
My suggestion is to use the same encoding through the whole process. Use UTF-8 as the charset both in header and the meta tag.
It seems to me, that your data is not stored correctly in the database. If you are working with mysqli you can try to set the charset of the connection object, before reading or writing to the database.
// tells the mysqli connection to deliver UTF-8 encoded strings.
$db = new mysqli($dbHost, $dbUser, $dbPassword, $dbName);
$db->set_charset('utf8');
For other databases see UTF-8 for PHP and MySQL. Maybe it's necessary to insert the french textes again (with this setting), because the existing textes could be invalid now.
Your linked example page is correctly encoded with UTF-8 (the file format), though your meta tag is a bit incorrect:
<!meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
The <! is not a commented out, you would have to write <!-- instead. The best would be to declare it only once for UTF-8 and remove other meta tags.

Reading Unicode characters from MySQL with PHP

I've inherited a MySQL database which contains a field named Description of type text and collation of latin1_swedish_ci.
The problem with this field is it contains utf-8 data with some Unicode characters, e.g. character 733, etc. Sometimes this character also exists in the field represented as HTML encoded "&#733" as well.
I'm trying to read the table and export the data to a CSV file and I need to represent this character as a double quote.
Reading the HTML encoded character is easy enough. However, it appears that the actual Unicode character is converted to utf-8 before I can do anything with it resulting in a "?".
How do I read in the Unicode character 733 (U+02DD), recognize it and convert it?
Here's a simplified (not tested) version of the code.
<?
$testconn=odbc_connect ("TESTLIB", "......", "......");
$query="SELECT Description FROM TestTable";
$rsWeb=mysql_query($query));
$WebRow=mysql_fetch_row($rsWeb));
$Desc = $WebRow[0];
$Desc = str_replace('"','""',$Desc);
fwrite($output,"\"".$Desc."\",\r\n");
%>
Also set charset to utf-8 when connecting to SQL server:
http://php.net/manual/en/mysqli.set-charset.php
$mysqli->set_charset("utf8");
I think your connection charset is not utf8, that's why chars are being converted to '?'.
Read this: http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html
Post result for query:
show variables like 'char%';
You really should put only non-entity (Unicode) version in the database, and entity-decode the rest. However, when you want to use UTF-8 with MySQL, there are a few things to remember:
Your table column's collation should be utf8_bin or similar.
Your table's collation and database collation should also be utf8_bin just in case.
Your connection charset should be UTF8. Do this by executing the "SET NAMES utf8" query.
Also, if you're outputting a HTML page, that should have the UTF8 charset as well. If everything is correct, the UTF8 characters should come out fine.
Good luck!

Persian words in the database and it is issue?

Why insert in database value date as: شهريور ?
How can search this word in database(شهريور)?
In database: structure => date => varchar(255) => utf8_general_ci = "شهريور".
You website uses an encoding in which these characters do not exists, so the browser sends HTML entities instead.
(Try this here: http://codepad.viper-7.com/dfFMvW ; This page is in ISO-8859-1, if you send non-ISO-8859-1 characters in the input, they are sent as HTML entities.)
To avoid this you have to use a different encoding, like UTF-8.
Add this header in your <head> tag:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
Or do this in your PHP before printing anything:
header('Content-Type: text/html; charset=UTF-8');
And make sure your database uses UTF-8 too.
You can convert your database to UTF-8 by doing this:
ALTER DATABASE your_database CHARACTER SET utf8;
-- for each table:
ALTER TABLE some_table CONVERT TO CHARACTER SET utf8;
And after you connect to the database, send this query:
SET NAMES UTF8;
You dont need to html escape those characters as long as you have a UTF* table, and you do.
Simply make sure that the table is UTF8, that the connection is utf8, and the browser reads the texts as utf8.
for mysql see SET CHARACTER SET, SET NAMES, SET COLLATION_CONNECTION
for html use <meta content="text/html;charset=UTF-8" http-equiv="Content-Type" /> and the according http headers, if needed

Categories