In districts table
I have a row as
district_id district_name country_id
15 Šahty 16
While selecting from php and displaying in browser,it shows like this :�ahty
I am using mssql 2005 with collation SQL_Latin1_General_CP1_CI_AS.
The problem is something like this
removing accent and special characters
but i need the solution in php.
UPDATE(?):
There is no support for UTF-8 in sqlserver.
https://dba.stackexchange.com/questions/7346/mssql-2005-2008-utf-8-collation-charset
Hi you need to consider correct HTML content type header
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Data may be selected correctly, but browser may be can not displayed them as you expected.
You can play with this in firefox by Menu View -> Character encoding -> until you find correct one
In order to make special characters work, a general rule is that all the components must be on the same encoding. This means that database, database connection (very often forgotten 'SET NAMES {charset}' call after connecting to database) and web page Content-type have to be all in the same character set.
If you ask data from latin1 database and have database connection also has latin1, make sure the page you display values at is also latin1.
It's recommended though to use UTF-8 instead of latin1 everywhere, so if possible I recommending changing charsets and data in your database all to UTF-8, as it's more compatible all-around and easier to handle.
Just remove the UTF8 charset and let the browser select the charset it will set to ISO-8859-1 that will work with accents in sql server
Related
There are so many threads dedicated to this topic, that I feel silly having to ask this.
But, I'm at a total loss as to what the problem could be.
I am trying to insert special characters (cyrillic, scandinavian, etc) into a MySQL database, via PHP (html) form.
Characters like : Ä,Ö,Å, as well as russian alphabets, etc.
Based on previous threads in this forum, I have tried all the following (inserted right after the MySQL database-connection string) :
mysqli->set_charset("utf8");
This didn't work, so I tried the following :
mysqli_query("set names 'utf8'");
mysqli_query("set charset 'utf8'");
These are not recommended by PHP. But, I tried them anyway, but still no luck.
(All my databases, tables, and columns are collated as : UTF8_general_ci)
In addition, all my html forms have the following :
<meta charset="utf-8">
So, I'm at a complete loss as to what I'm doing wrong. Once the data is sent to the database, it shows up (in the database itself) as rubbish characters (question marks, and other hieroglyphics).
However, the funny thing is :
(a) When I view the data on my website, it displays correctly;
(b) When the data is sent within the body of an email, it also displays correctly
So..........why is it not displaying correctly within the database itself ??
When dealing with specific charset (like UTF-8), it's important that the entire line of code is set to the same charset. Below are a few pointers how to follow this.
ALL attributes must be set to ut8 (collation is NOT the same as charset in the database)
You should save the document itself as UTF-8 (If you're using Notepad++, it's Format -> Convert to UFT-8 (or UTF-8 w/o BOM), there's a difference - both or either may work for you)
The header in both PHP and HTML should be set to UTF-8:
HTML: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
PHP: header('Content-Type: text/html; charset=utf-8');
Upon connecting to the databse, set the charset ti UTF-8, like this:
$connection->set_charset("utf8"); (directly after connecting)
Also make sure your database and tables are set to UTF-8, you can do that by this query (in the database, need only be done once):
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Remember that EVERYTHING needs to be set to UFT-8 charcode. If something can be set to UFT-8 (or another charset, check the PHP-docs (php.net)), it should be set to the same charset as everything else.
(a) When I view the data on my website, it displays correctly;
(b) When the data is sent within the body of an email, it also displays correctly
This means data is correctly stored in the db, when you get the output is the same like the input, logically correct?
The other question is: How are you looking into the database, which kind of client are you using?
PHPMyAdmin, SomeDesktop Client.. The problem will be there.. because the data is stored right.. seems so ;)
I know there are hundreds of questions about UTF-8 woes but I tried all the approaches I could find, none of them helped.
The facts:
I'm trying to read a string that contains a é from my MySQL database and display it on a PHP page. Actually, it does display as é (but the font does not recognize it as such and thus another default font is used). The troubles arose when I wanted to convert this string to a filename using PHP functions for string replacement. PHP does not recognize this as the é character at all.
Here's a quick rundown of what I'm doing:
1) The String is stored in a MySQL database. The MySQL server settings are:
MySQL connection collation utf8_unicode_ci
MySQL charset: UTF-8 Unicode (utf8)
The database itself is set to collation utf8_unicode_ci (MyISAM storage engine, not changeable due to shared server)
The actual table is set to collcation utf8_unicode_ci (InnoDB storage engine)
The é shows up correctly in phpMyAdmin. The data is inserted into the DB via a Java program but I have also tried this with manually entered data (entered in phpMyAdmin).
2) The PHP default_charset is not set (NO VALUE), I'm on a shared server and placing a manual override php.ini did not seem to work. Using ini_set("default_charset", 'utf-8'); works but has no effect on the problem I have.
3) Before I run the actual select query I query SET NAMES 'utf8'. The query itself is irrelevant but for testing I chose a simple SELECT title FROM items WHERE item_id = 1
4) The PHP file itself is encoded UTF-8. I have set the correct charset for the html with <meta http-equiv="content-type" content="text/html; charset=utf-8" />
5) To test the problem I used htmlentities on the returned string (Astérix), checking the source code it is converted to Astérix which is not correct of course. Accordingly, the string shows up as Astérix in the browser.
What possible reason could there be for this? To me it seems like I set everything that can be set to UTF-8.
http://php.net/manual/en/ref.mbstring.php - look at multibyte string functions.
I have a form in my page for users to leave a comment.
I'm currently using this charset:
meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"
but retrieveving the comment from DB accents are not displaying correct ( Ex. è =>è ).
Which parameters should i care about for a correct handling of accents?
SOLVED
changed meta tag to charset='utf-8'
changed character-set Mysql (ALTER TABLE comments CONVERT TO CHARACTER SET utf-8)
changed connection character-set both when inserting records and retrieving ($conn->query('SET NAMES utf8'))
Now accents are displaying correct
thanks
Luca
Character sets can be complicated and pain to debug when it comes to LAMP web applications. At each of the stages that one piece of software talks to another there's scope for incorrect charset translation or incorrect storage of data.
The places you need to look out for are:
- Between the browser and the web server (which you've listed already)
- Between PHP and the MySQL server
The character you've listed look like normal a European character that will be included in the ISO-8859-1 charset.
Things to check for:
even though you're specifying the character set in a meta header have a look in your browser to be sure which character set the browser is actually using. If you've specified it the browser should use that charset to render/view the page but in cases I've seen it attempting to auto-detect the correct charset and failing. Most browsers will have an "encoding" menu (perhaps under "view") that allows you to choose the charset. Ensure that it says ISO-8859-1 (Western European).
MySQL can happily support character set conversion if required to but in most cases you want to have your tables and client connection set to use the same encoding. When configured this way MySQL won't attempt to do any encoding conversion and will just write the data you input byte for byte into the table. When read it'll come out the same way byte for byte.
You've not said if you're reading data from the database back out with the same web-app or with some other client. I'd suggest you try to read it out with the same web application and using the same meta charset header (again, check the browser is really setting it) and see what is displayed in the browser.
To debug these issues requires you to be really sure about whether the client/console you're using is doing any conversion too, the safest way is sometimes to get the data into a hex editor where you can be sure that nothing else is messing around with any translation.
If it doesn't look like it's a browser-side problem please can you include the output of the following commands against your database:
Run from a connection that your web-app makes (not from some other MySQL client):
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';
Run from any MySQL client:
SHOW CREATE TABLE myTable;
(where myTable is the table you're reading/writing data from/to)
The ISO-8859-1 character set is for Latin characters only. Try UTF-8, and make sure that the database these characters are coming from are also UTF-8 columns.
Whenever I put ñ in a mysql row... it $_POST's it as a strange triangular ? symbol...
Anyone know what the problem is?
First thing is to check if your mysql column is has the correct encoding (utf8 probably).
Then you may need to enable utf8 when connecting to mysql, at least that's what I must do in Perl.
This link might help http://dreweyscorner.blogspot.com/2008/01/enable-utf-8-on-php-mysql-and-apache.html
It's a mismatch between the encoding used to store the accented character in the database and the charset used by the page that displays it. The text was probably stored as ISO-8859-1 (Western European), but is being displayed as Unicode (UTF-8).
Make sure that the insert form and the display page both use the same encoding. These days, both should have the following tag in the of the page:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Unless you're using an obsolete version of MySQL (such as 3.23), it should support UTF-8 encoding by default.
I've made a test program that is basically just a textarea that I can enter characters into and when I click submit the characters are written to a MySQL test table (using PHP).
The test table is collation is UTF-8.
The script works fine if I want to write a é or ú to the database it writes fine. But then if I add the following meta statement to the <head> area of my page:
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
...the characters start becoming scrambled.
My theory is that the server is imposing some encoding that works well, but when I add the UTF-8 directive it overrides this server encoding and that this UTF-* encoding doesn't include the characters such as é and ú.
But I thought that UTF-8 encoded all (bar Klingon etc) characters.
Basically my program works but I want to know why when I add the directive it doesn't.
I think I'm missing something.
Any help/teaching most appreciated.
Thanks in advance.
Firstly, PHP generally doesn't handle the Unicode character set or UTF-8 character encoding. With the exception of (careful use of) mb_... functions, it just treats strings as binary data.
Secondly, you need to tell the MySQL client library what character set / encoding you're working with. The 'SET NAMES' SQL command does the job, and different MySQL clients (mysql, mysqli etc..) provide access to it in different ways, e.g. http://www.php.net/manual/en/mysqli.set-charset.php
Your browser, and MySQL client, are probably both defaulting to latin1, and coincidentally matching. MySQL then knows to convert the latin1 binary data into UTF-8. When you set the browser charset/encoding to UTF-8, the MySQL client is interpreting that UTF-8 data as latin1, and incorrectly transcoding it.
So the solution is to set the MySQL client to a charset matching the input to PHP from the browser.
Note also that table collation isn't the same as table character set - collation refers to how strings are compared and sorted. Confusing stuff, hope this helps!