I am trying to display kannada words in mozilla browser on Ubuntu 12.04 lts connecting through the MySql.
I have used collation utf-8 general ci and used header('Content-type:text/html; charset=utf-32'); php code in php.
When I tried to retrieve the words from database and display it on the firefox browser it is displaying as question marks...
Please help.
The character encoding that you declare in a header must be the same as the actual encoding. It seems that these differ radically (UTF-32 vs. possibly UTF-8). Find out the actual encoding and declare it.
Don’t use UTF-32 on web pages. Firefox was the last major browser that supported it, and the support was removed in 2011.
Related
I am developing a php project which is in HTML5. Following is the meta used for all pages in my website.
<meta charset="utf-8">
I am coding in windows machine using NetBeans. I was not really aware of encoding of the files. Since the code was working fine, i was not giving importance for this.
However, based on some of the questions in stackoverflow, I could understand more about encoding. I noticed that many php/js/css files of my project are saved in UTF-8 encoding whereas some php/js/css files are saved in ANSI encoding. (to understand this, i opened the file in notepad, clicked on save as and checked the default encoding shown).
It seems the files in which I pasted some of the unicode characters were autosaved in UTF-8 and all other files were saved in ANSI encoding (I guess it might be Windows-1252). All this happened even though I set project preference as UTF-8 in netbeans.
Is it required to save those files (files which does not use unicode) also to UTF-8 as my html meta says UTF-8? (Note that there are no issues when I tested my website, but my testing was from a windows machine)
I am also curious to know, how the browser render the web page correctly though some of the php files are saved in ANSI but served with meta UTF-8.
(to understand this, i opened the file in notepad, clicked on save as and checked the default encoding shown).
This isn't an accurate way of checking the encoding of a file.
Files which contain only ASCII characters -- like most CSS and Javascript source files! -- are valid in most text encodings. Notepad will call them "ANSI" because that's its default, but they're also perfectly valid as UTF-8. No conversion is necessary.
I have a wordpress site hosted on Heroku using nginx, hhvm and mysql.
I was on WP version 4.1 and then upgraded to 4.4.2. Now I see that some of my posts have weird characters (black diamonds with questions marks that you can see below).
Also, when i log in to the wp-admin and try to edit a post, the post editor is completely blank (both visual and text modes), and the wordcount shows 0. However the actual post content is still displayed on the live page and I see in the DB that the posts still exist.
*I've checked the console and this is not because of a javascript error or any sort of jQuery mismatch
I believe this is some sort of character encoding issue, because when I change the encoding settings in chrome to Western ISO-8859-15 the black diamonds go away. The post editor is still broken though.
I've tried changing the character encoding on the database and on the individual tables, but that did not fix the problem. I wonder if this is related to some internal HHVM setting for character encoding. Have not been able to find anything there to change though.
I am building an Arabic website using PHP. It is probably an encoding error, but spaces in the Arabic language turns into capital "i"'s for some reason. I have included UTF-8 enconding in the website's main CSS, but the error still exists.
Note: This only happens when using Chrome on Windows OS.
After thorough research, it appeared that it wasn't a Charset issue. I changed the fonts around in some CSS tags and the "i" appeared in some fonts and disappeared in others. So, I chose one that worked. Thanks for everyone's help.
For the last years I used Notepad++ on Win XP SP2.
As I just have seen, the setting in Notepad++ is to encode new files in "ANSI" in "Windows Format". Basically all files on my harddisk should be ANSI files then, but I'm not sure.
Most .html-files have a charset-tag as "text/html; charset=iso-8859-1", but some have none.
Other files, especially text-files (for example keyword-lists) I stored with Firefox XPCOM-system, I don't know how they are currently encoded.
On Server-side I have Apache with PHP and MySql.
For Upload I used Filezilla.
Now the problem is: I want to use Japanes signs (or arabic, etc.). This only works partly.
I can get my selfmade Firefox-Application to constantly write or read UTF-8. But I can't check everytime which of the old files is which encoding.
Having just read Joel Spolsky's old article about UTF-8 strengthens my view that I simply have to get my whole system changed as much as possible to UTF-8.
As long as I have it running that way locally on my Hard-Disk I could just re-upload everything to the server.
So: How do I get all my files locally transfered to UTF-8?
And: Is it possible at all to have Win XP SP2 using constantly UTF-8 everywhere? Or do I have to check it with every program, or even worse with every file, that the right encoding is to be used.
How about files I get for example in E-Mails or via an USB-stick, or that I download in zip-files? (Or a thousand possibilities more.)
Update:
1.-4. went OK so far. I tried first with BOM, but without seems to be better.
So to 5.) Something I have to change there too. I changed as in 3.) the charset in the html-template-file, and the text coming from the template is displayed correctly. But the text coming from MySql/Php shows the UnknownChar-sign at some places currently, i.e. where there should be Umlaute äöü.
I have changed all collations for text fields in the MySql-Database via phpmyadmin to "utf8_unicode_ci", but that didn't do the trick.
Is it a php-issue, or do I only have to convert somehow the data in the MySql-Database once?
The beauty of UTF-8 is that it's a superset to ASCII, so if your html and php files only contain Latin alphabets (i.e. English and programing/HTML syntax), you don't need to convert the file at all. You can leave most of your file unchanged.
Should you find few exceptions that you want to convert it manually, you may open them up in Notepad++, and do 'Encoding' - 'Convert to UTF-8 (No BOM)'.
Yes, you do need to change/add <meta> charset tag to all the HTML files to make sure the browser render your files in UTF-8.
In Notepad++ you could set the new file to always open with 'UTF-8 (No BOM), Unix'. Also, check the tick on "Apply to ANSI files" so old file can be correctly saved to the new encoding. I suggest the format is because even though you are working on a Windows machine, the web servers usually runs Linux/BSD so the format is the native form (keeping files in native form is important especially when you are using a version control system).
Migrate a live site with database is a different issue. Data in MySQL comes with their own encoding, and from your question I cannot tell if you need to do it and how to do it. Need more specifics on that (if you need to).
I am running a fairly standard LAMP stack.
The problem is an intermittent rendering of UTF-8 characters correctly. About 50% of the time the non-ASCII UTF-8 characters render correctly (e.g. with appropriate diacritical marks), but about 50% of the time I get the '?' rendition instead. If I reload the page, sometimes it corrects the problem and sometimes it does not. It happens with all browsers on all platforms, which suggests a MYSQL or Apache problem but I have not been able to figure it out.
The data base itself is in UTF-8 format and I have never seen the problem while browsing the database in phpMyAdmin.
I issue a SET NAMES utf-8 command upon opening the data base (and have tried changing that to a SET CHARSET utf-8 command) with no luck.
What's confusing me is that it is intermittent, happening in streaks, e.g. it will happen on 30 pages in a row (even if they are just reloads), and then clear up for 10 pages, and then happen again for a few pages, etc.
You can try to see the problem by hitting the 'list' button here: http://latin-words.com/list_vocab.php though it may take many reloads to either make it happen or make it go away
Server Configuration:
Ubuntu: 9.10
Mysql: 5.1.37
PHP 5.2.10
Apache 2.2.12
Any hints would be greatly appreciated?
edit:
For searchers sake, from the comments, the problem was actually an issue doing a SET NAMES utf-8; (incorrect) instead of an SET NAMES utf8; (correct) That doesn't mean my much more obscure reason posted below cannot also be the reason ;)
Sounds like a problem with locales & iconv, try to determine what locale is used in the webserver process the moment all is well, and the moment it doesn't work anymore (try $currentlocale = setlocale(LC_ALL,NULL); or $currentlocale = setlocale(LC_CTYPE ,NULL); to get the used locale).