MySQL table sets empty spaces to  despite collation of utf8_general_ci - php

 is being inserted from textareas with empty space at beginning into MySQL table, despite having the database and table set to collation of utf8_general_ci.
I have <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> in the head of the page doing the inserts. I haven't experienced this problem with other databases/tables on the same MySQL installation, and these are set to the default collation of latin1_swedish_ci.
I can resolve the issue with mysql_query("SET NAMES 'utf8'"); after the MySQL connection is established, however I'd like to know why this happens despite setting the table to utf-8 and having the charset to utf-8 in the page.

You need to properly initialize the connection with utf8 unless you can set a default somehow, which means SET NAMES is mandatory.
If you're using some 1990s style mysql_query based application that opens and closes connections at random points in your code base, without any proper framework, you might have a hard time tracking these all down.

your php/html file/s should have utf-8 encoding. your mysql database and table/s as well collation utf-8. however, you can try then to make test inserts from your php file not with submitted html-form-data just to see if it's a php or a html/form/browser problem.

I didn't notice it before, but the textareas contained Removing that fixes the issue and I don't need to use mysql_query("SET NAMES 'utf8'");

Related

Characters getting encoded to �

I am using php + mysql to make a dynamic page. My db has “Make which is encoded to �Make in the web page. I though it to be an encoding issue so,I tried using <html lang='en' dir='ltr'> & <meta charset="utf-8" /> But that too didn't help
When dealing with any charset, it's important that you set everything to the same. You mentioned having set both PHP and HTML headers to UTF-8, which often does the trick, but it's also important that the database-connection, the actual database and it's tables are encoded with UTF-8 as well.
Connection
You also need to specify the charset in the connection itself.
PDO (specified in the object itself):
$handler = new PDO('mysql:host=localhost;dbname=database;charset=utf8', 'username', 'password', array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET UTF8"));
MySQLi: (placed directly after creating the connection)
For OOP: $mysqli->set_charset("utf8");
For procedural: mysqli_set_charset($mysqli, "utf8");
(where $mysqli is the MySQLi connection)
MySQL (depricated, you should convert to PDO or MySQLi): (placed directly after creating the connection)
mysql_set_charset("utf8");
Database and tables
Your database and all its tables has to be set to UTF-8. Note that charset is not exactly the same as collation (see this post).
You can do that by running the queries below once for each database and tables (for example in phpMyAdmin)
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
File-encoding
It might also be needed for the file itself to be UTF-8 encoded. If you're using Notepad++ to write your code, this can be done in the "Format" drop-down on the taskbar (you should use Convert to..., as this won't mess your current file up) - but any decent IDE would have a similar option. You should use UTF-8 w/o BOM (see this StackOverflow question).
Other
It may be that you already have values in your database that are not encoded with UTF-8. Updating them manually could be a pain and could consume a lot of time. Should this be the case, you could use something like ForceUTF8 and loop through your databases, updating the fields with that function.
Should you follow all of the pointers above, chances are your problem will be solved. If not, you can take a look at this StackOverflow post: UTF-8 all the way through.
If the � is in your database column itself, change the original character to the following:
http://www.w3schools.com/charsets/ref_html_ansi.asp

PHP <==> MySQL; storing Cyrillic / Scandinavian characters in the database

There are so many threads dedicated to this topic, that I feel silly having to ask this.
But, I'm at a total loss as to what the problem could be.
I am trying to insert special characters (cyrillic, scandinavian, etc) into a MySQL database, via PHP (html) form.
Characters like : Ä,Ö,Å, as well as russian alphabets, etc.
Based on previous threads in this forum, I have tried all the following (inserted right after the MySQL database-connection string) :
mysqli->set_charset("utf8");
This didn't work, so I tried the following :
mysqli_query("set names 'utf8'");
mysqli_query("set charset 'utf8'");
These are not recommended by PHP. But, I tried them anyway, but still no luck.
(All my databases, tables, and columns are collated as : UTF8_general_ci)
In addition, all my html forms have the following :
<meta charset="utf-8">
So, I'm at a complete loss as to what I'm doing wrong. Once the data is sent to the database, it shows up (in the database itself) as rubbish characters (question marks, and other hieroglyphics).
However, the funny thing is :
(a) When I view the data on my website, it displays correctly;
(b) When the data is sent within the body of an email, it also displays correctly
So..........why is it not displaying correctly within the database itself ??
When dealing with specific charset (like UTF-8), it's important that the entire line of code is set to the same charset. Below are a few pointers how to follow this.
ALL attributes must be set to ut8 (collation is NOT the same as charset in the database)
You should save the document itself as UTF-8 (If you're using Notepad++, it's Format -> Convert to UFT-8 (or UTF-8 w/o BOM), there's a difference - both or either may work for you)
The header in both PHP and HTML should be set to UTF-8:
HTML: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
PHP: header('Content-Type: text/html; charset=utf-8');
Upon connecting to the databse, set the charset ti UTF-8, like this:
$connection->set_charset("utf8"); (directly after connecting)
Also make sure your database and tables are set to UTF-8, you can do that by this query (in the database, need only be done once):
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Remember that EVERYTHING needs to be set to UFT-8 charcode. If something can be set to UFT-8 (or another charset, check the PHP-docs (php.net)), it should be set to the same charset as everything else.
(a) When I view the data on my website, it displays correctly;
(b) When the data is sent within the body of an email, it also displays correctly
This means data is correctly stored in the db, when you get the output is the same like the input, logically correct?
The other question is: How are you looking into the database, which kind of client are you using?
PHPMyAdmin, SomeDesktop Client.. The problem will be there.. because the data is stored right.. seems so ;)

utf8 characters wrongly displayed as question marks after conversion to mysqli_query()

My database is latin1_swedish_ci but all the tables which contain foreign characters (german, turkish...) are utf8_general_ci.
Before the upgrade to php 5.6, I used
mysql_query("SET CHARACTER SET utf8;");
mysql_query("SET NAMES utf8");
before mysql_query() and everything was displayed correctly in my page (<meta http-equiv="content-type" content="text/html;charset=UTF-8" /> in page header).
After the conversion of all mysql_query(...) to mysqli_query(id,...) and running under php 5.6, all the foreign languages are now scrambled with ? and �. Switching back to php 5.4 does not help. phpMyAdmin displays the mysql database (which has not changed) correctly.
I have looked around for a solution but nothing works... am I missing something?
What do I need to change in my code to work properly?
After searching again and again...
MAMP MySQL not recognizing my.cnf values in OSX
Does MySQL included with MAMP not include a config file?
http://www.toptal.com/php/a-utf-8-primer-for-php-and-mysql
Here is the solution.
In my php scripts, I had to add a charset query after connecting to database.
$con=mysqli_connect($host",$user,$password,$db);
mysqli_set_charset($con,"utf8");
The previous charset & names I used with mysql_query() up to php 5.4 were not enough anymore.
mysqli_query($con,"SET CHARACTER SET utf8;");
mysqli_query($con,"SET NAMES utf8");
On my local server, I also had to recompile mysql after adding a my.cnf file containing the following lines :
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
default-storage-engine = InnoDB
character-set-client-handshake
collation-server=utf8_general_ci
character-set-server=utf8
[mysqld_safe]
default-character-set=utf8
I also had to add the utf8 charset to MAMP by editing the MAMP/Library/share/charsets/index.xml and adding the folling lines :
<charset name="utf8">
<family>Unicode</family>
<description>UTF8 Unicode</description>
<alias>utf8</alias>
<collation name="utf8_general_ci" id="33">
<flag>primary</flag>
<flag>compiled</flag>
</collation>
<collation name="utf8_bin" id="83">
<flag>binary</flag>
<flag>compiled</flag>
</collation>
</charset>
On my web server, the 2 steps above were not necessary.
These gibberish characters are the result of a browser or other user-visible software package rendering utf-8 characters as if they were ASCII or Latin-1.
EDIT In the Chrome browser, you can view the encoding with which your browser is rendering your page. Click the chrome menu (three little horizontal lines) in the upper right corner. Click More Tools> Click Encoding> Then you will see a choice of character sets. Try choosing a different one.
Try putting this line into your HTML documents' <HEAD> sections.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
That may force the browser to use the correct character set.
END edit
After your switch to mysqli, did you keep the SET CHARACTER SET and SET NAMES queries as you opened up your mysql session? If not, put them back and see if it helps.
It's possible your database is working correctly but php is telling browsers you're using Latin-1. In fact that's very likely because phpmyadmin is working properly.
Try doing the things suggested here: Setting PHP default encoding to utf-8?
There's a lot of conceptual confusion around character sets and collations. When you say
My database is latin1_swedish_ci
you mean that the default character set for a newly defined table is latin1 and the default collation is case-insensitive Swedish. These are only defaults. What counts for data storage is the character set used for each column. You say that's utf8. The collation (utf8_general_ci) only counts for searching and matching.
I had the same problem.
Not sure if this is helpful, but as a noob, there is no way I was able to follow or implement the complexities of the answer to this issue.
First, I added this to my html page:
<meta charset="utf-8"/>.
Then, I went into PHPMyAdmin, clicked on the column in which I saw Diamond-circled Question Marks in my browser, and simply reset the encoding to UTF-8 at the database level.
I hope it's helpful. Certainly was easy for me.

How to code mySQL connection with utf-8

Well I have created a page in Dreamweaver where I use phpMyAdmin as a database.
My page can read Swedish letters which are written in Dreamweaver without any problems but when it retrieves data from the database (phpMyAdmin), it gets wrong. Instead of the Swedish letters it says �.
I have encoded with utf- 8
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Also the file is saved as utf-8 and I'm using Google Chrome which uses utf 8.
I've heard that I have to encode the connection with the database with utf-8 also, but I don't know how to do that?
As far as I know the way I connect is by:
<?php require_once('Connections/Audiologiska.php'); ?>
Grateful for answers!
Don't forget set charset after connecting to db php mysql charset. Also check if columns has right charset.
On the connection file/class normally i put:
SET NAMES utf8;
And for the database collation i choose utf8_general_ci

accent characters not showing properly in browser

In districts table
I have a row as
district_id district_name country_id
15 Šahty 16
While selecting from php and displaying in browser,it shows like this :�ahty
I am using mssql 2005 with collation SQL_Latin1_General_CP1_CI_AS.
The problem is something like this
removing accent and special characters
but i need the solution in php.
UPDATE(?):
There is no support for UTF-8 in sqlserver.
https://dba.stackexchange.com/questions/7346/mssql-2005-2008-utf-8-collation-charset
Hi you need to consider correct HTML content type header
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Data may be selected correctly, but browser may be can not displayed them as you expected.
You can play with this in firefox by Menu View -> Character encoding -> until you find correct one
In order to make special characters work, a general rule is that all the components must be on the same encoding. This means that database, database connection (very often forgotten 'SET NAMES {charset}' call after connecting to database) and web page Content-type have to be all in the same character set.
If you ask data from latin1 database and have database connection also has latin1, make sure the page you display values at is also latin1.
It's recommended though to use UTF-8 instead of latin1 everywhere, so if possible I recommending changing charsets and data in your database all to UTF-8, as it's more compatible all-around and easier to handle.
Just remove the UTF8 charset and let the browser select the charset it will set to ISO-8859-1 that will work with accents in sql server

Categories