display character encoding in php - php

I have retrieve data from MySQL in php and post it to client:
if I use select * from users LIMIT 8;
no problem.. but when select * from users LIMIT 9; the last data retrieved broke the page.. when I debug in php I can see this data looks fine also:
1 = "CN=User1,OU=ARGE,OU=Personel,OU=Kullanicilar,OU=CompanyName,DC=company,DC=intra
2 ="CN=User2,OU=ARGE,OU=Personel,OU=Kullanicilar,OU=CompanyName,DC=company,DC=intra
3 ="CN=User3,OU=ARGE,OU=Personel,OU=Kullanicilar,OU=CompanyName,DC=company,DC=intra
4 ="CN=User4,OU=ARGE,OU=Personel,OU=Kullanicilar,OU=CompanyName,DC=company,DC=intra
5 ="CN=Öney,OU=ARGE,OU=Personel,OU=Kullanicilar,OU=CompanyName,DC=company,DC=intra
But there is no data returned from php.. its obvious that 'Ö' character cause this but I don't understand why its even looks true and my character encoding is:
My PageÖÖ
and this is also in top of my page :
header('Content-Type: text/html; charset=ISO-8859-1');
When I type a title '4 Tasks to completeÖÖŞŞŞ' in html of mypage.php it looks:
'4 Tasks to complete��???'
In stackoverflow it looks well, I want same for mine.. couldn't figure out the problem now in html or php or both of them?
EDITED: since I see in 'Öney' character in script variable I think in php page there is problem..
my connection settings:
$this->dbh = new PDO('mysql:host=localhost;dbname=webfilter;port=3306;connect_timeout=15', 'root', 'company');
$this->dbh->exec("set names utf8");

You're specifying the charset as UTF-8 in meta:
<meta charset="utf-8"/>
But you're specifying ISO-8859-1 in PHP:
header('Content-Type: text/html; charset=ISO-8859-1');
You'll need to have consistency, change the PHP header function to set the charset as UTF-8 too:
header('Content-Type: text/html; charset=utf-8');
Other steps that might help:
Setting the file encoding to UTF-8.
Setting the table columns to be in UTF-8 (utf8-general usually works).
Running the query SET NAMES utf8 after connecting to the database.

Change
header('Content-Type: text/html; charset=ISO-8859-1');
to
header('Content-Type: text/html; charset=utf-8');
In HTML, charset is utf-8, but it is different in PHP. Make sure that they are same.

Related

php utf-8 encoding for chinese text

I am doing migration to generate SQL from one DB to another.
I am trying to get the output
But when I did a mb_convert_encoding("Mr.Wang (王老板)", 'UTF-8', 'Windows-1252')
I have the output as
I have those two extra "box". Any idea what am I doing wrong?
phpMyAdmin is able to export my old database containing chinese text in correct format, how do it do that in script?
*updated the images to better show my view
Have you tried setting the header in the script to UTF8? What I normally use is the following:
header('Content-Type: text/html; charset=utf-8');
That has worked for me so far for German characters & some Arabic & Japanese etc.
I found that I actually need to
mysql_query("SET NAMES 'utf8'");
before my select statement. And I do not need to run mb_convert_encoding("Mr.Wang (王老板)", 'UTF-8', 'Windows-1252') at all.
Now if I write my insert sql I got the correct text i wanted.

set charset when saving files with php

I created a simple code for uploading pictures to a folder, with PHP.
On the server side I have
<?php
header('Content-Type: text/plain; charset=utf-8');
//check if file is actually an image etc.
//if is an image, send it to "upload" folder
move_uploaded_file($_FILES["file"]["tmp_name"],"upload/" . $_FILES["file"]["name"]);
//save to the database a string like "upload/myImage.jpg", so I can render it on the site later
$stu = $dbh->prepare("UPDATE multi SET m_place=:name WHERE m_id = :id");
$stu->bindParam(':name', $n, PDO::PARAM_STR);
$n= "upload/".$_FILES["file"]["name"];
$stu->execute();
The problem?
If the name of the image is in english in the folder I see a "myImage01.jpg" and in the database also "upload/myImage01.jpg". But, if the name of the image is in greek in the folder I see "χωΟΞ―Ο‚ τίτλο.jpg" and in the db "upload/χωΟΞ―Ο‚ τίτλο.jpg".Which is wrong. Insted of χωΟΞ―Ο‚ τίτλο I should get "χωρις τιτλο" (thats greek for "no title" btw). So , I guess charset problem?
How do I fix this?
Thanks in advance
It sounds like your database doesn't have the correct collation. Make sure the tables/columns are using utf8_general_ci for their collation.
Also extremely important when handling UTF8 is to use the following two MySQL lines for GET requests...
SET time_zone = '+00:00'
SET CHARACTER SET 'utf8'
...and when you have a POST request use the following two...
SET time_zone = '+00:00'
SET NAMES 'utf8'
These will help ensure that UTF8 characters are maintained correctly.
Finally, I figure out that "PHP filesystem functions can only handle characters that are in system codepage". Thanks to this I solve my problem.
I used the iconv function
So I changed the move_uploaded_file line like so
move_uploaded_file($_FILES["file"]["tmp_name"],"upload/" . iconv('UTF-8', 'Windows1253',$_FILES["file"]["name"]));

MYSQL database charset issue

I have a database issue, that I'm unable to understand. I'm from Denmark and have made a sign-up system in PHP and MySQL. Now... I have made two tables seperately.
One of the tables (let's call it table1) displays my beloved danish letters (æøå) just fine, when I'm querying them from the database through PHP. But when I go to phpMyAdmin, then the letters are displayed wierdly... For instance: It looks like this in phpMyAdmin:
Bjørn (which is Bjørn)
But the again, when I get them from the database with a mysql_query('SELECT * FROM $tablename'), then it is displayed as 'Bjørn' (as it should).
Now to the problem...
In the other table (let's call it table 2), then in phpMyAdmin 'Bjørn' is displayed as 'Bjørn' (what seems correct). But when I pull it into PHP with mysql_query('SELECT * FROM $tablename') then it is displayed as 'Bj?rn'. All of the letters 'æøå' is displayed as a '?'.
I tried doing a SHOW TABLE STATUS, and it shows that the Collation is the same.
In table1, then the variables are VARCHAR(255), while in table2, the variables are TEXT.
Both tables are created like this:
CREATE TABLE >>tablename<< ( bla bla bla ) CHARSET=UTF8
You should connecto mysql like this
$link = mysql_connect('localhost', 'user', 'password');
mysql_set_charset('utf8',$link);
And then try executing the query it fetches properly
The problem here is , you should specify the charset while you are connecting to DB also .
Even while storing also your inserts gets garballed if you dont set the charset in your connection to utf8 so do verify once whether you are setting like this while connecting to DB or not .
Hope this helps
Also last but not least , while displaying in the browser also you should set html headers while dumping the data from DB to browser like below
<?php
header('Content-type: text/html; charset=utf-8');
?>
I don't know what's wrong with phpmyadmin, but in my own programming I do the following things:
PHP (before anything else):
header("Content-Type: text/html; charset=utf-8");
HTML:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
DB query:
SET CHARACTER SET utf8;
Or if using PDO my dsn looks like:
mysql:dbname=mydb;host=localhost;charset=utf8

Codeigniter and charsets

I'm using Codeigniter not for so long but I've some charset problems.. I'm asking around at the CI Forum, but I want to go further, still no global solution: http://codeigniter.com/forums/viewthread/204409/
The problem was a database error 1064. I've got a solution, use iconv! Works fine, but I think it's not necessary. I'm searching a lot on the internet for charset's etc but I'm using CI now, how about charsets and CI...
So I've a lot of question about it, I hope someone can make it clear for me:
What’s the best way to set the charset global? And what to set?
In the head
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
In config/config.php
$config['charset'] = 'UTF-8';
In config/database.php
$db['default']['char_set'] = 'utf8';
$db['default']['dbcollat'] = 'utf8_general_ci';
In .htaccess, my rewrite rules and
php_value magic_quotes_gpc Off
AddDefaultCharset UTF-8
Also need send a header? Where to place? Something like?
header('Content-Type: text/html; charset=UTF-8');
In my editor (Notepad++) save files as UTF-8? Or UTF-8 (without BOM)? Or is ANSI good (this is what I’m using now)?
Use utf8_unicode_ci or utf8_general_ci for the MySQL database? And why?
How about reading RSS feeds, how to handle multiple charsets? Where I’m working on I’ve two feeds, one with UTF-8 encoding and the other with ISO-8859-1. This will be stored in the database and will be compared sometimes to see if there are new items. It fails on special chars.
I'm working with:
- CI 2.0.3
- PHP 5.2.17
- MySQL 5.1.58
More information added:
Model:
function update_favorite($data)
{
$this->db->where('id', $data['id']);
$this->db->where('user_id', $data['user_id']);
$this->db->update('favorites', $data);
return;
}
Controller:
$this->favorites_model->update_favorite(array(
'id' => $id,
'rss_last' => $rss_last,
'user_id' => $this->session->userdata('user_id')
));
When $rss_last is a “normal” value like: “test” (without quotes) it works fine.
When it’s a value with more length like (in Dutch): F-Secure vindt malware met certificaat van Maleisische overheid
I get this error:
Error Number: 1064
You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use
near ‘vindt malware met certificaat van Maleisische overheid,
user_id = ‘1’ WHERE `i’ at line 1
UPDATE favorites SET id = ‘15’, rss_last = F-Secure vindt
malware met certificaat van Maleisische overheid, user_id = ‘1’
WHERE id = ‘15’ AND user_id = ‘1’
Filename:
/home/.../domains/....nl/public_html/new/models/favorites_model.php
Line Number: 35
Someone at the CI forum told me to use this:
'rss_last' => iconv("UTF-8", "UTF-8//TRANSLIT", $rss_last)
This works fine, but I think this is not necessary..
The value $rss_last came out a RSS feed, as told before, sometimes a UTF-8 and other times a ISO-8859-1 encoding:
$rss = file_get_contents('http://www.website.com/rss.xml');
$feed = new SimpleXmlElement($rss);
$rss_last = $feed->channel->item[0]->title;
It looks like this last part is the problem, when $rss_last is set to the value it works fine:
$rss_last = 'F-Secure vindt malware met certificaat van Maleisische overheid';
When the value came out the RSS it give problems...
Some more questions..
Just found this: Detect encoding and make everything UTF-8
Best solution? But.. is iconv not more simple, do something like this:
$encoding = some_function_to_get_encoding_from_feed($feed);
$rss_last = iconv($encoding, "UTF-8//TRANSLIT", $feed->channel->item[0]->title);
But what to use for "some_function_to_get_encoding_from_feed"? mb_detect_encoding?
And mb_convert_encoding vs iconv?
1) There is no global solution.
2)
AddDefaultCharset UTF-8
It's needed for Apache response to client with right encoding. Make it.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
not necessarily, but recommended by W3C.
$config['charset'] = 'UTF-8';
it's desirable
$db['default']['char_set'] = 'utf8';
$db['default']['dbcollat'] = 'utf8_general_ci';
Encoding for CI connection to database. If encoding of your database is UTF-8 - make it mandatory.
header('Content-Type: text/html; charset=UTF-8');
Do not do this unless necessary. Charset already indicated in HTML code and .htaccess.
Use utf8_unicode_ci or utf8_general_ci for the MySQL database? And why?
For their own language (Russian), I use utf8_general_ci.
In my editor (Notepad++) save files as UTF-8?
Absolutely! All code that Apache will give as UTF8 should be in UTF8.
How about reading RSS feeds, how to handle multiple charsets?
If you have each RSS in each table - you can specify charset for each table and set right encoding with each sql query.
Yes, cyrillic symbols, for example, will fails on non-UTF8.
UTF-8 (without BOM) should give you the best results based on your configuration and there's no need to send separate headers since the encoding is already selected in the head part. Utf8_general_ci should do fine for the MySQL database.
Perhaps the entries in the database are not valid?

php 5.2 + mysql 5.1 character encoding issue

Background:
There is a table, events; this table is formatted latin1. Individual columns in this table are set to utf8. The column we will cherry pick to discuss is 'title' which is one of the utf8 columns. The website is set for utf8 both via apache and the meta tag.
As a test, if I save décor or © into the title field and perform
select title, LENGTH(title) as len, CHAR_LENGTH(title) as chlen
from events where length(title) != char_length(title)
I will get décor or ©, 12, 10 back as a result; which is expected showing that the data has indeed been properly saved into my utf8 column.
However, upon echoing the title out to a page, it's mangeld into d�cor or � which makes no sense to me since, as mentioned before, the character encoding is set to utf-8 on the page.
Not sure if this final detail makes a difference but if I edit the page and resubmit the mangled text it turns into d%uFFFDcor or %uFFFD both in the database and when displayed to the page. Further submits cause no change.
Actual Question:
Does anyone have an idea as to what I may be doing wrong? :-P
Well, there's likely one of three problems.
1. Mysql's connection is not using UTF-8
This means that it's converted to another charset (likely Latin-1) before it hits PHP. I've found the best solution is to run the following queries:
SET CHARACTER SET = "utf8";
SET character_set_database = "utf8";
SET character_set_connection = "utf8";
SET character_set_server = "utf8";
2. The page rendered is not really set to UTF-8
Set both the Content-type header and the <meta> tag content types to UTF-8. Some browsers don't respect one or the other...
header ('Content-Type: text/html; charset=UTF-8');
echo '<meta http-equiv="content-type" content="text/html; charset=utf-8" />';
As noted in the comments, that's not the problem...
3. You're doing something to the string before echoing it
Most of PHP's string functions will not do well with UTF-8. If you're calling a normal function that doesn't accept a $charset parameter, the chances are that it won't work with utf-8 strings (such as str_replace). If it does have a $charset parameter (like htmlspecialchars, make sure that you set it.
echo htmlspecialchars($content, ENT_COMPAT, 'UTF-8');

Categories