cyrillized words read from database with php - php

Why when I'm reading a cyrillic text from database it's ok , but when I put this text in select-option menu I get strange symbols
http://prikachi.com/images/813/6589813g.jpg
http://prikachi.com/images/811/6589811I.jpg
I think that I put everywhere to be utf-8 but I don't know ...
in my html I use :
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

most likely you're not using the same charset (utf-8) everywhere so your data gets messed up at some point. depending on what exactly you're doing, you'll have to change/add one or more of the following points (maybe it's the SET CHARSET/mysql_set_charset you forgot):
tell MySQL to use utf-8. to do this, add this to your my.cnf:
collation_server = utf8_unicode_ci
character_set_server = utf8
before interacting with mysql, send this two querys:
SET NAMES 'utf8';
CHARSET 'utf8';
or, alternatively, let php do this after opening the connection:
mysql_set_charset('utf8', $conn); // when using the mysql_-functions
mysqli::set_charset('utf8') // when using mysqli
set UTF-8 as the default charset for your database
CREATE DATABASE `my_db` DEFAULT CHARACTER SET 'utf8';
do the same for tables:
CREATE TABLE `my_table` (
-- ...
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
assuming the client is a browser, serve your content as utf-8 and the the correct header:
header('Content-type: text/html; charset=utf-8');
to be really sure the browser understands, add a meta-tag:
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
and, last but not least, tell the browser to submit forms using utf-8
<form accept-charset="utf-8" ...>

Related

HTTP post doesn't send special characters correctly

I have two .php files, a form with a text field where you put the names you want to search in the database, and a results file that processes the post...
The names in the database will be searched using this query:
SELECT * FROM acw_papers_web web
INNER JOIN acw_papers_web_autores aut
ON web.id_paper_web = aut.id_paper_web
WHERE aut.nombre_autor_pw LIKE '%autorname%'
ORDER BY web.probabilidad DESC
The problem is that when I send the post, insted of sending lópez, it sends lópez...
How can I fix it... both .php files are utf-8 encoded...
Since you did not provide much information, I cannot exactly pinpoint the problem. But here are two possible solutions:
Set the correct charset in the <head> of both files:
<meta charset="UTF-8" /> for HTML5 or <meta http-eqiv="Content-Type" content="text/html; charset=UTF-8" /> for everything else
Set the character set of the database tables to UTF-8. In MySQL:
ALTER TABLE acw_papers_web CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE acw_papers_web_autores CHARACTER SET utf8 COLLATE utf8_general_ci;
That's all I could think of, for now.
You need to use
urlencode() - URL-encodes string
urldecode() - URL-decode
Functions to send special characters in the Post Request.

MySQL Character encoding (öä) in PHP application

Hello I have a character encoding problem in my application and thought to ask for some help, because I couldn't solve the problem even thought I was given some guidance so here goes:
My Ä and Ö characters are shown in the browser as: �
I will also post all what I have done so far trying to solve the problem:
1) Database: I have tried changing the collation of my tables, here are some info what SHOW TABLE STATUS gives for one of my tables:
Name = test_groups Engine = InnoDB Version = 10 Row_format = Compact
Collation = utf8_swedish_ci
Database character variables gives:
| character_set_client = utf8 | character_set_connection =
utf8 | character_set_database = latin1 (I
Wonder is this the cause?) | character_set_filesystem
= binary | character_set_results = utf8 | character_set_server = utf8 |
character_set_system = utf8
2) In apache httpd.conf I have:
AddDefaultCharset UTF-8
3) In my Zend-application application.ini:
resources.view.encoding = "UTF-8"
4) In my firefox 14.0.1 browser
edit->preferences->content->advanced->Default character encoding =
Unicode (UTF-8)
5) In my php code meta-tag:
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
Now here's also few other interesting things: When I look at my page and change from firefox
View->Character encoding->Western (ISO-8859-1)
, the �-characters which came from the MySQL database turn out ok to öä-characters, but the öä-characters that come from my php-code turn into ät-characters.
Another thing when I check the encoding of the data coming from my MySQL-database with
mb_detect_encoding($DATA_FROM_MYSQL_DATABASE)
it outputs UTF-8!! Then lastly if I do in the code:
utf8_encode($DATA_FROM_MYSQL_DATABASE)
and output the result the problem disappears that is �-characters -> öä-characters. So what's going on here x) All help appreciated
Are you sending SET NAMES utf8 in your PHP as the first query to MySQL ? That could be the cause if not.
SET NAMES indicates what character set the client will use to send SQL
statements to the server. Thus, SET NAMES 'cp1251' tells the server,
“future incoming messages from this client are in character set
cp1251.” It also specifies the character set that the server should
use for sending results back to the client. (For example, it indicates
what character set to use for column values if you use a SELECT
statement.)
SET NAMES utf8 in MySQL? has more detail about how and why.
Troubleshoot:
Check your database (with PHPMyAdmin, for instance). Are the characters correctly stored? Or does it seem gibberish?
If the characters in the database are ok, then the problem happens when retrieving. If they are stored incorrectly (as I would guess they are), then the problem is in the "storing".
Check your source code file and verify if they are encoded in UTF-8.
Force mysql connection to use UTF8 (mysqli::set_charset('utf8') or mysql_set_charset('utf8') or PDO: Add charset to the connection string (charset=utf8) )

could not retrieve utf8 encoding value from mysql

I've such a problem , first take a look at mysql table =>
CREATE TABLE IF NOT EXISTS users(
id int(4) NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(40) CHARSET utf8 COLLATE 'utf8_unicode_ci' NOT NULL,
surname VARCHAR(40) CHARSET utf8 COLLATE 'utf8_unicode_ci' NOT NULL,
);
It is storing data successfully , but when I'm trying to retrieve this info to my .php file (which encoding is also utf8) it still showing me question marks (?????), why ? How can I solve it ?
UPDATE
Something I'm not doing well. So I've 2 php files, one is classA.php file in which I've defined class which is retrieving info from database and I've included this file (classA.php) into my default.php file where I want to see data.
I've exactly same table which is written above , and I'm writing
header('Content-Type: text/html; charset=UTF-8');
in the first line in default.php, but it still doesn't work, thanks for advices :))
SECOND UPDATE
This script I've in classA.php file , and its encoding is default like default.php file encoding. I just added in default.php file in first line this
header("Content-Type: text/html; charset=UTF-8");
but it still doesn't work.
Third update
sql =>
create table ok(
id int(2) not null auto_increment primary key,
name varchar(20) charset utf8 not null);
and php file
<?php
header('Content-Type: text/html; charset=UTF-8');
?>
<!DOCTYPE html>
<html>
<head>
<title>hello</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<?php
$con = mysqli_connect("host","user","pass","db");
if (mysqli_connect_errno()==0){
if ($r = mysqli_query($con,"SELECT * FROM ok")){
mysqli_set_charset($con,"utf8");
while ($d = mysqli_fetch_assoc($r)){
echo $d['name'] . "<br>";
}
}
}
if (isset($con)){
mysqli_close($con);
}
?>
</body>
</html>
I've inserted in ok table this=>
insert into ok(name) values("one"),("ერთი"),("two"),("ორი");
PS. special characters are Georgian :)
and it results English characters fine and Georgians with question marks :(
It doesn't work anyways :(
The fact that the PHP file is in UTF-8 doesn't necessarily mean that the data coming from/ going to the database is in UTF-8 too.
You didn't mention which extension you're using, but:
For mysql use mysql_set_charset($link,'utf8');
For mysqli use mysqli->set_charset('utf8') or the same as above with mysqli_
For PDO, when you connect include charset:utf8 in the DSN string.
Declare the HTML utf8 encoding:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
And/ Or transfer encodings in the headers:
header('Content-Type: text/html; charset=UTF-8');
try with
htmlentities($row['name'],ENT_QUOTES);
see htmlentities

Codeigniter and charsets

I'm using Codeigniter not for so long but I've some charset problems.. I'm asking around at the CI Forum, but I want to go further, still no global solution: http://codeigniter.com/forums/viewthread/204409/
The problem was a database error 1064. I've got a solution, use iconv! Works fine, but I think it's not necessary. I'm searching a lot on the internet for charset's etc but I'm using CI now, how about charsets and CI...
So I've a lot of question about it, I hope someone can make it clear for me:
What’s the best way to set the charset global? And what to set?
In the head
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
In config/config.php
$config['charset'] = 'UTF-8';
In config/database.php
$db['default']['char_set'] = 'utf8';
$db['default']['dbcollat'] = 'utf8_general_ci';
In .htaccess, my rewrite rules and
php_value magic_quotes_gpc Off
AddDefaultCharset UTF-8
Also need send a header? Where to place? Something like?
header('Content-Type: text/html; charset=UTF-8');
In my editor (Notepad++) save files as UTF-8? Or UTF-8 (without BOM)? Or is ANSI good (this is what I’m using now)?
Use utf8_unicode_ci or utf8_general_ci for the MySQL database? And why?
How about reading RSS feeds, how to handle multiple charsets? Where I’m working on I’ve two feeds, one with UTF-8 encoding and the other with ISO-8859-1. This will be stored in the database and will be compared sometimes to see if there are new items. It fails on special chars.
I'm working with:
- CI 2.0.3
- PHP 5.2.17
- MySQL 5.1.58
More information added:
Model:
function update_favorite($data)
{
$this->db->where('id', $data['id']);
$this->db->where('user_id', $data['user_id']);
$this->db->update('favorites', $data);
return;
}
Controller:
$this->favorites_model->update_favorite(array(
'id' => $id,
'rss_last' => $rss_last,
'user_id' => $this->session->userdata('user_id')
));
When $rss_last is a “normal” value like: “test” (without quotes) it works fine.
When it’s a value with more length like (in Dutch): F-Secure vindt malware met certificaat van Maleisische overheid
I get this error:
Error Number: 1064
You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use
near ‘vindt malware met certificaat van Maleisische overheid,
user_id = ‘1’ WHERE `i’ at line 1
UPDATE favorites SET id = ‘15’, rss_last = F-Secure vindt
malware met certificaat van Maleisische overheid, user_id = ‘1’
WHERE id = ‘15’ AND user_id = ‘1’
Filename:
/home/.../domains/....nl/public_html/new/models/favorites_model.php
Line Number: 35
Someone at the CI forum told me to use this:
'rss_last' => iconv("UTF-8", "UTF-8//TRANSLIT", $rss_last)
This works fine, but I think this is not necessary..
The value $rss_last came out a RSS feed, as told before, sometimes a UTF-8 and other times a ISO-8859-1 encoding:
$rss = file_get_contents('http://www.website.com/rss.xml');
$feed = new SimpleXmlElement($rss);
$rss_last = $feed->channel->item[0]->title;
It looks like this last part is the problem, when $rss_last is set to the value it works fine:
$rss_last = 'F-Secure vindt malware met certificaat van Maleisische overheid';
When the value came out the RSS it give problems...
Some more questions..
Just found this: Detect encoding and make everything UTF-8
Best solution? But.. is iconv not more simple, do something like this:
$encoding = some_function_to_get_encoding_from_feed($feed);
$rss_last = iconv($encoding, "UTF-8//TRANSLIT", $feed->channel->item[0]->title);
But what to use for "some_function_to_get_encoding_from_feed"? mb_detect_encoding?
And mb_convert_encoding vs iconv?
1) There is no global solution.
2)
AddDefaultCharset UTF-8
It's needed for Apache response to client with right encoding. Make it.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
not necessarily, but recommended by W3C.
$config['charset'] = 'UTF-8';
it's desirable
$db['default']['char_set'] = 'utf8';
$db['default']['dbcollat'] = 'utf8_general_ci';
Encoding for CI connection to database. If encoding of your database is UTF-8 - make it mandatory.
header('Content-Type: text/html; charset=UTF-8');
Do not do this unless necessary. Charset already indicated in HTML code and .htaccess.
Use utf8_unicode_ci or utf8_general_ci for the MySQL database? And why?
For their own language (Russian), I use utf8_general_ci.
In my editor (Notepad++) save files as UTF-8?
Absolutely! All code that Apache will give as UTF8 should be in UTF8.
How about reading RSS feeds, how to handle multiple charsets?
If you have each RSS in each table - you can specify charset for each table and set right encoding with each sql query.
Yes, cyrillic symbols, for example, will fails on non-UTF8.
UTF-8 (without BOM) should give you the best results based on your configuration and there's no need to send separate headers since the encoding is already selected in the head part. Utf8_general_ci should do fine for the MySQL database.
Perhaps the entries in the database are not valid?

php 5.2 + mysql 5.1 character encoding issue

Background:
There is a table, events; this table is formatted latin1. Individual columns in this table are set to utf8. The column we will cherry pick to discuss is 'title' which is one of the utf8 columns. The website is set for utf8 both via apache and the meta tag.
As a test, if I save décor or © into the title field and perform
select title, LENGTH(title) as len, CHAR_LENGTH(title) as chlen
from events where length(title) != char_length(title)
I will get décor or ©, 12, 10 back as a result; which is expected showing that the data has indeed been properly saved into my utf8 column.
However, upon echoing the title out to a page, it's mangeld into d�cor or � which makes no sense to me since, as mentioned before, the character encoding is set to utf-8 on the page.
Not sure if this final detail makes a difference but if I edit the page and resubmit the mangled text it turns into d%uFFFDcor or %uFFFD both in the database and when displayed to the page. Further submits cause no change.
Actual Question:
Does anyone have an idea as to what I may be doing wrong? :-P
Well, there's likely one of three problems.
1. Mysql's connection is not using UTF-8
This means that it's converted to another charset (likely Latin-1) before it hits PHP. I've found the best solution is to run the following queries:
SET CHARACTER SET = "utf8";
SET character_set_database = "utf8";
SET character_set_connection = "utf8";
SET character_set_server = "utf8";
2. The page rendered is not really set to UTF-8
Set both the Content-type header and the <meta> tag content types to UTF-8. Some browsers don't respect one or the other...
header ('Content-Type: text/html; charset=UTF-8');
echo '<meta http-equiv="content-type" content="text/html; charset=utf-8" />';
As noted in the comments, that's not the problem...
3. You're doing something to the string before echoing it
Most of PHP's string functions will not do well with UTF-8. If you're calling a normal function that doesn't accept a $charset parameter, the chances are that it won't work with utf-8 strings (such as str_replace). If it does have a $charset parameter (like htmlspecialchars, make sure that you set it.
echo htmlspecialchars($content, ENT_COMPAT, 'UTF-8');

Categories