I have created a small contact form creator in PHP and everything works fine for me, but my friend in Denmark says the forms don't display danish characters properly and won't submit when Danish characters are used. I have never worked with internationalization and the PHP manual doesn't help me at all.
How can I make my PHP applications be compatible with other language characters. I work with PHP 5.3.x and my files are created with utf-8 without bom and everything in my pages and database are utf-8. Is there an article you can recommend that a beginner can understand?
Thank you!
1.Add meta charset on your page
2.Serve Content-Type:text/html; charset=utf-8 in header
3.Use set names for mysql, see SET NAMES utf8 in MySQL?
Seems you have everything that needs to be, but clearly there's some little thing probably one of the things that you think that is in place.
Whatever, as a last resort, you can use php's htmlspecialchars() function to encode all chars into html entities.
Related
I saw a website like this i.e. http://www.a3malcom.com/index.php. I want to build same kind of website in Arabic website. I was wondering does entries into database table also needs to be done in Arabic or English?
What if i need the website in 2 languages i.e. english and arabic. In what language should data should be entered in DB.
Take a look at comprehensive article: (Thanks to #Deceze for great article)
Handling Unicode Front To Back In A Web App
It also has Arabic example with other languages:
Yes you should insert data in Arabic in db table. So you can read it easily in web page and no need to convert.
And use utf-8 encoding while displaying the page
Usually unicode is all that is needed (UTF-8).
Your source file should be encoded UTF-8 if you want to write arabic in the PHP source. Note that some text editors don't support arabic properly.
For the database, just create your database with UTF-8 encoding.
For HTML output, add this to your HEAD section:
<meta http-equiv="Content-type" content="text/html;charset=UTF-8" />
Alternatively, you can send it as HTTP header... but why bother.
Anyway, I have encoutered a problem generating Arabic sentences as images in GD - using a custom font. Turns out GD doesn't render arabic properly (at least in my case).
That was solved using the following library:
http://ar-php.org/
(yes the website is pretty ugly, but the library works, is well packaged, contains documentation...)
All I had to do, in my case, to fix the problem, is:
$Arabic = new I18N_Arabic('Glyphs');
$text = $Arabic->utf8Glyphs($_GET['txt']);
And then feed $text to GD.
I'm encountering a minor problem with text direction, though, and looking for a solution. But at least I have valid arabic now.
But in most cases you won't need that library, but will just need to make sure that you're using UTF-8 in all your development process.
Hope it helps.
I'm working on an application using the CakePHP framework and in the past I ran into a few encounters with encoding.
To avoid these issues in my application, I started doing some research. But I'm still a little confused about the how and why.
My application will need to support all languages, yes even languages such as Chineese. Most of the data will be stored into a MySQL database, and that's where confusion starts. What should I use as collation?
Based on what I've read the past few days, I come to the conclusion the best choice for collation would be utf8_unicode_ci. Is this correct?
Now onto the PHP, what would I set as encoding? UTF-8? I need to completely be sure not a single character shows up the way it shouldn't. Content will be submitted through forms, so the output has to be the same as the input.
I hope anyone can give me an answer to my questions and help clarify it to me, thanks in advance.
You need UTF-8 encoding to store you data. But as for collation, it is used to sort strings. Unfortunatelly, there exists no universal collation, and such universal collation can not exists, because collations are contradictory.
To make a point on example, in Czech 'ch' goes after 'h', opposite to most other Latin script languages.
Yes, utf8_unicode_ci is a sane choice when you don't know in advance the language. As for PHP I'll just link to some answers I wrote in the past:
How to best configure PHP to handle a UTF-8 website
Croatian diacritic signs in MySQL db (utf-8)
Am I correctly supporting UTF-8 in my PHP apps?
One additional advice would be to make sure your text editor saves all files as UTF-8 (NO BOM, if you have this option). In short, keep everything utf-8 from the very beginning and you should be safe.
I have two tables here - one is in UTF and holds Arabic text as it can be read. The other one has a different encoding however and the content is Arabic however in the database its displayed as
ÈöÓúãö Çááøåö ÇáÑøóÍúãóäö ÇáÑøóÍöíãö
I have to show data from both tables on the same page - the page is UTF encoded however I'm not sure if this can be done or if its possible. What do i do? My database is mysql and I'm using php.
Is it possible to convert the encoding of the contents of the other table into UTF8 btw?
You have to use mb_convert_encoding() first, on everything, to make sure it's all in UTF-8 to begin with. http://us3.php.net/manual/en/function.mb-convert-encoding.php Then it should display, assuming your HTML's charset is UTF-8 and the users have the appropriate fonts installed.
Also, virtually all consoles and a great many free online SQL commanders (like PHPMyAdmin) are not UTF-8 aware and print out jibberish. I have not yet found a free SSH client that supports UTF-8; if it's a big deal, invest in SecureCRT.
EDIT:
Excuse me. I don't read Arabic at all, but I did get Arabic back. please tell me if this is the correct text, and if so, accept this answer ;_)
ب?س?ك? افف?م? افر??ح?ك?ل? افر??ح?ٍك?
The code I used to get this was:
header('Content-Type: text/html;charset=utf-8');
echo mb_convert_encoding('ÈöÓúãö Çááøåö ÇáÑøóÍúãóäö ÇáÑøóÍöíãö', 'utf-8', 'iso-8859-6');
I found the Arabic encoding via this page: http://a4esl.org/c/charset.html
Cheers!
I have content stored in a Postgres DB, now everytime I call the content so that it gets displayed using php, i get funny squares in IE and funny square type question marks in Firefox?
Example below
* - March � May 2009
How do I remove this?
I do not have access to the server so can't adjust the encoding there, only have postgres DB details and FTP access to upload my files
I would also recommend: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky, I've read it only recently myself, it will definitely help you sort out your problems.
You need to make sure that Postgres, PHP, and your browser all agree on the content encoding, and that you have an appropriate font selected in your browser. The simplest way to do that is to choose UTF8 for everything.
I don't know about PHP, but I do know about databases and browsers. First you need to find out if the database is UTF8. (From psql, I would do a "\l" and look at the encoding.) Then you need to find out if PHP supports UTF8 (I have no idea how you do that). Then you need to see if how those characters are being stored in the database by the PHP app. Then you need to figure out if the web server is correctly reporting the content encoding. (On Linux/Unix, I'd use the program "HEAD" (not "head") to see the headers its returning.) And then you need to figure out if your browser is using a font that supports UTF8.
Or, you could just make sure you only store ASCII and forget the rest of the world exists. Not recommended.
Wrong charset somewhere. The characters could be stored wrong already in database, or you have wrong charset in meta tags on the page(try manually change charset in browser), or there could be problem with wrong encoding when page is communicating with database.
Check this page http://www.postgresql.org/docs/8.2/static/multibyte.html for more informations.
Try to have same encoding on all places, preferably UTF-8
You have encoding issues. Make sure the encoding is set right in the database, in the html markup and make sure the files themselves are saved in proper encoding.
I'm still learning the ropes with PHP & MySQL and I know I'm doing something wrong here with how character sets are set up, but can't quite figure out from reading here and on the web what I should do.
I have a standard LAMP installation with PHP 5, MySQL 5. I set everything up with the defaults. When some of my users input comments to our database some characters show up incorrectly - mostly apostrophes and em dashes at the moment. In MySQL apostrostrophes show up as ’. They display on the page this way also (I'm using htmlentities to output user comments).
In phpMyAdmin it says my MySQL Charset is UTF8-Unicode.
In my database my tables are all set up with the default Latin1-Swedish-ci.
My web pages all have meta http-equiv="Content-Type" content="text/html; charset=utf-8"
When I look at the site's http headers I see: Content-Type: text/html
Like a newbie, I hadn't considered character sets at all until things started looking odd on some of my pages. So does it make most sense for me to convert everything to utf-8 and will this affect my PHP code? Or should I try to get it all into Latin? And do I have to go into the database and replace these odd codes, or will they magically display once I set up the charsets properly? All the fiddling I've done so far hasn't helped (I set the http headers to utf-8, and also tried latin).
If you really want to understand these issues, I would start by reading this article at mysql.com. Basically, you want every piece of the puzzle to expect UTF-8 unicode. On the PHP side, you want to do something like:
<?php header("Content-type: text/html; charset=utf-8");?>
<html>
<head>
<meta http-equiv="Content-type" value="text/html; charset=utf-8">
And when you run your insert queries you want to make sure both the table's character encoding and the encoding that you're running the queries in are UTF-8. You can accomplish the latter by running the query SET NAMES utf8 right before you run an insert query.
http://www.phpwact.org/php/i18n/charsets
That site gave me a lot of good advice on how to make everything play nice in UTF-8.
I also recomened switching from htmlentities to htmlspecialchars as it is more UTF friendly.
The main point is to make sure everything is talking the same language. Your database, your database connection, your PHP, your page is in utf8 (should have a meta tag and a header saying so).
Sorry for not understanding all of your question. But when part of the question is "UTF-8 or not?", the answer is: "UTF-8, of course!"
You definitely want to sort things out now rather than later. One of the most important programming rules is not to keep going with a bad idea - don't dig yourself in any deeper!
As latin1 and utf-8 are compatible, you can convert your tables to use utf-8 without manipulating the data contained by hand. MySQL will sort this part out for you.
It's then important to check that everything is speaking utf-8. Set the http headers in apache or use a meta tag - this says to a browser that the HTML output is utf-8.
With this in mind, you need to make sure all of the data you send really is utf-8! Configure your IDE to save php/html files as utf-8. Finally make sure that PHP is using a utf-8 connection to MySQL - issue this query after connecting:
SET NAMES 'utf-8';