Build a website in Arabic language - php

I saw a website like this i.e. http://www.a3malcom.com/index.php. I want to build same kind of website in Arabic website. I was wondering does entries into database table also needs to be done in Arabic or English?
What if i need the website in 2 languages i.e. english and arabic. In what language should data should be entered in DB.

Take a look at comprehensive article: (Thanks to #Deceze for great article)
Handling Unicode Front To Back In A Web App
It also has Arabic example with other languages:

Yes you should insert data in Arabic in db table. So you can read it easily in web page and no need to convert.
And use utf-8 encoding while displaying the page

Usually unicode is all that is needed (UTF-8).
Your source file should be encoded UTF-8 if you want to write arabic in the PHP source. Note that some text editors don't support arabic properly.
For the database, just create your database with UTF-8 encoding.
For HTML output, add this to your HEAD section:
<meta http-equiv="Content-type" content="text/html;charset=UTF-8" />
Alternatively, you can send it as HTTP header... but why bother.
Anyway, I have encoutered a problem generating Arabic sentences as images in GD - using a custom font. Turns out GD doesn't render arabic properly (at least in my case).
That was solved using the following library:
http://ar-php.org/
(yes the website is pretty ugly, but the library works, is well packaged, contains documentation...)
All I had to do, in my case, to fix the problem, is:
$Arabic = new I18N_Arabic('Glyphs');
$text = $Arabic->utf8Glyphs($_GET['txt']);
And then feed $text to GD.
I'm encountering a minor problem with text direction, though, and looking for a solution. But at least I have valid arabic now.
But in most cases you won't need that library, but will just need to make sure that you're using UTF-8 in all your development process.
Hope it helps.

Related

write arabic language in text box

I want to create application which will be transliteration of same, arabic to english. (by convert I mean writing roman arabic to roman english)
For that I need user input in arabic in Text area. IF user writes in particular text area it should automatically write in arabic only.
see this example: https://translate.google.com/#ar/en/%D9%84%D9%8A%D9%85%D8%A7%D9%86
How this can be done?
My ultimate goal is performing Arabic Romanization using Beirut System
Arabic is relatively easy to transliterate to Latin script. There are a few widely used standards for this, which are mostly 1-on-1 mappings of the arabic character to a latin character. They are mentioned here: http://en.wikipedia.org/wiki/Romanization_of_Arabic
The PECL intl package has transliteration support. If you can't use that, you can have a look at the excellent Drupal Transliteration module for a PHP implementation (I suggest you download 7.x-3.2, scroll down the page to find it).
You will need a code that will translate the arabic to english. In PHP
This can be done using language processing. This is a really big topic we, and it's not classified as a problem. Or you can use an API such as Google Translate to get the translation back.
You will need one text area and one container that will hold the translation.
You will need an AJAX code that will send to the server the current content of the text area, and get back the translation.
The trigger should be onKeyUp

Is it possible to show data of different encodings in the same page?

I have two tables here - one is in UTF and holds Arabic text as it can be read. The other one has a different encoding however and the content is Arabic however in the database its displayed as
ÈöÓúãö Çááøåö ÇáÑøóÍúãóäö ÇáÑøóÍöíãö
I have to show data from both tables on the same page - the page is UTF encoded however I'm not sure if this can be done or if its possible. What do i do? My database is mysql and I'm using php.
Is it possible to convert the encoding of the contents of the other table into UTF8 btw?
You have to use mb_convert_encoding() first, on everything, to make sure it's all in UTF-8 to begin with. http://us3.php.net/manual/en/function.mb-convert-encoding.php Then it should display, assuming your HTML's charset is UTF-8 and the users have the appropriate fonts installed.
Also, virtually all consoles and a great many free online SQL commanders (like PHPMyAdmin) are not UTF-8 aware and print out jibberish. I have not yet found a free SSH client that supports UTF-8; if it's a big deal, invest in SecureCRT.
EDIT:
Excuse me. I don't read Arabic at all, but I did get Arabic back. please tell me if this is the correct text, and if so, accept this answer ;_)
ب?س?ك? افف?م? افر??ح?ك?ل? افر??ح?ٍك?
The code I used to get this was:
header('Content-Type: text/html;charset=utf-8');
echo mb_convert_encoding('ÈöÓúãö Çááøåö ÇáÑøóÍúãóäö ÇáÑøóÍöíãö', 'utf-8', 'iso-8859-6');
I found the Arabic encoding via this page: http://a4esl.org/c/charset.html
Cheers!

PHP search engine problem

Im using Sphider as a search engine for my website, its really easy to work with but im having some major issues with localized characters.
All of my html/php pages have the charset defined as UTF-8 and the search and result page from Sphider had charset=ISO-8859-1, when I first used the Sphider "spider" to crawl my website it made all of my localized characters into some codification I dont know:
"ç" become "ç" and so on with "ã", "á" etc
When I created the DB in MySql I made it a utf-8_general_ci also my defenitions for the DB are :
MySQL charset: UTF-8 Unicode (utf8)
MySQL connection collation: utf-8_unicode_ci
This is a real problem because the search wont work properly, if I search "diferença" for instance, in the url it will appear as "?query=diferença&search=1" which is correct but will produce no results in the "suggested search" it will appear as "diferen�a" in case its not visible, the "ç" has become a black square with a white question mark on it.
I believe the spider might have a different working charset but I dont seem able to understand were if it is to be the case. Also being developed towards English primarily I believe its not hard to understand that it has some hiccups along the way.
Does anyone has any experience with it or what should I try to do to solve this?
What really bugging me is not understanding why I get strange symbols in the DB.
Quickly browsing through some Sphider source code files revealed that the application works only with Latin1 charset. You should switch to some other search engine, like Lucene. You'll need to do a bit more search-related coding though. If you don't feel like doing it, and your site is public, just integrate Google search.
You should have EVERYTHING in utf-8.
The forms who edit any given page
The physical files
The outputted html files
The headers
The connection to the database
The table definition
Miss one and you will have problems (I'm talking from personal experience)
Modify the line 4 of file "header.html" in appropriate template directory to <meta http-equiv="content-type" content="text/html; charset=UTF-8">
Convert the appropriate php file in "languages" directory to UTF8.
If the above doesn't suffice, follow the answer by The Disintegrator as well.

Multilingual support with unicode characters. A little confusion

I am creating a web application framework, in which I am providing support for multilingual content.
I mean a content, say a paragraph can have 2 sentences in English and other 2 sentences in Hindi (an indian language). Now I have several doubts about that.
1) User or admin will add that content to the website. They will be presented a textarea (where they can paste their content). Then they submit the post and I will save the content in a database. I also want to provide them a web based typewriter interface where they can type content in a given language, copy it from there, and then put it back in my main textarea.
Doubt:
1a) Will I need to do something to the textarea, so that it will accept characters in unicode.
1b) Where can I find a typewriter interface for some language I desire. Does tinymce supports that.
1c) I should put the encoding of database as 'UTF 8', right?
2) Then I nead to get content from database and put it in a webpage and show it. Now this content has utf8 encoding. As it can have many languages. What should I need to do? I am guessing that just setting encoding of the webpage as utf-8 will do. What will happen if the font that is required by a language is not installed on clients pc?
I am using PhpEd editor. Should my php files encoding must be utf-8, or just specifying the html encoding tag as utf8 will be enough?
I am a bit stumped. Please help.
1a) Yes, if the text area will accept text in any language, as long as you have the web page that contains it encoded in UTF-8. If it doesn't work, double check both the HTTP Content-type header, and the HTML META http-equiv tag for Content-type. If they are both present, they should agree; one of them would be sufficient.
1c) what to do with your database depends on the specific DBMS you use. If supported, make sure that
1. the table encoding
2. the connection/the client encoding
are both set to UTF-8.
2) Again, set the page encoding to UTF-8 (see 1a). If there are no sufficient fonts on the client system, you lose - but likely, if that's the case, the end user wouldn't have been able to read the text, anyway (most users do have fonts for text in their native languages).
The encoding of the PHP files is only relevant if they contain non-ASCII text (which you should avoid).

Questions about iPhone emoji and web pages


Okay, so emoji basically shows the above on a computer. Is that another programming language? So how do I put those little boxes into a php file? When I put it into a php file, it turns into question marks and what not. Also, how can I store these in a MySQL without it turning into question marks and other weird things?
how do I put those little boxes into a php file?
Same way as any other Unicode character. Just paste them and make sure you're saving the PHP file and serving the PHP page as UTF-8.
When I put it into a php file, it turns into question marks and what not
Then you have an encoding problem. Work it out with Unicode characters you can actually see properly first, for example ąαд™日本, before worrying about the emoji.
Your PHP file should be saved as UTF-8; the page it produces should be served as Content-Type: text/html;charset:UTF-8 (or with similar meta tag); the MySQL database should be using a UTF-8 collation to store data and PHP should be talking to MySQL using UTF-8.
However. Even handling everything correctly like this, PCs will still not show the emoji. That's because:
they don't have fonts that include shapes for those characters, and
emoji are still completely unstandardised. Those characters you posted are in the Unicode Private Use Area, which means they don't have any official meaning at all.
Each network in Japan uses different character codes for their emoji, mapped to different areas in the PUA. So even on another mobile phone, it probably won't display the correct character, unless you spend ages manually converting emoji codes for different networks. I'm guessing the ones you posted above are from SoftBank (iPhone?).
There is an ongoing proposal led by Google and Apple to collate the different networks' emoji and give them a proper standardised place in Unicode. Until then, getting emoji to display consistently across networks is an exercise in unhappiness. See the character overview from the standardisation work to see how much converting you would have to do.
God, I hate emoji. All that pain for such a load of useless twee rubbish.
This has nothing to do with programming languages, just with encoding and fonts. As a very brief overview: Every character is stored by its character code (e.g.: 0x41 = A, 0x42 = B, etc), which is rendered as a meaningful character on your screen using a font (which says "the character with the code 0x41 should look like this ...").
These emoji occupy the "private use area" of the Unicode table, which is a range of codes that are undefined and free for anyone to use. That makes them perfectly valid character codes, it's just that no standard font has an appropriate character to display for them, since they are undefined. Only the iPhone and other handhelds, mostly in Japan, have appropriate icons for these codes. This is done to save bandwidth; instead of transmitting relatively large image files back and forth, emoji can be transmitted using a single character code.
As for how to store them: They should be storable as is, as long as you don't try to convert them to another encoding, in which case they may get lost. Just be aware that they only make sense on the iPhone and other SoftBank phones in Japan.
Character Viewer http://img.skitch.com/20091110-e7nkuqbjrisabrdipk96p4yt59.png
If you're on OSX you can copy and paste the character into the Character Viewer to find out what it is. I think there's a similar Character Map on Windows (albeit inferior ;-P). You could put it through PHP's ord(), but that only works on ASCII characters. See the discussion on the ord page for UTF8 functions.
BTW, just for the fun of it, these characters display fine on the iPhone as is, because the iPhone has a font which has icons for them:
iPhone http://img.skitch.com/20091110-bjt3tutjxad1kw4p9uhem5jhnk.png
I'm using FF3.5 and WinXP. I see little boxes in my browser, too.
This tells me the string requires a character set not installed on my computer.
When you put the string into a PHP file, the question marks tell you the same thing: your computer doesn't know how to display the characters.
You could store these emoji characters in MySQL if you encoded them differently, probably using UTF-8.
Do a web search for character encoding, as it relates to MySQL.

Categories