I am in search for a database with translations so I can have comonly used phrases and words translated by a machine and not by an expensive translator. Is there such a thing as a translation database with words and often used phrases?
If you don't know any would you use such a service?
edit: the database should only be monitored by people and not some automatic translater since they tend to be VERY bad
edit: the database should only be monitored by people and not some automatic translater since they tend to be VERY bad
I don't think this is enough. If you're going to translate single words, you need to have some idea of the context in which the word will be used.
For instance, consider the english word "row"
Does this mean
1. A line of things
2. An argument
3. To move a boat with oars
4. An uproar
5. Several things in succession ("they won four years in a row")
These are likely to have very different translations.
So instead, it might well be worth keeping a multi-language glossary, where you record the definition of a term and its translation in all the languages you care about, but I think you'll need a professional translator to get the translations right, and the "lookup" will always need to be manual.
Check: open-tran.eu. It is a database of translations taken from various open source projects.
http://www.google.com/language_tools
So what you want is a database phrase book? What do you want that for? You can't use a phrase book to translate books or software etc. You can't use machine translation either, even though it can be a useful tool to start with. You have to use human translators wich know the source and target-language well, preferrably a bi-lingual person.
The only thing a phrase book is good for is asking directions; and not understand the answer... ;)
Related
I am the project manager on a website that needs to be converted into multiple languages. I am trying to figure out what the best option to go with is. I don't have a problem paying for something, but I just want to make sure it will work properly.
The options that I have thought of was to either (somehow) integrate google translate that when the user clicks on the language they want to read the page in, it updates the language for google to translate into. I did work with Google translate a little bit, but I found it to be little clumsy. Maybe I am not using it properly.
Another alternative I had, definitely not the best idea, but a backup if need be is to have the content put in a database and pulling the content dependent on the user's language. The only problem I have is that changing one word on the English version would have to change on every other language.
I am open to any other idea. I can clarify the project more, if need be.
As someone who speaks several languages, I can assure you that Google Translate often misses the mark. In many cases their translations are embarrassing, especially when you try to translate individual words or phrases without a sufficient context. Some language pairs are better than others, but overall this is not an option at this point.
Compiled languages have an advantage of static i18n, when a different version of a code is compiled for each UI language.
Database-driven dynamic i18n is a bad option, and almost all programming frameworks try to avoid it. I would recommend, therefore, that you look for an i18n solution that works with properties (text) files to lookup translated strings. In PHP this is gettext or intl.
Note also that i18n involves not only translation of text, but it also requires appropriate localization of dates, numbers, currencies, etc.
I don't have a problem paying for something, but I just want to make
sure it will work properly.
Based on that statement of yours I would like to suggest that hiring a firm that specializes in translation will be your best bet, then just put a multiple links that will lead to multiple languages of your website.
Problems that you might encounter:
Adjusting contents, some translations might be too short, some might be too long.
Using google translate can ruin your site, because sometimes it fails especially for some languages.
Which is better:
gettext
custom MySQL+cache based functionality
Gettext is a sort of builtin feature, so I assume it's tweaked for performance. Using poedit is a pain and impossible to show to any client.
A custom functionality allows for simple translation interface. But might be heavy on php/db usage.
I suppose, which one would you use when?
Localization is difficult. It is really difficult. It's not just "pairs of words" => "Wortpaare", it's a lot more complex than that. What most people forget when they look at gettext and go "Ugh, ugly" is that the localization process is a lot more important than the technical details of the implementation. That's because the actual translators are typically not programmers and are probably not even in-house. This causes a lot more headaches than you may think. gettext is really old, is battle tested and has a huge toolchain behind it that is tuned to support this process. If you want to do i18n and l10n properly, you need a powerful system. gettext is that and has support from a wide range of tools. Your Homebrewed Translation Systemâ„¢ does not.
First of all, you need a robust system to extract translatable strings. Without being able to automatically and reproducibly extract translatable strings from source code, you have a mountain of work for each new string you want to translate. In gettext, xgettext does that.
Next, you need a tool to synchronize the extracted strings with already existing translations in a way that no translations are lost and that only slightly changed translations are kept if possible. In gettext, msgmerge does that.
Next, you want a way to add extra information to strings. You want to be able to group them by category, "domain" and context, you may want to add comments for the translator to the source code and you may want translators to be able to add comments to the translations. gettext supports all that.
Next, you want a file format that has good support from a variety of tools, since you may be sending your files to China to get them translated there. The reason you may be sending them away to external translators is also the reason you need a good synching tool to merge changes, since this can be a very asynchronous process. PO files are very well supported, because gettext is so old. There are many open source and commercial tools that support the localization process at many levels, depending on your specific needs.
Do not underestimate the task of localization, choose a tool that is well suited for the process and learn it. gettext is a great tool, if admittedly not the most beginner friendly.
For what it's worth, here's my gettext extension for Twig, which makes gettext for PHP even better.
Maybe you should look into Memcached which you can use it in combination with MySQL.
It's very useful for fetching data which doesn't change too often, like translations.
Gettext is a very old format. Its using files to store translations. Its clumsy, especially when you have translations by the thousands lets say 20,000. Managing a PO file with 20,000 translation strings is a nightmare, across 50 languages is imposible. Then you have to actually compile it in a MO file. No thanks. It might have made sense back in the 1990, not now.
Databases instead are powerful. Like really powerful. Name what you need and you can get it. In a second they can tell you exactly:
Which of the translation strings are not translated in which language
When was the translation first created and by whom
When was the translation last updated and by whom
Full history of every translation with person who has done the change
You can have all texts pre-translated in materialized views and get them with one select statement
Order translation strings in alphabetic order, in pages for page by page view and edit
Set which user can update exactly which translations
With some simple HTML web forms, anyone anywhere in the world can translate your application in real time, within seconds, and with full history, comment on every translation pair, recieve and read replies, flag translations for todo, etc, etc
Have analytics in seconds of who has made how many translations, in the last day, week, month, year - so you can give incentives out
Still want PO files? You can get these created by your database on a schedule that you need them
Missing translations? Your database can send automatic emails, SMS to the responsible translator for this language.
A translation has been updated by the translator? Now the database can send an email to the responsible reviewer to approve
Need the translation fast? Your database can call an API to have it translated right now, then send email to the responsible person to review
I am trying to figure out how autocorrect algorithms can be implemented in either PHP or C#.
In short, I have a user inputted word that should be able to have minor misspelling be tolerated. I also have an SQL database of correctly spelled words. I want to be able to grab the closest (correctly) spelled word from the database to that which the user entered.
I realize there are a zillion autocorrect packages out there, but I would like to be able to customize it, so I am looking for any information on implementing this functionality in either PHP or C#.
Many thanks,
Brett
I am assuming you mean Peter Norvig's spell corrector, only written in C# or PHP (1, 2) as linked from his site.
This is essentially the method Google uses for spelling corrections.
A dictionary file and levenshtein distance functions are going to be your best bet.
http://us.php.net/manual/en/function.levenshtein.php
Check out the comments on that function, it has a few sample implementations.
To take it to the next level, you could also throw soundex or metaphone functions in there, and it will catch phonetic errors too.
Web or windows? Assume web, since you mention PHP.
Budget or no budget? There are various web editors out there. Telerik makes a nice AJAX control, for example, that allows using AJAX to spell check. It is fully customizable. I am sure some of the other vendors (Infragistics, Synfusion, ComponentOne, etc) have similar editors.
If you need to head to Open Source, there are editors out there. Not sure which support customization of lists, however. As the third party controls are relatively inexpensive (a few hundred dollars or less) and easy to customize (Telerik's is), I find it a better option to coding yourself or ending up with an open source implementation that is hard to customize. It is worth looking at open source, however.
Does somebody knows a simple php language switcher. I'm not really a PHP savvy and I would like your help.
Thanks in advance.
The answer has already been posted, but let me give a brief explanation here.
Computers aren't smart. They don't understand higher level concepts like language. The fact is: computers can't look at a sentence and know what it means. Using advanced math and algorithms we can dissect the sentence and try to recognize key words, but something as simple as a misspelling could throw the whole algorithm for a loop.
Web services which perform automatic translation are not only buggy, but also tend to require LOTS of power and resources. That's why they're often only owned and operated by companies like Yahoo! (Babelfish) or Google (Google Translate).
Whenever a website has a simple feature for changing language (phpBB has a feature like this built in) the simple fact is that they typed everything several times. Once in English, one in Spanish, once in German... Then by clicking a button it determines whether to send you the English text, Spanish text, or German text. The same is true of wikipedia. When you view an article in two different languages they are not by ANY means the same article. Many times I'll read the Spanish wiki and the information will differ drastically. Two different people wrote two different articles, and by selecting a language you're just telling wikipedia which article to send you.
Your best bet if you really need your website translated at the click of a button is to add Google's Translate Tools. http://translate.google.com/translate_tools
There's no free ride here. You'll have to provide translated strings for every message displayed by your program. This article will get you started: Internationalization in PHP 5.3
yeah it requires some modifications
1. seperate your business logic from presentation layer via templating
2. in your presentation layer remove hard-coded text and replace it with php-variables
3. create your language files
4. depending on how you solved the case, puzzle your app together(hand over the data from the language files in the presentation layer)
I am currently working on a project / website and I will need to make it available in several languages. The site was done with PHP / mysql and a lot of javascript (jQuery). I have no idea where to start and I was hoping somebody could give me some hints. I would like to know opinions about what is the best approach to take, if there are some good tools for such a php site, what to do with the existing scripts, or better, with the text inside of the scripts that need to be translated as well. Does anybody had to do something like this before that could guide me through the right path :) ??
thanks
There are a number of ways of tackling this. None of them "the best way" and all of them with problems in the short term or the long term. The very first thing to say is that multi lingual sites are not easy, translators and lovely people but hard to work with and most programmers see the problem as a technical one only. There is also another dimension, outside the scope of this answer, as to whether you are translating or localising. This involves looking at the target audiences cultural mores and then tailoring language, style, layout, colour, typeface etc., to that culture. Finally do not use MT, Machine Translation, for anything serious or if it needs to be accurate and when acquiring translators ensure that they are translating from a foreign language into their native language which means that they understand all the nuances of the target language.
Right. Solutions. On the basis that you do not want to rewrite the site then simply clone the site you have and translate the copies to the target language. Assuming the code base is stable you can use a VCS to manage any code changes. You can tweak individual parts of the site to fit the target language, for example French text is on average 30% larger than the equivalent English text so using one site to deliver this means you may (will) have formatting problems and need to swap a different css file in and out depending on the language. It might seem a clunky way to do it but then how long are the sites going to exist? The management overhead of doing it this way may well be less than other options.
Second way without rebuilding. Replace all content in the current site with tags and then put the different language in file or db tables, sniff the users desired language (do you have registered users who can make a preference or do you want to get the browser language tag, or is it going to be URL dot-com dot-fr, dot-de that make the choice) and then replace the tags with the target language. Then you need to address the sizing issues and the image issues separately. This solution is in effect when frameworks like Symfony and Zend do to implement l10n.
Then you could rebuild with a framework or with gettext and and possibly have a cleaner solution but remember frameworks were designed to solve other problems, not translation and the translation component has come into the framework as partial solution not the full one.
The big problem with all the solutions is ongoing maintenance. Because not not only do you have a code base but also multiple language bases to maintain. Unless you all in one solution is really clever and effective then to ongoing task will be difficult.