I'm using a database that contains contacts (fields like name, address, ...). If i'm using in my database a city that contains special chars (like ü) or html codes (like ü), then how can i convert them to u, so when i search for a city that contains that a special char should be shown in the result...
the database is MyISAM and Collation is latin1_swedish_ci (by default)
Your database shouldn't contain any HTML codes but symbols itself.
With proper collation set, you will find both ü and u with your query
Edit: using collations example
you probably want to look at this article for other ideas and tools for how to solve your search problem. Normalizing your database search to latin-1 characters isn't the right solution.
http://www.alistapart.com/articles/accent-folding-for-auto-complete/
Related
In my database I have DONT word..
In database it is stored like this:DÕNT Due to that Õ the data is not coming..!
How can i retrive that..?
This my code of retrive:
htmlspecialchars($res_get_option['answer']);
Collation: Null and Type: MyISAM
You will need to set right collation for database and update wrong characters, with right ones, using a script for this.
I have a database(Mysql) in which I store more then 100 000 keywords with keyword in different languages. So an example if I have three colums [id] [turkish (utf8_turkish_ci)] [german(utf8)]
The users could enter a german or a turkish word in the search box. If the user enters a german word all is fine so it prints out the turkish word but how to solve it with the turkish one. I ask because each language has its own additional characters like ä ü ö ş etc.
So should I use
mb_convert_encoding
to convert the string but then how to check if it is a german or turkish string I think that would be to complex. Or is the encoding of the tables wrong?
Stuck now so how to implement it so the user could enter keyword of both languages words
You have several issues to solve to make this work correctly.
First, you've chosen the utf8 character set to hold all your text. That is a good choice. If this is a new-in-2016 application, you might choose the utf8mb4 character set instead. Once you have chosen a character set your users should be able to read your text.
Second, for the sake of searching and sorting (WHERE and ORDER BY) you need to choose an appropriate collation for each language. For modern German, utf8_general_ci will work tolerably well. utf8_unicode_ci works a little better if you need standard lexical ordering. Read this. http://dev.mysql.com/doc/refman/5.7/en/charset-unicode-sets.html
For modern Spanish, you should use utf8_spanish_ci. That's because in Spanish the N and Ñ characters are not considered the same. I don't know whether the general collation works for Turkish.
Notice that you seem to have confused the notions of character set and collation in your question. You've mentioned a collation with your Turkish column and a character set with your German column.
You can explicitly specify character set and collation in queries. For example, you can write
WHERE _utf8 'München' COLLATE utf8_unicode_ci = table.name;
In this expression, _utf8 'München' is a character constant, and
constant COLLATE utf8_unicode_ci = table.name
is a query specifier which includes an explicit collation name. Read this.http://dev.mysql.com/doc/refman/5.7/en/charset-collate.html
Third, you may want to assign a default collation to each language specific column. Default collations are baked into indexes, so they'll help accelerate searching.
Fourth, your users will need to use an appropriate input method (keyboard mapping, etc) to present data to your application. Turkish-language users hopefully know how to type Turkish words.
I have a word list stored in mysql, and the size is around 10k words. The column is marked as unique. However, I cannot insert full-width and half-width character of punctuation mark.
Here are some examples:
(half-width, full-width)
('?', '?')
('/', '/')
The purpose is that, I have many articles containing both full-width and half-width characters and want to find out if the articles contain these words. I use php to do the comparison and it can know that '?' is different than '?'. Is there any idea how to do it in mysql too? Or is there some ways so that php can make it equal?
I use utf8_unicode_ci for the database encoding, and the column is also used utf8_unicode_ci for the encoding. When I made these queries, both return the same record, '?測試'
SELECT word FROM word_list WHERE word='?測試'
SELECT word FROM word_list WHERE word='?測試'
Most likely explanation is a characterset translation issue; for example, the column you are storing the value to is defined as latin1 characterset.
But it's not necessarily the characterset of the column that's causing the issue. It's a characterset conversion happening somewhere.
If you aren't aware of characterset encodings, I recommend consulting the source of all knowledge: google.
I highly recommend the two top hits for this search:
what every programmer needs to know about character encoding
http://www.joelonsoftware.com/articles/Unicode.html
http://kunststube.net/encoding/
It had been written many times already that Opencart's basic search isn't good enough .. Well, I have came across this issue:
When customer searches product in my country (Slovakia (UTF8)) he probably won't use diacritics. So he/she writes down "cucoriedka" and found nothing.
But, there is product named "čučoriedka" in database and I want it to display too, since that's what he was looking for.
Do you have an idea how to get this work? The simple the better!
I'm ignorant of Slovak, I am sorry. But the Slovak collation utf8_slovak_ci treats the Slovak letter č as distinct from c. (Do the surnames starting with Č all come after those starting with C in your telephone directories? They probably do. The creators of MySQL certainly think they do.)
The collation utf8_general_ci treats č and c the same. Here's a sql fiddle demonstrating all this. http://sqlfiddle.com/#!9/46c0e/1/0
If you change the collation of the column containing your product name to utf8_general_ci, you will get a more search-friendly table. Suppose your table is called product and the column with the name in it is called product_name. Then this SQL data-definition statement will convert the column as you require. You should look up the actual datatype of the column instead of using varchar(nnn) as I have done in this example.
alter table product modify product_name varchar(nnn) collate utf8_general_ci
If you can't alter the table, then you can change your WHERE clause to work like this, specifying the collation explicitly.
WHERE 'userInput' COLLATE utf8_general_ci = product_name
But this will be slower to search than changing the column collation.
You can use SOUNDEX() or SOUNDS LIKE function of MySQL.
These functions compare phonetics.
Accuracy of soundex is doubtful for other than English. But, it can be improved if we use it like
select soundex('ball')=soundex('boll') from dual
SOUNDS LIKE can also be used.
Using combination of both SOUNDEX() and SOUNDS LIKE will improve accuracy.
Kindly refer MySQL documentation for details OR mysql-sounds-like-and-soundex
I would like to put the following question in front of you.
The database consists of various names. These names can contain 'special' characters like é, ê & alike.
My plan was to have a SEO friendly URL containing the names from the database. The creation of the link is not a big issue, just replace all the special characters with their normal characters using the normalize function or a str_replace function.
But then when I want to query the database using the SEO friend URL (after the necessary validations and checks). The DB ofcourse doesn't return any results due to the special characters.
Now comes the question, is there a way to query the database without the special characters or would it be better to have a separate column in the database where I store the name without the special characters (less need to clean them during the creation of the link and easy to query on). This will have the need to maintain two columns, but will allow me to easily query the database.
Any idea's?
Depending on the collation hôtel will be equal to hotel:
mysql> select 'hôtel' = 'hotel';
+--------------------+
| 'hôtel' = 'hotel' |
+--------------------+
| 1 |
+--------------------+
The problem you have is that you are storing html entities (hôtel) which I think is very wrong because you are mixing a presentation issue with a persistence issue. Get rid of it and choose an adequate collation to your charset then make it the database default or set it for each query.
An adequate charset for web use would be utf8. To set the database up do:
create database mydatabase character set utf8 collate utf8_unicode_ci;
The easy way is to put an ID in the URL like URL's from Stackflow.