I cant seem to get these Chinese punctuation marks to work with my database (utf-8)
when i do an echo of the query the marks look like this
���
in php i have already done
$text=mysql_real_escape_string(htmlentities($text));
so as a result they are not saved into the database correctly what can i do to fix this?
Thanks
Executing mysql_query('SET NAMES utf-8'); before any operations with unicode will do the trick
Try using using utf8_encode() function while inserting into db and utf8_decode() while printing the same.
Add the character 'N' before your string value.
Eg. select from test_table where temp=N'unicode string'
besides if you want to use htmlentities, you have to set it to utf-8 encoding like that:
htmlentities($string,ENT_COMPAT,"UTF-8");
Don't put HTML-encoded data in the database. It should be raw text until the time you spit it onto the page (at which point you should use htmlspecialchars().
You need to make sure that both your database and your page are using UTF-8:
ensure your tables are CREATEd with a UTF-8 collation;
use mysql_set_charset after connecting to ensure the connection between MySQL and PHP is UTF-8;
set the Content-Type of the page to text/html;charset=utf-8 by header or meta tag.
You can get away with using a different encoding such as the default latin-1 on the database end and the connection if you treat it as bytes, but case-insensitive comparisons won't work if you do, so it's best to stick to UTF-8.
Related
I am working on a application where I need to store/show data containing many special characters. I have set database collation utf8. I have set collation of table utf8 and character set as utf8_unicode_ci. It is storing all special characters like é, â. But whenever a character ,€ comes it isn't stored as it is. Like whenever there is a word “attributed†it becomes âattributedâ. I am currently using Laravel 5.2 (PHP) .
What I have tried so far
I have set following in my code
iconv_set_encoding('internal_encoding', 'UTF-8');
mb_internal_encoding('UTF-8');
I have also tried
$value = array_map("utf8_encode", $array);
But this special character isn't getting stored as it is. Will any one let me know what should I do to get this special character saved as it is.
try setting your collation to "utf8_general_ci" in your mysql
Normally its no problem to store the € sign to a database field. Check if all your scripts are in a correct coding.
Set your table and all data in that table to utf8_general_ci then try to change your php file to UTF-8.
header('Content-Type: text/html; charset=utf-8');
And if you have incorrect data then use.
utf8_encode("test");
to encode your string to a correct UTF-8 string. If that isn't working i think your data or string your try to convert is not correct.
For being able to use UTF8 characters in your queries you must run a certain query setting this, just like this:
$db->query("SET NAMES UTF8");
But I've seen that you said you're using Laravel. I haven't been working on it, but I guess this's automatically set by the charset parameter written in config, just like in the screenshot of this question:
Laravel UTF-8 To Database
So I have programmed a crawler to scrape information and data from a website with charset utf8. But when I tried to store the contents into MySQL, some special characters, such as Spanish letters), did not show correctly in MySQL.
Here is what I have done:
Put header("Content-Type: text/html; charset=utf-8") in PHP
Set all charset in MySQL into utf8-unicode-ci
Have $conn->query("SET NAMES 'utf8'") this upon connection
Double checked that the html I parsed was encoded in utf-8
So what are some potentially problems here?
Maybe you coded your crawler using functions which are not supposed to manage multi-byte characters.
For example strlen instead of mb_strlen.
Try putting:
mb_internal_encoding("UTF-8");
as first line of your php coce, and then check if you have to convert some functions in their respective mb version.
Have a look at multibyte string reference
As a last chance you may play with iconv function just before inserting the string into mysql.
Something as:
$utf8_string = iconv(iconv_get_encoding($string), "UTF-8", $string);
should do the trick
Start by checking if the data is stored wrong in the database, in which case the problem is with your crawler. Otherwise the problem is in your presentation.
To test this, I would suggest that you use a dedicated mysql client (Such as the command line client) to inspect data.
I remember pulling my hair out in dealing with UTF8 issues until I started adding this to my header:
setlocale(LC_ALL, 'en_US.UTF-8');
I am trying to replace £ with £ and it did not work.
I've tried:
echo str_replace("£", "£", "£3 Discount Discount");
I have also tried html_entity_decode which also did not work.
This is an issue with trying to display UTF-8–encoded data as non–UTF-8. You need to make sure that all character encodings are consistent, and if not then you're converting between them appropriately. The easiest way is to ensure that absolutely everything is in UTF-8. This includes:
The data that's saved in the database (MySQL's character set / collation)
The client connection to the database (Using SET NAMES UTF-8)
The output to the browser (header('Content-Type: text/html; charset=utf-8');)
The PHP script containing the code (yes, this sometimes has an impact)
I would first suggest checking that there isn't any mojibake in your database (e.g. using phpMyAdmin or command-line client), before checking the character sets above. If you find that the database actually contains £, then I would suggest applying the same logic above to any input mechanism to the database (including character encoding of HTML forms).
(Note: I've assumed MySQL throughout this answer.)
If you're able try and use £ instead of the £ character and save yourself the trouble.
You can try cleaning it up in the DB instead. Adapt this query to suit your needs.
UPDATE YOUR_TABLE_NAME SET THE_ROW = REPLACE(THE_ROW , '£', '£');
Try utf8_decode() instead.
I have a table named "cust_details" which has a column "categories", where I have to store some categories like : blockadenlösung, affirmation, beziehungsprobleme lösen
But when I am trying to save this data into the database it is stored like :
blockadenlüsung, affirmation, beziehungsprobleme lösen
That is when umlauts are coming in the string it is not saved in its original form. I tried some charset for storing this characters. But I am still facing the problem.....
What may be the possible reasons...?
Thanks In Advance.....
The data you stored is encoded in UTF-8 (ü for an "ö" is typical for UTF-8), but is not displayed as UTF-8 but rather as ISO-8859-1 or the like.
Make sure that you use the same encoding everywhere:
Deliver your websites with Content-Encoding "utf-8"
Use mysql_query("SET NAMES 'utf8'"); to set the encoding to utf-8
Make sure that the encoding of the database is UTF-8 (use HeidiSQL etc. to check)
Use this when you are inserting the characters:
N'characters here'
The N before the string declaration should enable you to enter it into the DB.
What is the type of the field?
You could specify database/table/field level character-sets. The default latin-1 works in most scenarios.
Otherwise, you would have to use plain text and store unicode strings like &#<4-digit-unicode-value>; into it. Then when you print it out, just dump the unicode into HTML and it will show up as such.
Here is a sample string in Pashto ترافيکي پيښو کې درې تنه مړه او څوارلس نور ټپيان شول. which we store directy into the table. The charset used is latin_charset_ci
Good Luck!
I'm trying to compare some text to the text in a database. In the database any text with an accent is encoded like in HTML (i.e. é) when I compare the database text to my string it doesn't match because my string just shows é. When I use the PHP function htmlentities to encode the string first the é turns into é weird? Using htmlspecialchars doesn't encode the é at all.
How would you suggest I compare é to é as well as all the other accented characters?
You need to send in the correct charset to htmlentities. It looks like you're using UTF-8, but the default is ISO-8859-1. Change it like this:
$encoded = htmlentities($text, ENT_COMPAT, 'UTF-8');
Another solution is to convert the text to ISO-8859-1 before encoding, but that may destroy information (ISO-8859-1 does not contain nearly as many characters as UTF-8). If you want to try that instead, do like this:
$encoded = htmlentities(utf8_decode($text));
I'm working on french site, and I also had same problem. This is the function that I use.
function convert_accent($string)
{
return htmlspecialchars_decode(htmlentities(utf8_decode($string)));
}
What it does it decodes your string to utf8, than converts everything HTML entities. even tags. But we want to convert tags back to normal, than htmlspecialchars_decode will convert them back. So in the end you will get a string with converted accents without touching tags.
You can use pass through this function your email content before sending it to recipent.
Another issue you might face is that, sometimes with this function the content from database converts to ? . In this case you should do this before running your query:
mysql_query("SET NAMES `utf8`");
But you might need to do it, it depends on encoding in your table. I hope it helps.
The comparing task is related to the charset and the collation you selected when you create the database or the tables. If you are saving strings with a lot of accents like spanish I sugget you to use charset uft8 and the collation could be the more accurate to the language(english, french or whatever) you're using.
The best thing of using the correct charset in the database is that you can save the string in natural way e.g: my name I can store it as is "Mario Juárez" and I have no need of doing some weird conversions.
Ran into similar issues recently. Followed Emil's answer and it worked fine locally but not on our dev/stage environments. I ended up using this and it worked all around:
$title = html_entity_decode(utf8_decode($item));
Thanks for leading me in the right direction!