HTML Special Characters Post as ASCII - php

Please forgive me, I am new here. I hope I have formatted this correctly. I have converted a database from ASCII to UTF-8 changing all of the Special Characters. Example â changed to â
Working Example:
Domaine Comte Georges de Vogüé
changed to,
Domaine Comte Georges de Vogüé
In my HTML page I have a form with the line below as one of the options.
<option>Domaine Comte Georges de Vogüé</option>
When the form is posted to the PHP page the value is changed to
Domaine Comte Georges de Vogüé
So when it is searched for in the database of course it is not found.
The options for the dropdown field are generated dynamically using code I found at [So You Need To Fill a Dropdown Dynamically https://css-tricks.com/dynamic-dropdowns/]
How do I keep the option value from changing when posted to the PHP script?

Did you try setting a value to the option?
<option value="Domaine Comte Georges de Vogüé">Domaine Comte Georges de Vogüé</option>

You need to convert the HTML entities as displayed on the page (ü, é, etc.) to their representative values prior to performing the database search; this can be accomplished using the PHP function html_entity_decode() -- for example:
<?php
print html_entity_decode('Domaine Comte Georges de Vogüé');
?>
Will result in an output of --
Domaine Comte Georges de Vogüé
More information, options, and examples of usage can be found within the PHP manual -- http://php.net/manual/en/function.html-entity-decode.php

Related

How to encode and decode foreign charset when scraping data using php DOMDocument Class

Here is an example string that I am trying to scrape
Les Meilleurs Hôtels de Normandie en 2015
When I scrape it, I get a utf8 encoding on it as follows
Les Meilleurs Hôtels de Normandie en 2015
To convert it back, I use following function
mb_convert_encoding($utf8_string, 'Windows-1252', 'UTF-8');
The above function works but I was wondering if it possible to circumvent this conversion process and scrape the string as is?

Yet another UTF-8, PHP and Ajax post

I've tried to solve an issue with character encoding for many days now without finding any solution.
Here's what's happening:
I have a form in a page.
When I copy paste a text from Adobe Reader to this form, everything goes fine.
When I copy paste a text from Preview (mac os image viewer), it turns into strange characters.
When the form is submitted, the sentence:
salade mêlée, tomates, mozzarella, basilic melon en saison et jambon cru
Goes through an ajax function and I can see in firebug:
salade%20me%CC%82le%CC%81e%2C%20tomates%2C%20mozzarella%2C%20basilic%20melon%20en%20saison%20et%20jambon%20cru
Now when I get this value into my Zend Controller, in order to save it to my database, I meet the following cases:
if i iconv it to cp1252, the text is cut to "salade me" and that's it
If if utf8_encode it transforms into: salade meÌleÌe, tomates, mozzarella, basilic melon en saison et jambon cru
If I utf8_decode it, it goes to: salade me?le?e, tomates, mozzarella, basilic melon en saison et jambon cru
If I do no transformation, it works...but in phpmyadmin i see: salade mêlée, tomates, mozzarella, basilic melon en saison et jambon cru
Any idea to help me? I'm turning crazy!!
Thanks!
Make sure that phpMyAdmin is configured to use UTF-8, and that the database is also using UTF-8, as well as the connection between PHP and the database. If all of them are using UTF-8, then you should have no issues passing UTF-8 back and forth.

PHP working with text encoding

I am working on the Facebook Public Search API.
As you may understand the results I get come for many different sides of the world.
What I have to do is to give to all the texts I get the same text encoding before putting it inside my MongoDB. I need to use UFT8 as a general and working encoding.
This is an example of what I may get from Facebook:
10 ผู้นำที่โลà¸à¹„ม่ปรารถนา หาà¸à¹„ม่มีผู้นำประเภทนี้à¹à¸¥à¹‰à¸§à¹‚ลà¸à¹€à¸£à¸²à¸à¹‡à¸ˆà¸°à¸”ีขึ้นเยอะ โดยไทยติดอันดับ 1 อ่านต่อได้ที่นี่
or
Now he says he’d side with Pakistan if there were a conflict with the U.S. Better than the Taliban for sure, but not by much. The poor people of Afghanistan… Ayman al-Zawahiri: Al-Qaeda’
or
€™esercito tedesco, il primo modello di A400M è in fase di collaudo, e ci resterà per tre anni
Is there a function in PHP that can quickly convert the text into a UFT8 text encoding?
Did you try this function?
http://php.net/manual/en/function.utf8-encode.php

Search and replace in MySQL database?

I have an unusual problem (this is linked to Browser displays � instead of ´)
I had mismatched character encoding settings on my server (UTF-8) and application (ISO-8859-1), so a third person tasked with entering Spanish translations, entered the words properly at his end, but they weren't saved correctly in the database.
I have subsequently fixed the problem and the server is now ISO-8859-1 as well. [I set
default_charset = "iso-8859-1"
in php.ini]
I do see a pattern in what is in the system, for example the following appears on the system:
Nombre de la organización*
This needs to be:
Nombre de la organización*
ie, I need to search and replace 'ó' with 'ó'.
How can I do so for an entire table (all fields)? (there will be other such corrections as well)
Use the replace function. Simple example:
SELECT REPLACE('www.mysql.com', 'w', 'Ww');
Result: 'WwWwWw.mysql.com'
Now, if you have a table called Foo and you want to replace those characters in a field called bar, you can do the following:
update Foo set bar = Replace(bar, 'ó', 'ó');
Do this for all the affected fields and the problem is solved.
Best regards,
Lajos Arpad.

Accettend letter and other graphic simbols PHP->JS

I have to read a txt via file php. This file contains some normal so may contains this kind of symbols :
€ é ò à ° % etc
I read the content in php with file_get_contents and transform these for inserenting in SQL database.
$contFile = file_get_contents($pathFile);
$testoCommento = htmlspecialchars($contFile,ENT_QUOTES);
$testoCommento = addslashes($testoCommento);
Now if I have this text for example :
"l'attesa �é cruciale fino a quando il topo non viene morso dall'�€"
in the database I have this:
l'attesa è cruciale fino a quando il topo non veniene morso dall'€
When I was GETTING the data from the database I use the php function for decode html entites
$descrizione = htmlspecialchars_decode($risultato['descrizione'],ENT_QUOTES);
$descrizione = addslashes($descrizione);
Now I use jasvascript and AJAX for getting the table content and display to an HTML page
In the browser instead of getting the correct text (€,è) I have square symbol.
I think there is some mess with charset code/decode but never figured out.
The SQL' table is in "utf8_unicode_ci" format and the column in "utf8_general_ci".
The content-type of the page is
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Thanks for help me!
addslashes() is not Unicode-compatible, you should use a different way to escape the quotes in the strings, or (which would be much better) switch to using prepared statements instead of constructing the SQL query as a string.
You can find more details at: http://eleves.ec-lille.fr/~couprieg/post/Bypass-addslashes-with-UTF-8-characters

Categories