form outputs ascii while "accept-charset" is set to utf-8 - php

folks, I need you help please.
I have a form with some inputs that expect some special chars. Thats why I want to use utf-8 encoding. It set it in HTML as a meta-tag, in PHP as a header and directly in the form with "accept-charset". Yet, I get the following:
var_dump($_POST['name']) => "dagã¶bert" (original input: "dägobert")
var_dump(mb_detect_encoding($_POST['vorname'])); => "ascii"
I have absolutely no idea left on what more to do to get this working. I appreciate any hint.

To make sure, that your web server output (from php) is interpreted as utf-8, you can set the encoding explicitly by calling:
header('Content-type: text/html; charset=utf-8');
at the beginning of your php script. It is important that this is called before any other output is done by the script, else an error occurs, that headers could not be set any more.
The <meta charset="utf-8"> tag is not sufficient. You should use the meta tag as well to provide the encoding even in the case that a user decides to store the page locally and view it later again (when noch Content-Type is provided any more, because the page doesn't come from the web server any more).

Related

Download Excel File from a Website - CSV/MySql

First, sorry my english isn't very good.
I have a problem, when I download an Excel File from a Website(direct download) it works on Windows but it isn't working on MAC.
I get the Names and Prenames etc. from a Mysql database.
The german "ä - ü - ö" are not properly converted on MAC.
How can I convert this? Do you know what I mean?
I work with Notepad++.
Programming Language is PHP
Excel version : 2010.
From what you told I suppose you have a PHP script that generates a CSV file with data from your database.
So this sounds like a typical encoding/charset problem to me. You have to define in which encoding you want to store your texts in the database. That's in the most common case UTF-8 these days. For german texts (suppose thats the language because of the umlauts) you could also use ISO-8859-15 encoding.
It's just a guess but in your case I think maybe you did not specify how the browser should interpret the received CSV file.
You normally tell the browser about it in a http header.
Content-Type: text/plain; charset="ISO-8859-15"
or whatever charset you are using (Maybe "UTF-8" instead).
Maybe the PHP header function docu helps you setting the http header.
It's also possible to define the charset in the HTML page. But I think in your case you let the PHP script sends the CSV file and not HTML. But for the record, setting the charset in HTML:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

encoding issues in drupal when importing from wordpress

I am currently moving blog posts from wordpress to drupal. however after moving it
some of the text is not being displayed correctly.
wordpress is displaying :
When it hasn’t (html code is <h2>When it hasn’t</h2>)
Drupal is displaying :
When it hasn’t (html code is <h2>When it hasn’t</h2>)
In the wordpress and drupal db the value is correct. The source is the same.
<h2>When it hasn’t</h2>
I did a search and found many options. None of them helped.
Below are the ones I have done and checked.
1) I double checked that utf-8 is the character encoing in drupal and wp.
I also made a simple test.php file to check nothing else was coming in the way
and it still did not display correctly.
2) I made sure when we take a mysqldump and upload to drupal utf-8
is used.
3) I also made sure the .php file is in utf-8 when saved.
4) I changed the encoding type in chrome for every option available and nothing
displayed it correctly.
5) I also used php functions to recode it but they did not work.
$value2="<h2>When it hasn’t</h2>";
$out = recode_string('..utf-8', $value2);
//output - When it hasnt
$out2= mb_convert_encoding($value2,'UTF-8', "UTF-8");
// output - When it hasn’t
$out3= #iconv('UTF-8', 'utf-8', $value2);
// output - When it hasn’t
I have ran out of options now and I am stuck. Please help
You say the text in both databases is correct, but actually this doesn't mean too much: to viewing the content of a record you must use some client, and quite a few transformations may happen depending on how the text is rendered so you can read it.
So only two things matters:
the encoding of the column
the encoding of the HTML page returned by Drupal
Since your page outputs ’ (in CP1252 is xE2x80x99) for ’ (Unicode U+2019, UTF-8 is 0xE28099) I guess the column is indeed UTF-8, however there's someone between the database and the browser who thinks the text is CP1252. This is what you have to check:
If using MySQL, the connection encoding must be UTF-8 so that what you have in your PHP script is UTF-8 text. You can use SET NAMES 'UTF-8'. Note that if you don't need the Unicode set, you can even use CP1252: the only important thing is that you know the encoding, since PHP strings are just byte arrays.
Explicitely define the response encoding in the HTTP Content-Type header. I mean, configure Drupal to call header('Content-Type: text/html; charset=utf-8');
If the HTTP response encoding is different than the one used for the text retrieved from the db, transcode the query result accordingly

Sending UTF8 in GET parameter

When navigating to a URL like this:
http://example.com/user?u=ヴィックサ
I notice that Chrome encodes the characters as:
http://example.com/user?u=%E3%83%B4%E3%82%A3%E3%83%83%E3%82%AF%E3%82%B5
And everything works serer-side.
However, in IE I get this error from my code:
The user you are trying to find (?????) does not exist.
Note the five question marks. For some reason the PHP never gets to see the parameter.
What could be causing this, and is there any way to fix it?
Sadly it seems what you want to do is not going to work for the current generation of IE
The accepted answer for this question UTF-8 Encoding issue in IE query parameters says that you need to encode the characters yourself rather than relying on the browser as support varies from browser to browser, and maybe even device to device
<a href='/path/to/page/?u=<?=urlencode('ヴィックサ')?>'>View User</a>
Also I presume you are setting utf8 headers from the webserver? you didn't say, if not, in php
header('Content-Type: text/html; charset=utf-8');

How to set the charset to UTF-8 for a received http variable in PHP?

How to set the charset to UTF-8 for a received http variable in PHP?
I have a html form using the POST methode with 1 input field. But when i submit the form and echo the retrieved the contents from the input field via $_POST['input_name'] i get this: KrkiÄ - but i entered (and i need) this: Krkič
So how can i fix this?
I figured it out now. :)
If i want to add the contents to MYSQL then i need to add this:
if(!$mysqli->set_charset("utf8")){
printf("Error loading character set utf8: %s\n",$mysqli->error);
}
If i just need to echo the contents then adding this meta tag
<meta charset="utf-8">
into html head is enough.
There is no global default charset in PHP -- lots of things are encoding-aware, and each needs to be configured independently.
mb_internal_encoding applies only to the multibyte string family of functions, so it has an effect only if you are already using them (you need to do so most of the time that you operate on multibyte text from PHP code).
Other places where an incorrectly set encoding will give you problems include:
The source file itself (saved on the disk using which encoding?)
The HTTP headers sent to the browser (display the content received as which encoding?)
Your database connection (which encoding should be used to interpret your queries? which encoding for the results sent back to you?)
Each of these needs to be addressed independently, and most of the time they also need to agree among themselves.
Therefore, it is not enough to say "I want to display some characters". You also need to show how you are displaying them, where they are coming from and what the advertised encoding is for your HTML.
you can use:
<meta charset="UTF-8" />
on top of your php file place this
header('Content-Type: text/html; charset="UTF-8"');

UTF-8 data received by php isn't decoded

I'm having some troubles with my $_POST/$_REQUEST datas, they appear to be utf8_encoded still.
I am sending conventional ajax post requests, in these conditions:
oXhr.setRequestHeader("Content-type", "application/x-www-form-urlencoded; charset=utf-8");
js file saved under utf8-nobom format
meta-tags in html <header> tag setup
php files saved under utf-8-nobom format as well
encodeURIComponent is used but I tried without and it gives the same result
Ok, so everything is fine: the database is also in utf8, and receives it this way, pages show well.
But when I'm receiving the character "º" for example (through $_REQUEST or $_POST), its binary represention is 11000010 10111010, while "º" hardcoded in php (utf8...) binary representation is 10111010 only.
wtf? I just don't know whether it is a good thing or not... for instance if I use "#º#" as a delimiter of the explode php function, it won't get detected and this is actually the problem which lead me here.
Any help will be as usual greatly appreciated, thank you so much for your time.
Best rgds.
EDIT1: checking against mb_check_encoding
if (mb_check_encoding($_REQUEST[$i], 'UTF-8')) {
raise("$_REQUEST is encoded properly in utf8 at index " . $i);
} else {
raise(false);
}
The encoding got confirmed, I had the message raised up properly.
Single byte utf-8 characters do not have bit 7(the eight bit) set so 10111010 is not utf-8, your file is probably encoded in ISO-8859-1.

Categories