Diamond interrogation in PHP application - php

I have exactly the same application hosted in two different shared servers (one Windows and other Linux), both connecting in the same SQL Server database using PDO.
Somehow in one of the servers (the Linux one) all special chars that comes from SQL Server (like é, ç, õ) are being replaced by � .
The DB charset is Latin1_General_CI_AS and i'm sending the following header to the browser :
<meta charset="UTF-8">
Here is my connection to PDO :
$DBH = new PDO('dblib:host=111.111.111.11;dbname=mydbname;charset=utf8','myuser','mypass')
My question is : how can i find out what is happening here and why the same app runs fine in one server but not in another. Maybe is some configuration in php.ini or in the PDO object ?
I've downloaded a plugin for Chrome that allows you to change the page's character encoding. When i change it to 'Windows-1252', the data coming from the database are displayed correctly, BUT any other element on page that is generated dynamically by PHP gets the special chars replaced by symbols like ç and ã .
Please do not mark this question as duplicated, because it's not ; i can't change the DB charset to UTF-8. And because the application runs fine in the Windows server, i think there is something i can do to fix the issue in the Linux server, besides messing with the database configuration (which i don't have access at the moment).
Any ideas ?
Thanks !

Besides client encoding, which you did already by setting <meta charset="utf-8"/>, you will have to set your connection encoding as well (for both applications).
For PHP, the accepted answer in this post suggests you do the following:
ini_set('mssql.charset', 'UTF-8');
Edit:
For PDO, you can pass it in the connection string. Example:
$DBH = new PDO('mssql:dbname=spr_bank;host=hostname;charset=utf8', $userDB, $passwordDB);
Note that this won't change your actual table to UTF-8. This will only convert charsets for your connection, making it read/write the right characters. Even if it is a Latin1_General_CI_AS table/database.

Here is what fixed my issue :
Set up /your/local/freetds.conf file:
[sqlservername]
host=111.111.111.11
port=1433
tds version=7.0
client charset=UTF-8
Make sure your connection DSN is using the servername, not the IP:
'dsn' => 'dblib:host=sqlservername;dbname=yourdb
Make FreeTDS to use your local freetds.conf file as an unprivileged user from php script via env variables:
putenv('FREETDSCONF=/your/local/freetds.conf');

Related

Charset issue! Webpage transfer to another server

1) I migrated WEB page to another server! From Ms Server 2012/Xampp to Cento7/httpd. On Centos7 Web page contains question marks where should be a special characters "āīūņļš". Web page is built on PHP.
Old server is running old XAMP installation:
PHP Version 5.4.27
5.5.36 - MySQL
CentOS running:
apach/httpd
PHP 7.0.*
5.5.52-MariaDB
2) Both servers contains files with same encoding and DB tables with same charset colletions and both servers have "Server charset is UTF-8 Unicode (utf8)".
Only difference is "Server connection collation". Old server have "Collation" and CentOS have "utf8_general_ci".
3) I tried:
encode files to utf8
define utf8 in meta tags
header('Content-Type: text/html; charset=ISO-8859-1'); and
header('Content-Type: text/html; charset=utf8');
mysqli_set_charset($con,"utf8");
AddDefaultCharset UTF-8 in httpd.conf
I just don't understand, why on one server everything is OK and on another server text contains "?" when files/php code/db is the same! Is there a chance that the httpd doesn't have some module enabled?
And there is one more problem! Some php files and DB tables have different encoding and char collection. I tried change file encoding to UTF8 and it solved problem for static text in php files. Some text in db contains strange characters and db contains lot of information. In some cases mysqli_set_charset($con,"utf8"); works but there ar times when text randomly disappear when mysqli_set_charset is used!

UTF8 characters from database don't show up properly in the browser - MySQL & PHP CodeIgniter

My database and tables are set to utf8_general_ci collation and utf8 charset. CodeIgniter is set to utf8. I've added meta tag charset=utf8, and I'm still getting something like: квартиры instead of cyrillic letters...
The same code running on the local machine works fine - Mac OSX. It's only breaking in the production machine, which is Ubuntu 11.10 64bit in AWS EC2. Static content from the .php files show up correctly, only the data coming from the database are messed up. Example page: http://dev.uzlist.com/browse/cat/nkv
Any ideas why?
Thanks.
FYI:
When I do error_log() the data coming from the database, it's the same values I'm seeing on the page. Hence, it's not the browser-server issue. It's something between mysql and php, since when I run SELECT * FROM categories, it shows the data in the right format. I'm using PHP CodeIgniter framework for database connection and query and as mentioned here, I have configured it to use utf8 connection and utf8_general_ci collation.
Make sure your my.cnf (likely to be in /etc/) has the following entries :
[mysqld]
default-character-set=utf8
default-collation=utf8_general_ci
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
[client]
default-character-set=utf8
You'll need to restart the mysql service once you make your changes.
Adding my comments in here to make this a little clearer.
Make sure the following HTTP header is being set so the browser knows what charset to expect.
Content-type: text/html; charset=UTF-8
Also try adding this tag into the top of your html <head> tag
<meta http-equiv="Content-type" value="text/html; charset=UTF-8" />
To make the browser show up correctly.you should check three points:
encoding of your script file.
encoding of connection.
encoding of database or table schema.
if all of these are compatible, you'll get the page you want.
The original data has been encoded as UTF-8, the result interpreted in Windows-1252 and then UTF-8 encoded again. This is really bad; it isn't about a simple encoding mismatch that a header would fix. Your data is actually broken.
If the data is ok in the database (check with SELECT hex(column) FROM myTable) to see if it was double encoded already in the database), then there must be your code that is converting it to UTF-8 on output.
Search your project for uses of function utf8_encode, convert_to_utf8, or just iconv or mb_convert_encoding. Running
$ grep -rn "\(utf8_\(en\|de\)code\|convert_to_utf8\|iconv\|mb_convert_encoding\)" .
On your application's /application folder should be enough to find something.
Also see config values for these:
<?php
var_dump(
ini_get( "mbstring.http_output" ),
ini_get( "mbstring.encoding_translation" )
);
Well, if you absolutely and positively sure that your mysql client encoding is set to utf8, there are 2 possible cases. One - double encoding - described by Esailija.
But there is another one: you have your data actually encoded in 1251, not in utf-8.
In this case you have to either recode your data or set proper encoding on the tables. Though it is not one button push task
Here is a manual (in russian) exаctly for that case: http://phpfaq.ru/charset#repair
In short, you have to dump your table, using the same encoding set on the table (to avoid recoding), backup that dump in safe place, then change table definitions to reflect the actual encoding and then load it back.
Potentially this may also be caused by the mbstring extension not being installed (which would explain a difference between your dev and production environments)
Check out this post, might give you a few more answers.
Try mysql_set_charset('utf8') after the mysql connect. Then it should works.
After 2 days of fighting this bug, finally figured out the issue. Thanks for #yourcommonsense, #robsquires, and a friend of mine from work for good resources that helped to debug the issue.
The issue was that at the time of the sql file dump to the database (import), charset for server, database, client, and connection was set to latin1 (status command helped to figure that out). So the command line was set to latin1 as well, which is why it was showing the right characters, but the connection with the PHP code was UTF8 and it was trying to encode it again. Ended up with double encoding.
Solution:
mysqldump the tables and the data (while in latin1)
dump the database
set the default charsets to UTF8 in /etc/my.cnf as Rob Squires mentioned
restart the mysql
create the database again with the right charset and collation
dump the file back into it
And it works fine.
Thanks all for contribution!

UTF-8 problems with PHP DOM on Debian server

I have a problem with UTF-8 strings in PHP on my Debian server.
Update in details
I´ve done a little more testing and the situation is now more specific. I updated the title and details to fit it better the situation. Thanks for the responses and sorry that the problem wasn´t described clearly. The following script works fine on my local Windows machine but not on my Debian server:
<?php
header("Content-Type: text/html; charset=UTF-8");
$string = '<html><head></head><body>UTF-8: ÄÖÜ<br /></body</html>';
$document = new DOMDocument();
#$document->loadHTML($string);
echo $document->saveHTML();
echo $string;
As expected on my local machine the output is:
UTF-8: ÄÖÜ
UTF-8: ÄÖÜ
On my server the output is:
UTF-8: ÄÖÜ
UTF-8: ÄÖÜ
I wrote the script in Notepad++ in UTF-8 without BOM and transferred it over SSH. As noticed by guido the string itself is properly UTF-8 encoded. There seems to be a problem with PHP DOM or maybe libxml. And the reason must be some setting since it is machine dependant.
Original question
I work locally with XAMPP on Windows and everything is fine. But when I deploy my project on the server UTF-8 strings get all messed up. In fact when I upload this test script
echo utf8_encode('UTF-8 test: ÄÖÜ');
I get "ÃÃÃ". Also when I connect with putty to the server I cannot write umlauts (ÄÖÜ) correctly in the shell. I have no idea if this issue is even PHP related.
Check for your apache's AddDefaultCharset setting.
On standard debian apache distributions, the setting can be modified in /etc/apache2/conf.d/charset.
Please verify that your file is byte-to-byte the same as on your local machine. FTP transfer in text mode could have messed it up. You may want to try binary one.
EDIT: answer for updated question:
<?php
header("Content-Type: text/html; charset=UTF-8");
$string = '<html><head>'
.'<meta http-equiv="content-type" content="text/html; charset=utf-8">'
.'</head><body>UTF-8: ÄÖÜ<br /></body</html>';
$document = new DOMDocument();
#$document->loadHTML($string);
echo $document->saveHTML();
echo $string;
?>
I suspect your input string may be already UTF-8. Try:
setlocale(LC_CTYPE, 'de_DE.UTF-8');
$s = "UTF-8 test: ÄÖÜ";
if (mb_detect_encoding($s, "UTF-8") == "UTF-8") {
echo "No need to encode";
} else {
$s = utf8_encode($s);
echo "Encoded string $s";
}
Are you explicitly sending a content-type header? If you omit it, it's likely that Apache is sending one for you. If the file is served with a Latin-1 encoding (by Apache) and the browser reads it as such, then your UTF-8 characters will be malformed.
Try this:
<?php
echo "Drop some UTF-8 characters here.";
Then this:
<?php
header("Content-Type: text/html; charset=UTF-8");
echo "Drop some UTF-8 characters here.";
The second should work, if the first doesn't. You may also want to save the file as a UTF-8-encoded file, if it's not already.
If your database characters are messed up, try setting the (My)SQL connection encoding.
Try changing the defualt charset on the server in your php.ini file:
default_charset = "UTF-8"
also, make sure your are sending out the proper content type headers as utf-8
In my experience with utf-8, if you properly configure the php mbstring module and use the mbstring functions, and also make sure your database connection is using utf-8 then you won't have any problems.
The db part can be done for mysql with the query "SET NAMES 'utf8'"
I usually started an output buffer using mbstring to handle the buffer. This is what I use in production websites and it is a very solid approach. Then send the buffer when you have finished rendering your content.
Let me know if you would like the sampe code for that.
Another easy trick to just see if it is the wrong headers being sent out by php or the webserver is to use the view->encoding menu on your browser and see if it is utf-8. If it's not and you switch it to utf-8 and everything looks ok then it is a problem with your headers or content type. If it is already utf-8 and the text is screwed up then it is something going wrong in your code or db connection. If you are using mysql make sure the tables and columns involved are also utf-8
The cause of the problem was an old version of libxml (2.6.32.) on the server. On the development machine it was 2.7.3. I upgraded libxml to an unstable package resulting in version 2.7.8. The problems are now gone.

How to remove  character

I got very strange problem. I have one php website which is running in two server. One is on Apache (Linux) and second is on IIS (WIndow). Linux Server, I just run it for demo. IIS is the actual hosting that I need to host. Even with all the same code, database, in the linux server, there's no  character. But in IIS, everywhere got  characters. I checked all the meta tag, it's utf-8. In database collation is utf-8 also. In mysql database, i got those  character, but somehow, in linux, when we fetch the content from database, those  doesn't show. It just happening on IIS. Can anyone point out how can i resolve this ? Thank you.
I had a similar issue a while ago, there are some useful comments and information here - it's PHP but I believe the theory would be the same:Question 386378
You also need to specify UTF-8 in the HTTP headers. With PHP:
<?php
header('Content-Type: text/plain; charset=utf-8');
?>
With Apache:
AddDefaultCharset UTF-8
The Apache setting can be placed in an .htaccess file.
I checked all the meta tag, it's utf-8.
The browser doesn't interpret the meta tag. It's only a fallback when no http-headers are present. Right click and select "View Page Info" to see what encoding the browser actually interprets the page in.
In database collation is utf-8 also. In mysql database
Collation is irrelevant for display of characters. The charset matters however. So does the connection charset.
Try inspecting the html responses directly by using something like Fiddler or Firebug. Check to see if the responses from IIS/Apache (which should be returning exactly the same text) have:
Different data
Different headers
Pay particular attention to the Content-Type header, which should say what character encoding (utf-8, ISO/IEC 8859-1, Latin-1, etc.) the returned text is in.

International Fonts Display Issue with UTF-8

We have developed a PHP-MySQL application in two languages - English and Gujarati. The Gujarati language contains symbols that need unicode UTF-8 encoding for proper display.
The application runs perfectly on my windows based localhost and on my Linux based testing server on the web.
But when I transfer the application to the client's webserver (linux redhat based), the Gujarati characters are not displayed properly.
I have checked the raw data from both the databases (on my webserver and on the client's webserver) - it is exactly the same. However the display is different. On my server the fonts are displayed perfectly, but when I try to access the client's copy of the app, the display of Guajarati font is all wrong.
Since I am testing both these installation instances from the same machine and the same browser, the issue is not of browser incompatability or the code. I believe that there is some server setting that needs to be done, which I am missing out.
Can you help please.
Thanks
Vinayak
UPDATE
Thanks. We have tried the apache and php settings suggestions given by the SO community members. But the problem remains.
To breakdown the issue we have looked at the different stages that the data is passing through.
The two databases (at the client's end and at at our end) are identical. There is no difference in them.
The next step in this app is that a query is run which recovers the data, creates an xml file and saves it.
This XML file is then displayed using a PHP page.
We have identified that the problem is that there is a difference in the XML file being created. The code used for creating the XML file is as below:
function fnCreateTestXML($testid)
{
$objQuery = new clsQuery();
$objTest = new clsTest();
$setnames = $objQuery->fnSelectRecords("tbl_testsets", "setnumber", "");
$queryresultstests = $objQuery->fnSelectRecords("tbl_tests", "", "");
if($queryresultstests)
{
foreach($queryresultstests as $queryresulttest)
{
foreach($setnames as $setname)
{
//Creating Root node test and set its attribute
$doc = new DomDocument('1.0','utf-8');
$root = $doc->createElement('test');
$root = $doc->appendChild($root);
//and so on
//Saving XML on the disk
$xml_create = $doc->saveXML();
$testname = "testsxml.xml";
$xml_string = $doc->save($testname);
Any ideas??
Thanks
Vinayak
The answer almost certainly lies in the headers being sent with the web pages. To diagnose issues like this, I've found it useful to install the firefox addon "Live HTTP Headers".
Install that addon, then turn it on and reload a page from the client's webserver, and from your own.
What you'll probably see is that the page served by your webserver has the header:
Content-Type: text/html; charset=UTF-8
Whereas when served by the client webserver it says:
Content-Type: text/html
The way I would recommend fixing this is for you to ensure that you explicitly set the header to specify utf-8 in every page of your application. This then insulates your application from future configuration changes on the client's end.
To do this, call
header('Content-type: text/html; charset=utf-8');
on each page before sending any data.
Since you've stated that it is working in your development environments and not on your clients, you might want to check the clients Apache's AddDefaultCharset and set this to UTF-8, if it's not already. (Assuming that they're using Apache.)
I tend to make sure the following steps are checked
PHP Header is sent in UTF-8
Meta tag is set to UTF-8 (Content-Type)
Storage is set to UTF-8
Server output is set to UTF-8
Make sure your php code files are encoded in UTF8 with BOM (Byte Order Mark)
Make sure, that the response headers are correct - the Content-type should have UTF-8 in it.
Check the character set settings on the DB instance on the client's machine:
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';
Before executing any text fetching queries, try executing:
SET NAMES utf8
SET CHARACTER SET utf8
If that works, you might want to add the following to the my.cnf on the client's machine:
[mysqld]
collation_server=utf8_unicode_ci
character_set_server=utf8
default-character-set=utf8
default-collation=utf8_general_ci
collation-server=utf8_general_ci
Please use this meta tag : meta http-equiv="Content-Type" content="text/html; charset=UTF-8"
Make sure use this code in php :mysql_query ("set character_set_results='utf8'");

Categories