Find source of BOM in Zend Framework 2

Find source of BOM in Zend Framework 2 - php

I realized that all response that returns my Zend Framework 2 application contains weird characters at the beginning. For example, when I copy the source code of any page returned by ZF2, I see these characters at the beginning of the file when I paste it in Notepad++ : ï»¿ï»¿ï»¿ï»¿ï»¿ï»¿. These seem to be 6 Byte Order Mark characters following each other.
I checked the encoding of my files, and every file I opened in Notepad++ were said to be in UTF-8 w/o BOM.
Also, I checked other pages on my server from other sites, and there is no problem.
Could you please help me understand why there is such a thing at the beginning of each page of my site, even in the Json data returned by my web services. What would be the quickest way to spot from where these are printed and how to get rid of them?
Thank you for your help.

I eventually found my answer here:
Elegant way to search for UTF-8 files with BOM?
I tried both ways described on the thread:
grep -rl $'\xEF\xBB\xBF' .
or using Total Commander available here.
It helped me find the files where the BOM character appears and was then able to convert these files to UTF-8 w/o BOM.

Related

Encoding of Files for PHP project

I am developing a php project which is in HTML5. Following is the meta used for all pages in my website.
<meta charset="utf-8">
I am coding in windows machine using NetBeans. I was not really aware of encoding of the files. Since the code was working fine, i was not giving importance for this.
However, based on some of the questions in stackoverflow, I could understand more about encoding. I noticed that many php/js/css files of my project are saved in UTF-8 encoding whereas some php/js/css files are saved in ANSI encoding. (to understand this, i opened the file in notepad, clicked on save as and checked the default encoding shown).
It seems the files in which I pasted some of the unicode characters were autosaved in UTF-8 and all other files were saved in ANSI encoding (I guess it might be Windows-1252). All this happened even though I set project preference as UTF-8 in netbeans.
Is it required to save those files (files which does not use unicode) also to UTF-8 as my html meta says UTF-8? (Note that there are no issues when I tested my website, but my testing was from a windows machine)
I am also curious to know, how the browser render the web page correctly though some of the php files are saved in ANSI but served with meta UTF-8.

(to understand this, i opened the file in notepad, clicked on save as and checked the default encoding shown).
This isn't an accurate way of checking the encoding of a file.
Files which contain only ASCII characters -- like most CSS and Javascript source files! -- are valid in most text encodings. Notepad will call them "ANSI" because that's its default, but they're also perfectly valid as UTF-8. No conversion is necessary.

debug strange characters returned by symfony

I'm not sure how to debug this, or even how best to describe the problem, but all symfony requests are returning strange characters at the beginning of every page. Example:
Â§{"id":"c8184631","version":0.1}
This should be a json response. Those two characters appear at the beginning of every response no matter the bundle or controller. But only happens on symfony, regular PHP on that server is fine.
This doesn't happen locally. I'm unsure how to start debugging this or even which questions to ask.

Maybe there are some files with different encoding (utf-8 or iso-8859-13), that happened to me before, but I was not using symfony2, just php.
What I did was open every file and changed the file encoding to utf-8.
You can check the encoding for each file for example in "Notepad ++"
Encoding->Convert to UTF-8.
It worked for me.

Troubles Displaying Arabic Text

I have this website, websiteaddress.com where I'm facing troubles displaying arabic text. Arabic text shows (?????) question marks only.
I had two sessions of chatting with support, but, no result so far.
These options have been tested and didn't solve the issue:
No database connection is actually established for Arabic text. All arabic text is static, no database entries. So, database encoding isn't an issue.
The site is based on wordpress, encoding in settings is set to "UTF-8". Also tried the ISO-8859-1, both gave the same result. (the arabic text doesn't come from the wordpress databse, it's hardcoded within the theme files)
added a default charset to htaccess to UTF-8 as well as ISO-8859-1.
Resent the headers using PHP with UTF-8 encoding and also tried ISO-8859-1.
changed the PHP.INI in my hosting root directory and also under this specific account root directory and changed the encoding from ISO to UTF-8.
So, all the above did not solve the issue.
Also, I have created two testing pages, with exactly the same text inside (in arabic): websiteaddress.com/test.html and websiteaddress.com/test.php .. both were the same pages, I just changed the extensions of the files. The html works fine and displays arabic. The php one doesn't work and displays questions marks.
This is basically the issue.
If anyone has any other option for me to try or know how to go about doing this, please let me know!
I have searched alot on stackoverflow and found alot of solutions, mostly all the above solutions were mentioned in some questions on stackoverflow. But, none of them answered my question, hence the post here..
Thanks and have a great day everyone!

It was solved by my hosting provider, just in case anyone need this. This is their reply:
I checked the php.ini and the changes were not made in there, so I changed them for you. I also edited the following in your php.ini. exif.encode_unicode I set it to UTF-8 and uncommented it. Now the arabic is rendering at: websiteaddress.com/test.php and websiteaddress.com/test.html You can add the following to your .htaccess to get a different php version rendered. The following code is for 5.3. AddType application/x-httpd-php53 .php

MediaWiki Special Characters in file name issue

When I upload a file onto mediawiki with Special Characters in the name it mangles the filename into something that looks like this:
Waterford_(DÃ¡il_Ã‰ireann_constituency).png
when it actually writes it to the file when it should actually look like this:
Waterford_(Dáil_Éireann_constituency).png
Which means that when another page links to that file it comes up as a broken image link because it's looking for
http://mysite.com/wiki/images/Waterford_(Dáil_Éireann_constituency).png
I don't want to prevent people from using special characters as often they copy the files from wikipedia which supports special characters, and I think it's something to do with the way my host handles files.
So it would be preferable if there was some way to intercept the way that mediawiki creates the files so that the filepath would be free from special characters while all the references would still work.

MediaWiki is 100% UTF-8 safe, but something somewhere in your Apache/PHP configuration is mangling UTF-8 into ISO 8859-1 (Latin1). Start by making sure your PHP install has mbstring enabled as specified here:
http://www.mediawiki.org/wiki/PHP_configuration
If you're stuck with a messed-up host, then this Talk page has some clues for stabbing your Wiki with a knife to make it dumb down filenames:
http://www.mediawiki.org/wiki/Manual_talk:Configuring_file_uploads#Image_File_Names

I can't speak on how and where to do this in Mediawiki, but a good way to do this would be to urlencode() the file name. Percent encoded file names should work on all platforms, and it's easy to restore them into proper form when necessary.

the dreaded UTF-8 BOM

I read your answer to reformat php files used by includes and it did remove the problem. My question is we are working on a web site that needs to display different laguages, will this be a problem?
Thanks Conrad

The BOM is not necessary in UTF-8. It can be safely removed from your PHP scripts without losing UTF-8 support.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.