I'm currently working on a project which locally works just fine but once I upload it to my hoster's server it just throws errors. My hoster says that it's the fault of BOMs inside the file.
How do I get rid of them inside VS Code or Brackets?
In VS Code, the encoding information can usually be found at the bottom right of the editor window, along with end-of-line sequence (LF, CRLF) and language mode (PHP, JavaScript, etc.).
If the text file includes a BOM (Byte Order Mark) at its very beginning, UTF-8 with BOM should be displayed down there in a clickable area. Mouse-clicking on it will bring an Action pop-up menu at the top of the window, coming with two items:
Reopen with Encoding
Save with Encoding
Select the Save with Encoding action, then UTF-8 in the next Encoding pop-up menu. This will get rid of the embedded BOM for the current file.
This is referenced in the VS Code online documentation: Basic Editing > File Encoding Support
Related
I have a bizarre problem: Somewhere in my HTML/PHP code there's a hidden, invisible character that I can't seem to get rid of. By copying it from Firebug and converting it I identified it as or 'Zero width no-break space'. It shows up as non-empty text node in my website and is causing a serious layout problem.
The problem is, I can't get rid of it. I can't see it in my files even when turning Invisibles on (duh). I can't seem to find it, no search tool seems to pick up on it. I rewrote my code around where it could be, but it seems to be somewhere deeper in one of the framework files.
How can I find characters by charcode across files or something like that? I'm open to different tools, but they have to work on Mac OS X.
You don't get the character in the editor, because you can't find it in text editors. #FEFF or #FFFE are so-called byte-order marks. They are a Microsoft invention to tell in a Unicode file, in which order multi-byte characters are stored.
To get rid of it, tell your editor to save the file either as ANSI/ISO-8859 or as Unicode without BOM. If your editor can't do so, you'll either have to switch editors (sadly) or use some kind of truncation tool like, e.g., a hex editor that allows you to see how the file really looks.
On googling, it seems, that TextWrangler has a "UTF-8, no BOM" mode. Otherwise, if you're comfortable with the terminal, you can use Vim:
:set nobomb
and save the file. Presto!
The characters are always the very first in a text file. Editors with support for the BOM will not, as I mentioned, show it to you at all.
If you are using Textmate and the problem is in a UTF-8 file:
Open the file
File > Re-open with encoding > ISO-8859-1 (Latin1)
You should be able to see and remove the first character in file
File > Save
File > Re-open with encoding > UTF8
File > Save
It works for me every time.
It's a byte-order mark. Under Mac OS X: open terminal window, go to your sources and type:
grep -rn $'\xFEFF' *
It will show you the line numbers and filenames containing BOM.
In Notepad++, there is an option to show all characters. From the top menu:
View -> Show Symbol -> Show All Characters
I'm not a Mac user, but my general advice would be: when all else fails, use a hex editor. Very useful in such cases.
See "Comparison of hex editors" in WikiPedia.
I know it is a little late to answer to this question, but I am adding how to change encoding in Visual Studio, hope it will be helpfull for someone who will be reading this sometime:
Go to File -> Save (your filename) as...
And in File Explorer window, select small arrow next to the Save button -> click Save with Encoding...
Click Yes (on Do you want to replace existing file dialog)
And finally select e.g. Unicode (UTF-8 without signature) - that removes BOM
Currently I am developing a web site by php in Persian ( Farsi language ).The problem is when I submit a form in firefox all fonts get destroyed. like below pictures:
I have checked the code ( include meta tags and others thousands of times ) and it makes it more wierd that this happend only on firefox and no other browser after submission.Is there any bug related to firefox or am I supposed to change any attribute of form .
I am quiet desperated . please help me if anyone has a clue.
One detail of your screenshot caught my attention:
This looks a bit like a LTR variant of UTF-8 BOM.
To quote from Wikipedia Byte Order Mark:
A text editor or web browser interpreting the text as ISO-8859-1 or CP1252 will display the characters "" for this.
I would therefore assume you inject invalid text-fragments having such a BOM inside and existing HTML document (AJAX?), your Firefox browser detects that the document can't be valid Unicode any longer and therefore falls back to ISO-8859-1 which once was the default character encoding for all text documents on the internet.
As the CSS rules still apply, the LTR display was preserved, just the text-encoding meta-information was changed.
Please take care: Having the correct headers is one thing to signal the correct encoding, however it does not unburdened you from actually provide correctly encoded text-data.
I must admit, those BOMs can be pretty tricky, so they are easy to overlook.
Solution: Do not inject any BOM here. If you provide back HTML from a PHP file, check the PHP file that it doesn't use any BOMs.
I figured this issue out.
I used <meta content="text/html; charset=utf-8" http-equiv="Content-Type" />.
I replaced it with UTF8 header in PHP :
header('Content-Type: text/html; charset=utf-8');
and problem solved.
I had another problem that solved with this change.My website pages loaded twice in firefox and its solved now.It seams that fire fox dont like that meta tag at all ;)
I am anticipating my question is about to be closed down as exact duplicate but nevertheless.
I need to display cyrillic text in HTML
регистрация
However on the web I can see 'squares' if UTF-8 encoding is chosen. If I change encoding to Windows Cyrillic then link text is ok but all WordPress cyrillic contents is not displayed correctly.
So - I've got wordpress contents and HTML/PHP in different encodings. Do I have to save PHP in Windows Cyrillic or there is better solution?
I am using Notepad++.
You should save your file in UTF-8 instead of cyrillic, then the output will be rendered correctly.
Check your editor that it can save UTF-8 files and set the encoding correctly in it.
I'm writing my first little AJAX-enabled Joomla component. I'm using mootools. I got a xmlhttprequest to contact my Joomla component, and the component returns a response - just plain text echoed by php, like
echo 'Hello World!';
It's all working fine, except wireshark tells me that the response is prepended with \357\273\277\357\273\277 when it gets read by the javascript on the client side. This shows up as a little square before the response in an alert box that the script shows.
I don't explicitly set the encoding on the xmlhttprequest; mootools docs say that it defaults to UTF8.
What's the right way to handle this? Should I be setting the encoding on the request? Mime type? Should the javascript get rid of it? I'm not planning to have any characters requiring UTF8 in the response, so using plain old ascii would be ok for me too.
Thanks
A UTF-8 BOM is generally not recommended. Byte-order cannot be reversed in UTF-8 so it serves little purpose other than to just inform the consuming source that the following content is, indeed, UTF-8 encoded.
I'd strip it either on the Joomla end (preferred) or with javascript.
Also, for whatever reason, it looks like you have a double BOM there.
This related question might help as well.
I'm using Microsoft Expression Web 3, and even though it was set to not add a BOM for php files, there was indeed a BOM at the beginning of php files. I used a hex editor to remove the BOM, and now Expression doesn't add a BOM anymore while saving.
I don't know why there was 2 BOMs in the xmlhttprequest response, but now they're both gone.
When using the php include function the include is succesfully executed, but it is also outputting a char before the output of the include is outputted, the char is of hex value 3F and I have no idea where it is coming from, although it seems to happen with every include.
At first I thbought it was file encoding, but this doesn't seem to be a problem. I have created a test case to demonstrate it: (link no longer working) http://driveefficiently.com/testinclude.php this file consists of only:
<? include("include.inc"); ?>
and include.inc consists of only:
<? echo ("hello, world"); ?>
and yet, the output is: "?hello, world" where the ? is a char with a random value. It is this value that I do not know the origins of and it is sometimes screwing up my sites a bit.
Any ideas of where this could be coming from? At first I thought it might be something to do with file encoding, but I don't think its a problem.
What you are seeing is a UTF-8 Byte Order Mark:
The UTF-8 representation of the BOM is the byte sequence EF BB BF, which appears as the ISO-8859-1 characters  in most text editors and web browsers not prepared to handle UTF-8.
Byte Order Mark on Wikipedia
PHP does not understand that these characters should be "hidden" and sends these to the browser as if they were normal characters. To get rid of them you will need to open the file using a "proper" text editor that will allow you to save the file as UTF-8 without the leading BOM.
You can read more about this problem here
Your web server (or your text editor) apparently includes a BOM into the document. I don't see the rogue character in my browser except when I set the site's encoding explicitly to Latin-1. Then, I see two (!) UTF-8 BOMs.
/EDIT: From the fact that there are two BOMs I conclude that the BOM is actually included by your editor at the beginning of the file. What editor do you use? If you use Visual Studio, you've got to say “Save As …” in the File menu and then choose the button “Save with encoding …”. There, choose “UTF-8 without BOM” or something similar.
It doesn't show up on the rendered page in Firefox or IE but you can see the funny character when you View Source in IE
Is this on a Linux machine? Could you do find & replace with vim or sed to see if you can get rid of the 3F that way?
If it's on Windows, try opening include.inc with Notepad to see if the funny char is visible & can be deleted.
I'd also be curious to see what happens if you copy the code out of the include and just run it by itself.
Character 3F actually is the question mark, it isn't just displaying as one.
I get the same results as Thomas, no question mark showing up.
In theory it could be some problem with a web proxy but I am inclined to suspect a stray question mark in your PHP markup...which perhaps you have fixed by now so we don't see the problem.
I see hello, world on the page you linked to. No problems that I can see...
I'm using Firefox 3.0.1 and Windows XP. What browser/OS are you running? Perhaps that might be the problem.
I'd also be curious to see what
happens if you copy the code out of
the include and just run it by itself.
Mark: this is on a shared hosting solution, so I can not get shell access to the file. However, as you can see here, there are no characters that shouldn't be there, and running the same file as a script does not produce this char. (The shared hosting company have been of 0 help, continually telling me it is a browser issue).