Files
index.php :
<?php
include_once 'index_a.php';
?>
index_a.php :
<html>
<head>
<title>test</title>
</head>
<body>
casa
</body>
</html>
Results
The first result is from the index.php and the second index_a.php.
Why I defend those quotes?
If index_a.php converts the file in UTF-8 without BOM, quotation marks do not appear, but I want the file to be encoded in UTF-8.
you question doesn't make sense: UTF8 file encoding may (but shouldn't, as the byte ordering for UTF8 is fixed) have a BOM. In both cases your file will be UTF8 encoded, so you're done already. What happened here is that you've asked an XY question
So, what you really want to know is: why do those quotes show up for a normal UTF8 encoded file without BOM, but not when there is a BOM, and the answer to that is that you're giving the browser HTML code that could be any version of HTML, and expect it know which version you want rendered.
Without any knowledge of the document type, the browser may, or may not, treat any whitespace between tags as a single whitespace, or no whitespace, depending on the rendermode it guessed you wanted. So if you really don't want that " " then you shouldn't rely on the file encoding, you should make it explicit to the browser that what you're giving it to render is proper HTML. Add
<!doctype html>
at the top so that all browsers know this is a modern HTML5 content file and should be parsed accordingly, rather than falling back into an unpredictable quirks mode.
edit
http://jsbin.com/helikafuni/1/ shows proper HTML5 doctype and element use (you're using ancient HTML4.1 syntax. It's time to read up on how HTML5 changed a lot of the rules and use those new rules instead)
If you want to change your encoding of your Files i would sugguest you to use Notpad++!
After you installed it you can open your files in it and change the encoding like this:
(See point "Convert to UTF-8")
UPDATE:
This should work for you:
index.php:
<?php
include_once 'index_a.php';
?>
index_a.php:
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>test</title>
</head>
<body>
casa
</body>
</html>
Related
I have a problem of special character writing on my website coded in PHP (data from database and normal writing html)
Code :
code in Sublime text
Result :
result in web
I have in my header :
‹meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"›
It's important that your entire line code has the same charset to avoid issues where characters displays incorrectly.
There are quite a few settings that needs to be properly defined and I'd strongly recommend UTF-8, as this has most letters you would need (Scandinavian, Greek, Arabic).
Here's a little list of things that has to be set to a specific charset.
Headers
Setting the charset in both HTML and PHP headers to UTF-8
PHP: header('Content-Type: text/html; charset=utf-8');
(PHP headers has to be placed before any kind output (echo, whitespace, HTML))
HTML: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
(HTML-headers are placed within the <head> / </head> tag)
File-encoding
It might also be needed for the file itself to be UTF-8 encoded. If you're using Notepad++ to write your code, this can be done in the "Format" drop-down on the taskbar. You should use UTF-8 w/o BOM (see this SO).
Other
Some specific functions have the attribute of a specific charset, and if you are using such functions, it should be specified there as well. Read the documentation for each function.
Should you follow all of the pointers above, chances are your problem will be solved. If not, you can take a look at this StackOverflow post: UTF-8 all the way through.
I wan't to pass validation for this site :
http://www.mundo-satelital.com.ar/
but I can't seem to fix the strange character at the start of the file. The W3 validation service automatically detects my page as iso-8859-1 although I can see from the console on Firefox that the header being passed is Content-Type text/html; charset=utf-8 and my <head> contains
<?php header('Content-Type: text/html; charset=utf-8'); ?>
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
It seems what I'm looking for is a stray BOM character, apparently even one in the php includes can cause this, however I've been using grep -rl $'\xEF\xBB\xBF' *.php and variations of it to search for any .php, .html, .js or .css file that might be the culprit and after eliminating all those that turned up positive the problem is still present, anyone have any ideas?
Try to save them as UTF-8 non-BOM. (or without BOM, whatever it is called in your editor). Certainly this is your problem.
The header specified, the meta tag and the actual format your file is saved in are three totally different things. Make sure they are all the same. Also, right now you have a new line char (maybe chars if on windows) before your doctype. <?php header ... ?>\n<!DOCTYPE...
Actually i don't see the problem. When you cut and paste your code into validator, it goes without headers. If you set HTML5 doctype and utf-8 by yourself. Nothing will changed. you will still have 44 erors.
OK so I have a PHP file with several strings of text in various languages. For most languages like French or Spanish I just simply type in the characters.
The problem I have is with Russian language characters. The PHP file is encoded in UTF-8, how can I make sure that the Russian characters are both saved correctly and displayed correctly on the output web page... Is it just a case of pasting the text into the PHP file, or is there a way to guarantee the characters will be saved into the file correctly - perhaps converting it into HTML-like notation for example?
Obviously I am assuming the end user will have the correct encoding set in their web browser, I just want to make sure I got it all covered from my end.
I am using Notepad++ on Windows to edit my PHP file.
Thanks!
If you want to tell browsers your encoding, place it inside your <header> tag:
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
Or short version
<meta charset='utf-8'>
That should be pretty enough for Russian characters to be correctly displayed on a webpage.
if your doctype is html declare <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'> but if your doctype is xhtml then declare <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />.Never assume that end-user will act correctly during your designsIf you already have some document, edit your document's meta tag for charset declaration and use notepad++ encoding>convert to UTF-8 without BOM, save your document, safely go on with your multilingual structure from now on.php tag is irrelevant for your question since you don't mention about any database char setting.
There is no difference between Latin and Cyrillic characters in UTF-8. Both are just byte sequences. Configure your server or PHP script to send Content-Type: text/html;charset=utf, and you are rather safe.
Your editor might have problems when the font you are using does not contain Russian characters. Choose another font then.
And please ignore the <meta> element recommendations. You don't need that: it is useless when your HTTP headers are correct, and maybe harmful if they aren’t.
Well you have to check 2 things
To ensure that *.php is an UTF-8 file I use PSPad. If file is not in UTF-8, I save
it like that: http://stepolabs.com/upload/utf-8.png
Then your website must have UTF-8 encoding in <meta> tag;
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
... more about metatagging.
Finally if everything is done well - (format and meta declaration) all should be displayed properly!
I'm just trying to understand character encoding a bit better, so I'm doing a few tests.
I have a PHP file that is saved as UTF-8 and looks like this:
<?php
declare(encoding='UTF-8');
header( 'Content-type: text/html; charset=utf-8' );
?><!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<title>Test</title>
</head>
<body>
<?php echo "\xBD"; # Does not work ?>
<?php echo htmlentities( "\xBD" ) ; # Works ?>
</body>
</html>
The page itself shows this:
The gist of the problem is that my web application has a bunch of character encoding problems, where people are copying and pasting from Outlook or Word and the characters get transformed into the diamond question marks (Do those have a real name?)
I'm trying to learn how to make sure all my input is transformed into UTF-8 when the page loads (Basically $_GET, $_POST, and $_REQUEST), and all output is done using proper UTF-8 handling methods.
My question is: Why is my page showing the question mark for the first echo, and does anyone have any other information about making a UTF-8 safe web app in PHP?
0xBD is not valid UTF-8. If you want to encode "½" in UTF-8 then you need to use 0xC2 0xBD instead.
>>> print '\xc2\xbd'.decode('utf-8')
½
If you want to use text from another charset (Latin-1 in this case) then you need to transcode it to UTF-8 first using the various iconv or mb functions.
Also:
$ charinfo �
U+FFFD REPLACEMENT CHARACTER
\xBD is not valid as utf8 what you want is \xC2\xBD, the question mark thing is what applications replace invalid code points with, so if you see that in your utf8 text its either not utf8 or corrupted.
i am working on php. in my index.php page i have included right.php. right.php contains greek text. index.php has the html headers. the greek text are not showing correctly. when i open the right.php file in dreamweaver and save the page, it gives warning about the text. what can i do to solve this? because right.php has common contents which is used in many pages.
This is all to do with the content type of your pages. Most likely you are trying to save / display the text in latin1 format which doesn't support the characters you are trying to display.
The most sensible thing to do is convert everything to UTF-8. If you're manually editing the text then ensure your text editor (i.e. Dreamweaver) is set to save the files as UTF-8 and then ensure you have the following meta tag on your page
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
Make sure you are saving your files as UTF-8 encoding (check preferences in DreamWeaver to find file encoding). Also make sure your HTML <head> tags include charset similar to this: <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
You can use a different character set if you prefer, but UTF-8 supports the entire Unicode character space, so it's pretty safe.
You have to set file encoding to utf-8 and set it also in <meta> charset tag in <head> HTML.