Charset Issue: PHP outputs question marks on page

Charset Issue: PHP outputs question marks on page - php

This is how it looks on Firefox and Chrome.
� What is a Premier Listing?
This is how it looks in the netbeans IDE.
ï¿½ <a>What is a Premier Listing?</a><br>
This is how it looks like in Atom Editor.
� <a>What is a Premier Listing?</a><br>
I tried changing the charset in the html file and also tried changing it in the PHP side but no effect.
The symbols ï¿½ and � are on the html file itself. My question is 'is there a way to correct this error that we are seeing to the browser show the correct symbols?'

This is happening because someone copied and pasted those actual characters into the code. They are not UNICoded that way, or anything like it. It's actual characters. Like, someone copied the page in a rendered state when the characters were not rendered properly.
The only fix for this now is to do a search and replace in your code to put the correct characters in place.
I've seen this happen many times when someone pastes text into code directly from Word or something, and they end up with SmartQuotes. They look fine in the code editor. Then the page renders, and you get gobblygook. If you copy and paste that gobblygook into your code, you get hardcoded gobblygook and nothing more. That is precisely what happened here.

Related

encoding, php, tinymce &check; check

I have a question on encoding. I have a TinyMCE editor and when you put in the source code part & check;" into it is converts it to a check symbol when it saves into a utf8_general_ci database. When I pull it out into PHP code how can I convert that checkmark back into code that the browser will understand on a utf-8 page? I messed around with htmlspecialchars_decode htmlentities html_entity_decode but couldn't get any of those to work with the checkmark maybe I am using them the wrong way. Thanks for the help.
TinyMCE: 4.5.3 (using defaults)
One Solution
I found out my code files were not encoded as UTF-8 which didn't allow me to copy/paste the actual check symbol to find/replace it. I changed the php file to UTF8 and I was able to str_replace the checks with the html entity version. I feel like there's a better way but this works for now.

Utf8 in html correct and php html output messed up

This issue is mind-boggling to me. I am facing the following situation. I wrote a website in html using the utf8 charset. Special characters are displayed as expected. Now I want to give out some php mysql results, so the easiest way is to create a php file, include the html code and then give out the results. However the html given out via the php file does not display the special characters correctly... it's not utf8
here is the html version: HTML
and here the exact copy in a php file: HTML VIA PHP

To close this question myself (because I feel rather stupid right now), the one who actually solved this is Marc B as his comments made me understand the process of text encoding.
After setting the header (Content Type and charset) as well as setting the meta tag in HTML I discovered, just like Marc suspected that my IDE had encoded the php file in another encoding than UTF8. Saving the file as UTF8 and replacing the messed up specialchars fixed my issue.
Please excuse this, I wasn't fully aware of what I was doing.

What would cause an to turn into a unicode character?

I've got some documents on my website which users can edit via a rich text editor and then save them (to the DB) and print them. Some users are experiencing an issue (only happening on the live site) where some of the characters are getting screwed up. I've checked the DB, and the funny characters are in the DB, so it's not a display issue. It either happens when they save the document (submit the form on the site) or they've put something weird in there or their browser changed some of the characters.
The character that keeps appearing everywhere is Â . It's an accented A followed by a space. Looking at the source HTML, it appears that the affected documents had all their 's converted. But whenever I try it, they come out fine.
What would cause an to turn into a unicode character, but only in limited cases?

Misinterpreting the UTF-8 encoding as Latin-1 will cause this.
>>> u'\xa0'.encode('utf-8').decode('latin-1')
u'\xc2\xa0'
>>> print u'\xa0*'.encode('utf-8').decode('latin-1')
Â *

japanese encoding issue, keeps reverting to turkish (windows)

I have some php files which the text is outputting (in dreamweaver) as
stuff like...
“–“ú‚Í‚±‚¿‚ç‚Ìƒ[ƒ‹‚ğˆóü‚µ‚Äó•t‚É‚Ä‚²’ñ¦‚‚¾‚³‚¢B
It is supposed to be Japanese text, if I change the encoding in dreamweaver to swift-JIS, the text look like:
ƒxƒ‹ƒRƒ~ƒ…ƒjƒP[ƒVƒ‡ƒ“ƒYƒŠƒ~ƒeƒbƒh
So i open the file in notepad ++ and character set to Japanese (swift-jis) and paste it back in to dreamweaver and then change the encoding and everything looks great.. I save, reopen and boom, back to turkish and the weird encoding. Any ideas?
On the actual website it looks OK, its just i want to be able to edit it in dream weaver and obviously come back to it in a later date and still be able to read it.

if anyone is still interested: I found this answer smwhere on the web and it worked for me:
to explicitly say DW what encoding to use, manually place this line in the beginning of your page:
where xxxxxxx should be replaced with correct name for your encoding (see modify-page props.-document encoding for that).
Hope that helps.
Peter-ta

You will need to explicitly tell Dreamweaver to open the file in Swift-JIS.
Here is a tutorial that seems to explain how. Maybe there also is a select box in the "Open" Dialog.

PHP Include function outputting unknown char

When using the php include function the include is succesfully executed, but it is also outputting a char before the output of the include is outputted, the char is of hex value 3F and I have no idea where it is coming from, although it seems to happen with every include.
At first I thbought it was file encoding, but this doesn't seem to be a problem. I have created a test case to demonstrate it: (link no longer working) http://driveefficiently.com/testinclude.php this file consists of only:
<? include("include.inc"); ?>
and include.inc consists of only:
<? echo ("hello, world"); ?>
and yet, the output is: "?hello, world" where the ? is a char with a random value. It is this value that I do not know the origins of and it is sometimes screwing up my sites a bit.
Any ideas of where this could be coming from? At first I thought it might be something to do with file encoding, but I don't think its a problem.

What you are seeing is a UTF-8 Byte Order Mark:
The UTF-8 representation of the BOM is the byte sequence EF BB BF, which appears as the ISO-8859-1 characters ï»¿ in most text editors and web browsers not prepared to handle UTF-8.
Byte Order Mark on Wikipedia
PHP does not understand that these characters should be "hidden" and sends these to the browser as if they were normal characters. To get rid of them you will need to open the file using a "proper" text editor that will allow you to save the file as UTF-8 without the leading BOM.
You can read more about this problem here

Your web server (or your text editor) apparently includes a BOM into the document. I don't see the rogue character in my browser except when I set the site's encoding explicitly to Latin-1. Then, I see two (!) UTF-8 BOMs.
/EDIT: From the fact that there are two BOMs I conclude that the BOM is actually included by your editor at the beginning of the file. What editor do you use? If you use Visual Studio, you've got to say “Save As …” in the File menu and then choose the button “Save with encoding …”. There, choose “UTF-8 without BOM” or something similar.

It doesn't show up on the rendered page in Firefox or IE but you can see the funny character when you View Source in IE
Is this on a Linux machine? Could you do find & replace with vim or sed to see if you can get rid of the 3F that way?
If it's on Windows, try opening include.inc with Notepad to see if the funny char is visible & can be deleted.
I'd also be curious to see what happens if you copy the code out of the include and just run it by itself.

Character 3F actually is the question mark, it isn't just displaying as one.
I get the same results as Thomas, no question mark showing up.
In theory it could be some problem with a web proxy but I am inclined to suspect a stray question mark in your PHP markup...which perhaps you have fixed by now so we don't see the problem.

I see hello, world on the page you linked to. No problems that I can see...
I'm using Firefox 3.0.1 and Windows XP. What browser/OS are you running? Perhaps that might be the problem.

I'd also be curious to see what
happens if you copy the code out of
the include and just run it by itself.
Mark: this is on a shared hosting solution, so I can not get shell access to the file. However, as you can see here, there are no characters that shouldn't be there, and running the same file as a script does not produce this char. (The shared hosting company have been of 0 help, continually telling me it is a browser issue).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Charset Issue: PHP outputs question marks on page - php

Related

encoding, php, tinymce &check; check

Utf8 in html correct and php html output messed up

What would cause an to turn into a unicode character?

japanese encoding issue, keeps reverting to turkish (windows)

PHP Include function outputting unknown char

Categories

Resources