I am coding a small template editor and the problem I am having is that code keeps getting converted into other characters, such as:
<?php
$hello = "hello";
?>
and it writes exactly that to the file, I want to write the actual code, php and html.
How can I accomplish this?
In this specific case you should run the contents of the file through the html_entity_decode function.
Description from the documentation -
Convert special characters to HTML entities
$str = '<?php';
echo html_entity_decode($str);
Outputs - <?php
Your issue is that your PHP is calling htmlspecialchars(). This converts characters that could be an issue (such as <>) into their HTML-safe version. You can resolve this by removing the htmlspecialchars() function (not recommended, as it's probably there for a reason) or calling html_entity_decode() on the code you want to save to a file.
I wouldn't necessarily recommend using html_entity_decode. It's usually a bad idea to fix incorrectly-encoded text by just reversing the encoding. Instead, figure out why it's encoded incorrectly in the first place.
Related
I'm working on php with a book now. The book said I should be careful using superglobal variables, so it's better to use htmlentities like this.
$came_from = htmlentities($_SERVER['HTTP_REFERER']);
So, I wrote a code like this;
<?php
$came_from=htmlentities($_SERVER['HTTP_REFERER']);
echo $came_from;
?>
However, the display of the code above was the same without htmlentities(); It didn't change anything at all. I thought that it would change \ into something else. Did I use it wrong?
So, by default, htmlentities() encodes characters using ENT_COMPAT (converts double-quotes and leave single-quotes alone) and ENT_HTML401. Seeing as the backslash isn't part of the HTML 4.01 entity spec (as far as I can see anyway), it won't be converted.
If you specify the ENT_HTML5 flag, you get a different result
php > echo htmlentities('abc\123');
abc\123
php > echo htmlentities('abc\123', ENT_HTML5);
abc\123
This is because backslash is part of the HTML5 spec. See http://dev.w3.org/html5/html-author/charref
Sorry. My previous answer was absolutely wrong. I was confused with something else. My apologise. Let me refrain my answer:
htmlentities will convert special characters into their HTML entity. "<" for example will be converted to "<". Your browser will automaticly recognise this HTML entity and decode it back to "<". So you won't notice any difference.
The reason for this is to prevent problems when saving your document in something different then UTF-8 encoding. Any characters not encoded might become screwed up for this reason.
I'm getting troubles with data from a database containing german umlauts. Basically, whenever I receive a data containing umlauts, it is a black square with an interrogation mark. I solved this by putting
mysql_query ('SET NAMES utf8')
before the query.
The problem is, as soon as I use json_encode(...) on a result of a query, the value containing an umlaut gets null. I can see this by calling the php-file directly in the browser. Are there other solution than replacing this characters before encoding to JSON and decoding it in JS?
Check out this pretty elegant solution mentioned here:
json_encode( $json_full, JSON_UNESCAPED_UNICODE );
If the problem isn't anywhere else in your code this should fix it.
Edit: Umlaut problems can be caused by a variety of sources like the charset of your HTML document, the database format or some previous php functions your strings run through (You should definitely look into multibyte functions when having problems with umlauts).
These problems tend to be the pretty annoying because they are hard to track in most cases (altough this isn't as bad as it was a few years ago). The function above fixes – as asked – umlaut problems with json_encode … but there is a good chance that the problem is caused by a different part of your application and not this specific function.
I know this might be old but here a better solution:
Define the document type with utf-8 charset:
<?php header('Content-Type: application/json; charset=utf-8'); ?>
Make sure that all content is utf_encoded. JSON works only with utf-8!
function encode_items(&$item, $key)
{
$item = utf8_encode($item);
}
array_walk_recursive($rows, 'encode_items');
Hope this helps someone.
You probably just want to show the texts somehow in the browser, so one option would be to change the umlauts to HTML entities by using htmlentities().
The following test worked for me:
<?php
$test = array( 'bla' => 'äöü' );
$test['bla'] = htmlentities( $test['bla'] );
echo json_encode( $test );
?>
The only important point here is that json_encode() only supports UTF-8 encoding.
http://www.php.net/manual/en/function.json-encode.php
All string data must be UTF-8 encoded.
So when you have any special characters in a non utf-8 string json_encode will return a null Value.
So either you switch the whole project to utf-8 or you make sure you utf8_encode() any string before using json_encode().
make sure the translation file itself was explicitely stored as UTF-8
After that reload cache blocks and translations
I am making a website where one fills out a form and it creates a PDF. The user will be able to put in diacritic and special characters. The way I am sending the characters to the PHP, those characters will come into the PHP as HTML coded characters i.e. à. I need to change this to whatever it is PHP will read so when I put it through the PDF maker we have it has the diacritic character and not the HTML code for it.
I wrote a test to try this out but I haven't been able to figure it out. If I have to I will end up writing an array for every possible character they can use and translate the incoming string but I am trying to find an easier solution.
Here is the code of my test:
$title = "Test of Title for use With This Project and it should also wrap because it is sò long! Acutally it is even longer than previously expected!";
$ti = htmlspecialchars_decode($title);
I have been attempting to use the htmlspecialchars_decode() to convert it but it still comes out as ò and not ò. Is there an easy way to do this?
See the documentation which tells you it won't touch most of the characters you care about and to use html_entity_decode instead.
Use the html_entity_decode function instead of htmlspecialchars_decode (which only decodes entities such as &, ", < and > = special HTML chars, not all entities).
I'm working with PHP, getting html from websites, converting them to plain text and saving them to the database.
They need to be saved to the database in utf-8.
My first problem is that I don't know the original encoding, what's the best way to encode to utf-8 from an unknown encoding?
the 2nd issue is the html to plain text conversion. I tried using html2text but it messed up all the foreign utf characters.
What is the best approach?
Edit: It seems the part about plain text is not clear enough. What i need not to just strip the html tags. I want to strip the tags while maintaining a kind of document structure. <p>, <li> tags would convert to line breaks etc and tags like <script> would be completely removed with their content.
Use mb_detect_encoding() for encoding detection.
Use strip_tags() to get rid of HTML tags.
Rest of the subjects like formatting the output depends on your needs.
Edit: I don't know if a complete solution exists but this link is really helpful to improve existing html to text PHP scripts on your own.
http://www.phpwact.org/php/i18n/utf-8
This function may be useful to you:
<?php
function FixEncoding($x){
if(mb_detect_encoding($x)=='UTF-8'){
return $x;
}else{
return utf8_encode($x);
}
}
?>
I send json_encoded data from my PHP server to iPhone app. Strings containing html entities, like '&' are escaped by json_encode and sent as &.
I am looking to do one of two things:
make json_encode not escape html entities. Doc says 'normal' mode shouldn't escape it but it doesn't work for me. Any ideas?
make the iPhone app un-escape html entities cheaply. The only way I can think of doing it now involves spinning up a XML/HTML parser which is very expensive. Any cheaper suggestions?
Thanks!
Neither PHP 5.3 nor PHP 5.2 touch the HTML entities.
You can test this with the following code:
<?php
header("Content-type: text/plain"); //makes sure entities are not interpreted
$s = 'A string with & ۸ entities';
echo json_encode($s);
You'll see the only thing PHP does is to add double quotes around the string.
json_encode does not do that. You have another component that is doing the HTML encoding.
If you use the JSON_HEX_ options you can avoid that any < or & characters appear in the output (they'd get converted to \u003C or similar JS string literal escapes), thus possibly avoiding the problem:
json_encode($s, JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_QUOT)
though this would depend on knowing exactly which characters are being HTML-encoded further downstream. Maybe non-ASCII characters too?
Based on the manual it appears that json_encode shouldn't be escaping your entities, unless you explicitly tell it to, in PHP 5.3. Are you perhaps running an older version of PHP?
Going off of Artefacto's answer, I would recommend using this header, it's specifically designed for JSON data instead of just using plain text.
<?php
header('Content-Type: application/json'); //Also makes sure entities are not interpreted
$s = 'A string with & ۸ entities';
echo json_encode($s);
Make sure you check out this post for more specific reasons why to use this content type, What is the correct JSON content type?