Can't decode my HTML entities - php

I am storing my raw html like this...
nl2br(htmlentities($this->input->post('raw_html')))
In my database the data looks like this...
<ul> <li>Improve our understanding of this issue</li> <li>Strengthen your listening and writing skills</li> </ul>
When I try and display the markup from my database I use this:
echo html_entity_decode($html_from_db, ENT_COMPAT, 'UTF-8');
But I get this output being shown in the browser:
<ul> <li>Improve our understanding of this issue</li> <li>Strengthen your listening and writing skills</li> </ul>
Lesson name
And html entities are shown in my source code... so no entities are being decoded.
Why is this not working?

Probably when you're using encode with htmlentities you encoded it twice. See at function params:
function htmlentities ($string, $quote_style = null, $charset = null, $double_encode = true) {}
So you can try this:
nl2br(htmlentities($this->input->post('raw_html'), ENT_QUOTES, 'UTF-8', false))

Related

redirects using htmlspecialchars/htmlentities

I have this kind fo redirects now
Redirect::to(htmlspecialchars('home.php'));
but when I type this on my home.php: /%22%3E%3Cscript%3Ealert('hacked')%3C/script%3E
it results like this:
but why ? they said that it will converted so the exploit attempt will be a failure, but why in mine it is not ?
htmlspecialchars encodes special characters to their HTML equivalent in a string which is passed by argument.
Your Code
Redirect::to(htmlspecialchars('home.php'));
only encodes the string home.phpand pass it to the Redirect::To-Function and does not use htmlspecialchars on the output of the whole page.
To solve this, you have to use it on every output in home.php like this:
<?php
$new = htmlspecialchars("<a href='test'>Test</a>", ENT_QUOTES);
echo $new; // <a href='test'>Test</a>
?>
(Example from: http://php.net/htmlspecialchars)

PHP How to convert strings from DomCrawler to UTF-8

I have some data I collect with DomCrawler and store in an array, but it looks like he fails when it comes to special characters like è,à,ï,etc.
As an example I get è instead of è when I echo the result.
When I store my results in a .json file I get this: \u00c3\u00a8
My goal is to save the special character in the .json file.
I've tried encoding it but doesn't seem to have the result I want.
$html = file_get_contents($url);
$crawler = new Crawler($html);
$h1 = $crawler->filter('h1');
$title = $h1->text();
$title = mb_convert_encoding($title, "HTML-ENTITIES", "UTF-8");
Is there anyway I can have my special characters shown?
Thanks a lot!
By using the constructor to add the HTML, the crawler assume that it is in ISO-8859-1. You have to explicitly tell it that your DOM is in UTF-8 with the addHTMLContent method:
$html = file_get_contents($url);
$crawler = new Crawler;
$crawler->addHTMLContent($html, 'UTF-8');

PHP file_get_contents and domxpath UTF-8 encoding issue

I'm reading an external file which contains this :
<td>ÖZGÜR </td>
And I read it like this :
$html = file_get_contents("");
$html = str_replace("charset=iso8859-9" , "charset=utf-8" , $html);
$rows = $x->query('//tr[contains(#class,"tablerow")]');
foreach($rows as $node)
{
echo $node->childNodes->item(12)->nodeValue;
}
it does not echo ÖZGÜR , but it echoes �ZGÜR.
what type of encoding function should I call here ?
Thanks for any help !
you should use
mb_internal_encoding("UTF-8");
function to change the encoding instead of
$html = str_replace("charset=iso8859-9" , "charset=utf-8" , $html);
if data is stored in database than you need to change the connection encoding at the time of data fetching.
mysql_set_charset('utf8',$constring) than you will be able to retrieve in the UTF-8 format
Try converting $html to utf8 after you set it with file_get_contents, something like
$html = iconv('ISO-8859-9', 'UTF-8', $html);

Twitter typeahead displays '&amp' instead of '&'

I'm trying to display search-results with the sign & in them. But when I render from php to json & converts to &amp.
Is there anyway I can prevent that, or before I print the names in the search-bar convert it back to &?
Thanks in advance!
EDIT:
HTML JS:
{
name: 'industries',
prefetch: 'industries_json.json',
header: '<h1><strong>Industries</strong></h1>',
template: '<p>{{value}}</p>',
engine: Hogan
},
industries_json.json (Created with json_encode)
[{"id":42535,"value":"AUTOMOBILES &AMP; COMPONENTS","type":"industries"}]
php-script which ouputs json:
public function renderJSON($data){
header('Content-type: application/json');
echo json_encode($data);
}
Use html_entity_decode function like...
Code Before:
public function renderJSON($data){
header('Content-type: application/json');
echo json_encode($data);
}
output: Gives html code in string like ex: appthemes & Vantage search Suggest
Code After:
public function renderJSON($data){
header('Content-type: application/json');
$data = json_encode($data);
echo html_entity_decode( $data );
}
output: Gives html code in string like ex: appthemes & Vantage search Suggest
The issue you're facing is HTML encoding. What you want is to decode the text before sending it to the client. It is important that you only decode the text properties of the JSON object (rather than the entire JSON object itself).
Here is a reference on HTML decoding in PHP: http://php.net/manual/en/function.html-entity-decode.php
I would do it in javascript. Once you have got the JSON, just use replace function of javascript on the JSON like this:
first convert it to string by this:
var jsonString=JSON.stringify(jsonData);
then replace &amp like this.
jsonString=jsonString("&amp","&");
then again, convert it to JSON obj;
jsonObj=JSON.parse(jsonString);
now, your JSON, will have & instead of &amp.
What's next ?
Do whatever you need to do with the jsonObj

Character entered by user is breaking xml decoding done by xml_parse_into_struct

thanks for answering!
This is about PHP/MySQL
The user enters some text that is then processed through htmlentities():
$new_userinput = htmlentities($userinput, ENT_QUOTES);
This entry is stored in an XML:
...
<entrylist>
<list>$new_userinput</list>
<info>$someinfo</info>
</entrylist>
...
The xml file is stored in a database in UTF-8 format. The HTML for the site is also set with UTF-8.
What we observed is with a specific input, the xml being processed by:
$p = xml_parser_create();
xml_parse_into_struct($p, $xmlentry, $values, $index);
xml_parser_free($p);`
is not processed properly by the xml_parse_into_struct().
What we see in the database is the following:
...
<note>Creatives share shots—small screenshots.</note>
...
You need to specify the charset in htmlentities(), eg
$new_userinput = htmlentities($userinput, ENT_QUOTES, 'UTF-8');
To illustrate
echo htmlentities("€", ENT_QUOTES); // â?¬
echo htmlentities("€", ENT_QUOTES, "UTF-8"); // €

Categories