html entities for utf-8 character in php - php

I have used html_entities for UTF-8 in php.
$input = "<div> 'Testing' </div>";
echo htmlentities($input,ENT_NOQUOTES,"UTF-8");
But, above encoding is working for normal input, if i give below input and use encoding then I am getting blank output.
$input = "<div>Other 'user' is working on this line. Please contribute the next line.</div>";
echo htmlentities($input,ENT_NOQUOTES,"UTF-8");
I dont know how this is giving blank output.
If i print $input then I am getting below value in $input.
<div>Other user working on this line.�Please contribute the next line.</div>
Is any thing missed in htmlentities code, Please folks provide your suggestions.
Thanks,
-Pravin.

Try passing $input to utf8_encode first, and then passing the data to htmlentities with only the ENT_NOQUOTES option set:
<?php
$input = "<div>Other 'user' is working on this line. Please contribute the next line.</div>";
echo htmlentities(utf8_encode($input),ENT_NOQUOTES);
?>

Related

PHP: getting "SSA's" instead of "SSA's"

Im having a problem displaying certain data with PHP from the database.
How its currently showing - "SSA's"
How it should show "SSA's"
HTML Meta Tag
meta charset="UTF-8">
PHP Code
$article_title = html_entity_decode(mb_convert_encoding(stripslashes($r->ArticleTitle), "HTML-ENTITIES", 'UTF-8'));
You can decode by using these two methods html_entity_decode() or htmlspecialchars_decode()
Basic Example:
$string = html_entity_decode("SSA's");
echo $string; // result SSA's
$string = htmlspecialchars_decode("SSA's");
echo $string; // result SSA's
Remove the html_entity_decode function, as you are double encoding HTML-ENTITIES
And as #ChrisBanks pointed out, you also don't need stripslashes
You need to call html_entity_decode again because the data is being stored as double encoded and remove the stripslashes.
$article_title = html_entity_decode(html_entity_decode(mb_convert_encoding($r->ArticleTitle, "HTML-ENTITIES", 'UTF-8')));
You might want to investigate how the data is being stored in the database as double-encoded in the first place. Perhaps htmlentities is being called twice somewhere.
To add on to the comment:
You shouldn't store data HTML encoded unless for some reason you really and truly need to (there might be some cases you're required to). It is only on output and rendering on a webpage do you want to use htmlentities.

line breaks showing up as \r\n in textarea

I am trying to display a data into textarea which is fetched from tables that i have submitted via another form. The issue comes up when a new line is entered.
The data getting displayed in the textarea is as
lin1\r\nlin2
it should be like
lin1
lin2
I have tried nl2br but it does not work as expected.
How can i make things optimized. Thanks
This problem can be solved using stripcslashes() when outputting your data.
Please note that the method above is different from stripslashes() which doesn't work in this case.
I tried using nl2br but it wasn't sufficient either.
I hope str_replace saves you.
<?php
$str='lin1\r\nlin2';
$str=str_replace('\r\n','<br>',$str);
echo $str;
OUTPUT:
lin1
lin2
This is a common question and the most common answers are ln2br or str_replace.
However this is just creating unnecessary code.
In reality the problem is pretty much always that you have run the data through a mysql escape function before displaying it. Probably while you were in the process of saving it. Instead, escape the data for saving but display an unescaped version.
<?php echo str_replace('\r\n', "\r\n", $text_with_line_breaks); ?>
See single quotes & double quotes this is a trick.
A perfect solution for newbies.
you overdo quote in insert/update statement
This problem in you case you can solve doing next
<?php
$str = 'lin1\r\nlin2';
$solved_str = str_replace(array("\\r","\\n"), array("\r","\n"), $str);
var_dump($str,$solved_str);
But you need to check insert/update statement on over quotation escape symbols
I would recommend using double quotes for \r\n such as "\r\n". I've never had it work properly with single quotes.
For non- textarea use this function
function escapeNonTextarea($string){
$string=str_replace(array('\n','\r\n','\r'),array("<br>","<br","<br>"),$string);
return $string;
}
For text area use this function
function escapeTextarea($string){
$string=str_replace(array('\n','\r\n','\r'),array("\n","\r\n","\r"),$string);
return $string;
}
call appropriate function and pass argument

PHP htmlentities not working even with parameters

Of course this has been asked before and have searched for solutions, all which have not worked thus far. I want to change out the TM symbol and the ampersand to their html equivelents by using htmlentities or htmlspecialchars:
$TEST = "Kold Locker™ & other stuff";
echo "ORGINIAL: " . $TEST . "<BR/>";
echo "HTML: " . htmlentities($TEST, ENT_COMPAT, 'UTF-8');
This displays:
ORGINIAL: Kold Locker™ & other stuff
HTML:
I have also tried it with htmlspecialchars and the second parameter changed with the same result.
What am I missing that others have claimed worked in other solutions?
UPDATE: I tried just displaying utf8_encode($TEST) and it displayed HTML: Kold Locker™ & other stuff
I dont know why , this worked for me (htmlentities has to be called twice for me)
$html="<html> <head><head>something like this </html>"
$entities_correction= htmlentities( $html, ENT_COMPAT, 'UTF-8');
echo htmlentities( $entities_correction, ENT_COMPAT, 'UTF-8');
output :
<html> <head><head>something like this </html>
I thought I had the same problem as Pjack (msg of Jul 14 at 8:54):
$str = "A 'quote' is <b>bold</b>";
echo htmlentities($str);
gives in the Browser (Firefox in my case) the original string $str (without any translation), while
echo htmlentities(htmlentities($str));
gives:
A 'quote' is <b>bold</b>
(I use PHP/5.4.16 obtained from windows-7 XAMPP).
However, after some more thought it occurred to me that the Browser shows the strings < and > as > and <.
(See the source code in the browser). Second call of htmlentities translates & into & and only then the Browser shows what you expected in the first place.
Your code works for me :-?
In the manual page for htmlentities() we can read:
Return Values
Returns the encoded string.
If the input string contains an invalid code unit sequence within the
given encoding an empty string will be returned, unless either the
ENT_IGNORE or ENT_SUBSTITUTE flags are set.
My guess is that the input data is not properly encoded as UTF-8 and the function is returning an empty string. (Assuming that the script is not crashing, i.e., code after that part still runs.)
I had almost the same problem (in which somehow it showed the same text every time) and with a combination of different echo´s i got it. It seems that webbrowsers like firefox show the same text every time. That´s because when you echo the htmlentities-text, its being converted back into normal text while echoing. When I echo a script with the variable/text to be console.logged, it actually echo´s the htmlentities text (almost) correctly. Instead of replacing every special char with html-codings, it replaces ´em with some other coding i already saw before (I can´t remember the name). Htmlentities-ing it again, I get the same text echo´d again (remember it converts everything), but echoing it in console.log-version gives to me the expected result. Now, again, as a result:
1. Execute htmlentities two times!
2. Don´t (at least with firefox) echo the htmlentities as normal into the webpage. If you´d like to check if the value is actually correct, echo a script that logs it into console.
I hope this could help other guys with the same problem,
VicStudio
EDIT: 3. If you are using a $_POST formular, don´t forget to add accept-charset="UTF-8" (or some other charset) to the <form> tag.
EVEN MORE EDIT: Only do 2 times htmlentities if you wish to echo your result normal into the page. If you wish to directly send in f.e. a database, only do it once! -> what i said before is partually wrong. :(
This is an old post, but for anyone still looking for a solution, here is what I use with success:
echo html_entity_decode($htmlString);

HTML textarea newline character

I have a text area in HTML where the user can enter text, but when the form is submitted, and its passed to a php script which echo's it, there aren't any newlines. Knowing that HTML does that, I tried doing a preg_replace() before echoing it...
echo preg_replace("/\n/", "<br />", $_GET["text"]);
but still everything is on one line.
So my best guess is that HTML textareas use a different newline character... can anybody shed some light on the subject?
EDIT
Ok, so I've figured out the problem: The Javascript is stripping the newlines. view code here
EDIT 2
Ok, so thanks to Jason for solving this problem. I needed to do:
escape(document.getElementById('text'));
Instead of just:
document.getElementById('text');
and the newlines are preserved, problem solved!
echo nl2br($_GET['text'])
Though, your preg_replace worked for me!
usually when testing for newlines in any string, I use /[\n\r]/, just to cover my bases. My guess is that this would match the new lines.

PHP Ampersand in String

I'm having a bit of a problem. I am trying to create an IRC bot, which has an ampersand in its password. However, I'm having trouble putting the ampersand in a string. For example...
<?php
$var = "g&abc123";
echo $var;
?>
I believe this should print g&abc123. However it's printing g.
I have tried this as well:
<?php
$arr = array("key" => "g&abc123");
print_r($arr);
?>
This prints it correctly with the g&abc123, however when I say echo $arr['key']; it prints g again. Any help would be appreciated. I'm running PHP5.3.1.
EDIT: Also, I just noticed that if I use g&abc123&abc123 it prints g&abc123. Any suggestions?
I don't have that issue in a console:
php > $d="g&abc123";
php > echo $d;
g&abc123
What environment are you printing the output to? It sounds like you are viewing it in a web browser, and the & is being interpreted as a malformed HTML entity. Try replacing the & symbol with the entity encoded version &.
Look at the source code, it will be printing the correct code.
If you want it to print out correctly in HTML, then run htmlentities on it or make the & &
View the web page source to make sure your variable contains the correct value.
You're probably sending your output to a Web browser.
The correct way of doing it is
In HTML, XHTML and XML, the ampersand has a special meaning. It is used for character entities. You can think of it as an escape sequence of sorts.
For instance, in PHP, this would be illegal:
$variable = 'It's Friday';
This is because the apostrophe is interpreted by PHP as the end of your string, and the rest of your content looks like garbage.
Instead, you have to say:
$variable = 'It\'s Friday';
Similarly, in HTML and XHTML, you can't say
<h1>Inequalities</h1>
<p> x<yz+3 </p>
This is because it would be interpreted as an element.
Instead, you'd have to say:
<h1>Inequalities</h1>
<p> x<yz+3 </p>
Now, as you can see, the ampersand itself has a special meaning and, therefore, needs to be escaped as &. htmlspecialchars() will do it for you.

Categories