I have a application where I store sting as it is but while dispying, I want special characters to be converted to their HTML name like for < will be <. To achieve it, I am using php inbuilt function htmlspecialchars.
Output of text with this function is achieved with following code
$reviewTxt = htmlspecialchars($reviewTxt);
echo $reviewTxt;
Now, for reviewTxt to be 'I loved you <3', it should produce I loved you <3 but should display the original text. In my case, it displays the encoded data I loved you <3. I also tried to paste I loved you <3 instead of above php code just to see if I can get original text and yes, it shows 'I loved you <3'.
I am not sure what I am missing,
It looks like you are encoding twice with htmlspecialchars() / htmlentities().
That causes the & symbol of the first result to be encoded in the second result, giving you a string like I loved you <3.
So it will show the encoded & followed by the litteral string lt;.
Related
I have some text that I will be saving to my DB. Text may look something like this: Welcome & This is a test paragraph. When I save this text to my DB after processing it using htmlspecialchars() and htmlentities() in PHP, the sentence will look like this: Welcome & This is a test paragraph.
When I retrieve and display the same text, I want it to be in the original format. How can I do that?
This is the code that I use;
$text= htmlspecialchars(htmlentities($_POST['text']));
$text= mysqli_real_escape_string($conn,$text);
There are two problems.
First, you are double-encoding HTML characters by using both htmlentities and htmlspecialchars. Both of those functions do the same thing, but htmlspecialchars only does it with a subset of characters that have HTML character entity equivalents (the special ones.) So with your example, the ampersand would be encoded twice (since it is a special character), so what you would actually get would be:
$example = 'Welcome & This is a test paragraph';
$example = htmlentities($example);
var_dump($example); // 'Welcome & This is a test paragraph'
$example = htmlspecialchars($example);
var_dump($example); // 'Welcome & This is a test paragraph'
Decide which one of those functions you need to use (probably htmlspecialchars will be sufficient) and use only one of them.
Second, you are using these functions at the wrong time. htmlentities and htmlspecialchars will not do anything to "sanitize" your data for input into your database. (Not saying that's what you're intending, as you haven't mentioned this, but many people do seem to try to do this.) If you want to protect yourself from SQL injection, bind your values to prepared statements. Escaping it as you are currently doing with mysqli_real_escape_string is good, but it isn't really sufficient.
htmlspecialchars and htmlentities have specific purposes: to convert characters in strings that you are going to output into an HTML document. Just wait to use them until you are ready to do that.
I'm using a 3rd party API that seems to return its data with the entity codes already in there. Such as The Lion’s Pride.
If I print the string as-is from the API it renders just fine in the browser (in the example above it would put in an apostrophe). However, I can't trust that the API will always use the entities in the future so I want to use something like htmlentities or htmlspecialchars myself before I print it. The problem with this is that it will encode the ampersand in the entity code again and the end result will be The Lion’s Pride in the HTML source which doesn't render anything user friendly.
How can I use htmlentities or htmlspecialchars only if it hasn't already been used on the string? Is there a built-in way to detect if entities are already present in the string?
No one seems to be answering your actual question, so I will
How can I use htmlentities or htmlspecialchars only if it hasn't already been used on the string? Is there a built-in way to detect if entities are already present in the string?
It's impossible. What if I'm making an educational post about HTML entities and I want to actually print this on the screen:
The Lion’s Pride
... it would need to be encoded as...
The Lion&;#8217;s Pride
But what if that was the actual string we wanted to print on the string ? ... and so on.
Bottom line is, you have to know what you've been given and work from there – which is where the advice from the other answers comes in – which is still just a workaround.
What if they give you double-encoded strings? What if they start wrapping the html-encoded strings in XML? And then wrap that in JSON? ... And then the JSON is converted to binary strings? the possibilities are endless.
It's not impossible for the API you depend on to suddenly switch the output type, but it's also a pretty big violation of the original contract with your users. To some extent, you have to put some trust in the API to do what it says it's going to do. Unit/Integration tests make up the rest of the trust.
And because you could never write a program that works for any possible change they could make, it's senseless to try to anticipate any change at all.
Decode the string, then re-encode the entities. (Using html_entity_decode())
$string = htmlspecialchars(html_entity_decode($string));
https://eval.in/662095
There is NO WAY to do what you ask for!
You must know what kind of data is the service giving back.
Anything else would be guessing.
Example:
what if the service is giving back & but is not escaping ?
you would guess it IS escaping so you would wrongly interpret as & while the correct value is &
I think the best solution, is first to decode all html entities/special chars from the original string, and then html encode the string again.
That way you will end up with a correctly encoded string, no matter if the original string was encoded or not.
You also have the option of using htmlspecialchars_decode();
$string = htmlspecialchars_decode($string);
It's already in htmlentities:
php > echo htmlentities('Hi&mom', ENT_HTML5, ini_get('default_charset'), false);
Hi&mom
php > echo htmlentities('Hi&mom', ENT_HTML5, ini_get('default_charset'), true);
Hi&;mom
Just use the [optional]4th argument to NOT double-encode.
I have a method that scrapes data from a url and returns that as a string variable. Currently the method is working if i put in my own url, but when i insert a generated url it doesnt work.
e.g.
The following string is working if I insert it into a variable, and pass it:
http://www.rijkswaterstaat.nl/apps/geoservices/rwsnl/awd.php?mode=html&projecttype=windsnelheden_en_windstoten&category=1&loc=ZBWI&net=LMW
But this string is being generated by another source. The result of my attempt to fetch it is (var_dump()):
string(154) "http://www.rijkswaterstaat.nl/apps/geoservices/rwsnl/awd.php?mode=html&projecttype=windsnelheden_en_windstoten&category=1&loc=ZBWI&net=LMW"
The string is only 138 characters, however it prints string(158). I think this has something to do with the fact it is not working, but i'm not even sure...
Does anyone have any idea how to clean this up? I have found other questions with the question why var_dump() is showing another value then the length of the string, and that had something to do with unvisible characters, but no real solution is given anywhere.
Thx
154-138 = 16
You have 4 & in the string
& HTML encoded is &
So your string seems to be HTML encoded - in the browser you don't see the encoding unless you "View Source".
You can use html_entity_decode() to decode the string or, if possible, make sure that you get a string that is not encoded for HTML output in the first place.
Is there a PHP function that can take a string and convert any special characters to unicode. Similar to htmlspecialchars() or UTF8_encode().
For example in the string: "I think Bob's going too".
I would need the apostrophe or single right quote unicode in place of the apostrophe in "Bob's". So then after conversion the string should read: "I think Bob\u2019s going too".
I need this for use in a PHP script that prints into a javascript function.
Using \ to escape or ' does not work, it stops the script from running. I am trying to use Flowplayers Playist plugin. The only way it seems I can have a string with special characters is if they are in unicode.
Here is a JSFIDDLE to play around with and see what I mean when I say it doesn't work. Just replace \u2019 with ' or something similar and click to have the song play. The media player just goes black and doesn't play anything, whereas if you leave it with \u2019 then it plays fine.
Any help is appreciated.
I think json_encode() is the function you are looking for here.
The following code:
$string = "I think Bob’s going too";
print_r(json_encode($string));
will output:
"I think Bob\u2019s going too"
I'm having a bit of a problem. I am trying to create an IRC bot, which has an ampersand in its password. However, I'm having trouble putting the ampersand in a string. For example...
<?php
$var = "g&abc123";
echo $var;
?>
I believe this should print g&abc123. However it's printing g.
I have tried this as well:
<?php
$arr = array("key" => "g&abc123");
print_r($arr);
?>
This prints it correctly with the g&abc123, however when I say echo $arr['key']; it prints g again. Any help would be appreciated. I'm running PHP5.3.1.
EDIT: Also, I just noticed that if I use g&abc123&abc123 it prints g&abc123. Any suggestions?
I don't have that issue in a console:
php > $d="g&abc123";
php > echo $d;
g&abc123
What environment are you printing the output to? It sounds like you are viewing it in a web browser, and the & is being interpreted as a malformed HTML entity. Try replacing the & symbol with the entity encoded version &.
Look at the source code, it will be printing the correct code.
If you want it to print out correctly in HTML, then run htmlentities on it or make the & &
View the web page source to make sure your variable contains the correct value.
You're probably sending your output to a Web browser.
The correct way of doing it is
In HTML, XHTML and XML, the ampersand has a special meaning. It is used for character entities. You can think of it as an escape sequence of sorts.
For instance, in PHP, this would be illegal:
$variable = 'It's Friday';
This is because the apostrophe is interpreted by PHP as the end of your string, and the rest of your content looks like garbage.
Instead, you have to say:
$variable = 'It\'s Friday';
Similarly, in HTML and XHTML, you can't say
<h1>Inequalities</h1>
<p> x<yz+3 </p>
This is because it would be interpreted as an element.
Instead, you'd have to say:
<h1>Inequalities</h1>
<p> x<yz+3 </p>
Now, as you can see, the ampersand itself has a special meaning and, therefore, needs to be escaped as &. htmlspecialchars() will do it for you.