php strings with <>s in them seem to need escaping - php

I've got this down to a two line php file:
$s = "<option value=\"%id%\">%desc%</option>";
die($s);
This will output:
%desc%
If I change the <>s to & l t ; & g t ;
then it works but the string I am trying to output is to be interpreted as HTML.
I don't see anything in the PHP string docs to indicate that <>s are special characters that need escaping.
Funny thing is, I have the same problem happening when trying to quote the problem in this forum!
Whats the deal?

You are getting the correct output to your browser, that is
<option value=\"%id%\">%desc%</option>
However, your browser is then parsing this as HTML. You can confirm this by viewing the raw HTML source.
If you do not want your browser to parse it as HTML, use < and >.

Related

use of <> in php and in wamp

This must be terribly well known, but I can't find anything relevant. Strings in php cant have the characters < and > in them in the wamp environment. It all works fine in the live server. e g under wamp
$teststring = 'aaa<bbb';
echo $teststring;
produces aaa.
I want to edit html files using str_replace() and preg_replace().
I guess I have to modify the php set up but I don't know how.
I assume it doesn't have anything to do with WAMP, and I'm not even sure I understand the situation fully, because you say "Strings in php cant have the characters <> in them in the wamp environment" (I interpret it as: It doesn't work with WAMP) and then directly afterwards "It all works fine in the live server. e g under wamp" (I interpret it as: It does work with WAMP).
But I believe the problem is just that you are adding stray unencoded <'s into the HTML output. Think about what would happen normally:
<?php echo "I can write <em>emphasized</em> text!"; ?>
...would result in:
I can write emphasized text!
...and not:
I can write <em>emphasized</em> text!
Because you can output HTML from PHP, and the browser will read it as it would read any static HTML page. Now if you just include a random <, it will be interpreted as HTML as said in the comments and will not be valid.
So, in order to have a literal < shown in the browser, it has to be encoded as HTML entity, in this case <, e.g. 3 < 4 instead of 3 < 4. This can be automatically done using the function htmlentities. For example:
<?php echo htmlentities("This is a string with < and > and & and other stuff like this which has to be encoded."); ?>

How to get &curren to display literally, not as an HTML entity

I'm using php to look at an XML file that has a URL in it. The URLs look something like this:
https://site.com/bacon_report?Id=1&report=1&currentDimension=2&param=1
When I echo out the URLs, the "&curren" shows up as "¤" (AKA #164, A4 or currency symbol) and the links don't work. This happens even though there isn't a closing semicolon for it. What is the cleanest way to make "&curren" display literally?
Funny enough I ran into the same problem just now and I found this answer. However, I found another solution which might even be better!
Simply put the variable at the beginning of your query string, and you will avoid the &curren completely.
Do:
https://site.com/bacon_report?currentDimension=2&Id=1&report=1&param=1
instead of:
https://site.com/bacon_report?Id=1&report=1&currentDimension=2&param=1
Use the php function urlencode:
urlencode("https://site.com/bacon_report?Id=1&report=1&currentDimension=2&param=1"
will output
https%3A%2F%2Fsite.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
The problem here is escaping - you need to escape the "&" characters. In XML all special characters like <, >, ', " and & should be escaped.
Escape it properly as
https://example.com/bacon_report?Id=1&report=1&currentDimension=2&param=1
..just like in HTML:
WRONG - no escaping
CORRECT - correct escape sequence
So - the cleanest way to show "&curren" in HTML/XML is to properly escape the ampersand, and render it as "&curren".
I think that in this case it is best to use htmlentities because with urlencode you get
https%3A%2F%2Fexample.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
and when applying urldecode, you will still have the &curren symbol
where as with htmlentities the url comes out clean.
https://example.com/bacon_report?Id=1&report=1&currentDimension=2&param=1
I came across this issue while working on technical documentation (in Markdown which gets converted to HTML).
To solve the issue I used a zero-width space character which I copied and pasted from between these brackets (​). That way it appears that there is no space and can include the below without any issues:
/search?query=1&currentLonLat=-74.600291,40.360869

PHP htmlentities not working even with parameters

Of course this has been asked before and have searched for solutions, all which have not worked thus far. I want to change out the TM symbol and the ampersand to their html equivelents by using htmlentities or htmlspecialchars:
$TEST = "Kold Locker™ & other stuff";
echo "ORGINIAL: " . $TEST . "<BR/>";
echo "HTML: " . htmlentities($TEST, ENT_COMPAT, 'UTF-8');
This displays:
ORGINIAL: Kold Locker™ & other stuff
HTML:
I have also tried it with htmlspecialchars and the second parameter changed with the same result.
What am I missing that others have claimed worked in other solutions?
UPDATE: I tried just displaying utf8_encode($TEST) and it displayed HTML: Kold Locker™ & other stuff
I dont know why , this worked for me (htmlentities has to be called twice for me)
$html="<html> <head><head>something like this </html>"
$entities_correction= htmlentities( $html, ENT_COMPAT, 'UTF-8');
echo htmlentities( $entities_correction, ENT_COMPAT, 'UTF-8');
output :
<html> <head><head>something like this </html>
I thought I had the same problem as Pjack (msg of Jul 14 at 8:54):
$str = "A 'quote' is <b>bold</b>";
echo htmlentities($str);
gives in the Browser (Firefox in my case) the original string $str (without any translation), while
echo htmlentities(htmlentities($str));
gives:
A 'quote' is <b>bold</b>
(I use PHP/5.4.16 obtained from windows-7 XAMPP).
However, after some more thought it occurred to me that the Browser shows the strings < and > as > and <.
(See the source code in the browser). Second call of htmlentities translates & into & and only then the Browser shows what you expected in the first place.
Your code works for me :-?
In the manual page for htmlentities() we can read:
Return Values
Returns the encoded string.
If the input string contains an invalid code unit sequence within the
given encoding an empty string will be returned, unless either the
ENT_IGNORE or ENT_SUBSTITUTE flags are set.
My guess is that the input data is not properly encoded as UTF-8 and the function is returning an empty string. (Assuming that the script is not crashing, i.e., code after that part still runs.)
I had almost the same problem (in which somehow it showed the same text every time) and with a combination of different echo´s i got it. It seems that webbrowsers like firefox show the same text every time. That´s because when you echo the htmlentities-text, its being converted back into normal text while echoing. When I echo a script with the variable/text to be console.logged, it actually echo´s the htmlentities text (almost) correctly. Instead of replacing every special char with html-codings, it replaces ´em with some other coding i already saw before (I can´t remember the name). Htmlentities-ing it again, I get the same text echo´d again (remember it converts everything), but echoing it in console.log-version gives to me the expected result. Now, again, as a result:
1. Execute htmlentities two times!
2. Don´t (at least with firefox) echo the htmlentities as normal into the webpage. If you´d like to check if the value is actually correct, echo a script that logs it into console.
I hope this could help other guys with the same problem,
VicStudio
EDIT: 3. If you are using a $_POST formular, don´t forget to add accept-charset="UTF-8" (or some other charset) to the <form> tag.
EVEN MORE EDIT: Only do 2 times htmlentities if you wish to echo your result normal into the page. If you wish to directly send in f.e. a database, only do it once! -> what i said before is partually wrong. :(
This is an old post, but for anyone still looking for a solution, here is what I use with success:
echo html_entity_decode($htmlString);

PHP string cut short

Why does this code
$string = "!##$%^&*(<a#g.com";
echo $string;
only output:
!##$%^&*(
Is this is a PHP bug?
Because < is a reserved character in in HTML :)
Use < and >
Read this for more information
http://www.w3schools.com/HTML/html_entities.asp
You can use the function htmlspecialchars to convert such special chars
http://php.net/manual/en/function.htmlspecialchars.php
I'm not seeing that:
http://ideone.com/zhycx
Perhaps you've got some weird characters in your file? Make sure you're using a "normal" encoding on your source code, as well.
You need to do:
echo htmlentities($string);
to display the string as it is on a browser. This is because the < in the string is interpreted by the browser as start of a HTML tag.
So it's not PHP but the browser that is causing this behavior. If you do the exact same display on a command line, you'll see all the characters.
If you are viewing the output in a web browser, then the < begins a tag and is usually not displayed but interpreted in the HTML document structure parser. Also, a $ inside of a double-quoted string is interpolated as the variable name that follows it; try using single quotes where this won't happen.
Try this:
$string = '!##$%^&*(<a#g.com';
echo htmlentities($string);

PHP Ampersand in String

I'm having a bit of a problem. I am trying to create an IRC bot, which has an ampersand in its password. However, I'm having trouble putting the ampersand in a string. For example...
<?php
$var = "g&abc123";
echo $var;
?>
I believe this should print g&abc123. However it's printing g.
I have tried this as well:
<?php
$arr = array("key" => "g&abc123");
print_r($arr);
?>
This prints it correctly with the g&abc123, however when I say echo $arr['key']; it prints g again. Any help would be appreciated. I'm running PHP5.3.1.
EDIT: Also, I just noticed that if I use g&abc123&abc123 it prints g&abc123. Any suggestions?
I don't have that issue in a console:
php > $d="g&abc123";
php > echo $d;
g&abc123
What environment are you printing the output to? It sounds like you are viewing it in a web browser, and the & is being interpreted as a malformed HTML entity. Try replacing the & symbol with the entity encoded version &.
Look at the source code, it will be printing the correct code.
If you want it to print out correctly in HTML, then run htmlentities on it or make the & &
View the web page source to make sure your variable contains the correct value.
You're probably sending your output to a Web browser.
The correct way of doing it is
In HTML, XHTML and XML, the ampersand has a special meaning. It is used for character entities. You can think of it as an escape sequence of sorts.
For instance, in PHP, this would be illegal:
$variable = 'It's Friday';
This is because the apostrophe is interpreted by PHP as the end of your string, and the rest of your content looks like garbage.
Instead, you have to say:
$variable = 'It\'s Friday';
Similarly, in HTML and XHTML, you can't say
<h1>Inequalities</h1>
<p> x<yz+3 </p>
This is because it would be interpreted as an element.
Instead, you'd have to say:
<h1>Inequalities</h1>
<p> x<yz+3 </p>
Now, as you can see, the ampersand itself has a special meaning and, therefore, needs to be escaped as &. htmlspecialchars() will do it for you.

Categories