HTML string from API not being converted to HTML rendered entities - php

I'm receiving a chunk of HTML via API call and trying to put that HTML into my template for it to render. But, instead of rendering, it's being printed out as if it were a string.
Example of HTML string from API:
\u0026lt;p\u0026gt;\u0026lt;strong\u0026gt;Hello World\u0026lt;/strong\u0026gt;\u0026lt;/p\u0026gt;
Then in the controller I convert the string to HTML entities
$content = htmlspecialchars_decode($response['content']);
The issue I'm having is that in my view, the HTML is printed (tags and all) instead of being rendered as HTML:
In view code:
<?= $content ?>
End result:
<p><strong>Hello World</strong></p>
How can I get this HTML chunk to render in my view?

Your data looks like double-encoded. Try
$content = htmlspecialchars_decode(htmlspecialchars_decode($response['content']));

From the looks of your string, you don't need htmlspecialchars_decode(), since you don't have HTML special characters directly in it.
I suspect you are getting your data in JSON format. A JSON-encoded value can sometimes have special characters converted to Unicode constants (for example, the PHP json_encode() function does that when used with the JSON_HEX_* options).
Try this:
$content = json_decode('"' . $response['content'] . '"');

Related

How to use php echo without quotes to echo html code?

I'm trying to print out some html code which isn't stored as a simple string so I need to decode it before echo-ing, my problem is when I echo a decoded value I keep getting these quotes and they are ruining the output, this is how it looks:
<div>
"<h1 class="ql-align-center">TEST</h1>"
</div>
so because of these quotes " h1 failes to form and it prints out as a text and not as an html code.
So i'm wondering is it possible to print as html code meaning there is no quotes "" ?
this is the php code that it generates it like this
<?php echo html_entity_decode($singleEmail['camp_desc'], ENT_NOQUOTES, 'UTF-8'); ?>
also this is the database value of 'camp_desc' that has to be encoded before being printed
&lt;h1 class=&quot;ql-align-center&quot;&gt;TEST&lt;/h1&gt;
and the output of php code above for encode is
<h1 class="ql-align-center">TEST</h1>
but since I'm using echo to print.... php wraps it with quotes and <h1> tag becomes a plain text instead of html element
I don't know where the quotes are coming from - the code you have in your question doesn't add extra quotes so they are coming from somewhere else.
However if you want the HTML string to be rendered as HTML instead of displaying the tags as text, you can do the following:
Starting with this value in your variable:
&lt;h1 class=&quot;ql-align-center&quot;&gt;TEST&lt;/h1&gt;
displayed as: <h1 class="ql-align-center">TEST</h1>
...you can use html_entity_decode to decode it which will give us the following output, i.e. it converts it into a string that will display as plain text HTML when you echo it:
<h1 class="ql-align-center">TEST</h1>
displayed as: <h1 class="ql-align-center">TEST</h1>
...now we need to decode this to turn it into the HTML elements that will be displayed as a H1 tag in the page:
<h1 class="ql-align-center">TEST</h1>
displayed as: TEST
Code: To do this, you need to call html_entity_decode twice before it will display the string as HTML elements:
<?php
$htmlstr = html_entity_decode($singleEmail['camp_desc'], ENT_QUOTES, 'UTF-8');
echo html_entity_decode($htmlstr, ENT_NOQUOTES, 'UTF-8');
?>
What if you try to replace those quotes when echo-ing them?
Like you make regular expression to replace it or you make a function that replaces the two parts you want like str_replace('”<', '<', $yourDecodedHtml),
str_replace('>”', '>', $yourDecodedHtml)

TinyMce not storing html, just raw text (laravel)

So i added TinyMCE with this method
<script src="https://cdn.tiny.cloud/1/myapihere/tinymce/5/tinymce.min.js"></script>
<script>tinymce.init({selector:'textarea'});</script>
and added a textarea later on the text. But for some reason
This is what i want it to show to me, when the post is updated
This is what it shows me
If I understand correctly, your problem is that when you output the string from tinyMCE, you get the raw html without any formatting.
I think the problem is how you output the string. When outputting HTML in a blade template, don't use {{ $content }}, this will automatically encode html entities.
To output HTML, you have to use {!! $content !!}. This will output your string as is and won't parse html entities.

Replace a string with HTML

I'm using str_replace as follows:
<html>
<head>
</head>
<body>
<script type="text/php">
$new_str = str_replace( '[[str_to_replace]]' , $GLOBALS['html'] , $original_str );
</script>
<div class="wrapper">[[str_to_replace]]</div>
<?php
// multiple includes
// lots and lots of code
//PHP code to calculate HTML code
//value of $html depends on data calculated after div.wrapper is drawn
$GLOBALS['html'] = '<input type="text" />';
?>
</body>
</html>
I'm forced to wrap the PHP code in a script tag because the document is getting passed to a library as an HTML document. The library has the ability to execute PHP code inside script tags but in this case is working oddly.
What I'm expecting:
[[str_to_replace]] should become an HTML input field.
What I'm getting:
[[str_to_replace]] becomes the literal string <input type="text" />.
How do I get the second result?
You're likely misinterpreting wht you can do with inline script. In dompdf HTML is parsed separately from inline script. Any HTML you insert into the document using inline script will be treated as plain text. What you should be doing is parsing your document first then passing the results to dompdf.
FYI, It's hard to see from your sample exactly what you're doing in the code. Plus we can't see what's going on with dompdf. I'm having a hard time seeing how everything ties together.
It sounds like what you're trying to do is to replace the string with decoded HTML entities. You'll probably want to do:
$htmlEntityString = '&'; // String containing HTML entities that you want to decode.
$new_str = str_replace( '[[str_to_replace]]' , html_entity_decode($htmlEntityString) , $original_str );
In this case, whatever HTML you have with HTML entity form will be decoded and will replace the substring.
Read more about it for all the options:
http://php.net/manual/en/function.html-entity-decode.php

php output xml produces parse error "’"

Is there any function that I can use to parse any string to ensure it won't cause xml parsing problems? I have a php script outputting a xml file with content obtained from forms.
The thing is, apart from the usual string checks from a php form, some of the user text causes xml parsing errors. I'm facing this "’" in particular. This is the error I'm getting Entity 'rsquo' not defined
Does anyone have any experience in encoding text for xml output?
Thank you!
Some clarification:
I'm outputting content from forms in a xml file, which is subsequently parsed by javascript.
I process all form inputs with: htmlentities(trim($_POST['content']), ENT_QUOTES, 'UTF-8');
When I want to output this content into a xml file, how should I encode it such that it won't throw up xml parsing errors?
So far the following 2 solutions work:
1) echo '<content><![CDATA['.$content.']]></content>';
2) echo '<content>'.htmlspecialchars(html_entity_decode($content, ENT_QUOTES, 'UTF-8'),ENT_QUOTES, 'UTF-8').'</content>'."\n";
Are the above 2 solutions safe? Which is better?
Thanks, sorry for not providing this information earlier.
You take it the wrong way - don't look for a parser which doesn't give you errors. Instead try to have a well-formed xml.
How did you get ’ from the user? If he literally typed it in, you are not processing the input correctly - for example you should escape & to &. If it is you who put the entity there (perhaps in place of some apostrophe), either define it in DTD (<!ENTITY rsquo "&x2019;">) or write it using a numeric notation (’), because almost every of the named entities are a part of HTML. XML defines only a few basic ones, as Gumbo pointed out.
EDIT based on additions to the question:
In #1, you escape the content in the way that if user types in ]]> <°)))><, you have a problem.
In #2, you are doing the encoding and decoding which result in the original value of the $content. the decoding should not be necessary (if you don't expect users to post values like & which should be interpreted like &).
If you use htmlspecialchars() with ENT_QUOTES, it should be ok, but see how Drupal does it.
html_entity_decode($string, ENT_QUOTES, 'UTF-8')
Enclose the value within CDATA tags.
<message><![CDATA[’]]></message>
From the w3schools site:
Characters like "<" and "&" are illegal in XML elements.
"<" will generate an error because the parser interprets it as the start of a new element.
"&" will generate an error because the parser interprets it as the start of an character entity.
Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA.
Everything inside a CDATA section is ignored by the parser.
The problem is that your htmlentities function is doing what it should - generating HTML entities from characters. You're then inserting these into an XML document which doesn't have the HTML entities defined (things like ’ are HTML-specific).
The easiest way to handle this is keep all input raw (i.e. don't parse with htmlentities), then generate your XML using PHP's XML functions.
This will ensure that all text is properly encoded, and your XML is well-formed.
Example:
$user_input = "...<>&'";
$doc = new DOMDocument('1.0','utf-8');
$element = $doc->createElement("content");
$element->appendChild($doc->createTextNode($user_input));
$doc->appendChild($element);
I had a similar problem that the data i needed to add to the XML was already being returned by my code as htmlentities() (not in the database like this).
i used:
$doc = new DOMDocument('1.0','utf-8');
$element = $doc->createElement("content");
$element->appendChild($doc->createElement('string', htmlspecialchars(html_entity_decode($string, ENT_QUOTES, 'UTF-8'), ENT_XML1, 'UTF-8')));
$doc->appendChild($element);
or if it was not already in htmlentities()
just the below should work
$doc = new DOMDocument('1.0','utf-8');
$element = $doc->createElement("content");
$element->appendChild($doc->createElement('string', htmlspecialchars($string, ENT_XML1, 'UTF-8')));
$doc->appendChild($element);
basically using htmlspecialchars with ENT_XML1 should get user imputed data into XML safe data (and works fine for me):
htmlspecialchars($string, ENT_XML1, 'UTF-8');
Use htmlspecialchars() will solve your problem. See the post below.
PHP - Is htmlentities() sufficient for creating xml-safe values?
This worked for me. Some one facing the same issue can try this.
htmlentities($string, ENT_XML1)
With special characters conversion.
htmlspecialchars(htmlentities($string, ENT_XML1))
htmlspecialchars($trim($_POST['content'], ENT_XML1, 'UTF-8');
Should do it.

Using htmlentities with BBCode

What I am trying to achieve is a sound method for using BBCode but where all other data is parsed through htmlentities(). I think that this should be possible, I was thinking along the lines of exploding around [] symbols, but I thought there may be a better way.
Any ideas?
htmlentities() does not parse. Rather, it encodes data so it can be safely displayed in an HTML document.
Your code will look like this:
Parse BB-code (by some mechanism); don't do escaping yet, just parse the input text into tags!
The output of your parser step will be some tree structure, consisting of nodes that represent block tags and nodes that represent plain text (the text between the tags).
Render the tree to your output format (HTML). At this point, you escape plain text in your data structure using htmlentities.
Your rendering function will be recursive. Some pseudo-functions that specify the relationship:
render( x : plain text ) = htmlentities(x)
render( x : bold tag ) = "<b>" . render( get_contents_of ( x )) . "</b>"
render( x : quote tag ) = "<blockquote>" .
render( get_contents_of( x )) .
"</blockquote>"
...
render( x : anything else) = "<b>Invalid tag!</b>"
So you see, the htmlentities only comes into play when you're rendering your output to HTML, so the browser does not get confused if your plain-text is supposed to contain special characters such as < and >. If you were rendering to plain text, you wouldn't use the function call at all, for example.

Categories