PHP:HTML markup breking when taking first 200 chars - php

in my PHP web application,I have a dataentry form where users will enter data using a rich text editor (FCKEditor i m using) and will be saving the Markup from the editor to the DB table.In another page i have to display the first 200 chars of the content (with View more link to view the entire thing). So when i m taking first 200 chars,the HTM Lmarkup is breaking because i may miss the closing tags of some of the html tags started already.How can i get rid of this ? I know i can use strip_tags to remove all HTML markup.But i wanna keep that as it is.Is there anything which i can do to solve this ?

Run it through HTMLTidy as that might help. For example, when you have link tag (a) opened but not closed, that might help to get rid of the link "bleed" to next element. You will still have issues if your script cuts the string from the middle of the tag, a la "<di". It's not fool-proof solution and i wouldn't rely on it.
The best practice imgo is to treat the "short" version of the text separately, just let the user enter it into separate text editor.

Related

How to remove html formatting effect in CSS styles

I am working on a simple php-MySql website and presenting the data for the following fields for each entry in the database (through a loop):
Title
Organisation
DetailedInfo
The 'DetailedInfo' field in the database can hold up to 5000 characters. While displaying on the webpage I am only using the first 250 characters.
The problem is as follows. If an entry has a formatting tag (italic/bold) starting, say at character 240, and the formatting tag is not closed by the 250th character then the problem starts. For all subsequent entries the Title, Organisation and DetailedInfo are displayed with the tag (so all the subsequent text are either italic, or, bold).
I am using CSS style for Title, Organisation and DetailedInfo but it seems that the CSS is not able to get rid of the formatting tag from the data.
Any help will be appreciated.
Cheers,
Tim
If you're only displaying a small portion of the detailedInfo field I'd guess formatting it isn't that mportant. Use strip_tags() to get rid of the formatting tags before you display it.
CSS cannot fix broken HTML. You'll need to strip it back to plain text and re-code (or just leave it out).
I wouldn’t fix that with CSS (and I don’t think it’s possible). You’re outputting invalid HTML, which is going to cause problems, especially if anyone ever looks a the page in IE 8 or earlier.
It could also be worse than an unclosed tag. What if the excerpt ends with </i?
I’d either implement some crazy logic to close any unclosed HTML tags in the 250-character excerpt, or strip all HTML tags from the excerpt. I’m guessing the latter would be easier.

htmlentities with ajax editable textarea

Here is an example of the workflow a user can have on my website :
Create a task, with content: I use htmlentities to encode the content and store it in my database (yes, I've decided to store the encoded content);
The user comes back later and clicks to view the task. The thing is, the preview of the content is done in a disabled textarea.
I tried to use htmlentities_decode when printing the content in the textarea (XSS problem if the user entered bad things);
I just print the encoded text and everything is fine.
The user clicks on EDIT, this will make the textarea editable
The user clicks on SAVE.
Here is my main issue, as I didn't decode the text before I printed it, it is still encoded and when the user saves it, it is re-encoded. So, the previous content is double encoded.
So, if the first time the user enters something like:
blablabla </textarea/> yeah!
Then, it's encoded and the result is:
blablabla </textarea/> yeah!
Then, when I display it, it displays as the user previously entered it but if he saves it, the result is:
blablabla &lt;/textarea/&gt; yeah!
And, so, if he displays it again, it is not well displayed (and it also takes more and more space in my database as the user keeps editing his task).
Well, I am sure this is a problem a lot of people have experienced but I can't find any good solution.
By the way, I am using htmlentities with ENT_QUOTES.
ahah, here is my main issue, as I didn't decode the text before I
printed it, it is still encoded and when the user save it, it is
reencoded. So, the previous content is double-encoded.
This is actually correct, you shouldn't decode the text before you print it. In fact, it must be HTML encoded when output in the HTML page. It is not still encoded when the user submits it because the browser will have already interpreted the HTML entities.
Unless... you are creating a TEXT_NODE in the DOM and assigning the encoded data to this (in the textarea)? In which case the browser will not interpret the HTML entities and you will end up resubmitting already encoded data. Assign to the innerHTML property instead, if this is the case. However, the HTML entities would be clearly visible in the form to the end user (on the first edit), before the data is submitted, which does not appear to be the case?
Hum,
I fixed my problem.
I didn't noticed but for the first entry, I was using htmlentities() and when editing, I was using the Zend escape() function.
Using only htmlentities() fixed the problem. I don't know how the escape() function of ZF works, but I won't use it in the future :p
Thanks you for answers :)
Anyway, so, I am wondering, the htmlentities_decode() function, in which situation should it be used? As I htmlentities() when I get the form and print it like that, I never use the htmlentities_decode(). Is that normal? So I am wondering what is this function used for?
Thanks again!

diplaying user HTML and HTML spill-over

I am displaying user submitted HTML as part of a page (the user's resume). I noticed that if the user's HTML get's cut off in the DB, due to character limit, when the HTML is displayed, any tags that didn't get closed, spills over to the rest of the page.
Am I able to contain the HTML within a div? Another solution? I'm using PHP, by the way.
Thanks.
If you have one ore more unclosed tags in you HTML, your DOM becomes invalid and the consequences are unpredictable. Make sure that the submitted HTML doesn't get cut by using a TEXT column in your DB table.

Text area to modify HTML - rendering line breaks

I'm loading a portion of an HTML page into a text area so that I can make small changes. All the HTML tags are shown along with the text, which is what I need to happen; I don't want a WYSIWYG editor or anything fancy.
The one thing I want is for line breaks to be shown in the text area in addition to the <p></p> and <h1></h1> tags otherwise it's a giant wall of text and it's really hard to proof read. I don't want the line breaks to be doubled after I save the modifications though as the next step will be to convert everything in the text area to a PDF file.
ETA: nl2br() doesn't work because there are no line breaks to begin with. The content is assembled from paragraphs in a MySQL database using a loop. The tags are inserted during the loop too.
What's the best way to do this? I'm using PHP.
Oh, PS - I'm aware of the security concerns of not stripping the tags. This page is for the admin (me) only and will be password protected.
Maybe you can filter those first, before showing to textarea? Something like this, maybe (add newline after the closing tag):
$rawhtml = str_replace(array("</p>", "</h1>"), "\n", $rawhtml);
Before sending data to your textarea put your variable in nl2br() func then send it to your textarea.

HTML displaying articles

Hey guys Im building a web-app where users can login and post/read articles and comment and things.
Im giving them a form to post an article where they provide its title, description and text.
leaving the validations and sql injections aside (already done that), I need help with displaying the article stored in MySQL database as TEXT.
Im taking the article text from a textarea, and displaying it in a p tag but then obviously it skips the new line characters entered by the users, but the pre tag makes it ugly by giving a wide scrollable display.I want to know which tag is appropriate to be used for this purpose? or is even taking an article through textarea correct?
Im a learner and am building such a webapp with articles and comments sections for the first time, so any suggestions are most welcome. Thank you in advanced.
My recommendation would be of two choices:
1. Use Plain Text:
If you want that user can not put any HTML in the contents, show a simple HTML Textarea input to user, then when the user enters a new line (Enter key) it would be \n in your database. When you want to print the article just use nl2br($article_contents); and it will convert the new lines (\n) into HTML line breaks.
2. Rich Text:
If you want users to put HTML contents in article then it would be easy if you use any Text Editors like TinyMCE. TinyMCE will make it easy for your users to do simple HTML Formatting like headings, bold, italic, paragraph alignments, color, add images. Then in the PHP side use strip_tags function to allow only the certain tags so the user could not insert any malicious code like XSS injections into HTML contents. For example:
strip_tags($article_contents, "<u><b><i><font><span><p>");
Proposed Answer:
Use <span></span>
Tags like <p></p><div></div> take up as much space as they can, while <span></span> takes up as little as it can to hold whatever is inside it, so it might be more suitable for you.
Let me know if that worked for you.
In PHP you can use function nl2br that changes all newline characters to BR HTML tag. http://php.net/nl2br

Categories