CKEditor output rules

CKEditor output rules - php

Is there a way to add rules for the changes ckeditor makes to html?
Like I would like to use <br /> instead of it being output as <p>$nbsp;</p>
, to not wrap <style></style> in <p> tags
, and have it not modify the white space and leave all the carriage returns as they are put in.
Most of all I'm looking for some way to allow php to be added. The CMS I am using it on needs php on some pages. I write all the code but the client has the ability to go in and edit the text, but she doesn't know html, hence ckeditor, and changes pages with php in it over to ckeditor sometimes and it completely garbles the code.
Is there any way to do any of this?

CKEditor offers a powerful and flexible output formatting system. It
gives developers full control over what the HTML code produced by the
editor will look like.
http://docs.cksource.com/CKEditor_3.x/Developers_Guide/Output_Formatting
Most of all I'm looking for some way to allow php to be added
PHP can be added you just need to open the file in a plain textarea tag for writing and make sure its handled properly when saving, or if content is held in database, use eval() but not recommended.
http://php.net/manual/en/function.eval.php
If your client dose not understand basic html then opening up the page to more syntax errors will only cause you greater head pain.

Related

Formatting HTML correctly from cURL requests

I am working on an applet that allows the user to input a URL to a news article or other webpage (in Japanese) and view the contents of that page within an iFrame in my page. The idea is that once the content is loaded into the page, the user can highlight words using their cursor, which stores the selected text in an array (for translating/adding to a personal dictionary of terms) and surrounds the text in a red box (div) according to a stylesheet defined on my domain. To do this, I use cURL to retrieve the HTML of the external page and dump it into the source of the iFrame.
However, I keep running into major formatting problems with the retrieved HTML. The big problem is preserving style sheets, and to fix this, I've used DOMDocument to add tags to the section of the retrieved HTML. This works for some pages/URLs, but there are still lots of style problems with the output HTML for many others. For example, div layers crash into each other, alignments are off, and backgrounds are missing. This is made a bit more problematic as I need to embed the output HTML into a new in order to make the onClick javascript function for passing text selections in the embedded content to work, which means the resulting source ends up looking like this:
<div onclick="parent.selectionFunction()" id ="studyContentn">

</div>
It seems like for the most part a lot of the formatting issues I keep running into are largely arbitrary. I've tried using php Tidy to clean output from HTML, but that also only works for some pages but not many others. I've got a slight suspicion it may have to do with CDATA declarations that get parsed oddly when working with DOMDocument, but I am not certain.
Is there a way I can guarantee that HTML output from cURL will be rendered correctly and faithfully in all instances? Or is there perhaps a better way of going about doing this? I've tried a bunch of different ways of approaching this issue, and each gets closer to a solution but brings its own new problems as well.
Thanks -- let me know if I can clarify anything.

If I understand correctly you are trying to pull the html of a complete web page and display it under your domain, in your html. This is always going to be tricky, a lot of java script will break, relative url's will be wrong and as you mentioned, styles as well. Your probably also changing the dimensions that the page is displayed in. These can all be worked around but your going to be fighting an uphill battle with each new site, or if a current site change design
I'd probably take a different approach to the problem. You might want to write a browser plugin as the interface to the external web site instead. Then your applet can sit on top of the functional and tested (hopefully) site. Then you can focus on what you need to do for your applet rather than a never ending list of fiddly html issues.

I am trying to do a similar thing. It is very difficult to conserve the formatting, and the JS scripts in webpage complicated the thing. I finally gave up the complete the idea of completely displaying the original format, but do it with a workaround:
Select only headers, links, lists, paragraph which you are interested at.
Add the domain path of your ownsite to links.
You may wrap the headers, links etc. items by your own class.
Display it
in your case you want to select text and store it, which is another topic. What I did is to parse the HTMl in two levels, and then it is easy to do the selection. Keep in mind IE and Firefox/Chrome needs to be dealt with separately.

Cleaning up HTML formatted content for display within Flash?

I want to display HTML formatted content from various sources inside a Flash Flex application. Flash supports HTML formatting in its text fields, however it is very limited compared to a web browser. Are there any scripts out there that will convert common HTML formatted text into a format that Flash can handle? My particular use cases are:
Displaying HTML formatted emails inside Flash
Displaying RTF files inside Flash (after running an RTF2HTML conversion on the server)
Displaying random HTML content copied and pasted from other sources into Flash
I'm open to code that runs either on the client or the server, but server is probably preferable.

The subset of html tags supported is quite poor and has not changed in forever:
<a>, <b>, <br>, <font>, <img>, <i>, <li>, <p>, <textformat>, <u>
This means is that regardless of conversion quality, html cannot be rendered as fully intended; you could also be giving up a significant portion of css styling if you replace unsupported tags with more basic ones.
That being said, http://simplehtmldom.sourceforge.net/ (PHP) would work with some tweaks and it's competent enough to cope with invalid markup as well (seeing how you're after processing content from various sources, I'd say this feature alone would save a lot of pain in the long run) - than replace
<h1>,...,<h6> => <b>
<strong> => <b>
<em> => <i>
and plaintext the rest of it into paragraphs you'd be surprised at how readable it would still be. You could be a bit fancy too like so:
<h1> => <b class="header1">
and add some css as appropriate (although flash css support is pretty limited too)
I've been saving this one for desert - you'll either love it or hate it but it would do the trick. Assuming your app is deployed in-browser (if not and I misread you, save me the embarrassment and stop reading right here) you could use an iframe to display your html, seriously.
JS<->AS communication is fairly straightforward and you could have it positioned over a predetermined area of your app, giving the illusion that it's part of it; just remember to set windowmode on the flash object/embed correctly so it does not render on top of other page elements, then increase the iframe z-index.
I would not be surprised if this is seen as an "ugly" approach, but it's beautiful on the inside - you'll end up with verbatim html and real css support. As for user interactions, you could even intercept link clicks etc. in the iframe and request an action from the movieclip.

You can use HTMLPurifier and specify a whitelist of tags that you want to support.

AS3 HTML Parser Library is not quite what I'm looking for, since it does not convert the HTML but instead renders it within Flash, meaning that it wont be editable. But it may be useful in some cases that I only want to display and not edit text.
Another option is to look at several sample HTML that I'd like to be able to display, and then write regex to convert them to the format Flash/TLF expects. But I feel like that may be a huge endeavor, due to the wide range of HTML out there.

How do I manipulate the output of a PHP script when it is already manipulated by another (plug-in) script?

I take it you are confused. So am I, but I'll try to formulate this as well as I can.
The content management system I use has a third-party plug-in installed that manipulates the output of the pages produced by the CMS. That's what it's supposed to do, and that's why I installed it, but there's one small part of those manipulations that I need to get rid of.
The plug-in looks for the </title> tag in the HTML output and then adds an unwanted tag right after it (by replacing </title> with </title><unwanted tag>).
You might think, why not just dig into the plug-in source and comment out that particular function? Well, that's the kicker: the plug-in is encoded with Zend Guard, so I can't make heads nor tails of its source, and unfortunately the developer is not willing to assist.
One other manipulation that I was able to get rid of by myself, was the extra (and again unwanted) HTTP header it set for every page.
The Zend-encoded PHP file is loaded by a regular PHP file, and I was able to unset the above mentioned header by adding the following bit of code to the very bottom of this 'load file', before ?>:
header_remove("X-Enhanced-By");
It works splendidly, but that was about as far as my experience and research could take me.
The last thing, then, that I need to undo, is the manipulation of the title tag. I temporarily worked around it by changing all my </title> tags to </title >, but that seems hardly a proper workaround.
If I can unset the header by placing header_remove("X-Enhanced-By"); right before ?>, does that mean I can also use that same area to undo the addition of the unwanted tag after </title>?
Let's assume the plug-in replaces </title> with </title><base href="http://www.example.com/" /> on every page that is put out by the CMS.
How would I go about undoing that?

Depending on the CMS framework. Basically you should be able to create a hook/plugin which captures the output at a higher level than your plugin then regex the tags out.
I get it that the plugin installed already does that so it should be "doable"

FCKeditor is leaving behind a lot of open tags - how to remove them

So I am using FCKeditor and the problem I am having is that when the user writes the document sometimes info is copied from Word, other times it is written directly in the editor and other times it could be done both ways. What this leaves me with in the DB is a lot of tags that are open and never closed. This is throwing my layout off dramatically and I am trying to find a solution.
I changed the config file to paste as plain text, which I assumed would stop Word formatting from transferring over, and it is still doing it.
So now I am trying to figure out a way to search for the opening tags and delete them before the info is sent to the DB is possible. Or is there some FCKeditor function/config option I am missing to aid me?
Any suggestions on how I should proceed?
Thanks
Levi

Just as a security precaution that will address both security-related problems (like <script> tags being inserted by users, for instance -- which you probably don't want) and presentation-related problems (such as not-closed tags), you could use a tool like HTMLPurifier on your server, on what you are receiving from the browser.
Of course, this will not solve the first problem, the fact that users can input whatever they want in FCKEditor ; but it will ensure your HTML is both valid and secure.
Actually, even if FCKEditor wasn't getting you not-valid HTML, you still could use HTMLPurifier, just for security.
The idea is that you provide a list of :
allowed tags
allowed attributes to those tags
And, in return, HTMLPurifier gives you clean and valid HTML.

Edit: Sounds like you're running into a bug in the editor. You might try a different one, and/or use a server-side script that goes through and strips unmatched div tags.
Html allows most tags to be left open. If it's leaving tags open that should be left closed, you could white or blacklist to search through and strip those out serverside. Otherwise, you're pretty much stuck with understanding that HTML is not XML, FCKeditor generates HTML, and HTML won't validate as XML. If it's throwing your printed output off, try wrapping the FCKeditor output in a div.
Otherwise, please include concrete examples of input and output that is messing up your page layout.

RichTextEditor that is PHP/code friendly to snippets of php

I can't seem to find a js RTE that will play friendly with snippets of php intertwined in it. I want a mini CMS for the backend of a number of sites. The views have some snippets of php here and there
Are there any RTE's that will leave the php alone, and even show it mixed with the nice formatting?
TinyMCE kills the tags even when entered in html mode and switched back and forth. FCKEdit seems to keep the code intact if pasted into source mode, but it isn't shown in the editing side, so if someone deletes an element with some php in it, bop, it's gone.
And none of the editors like creating nicely indented code, that would be a nice plus as well, but probably over the top to ask, heh.

The Javascript rich text editors make use of the browsers' in-built DesignMode or ContentEditable features in order to implement in-line HTML editing, and these do not support embedded PHP tags.
The solution would have to convert these to some other form, which is not going to get mushed by the browser's HTML editor, then convert them back to PHP tags upon submission.
It could be done. I don't know of any that do, however.
As for creating nicely indented code, it is a similar issue. The browsers munge it in their in-line HTML editors.

I had a similar question a few weeks back:
Textarea that can do syntax highlighting on the fly?
This may be the right thing for you:
http://marijn.haverbeke.nl/codemirror/
They even have mixed PHP and HTML highlighting.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.