Converting HTML to ENML - php

I am trying to write an extension for Gmail that lets you save mail as a note in Evernote, but Evernote's ENML is pretty strict, as in, it doesn't allow external styles.
So what I am looking to do is something like so -
- convert external styles to inline,
- validate/balance the tags
- purify the tags that Evernote considers offensive
So before I try to jump into writing a parser for above, does anyone know of a php library that is already doing the heavy lifting?
If not, what is the way to go with above requirement?

If the only interesting problem is converting external styles to inline styles you can use https://github.com/tijsverkoyen/CssToInlineStyles. It also has a composer package at packagist for easy deployment.
I used it like this:
<?php
// ...
use \TijsVerkoyen\CssToInlineStyles\CssToInlineStyles;
// ...
$css = file_get_contents('./content.html');
// create instance
$cssToInlineStyles = new CssToInlineStyles();
$css = file_get_contents('./styles.css');
$cssToInlineStyles->setHTML($content);
$cssToInlineStyles->setCSS($css);
$mail_content = $cssToInlineStyles->convert();

Related

How to workaround PHP advanced html dom's conversion of entities?

How can I workaround advanced_html_dom.php str_get_html's conversion of HTML entities, short of applying htmlentities() on every element content?
Despite
http://archive.is/YWKYp#selection-971.0-979.95
The goal of this project is to be a DOM-based drop-in replacement for
PHP's simple html dom library.
... If you use file/str_get_html then you don't need to change
anything.
I find on
include 'simple_html_dom.php';
$set = str_get_html('<html><title> </title></html>');
echo ($set->find('title',0)->innertext)."\n"; // Expected: Observed:
changing to advanced HTML DOM gives an incompatible result:
include 'advanced_html_dom.php';
$set = str_get_html('<html><title> </title></html>');
echo ($set->find('title',0)->innertext)."\n"; // Expected: Observed: -á
This issue is not confined to spaces.
$set = str_get_html('<html><body>•</body></html>');
echo $set->find('body',0)->innertext; // Expected $bull; Observed ÔÇó
You can check my own package PHPHTMLQuery, it helps you to use PHP to select HTML element using most of CSS3 selectors.
the package works with external links and internal html files too.
Installation
Open your terminal and browse into your project root folder and run
composer require "abdelilahlbardi/phphtmlquery":"#dev"
Documentation
For more informations, please visit the package link: PHPHTMLQuery

Compressing CSS with no spaces via PHP?

Wondering how to reach a css file like this one from css-tricks.com
http://cdn.css-tricks.com/wp-content/themes/CSS-Tricks-9/style.css?v=9.5
Not sure if he is using php to accomplish this or not. I've been reading countless articles with no luck.
Also, is it something automated that spits out the version number after the .css? Been seeing it around and wondered how to achieve a clean css file.
Any help is appreciated! Thanks.
It's simple enough to use an editor with Search/Replace and strip out all the unnecessary spaces. For instance, when I write CSS I only use spaces to separate keywords - I use newlines and tabs to format it legibly. So I could just replace all tabs and newlines with the empty string and the result is "minified" CSS like the one above.
The version number is a fairly common cache trick. It doesn't affect anything server-side, but the browser sees it as a new file, and caches it as such. This makes it easy to purge the cache of all users when an update is made. Personally, though, I use a PHP function to append "?t=".filemtime($file) (in other words, the timestamp that the file was modified) automatically, which saves me the trouble of manually updating version numbers.
Here is the exact code I use to automatically append modification time to JS and CSS files:
<?php
ob_start(function($data) {
chdir($_SERVER['DOCUMENT_ROOT']);
return preg_replace_callback(
"(/(?:js|css)/.*?\.(?:js|css))",
// all the relevant files are in /js and /css folders of the root
function($m) {
if( file_exists(substr($m[0],1)))
return $m[0]."?t=".filemtime(substr($m[0],1));
else return $m[0];
},
$data
);
});
?>
I would avoid to do it manually because you may corrupt your css.
There are good tools available which will solve such problems for you without to be tricky.
An excellent solution is Assetic which is an assets manager and allow you to filter (minify, compress) using various tools (yuicompressor, google closure, etc..).
It is currently bundle by default with Symfony2 but may be used standalone in any PHP Project.
I've successfully implemented it in a Zend Framework project.

is there a way to stop HTMLPurifier/CSStidy from forcing input CSS into all lowercase?

Using PHP/Codeigniter/HTMLPurifier/CSStidy like so:
require_once 'extra/htmlpurifier-4_4_0/library/HTMLPurifier.auto.php';
require_once 'extra/csstidy-1_3/class.csstidy.php';
$input_css = $this->input->post('input_css');
$config = HTMLPurifier_Config::createDefault();
$config->set('Filter.ExtractStyleBlocks', TRUE);
// Create a new purifier instance
$purifier = new HTMLPurifier($config);
// Wrap our CSS in style tags and pass to purifier.
// we're not actually interested in the html response though
$html = $purifier->purify('<style>'.$input_css.'</style>');
// The "style" blocks are stored seperately
$output_css = $purifier->context->get('StyleBlocks');
// Get the first style block
$unClean_css = $input_css;
$clean_css = $output_css[0];
...is there a way to turn OFF whatever it is inside HTMLPurifier or CSStidy that is causing $clean_css to be forced into lowercase? I need the output to be left alone, in terms of case. I would just not use CSStidy but it is working great for security and compression, etc.. just is also hosing some (e.g. background-image) file paths on account of case-sensitivity. I am not able to fix the issue from the ground up (i.e. I cannot just force all lowercase)... so I ask the question the way I did. A config setting perhaps? (I hope) ...I have been looking but do not see one to do the trick yet.
Update: as nickb points out, there is a config option in CSStidy that might control this, so then the question becomes: How to set that option to FALSE in the context of HTMLpurifier?
Or is that lowercase_s CSSTidy configuration option even the culprit? I don't know because I am so far unable to stop the lowercasing one way or another. I want for example that input like so:
.zzzzzz {
background:transparent url("images/tmpl_Btn1_CSSdemo.jpg") no-repeat right top;
}
...would NOT become (as it is doing now) :
.zzzzzz {
background:transparent url("images/tmpl_btn1_cssdemo.jpg") no-repeat right top;
}
Actually, this is an HTML Purifier bug, not a CSS Tidy bug. I'm working on a fix.
Please apply this patch: http://repo.or.cz/w/htmlpurifier.git/commit/f38fca32a9bad0952df221f8664ee2ab13978504 (only the patches to file in library/ are really necessary)
You need to set the lowercase_s CSSTidy configuration option to false. Otherwise, it will lowercase all of your selectors for inclusion in XHTML.

Creating Simple markdown class

Im currently working on a system that has a comment system integrated, the system is running on Codeigniter so im looking to create a markdown library but with really minimal features.
The features im looking to have is
Autolinking
Bold *bold*
Italic _italic_
And that's practically it, The post data will be run through Codeigniter's XSS Class before it goes to the mark down class
So my question is what's the best way to do this, should i be using a library out there and disabling certain features, should I build this from scratch, if so, how should i build the class and what things should I take into account.
I was in a similar situation recently, where I wanted to support some kind of markup (BB, Markdown, etc). It turns out nothing has been done with BBCode for about 100 years, and it's dead easy to write a regex parser for it (for well-formed markup at least) so i wrote a really bare bones function to do just that.
My version also includes images, codes, and color support as well as nested tags ([b][i]bold and italic[/i][/b]).
function parseBBCode($string){
$search = array(
'/\[b\](.*?)\[\/b\]/',
'/\[i\](.*?)\[\/i\]/',
'/\[u\](.*?)\[\/u\]/',
'/\[img\](.*?)\[\/img\]/',
'/\[url\=(.*?)\](.*?)\[\/url\]/',
'/\[code\](.*?)\[\/code\]/',
'/\[color\=(.*?)\](.*?)\[\/color\]/'
);
$replace = array(
'<strong>\\1</strong>',
'<em>\\1</em>',
'<u>\\1</u>',
'<img src="\\1">',
'\\2',
'<code>\\1</code>',
'<span style="color:\\1;">\\2</span>'
);
$new = preg_replace($search, $replace, $string);
return nl2br($new);
}
You can begin with PHP Markdown class ?
or the one for CI.
And If I may suggest, you can also try MarkItUp as front end..
For me the easiest way to integrate Markdown is by simply
putting markdown.php from Michel Fortrin into my Application/helpers/ folder,
rename it to markdown_helper.php
load it with $this->load->helper('markdown');
...just in case someone - like me - stumbles upon this old thread again :)

How can I split a HAML template into different partials/includes in PHP?

I am a PHP dev trying to start using HAML, using this implementation:
http://phphaml.sourceforge.net/
HAML looks awesome, but I don't understand if/how it supports partials (or includes, as they are called in the PHP world).
I would like to have a master template HAML file that then goes and loads up a bunch of partials for all the little pieces. (Then I can reuse those pieces in other templates too.)
In PHP or Ruby this would be really easy, is there any way to do this with HAML? thanks!
dylan
You could create a global render_haml_partial method by analogy with phpHaml's existing display_haml method that might look something like:
function render_haml_partial($sFilename, $aVariables = array(), $sTmp = true, $bGPSSC = false)
{
$sPath = realpath($sFilename);
$haml = new HamlParser(dirname($sPath), $sTmp);
$haml->append($GLOBALS);
if ($bGPSSC)
{
$haml->append($_GET);
$haml->append($_POST);
$haml->append($_SESSION);
$haml->append($_SERVER);
$haml->append($_COOKIE);
}
$haml->append($aVariables);
return $haml->fetch($sFilename);
}
This method could be placed in phpHaml's HamlParser.class.php file so it is available to all your templates.
The only difference between this and display_haml is that it invokes fetch instead of display at the end and returns the result so you can then insert it in-place into the invoking template.
You would then use it in your PHP/HAML templates as follows:
= render_haml_template("path to partial")
This would then be very similar to the Rails/HAML syntax:
= render :partial => 'path to partial'
Note that using display_haml directly does not have quite the same effect since it renders the template directly to the output instead of returning the result to the caller. Thus you could do the following:
- display_haml("path to partial")
But this doesn't capture the result of the render.
I'm guessing that somebody who cares enough about phpHaml might add such a render_haml_partial or something similar eventually - I might suggest it to the author some time.
Quite an old question, but I've updated the source code of phpHaml to reflect this new functionality!
Check out the commit #github
https://github.com/endorama/phphaml/commit/8d95d5ebff06275db8b14438e566c6e41ec91b7f

Categories