Is there an embedded visual (PHP) parser for Microsoft Word? - php

I am writing about code that I have written for documentation, mainly in PHP. I also have other languages that I will write about but I am wondering what the easiest way to display code within a word document. I could just import a print screen of a Notepad++ document but I would like an easy way to include code into Microsoft Word without having to print screen it every time I want to make a change. I am looking for something that will allow me to edit the code within word, but obviously not be functional. I would like there to be some sort of visual parsing so that similarly to Notepad++ it is more readable.

You don't need to print screen in Notepad++. You can export/copy the text as RTF and preserve syntax highlighting and formatting.
I'm not on my PC at the moment, but the option is either under the TextFX menu, or the Plugins menu.
Works very nicely.
Edit:
This menu, press 'Copy RTF to Clipboard', and you can paste into Word.

Instead of using word to document your code, you could instead check out a document markup language called Latex.
It allows for easy documentation of code(and math) and is therefor a really good tool for creating scientific reports.
http://www.latex-project.org/
Here is a basic tutorial on how it works:
http://www.youtube.com/watch?v=SoDv0qhyysQ
(This youtube video explains the basics)

Related

PHP str_replace into microsoft word template

Right now I'm get task to make generate contract letter function in HRMS.
I'm already using CKEditor but the result is very different since the purpose made CKEditor is not like Microsoft Word or Google Docs purpose.
So I'm having idea that I'm making the template first in Microsoft Word and use PHP function str_replace to passing the data into Microsoft Word template.
The question is :
1. With that flow, is it possible to do that?
2. If Question 1 is possible can you hit me with the sample?
Many Thanks,
Hendra
There are several Classes that can do at least part of what you are trying to do:
wrklst/docxmustache
openTBS – Tiny But Strong
PHPWord
docxtemplater pro (basic opensource / free version / MIT license available as of writing; image replacing is a commercial plugin)
docxpresso (commercial)
phpdocx (commercial)
The first 4 of these are at least partially open source and investigating the code will help you understand the process, which is not trivial with word. In addition you can check out http://officeopenxml.com for the format details.
The main problem I see is with proper HTML to openXML conversion. Meaning to convert the styling from CKEditor (which might be HTML) into the proper XML Styling, which functions quite differently and a direct translation is not trivial. Check out https://github.com/wrklst/docxmustache/blob/master/src/WrkLst/DocxMustache/HtmlConversion.php so see some basic HTML conversion on singular runs of bold, italic and underlined text.
To my knowledge there is no maintained open source package that delivers proper html to openxml conversion. If you need this and cannot write it yourself, you will probably go for one of the paid solutions.
Good luck.
Docx is a zipped format that contains some xml. If you want to build a simple replace {tag} by value system, it can already become complicated, because the {tag} is internally separated into <w:t>{</w:t><w:t>tag</w:t><w:t>}</w:t>. If you want to embed loops to iterate over an array, it becomes a real hassle.
source : https://docxtemplater.readthedocs.io/en/latest/goals.html
You could use the library I created in answer for this problem : https://github.com/open-xml-templating/docxtemplater , it works with JS in the browser or with node.js.

where can i find any text formatting scripts like the one used in stackoverflow?

I want to give users the option to format text like bold, italic, add image etc ...
I want to give list of options as such as given here in stackoverflow while asking questions
Where can i find any predefined scripts for that? I searched on google , but i think i haven't searched with a proper text and i couldn't find anything relevant!
Actually:
Markdown is used by SO
Prettify is the code colorizer that StackOverflow uses.
TinyMCE/ WMD Editor (used by SO)
Markdown for PHP is located at
http://michelf.com/projects/php-markdown/
Alternatives to Markdown can be found at
http://en.wikipedia.org/wiki/Lightweight_markup_language
Somewhat related is this blog entry about what StackOverflow was built with:
https://blog.stackoverflow.com/2008/09/what-was-stack-overflow-built-with/
You'll find many more answers about SO on https://meta.stackoverflow.com/
SO uses the WMD editor, which you can find here. It also uses MarkdownSharp to generate the HTML shown on the page. You'd need to replace this with a PHP version of Markdown -- #Gordon's answer contains a link.
What you are after is something to 'parse' the text.
This will be a special function that looks at a string such as **my text** and notices the pair of * before and after the string my text it then converts the first pair into a <b> and the second pair get turned into </b>.
You can either do it in JavaScript or server side code, either before or after you store/read from the data base.
There are lots of library's that other people have been mentioning. But if you wanted to do it your self, that is the basic principle.
This are all editor. So use any of the editor and customize your input option.

Extract all text from a HTML page without losing context

For a translation program I am trying to get a 95% accurate text from a HTML file in order to translate the sentences and links.
For example:
<div>Overflow <span>Texts <b>go</b> here</span></div>
Should give me 2 results to translate:
Overflow
Texts <b>go</b> here
Any suggestions or commercial packages available for this problem?
I'm not exactly sure what you're asking, but look at simplehtmldom. Specifically the "Extract Contents from HTML" tab under quick start on that front page (can't link directly, sigh). With that you can extract the text of a website without all those pesky tags.

Text Parser with PHP, like Instapaper

I'm trying to write a text parser with PHP, like Instapaper did. What I want to do is; get a webpage and parse it in text-only mode.
It's simple to get the webpage with cURL and strip HTML tags. But every webpage have some common areas; like header, navigation, sidebar, footer, banners etc. I only want to get the article in text mode and exclude all other parts. It's also simple to exclude those parts if I know the "id" or "class" info. But I'm trying to automatize this process and apply for any page, like Instapaper.
I get all the content between but I don't know how to exclude header, sidebar or footer and get only the main article body. I have to develop a logic to get only the main article part.
It's not important for me to find the exact code. It would also be useful to understand how to exclude unnecessary parts as I can try to write my own code with PHP. It would also be useful if there any examples in other languages.
Thanks for helping.
You might try looking at the algorithms behind this bookmarklet, readability - It's got a decent success rate for extracting content among on all web page rubbish.
Friend of mine made it, that's why I'm recommending it - since I know it works, and I'm aware of the many techniques he's using to parse the data. You could apply these techniques for what your asking.
you can take a look at the source from Goose -> it already does alot of this like instapaper text extractions
https://github.com/jiminoc/goose/wiki
Have a look at the ExtractContent code from Shuyo Nakatani.
See original Ruby source http://rubyforge.org/projects/extractcontent/ or a port of it to Perl http://metacpan.org/pod/HTML::ExtractContent
You really should consider using a HTML parser for this. Gather similar pages and compare the DOM trees to find the differing nodes.
this article provides a comparison of different approaches. the java library boilerpipe was rated highly. at the boilerpipe site you find his scientific paper which compares to other algorithms.
not all algorithms suite all purposes. the biggest application of such tools is to just get the raw text to index as a search engine. the idea being that you don't want search results to be messed up by adverts. such extractions can be destructive; meaning that it wont give you "the best reading area" which is what people want with instapaper or readability.

Text-to-HTML converter for PHP

What text to HTML converter for PHP would you recommend?
One of the examples would be Markdown, which is used here at SO. User just types some text into the text-box with some natural formatting: enters at the end of line, empty line at the end of paragraph, asterisk delimited bold text, etc. And this syntax is converted to HTML tags.
The simplicity is the main feature we are looking for, there does not need to be a lot of possibilities but those basic that are there should be very intuitive (automatic URL conversion to link, emoticons, paragraphs).
A big plus would be if there is WYSIWYG editor for it. Half-wysiwig just like here at SO would be even better.
Extra points would be if it would fit with Zend Framework well.
Take your pick at http://en.wikipedia.org/wiki/Lightweight_markup_language.
As for Markdown, there's one PHP parser that I've been using called PHP Markdown, and I especially like the Extra extension.
I have actually taken a stab at extending it with my own (undocumented) features. It's available at GitHub (remember that it's the extra branch I've fixed, not the masteR), if you're interested. I've intended on making it a 'proper fork' for a while, but that's another, largely offtopic, story.
The Zend Framework has a WYSIWYG editor bundled with it's Dojo integration.
http://framework.zend.com/manual/en/zend.dojo.form.html#zend.dojo.form.elements.editor
... Bring on the extra points!
There's always textile. It is widely implemented, and has a few basic similarities with Markdown. However, I have never seen a WYSIWYG editor for Textile.
You might find upflow useful.
If you want WYSIWYG, I'm a big fan of FCKeditor. It converts user input to HTML before submitting the form, not after, but has a nice PHP library for using it, and a PHP connector for handling file uploading/browsing (along with several other languages).
If you want something that can be read as plain-text but output as HTML, I vote for Markdown.
I will stick with my original idea of adopting Texy.
None of the products mentioned here actually beats it. I had problem with Texys syntax but it seems to be quite standard and is present in other products too.
It is very lightweith, supports very natural syntax and has great "half" wysiwyg editor Texyla (wiki is in Czech only)

Categories