Textarea without wysiwyg editor but have nice format - php

I have a textarea where someone can input text. I do not want a wysiwyg editor.
But what would be great:
Strip all tags, but make correct <p> and <br /> if user input has new lines.
Additionally convert all urls, with or without http// or parameter to clickable links.
I cannot find a solution.
So you could type into the textarea:
........
This is a paragraph
This ist still in the paragraph
this is a new paragraph www.this-would-be-clickable
new paragraphp `<strong>`this will be shown not bold`</strong>`
........
thankfull for every advice

Take a look at CKEditor. It may be more than what you need, but still very good.
http://ckeditor.com/

Another more simplistic alternative is Wymeditor.
Seems to me that Markdown or Textile would get you a long way, though.
But if all you need is the newline/paragraph control and url to link, you could easily build it yourself with some regex.

I found a function inside a famous blog software..... plus this regex for links, the regex seems to work, but most probably there are better solution:
/((http(s?)://)|(www.))([^\s()[]<>]+|([^\s)])|[[^\s]]])+(?

Related

If there a PHP function like ucfirst() that will ignore html?

I am programmatically cleaning up some basic grammar in comments and other user submitted content. Capitalizing I, the first letter of sentence, etc. The comments and content are mixed with HTML as users have some options in formatting their text.
This is actually proving to bit a bit more challenging than expected, especially to someone new to PHP and regex.
If there a function like ucfirst that will ignore html to help capitalize sentences?
Also, any links or tutorials on cleaning up text like this in html, would be appreciated. Please leave anything you feel would help in the comments. thanks!
EDIT:
Sample Text:
<div><p>i wuz walkin thru the PaRK and found <strong>ur dog</strong>. <br />i hoPe to get a reward.<br /> plz call or text 7zero4 8two8 49 sevenseven</div>
I need for it to be (ultimately)
<div><p>I was walking through the park and found <strong>your dog<strong>. <p>I hope to get a reward.</p><p> Please call or text (704) 828-4977.</p>
I know this is going a little farther than the intended question, but my thought was to do this incrementally. ucfirst() is just one of many functions I was using to do one small cleanup at a time per scan. Even if I had to run the text 100 times through the filter, this runs on a cron run when the site has no traffic. I wish there was a discussion forum where this could continue as obviously there would be some great ideas on continuing the approach. Any thoughts on how to approach this as an overall project by all means please leave a comment.
I guess in the spirit of the question itself. ucfirst then would not be the best function for this as it could not take an argument list of things to ignore. A flag IGNORE_HTML would be great!
Given this is a PHP question, then the DOM parser recommended below sounds like the best answer? Thoughts?
You can also add a CSS pseudo-element to your desired elements like this:
div:first-letter {
text-transform: uppercase;
}
But you will probably need to change the way, you print out your senteces ( if you are printing them all in one huge tag ), since CSS lacks the ability to detect the start of a new sentence inside a single tag :(
You should probably use a DOM parser (either the built-in one or for example this one, which is really easy to use).
Walk through all of the text nodes in your HTML and perform the clean-up with preg_replace_callback, ucfirst and a regular expression like this one:
'/(\s*)([^.?!]*)/'
This will match a string of whitespace, and then as many non-sentence-ending-punctuation characters as possible. The actual sentence (starting with a letter, unless your sentence starts with ", which complicates things a bit) will then be found in the first capturing group.
But from your question, I suppose you are already doing something like the latter and your code is just choking on HTML tags. Here is some example code to get all text nodes with the second DOM parser I linked:
require 'simple_html_dom.php';
$html = new simple_html_dom();
$html->load($fullHtmlStr);
foreach($html->find('text') as $textNode)
$textNode = cleanupFunction($textNode);
$cleanedHtmlStr = $html->save();
In html it will be very difficult to do, as you will be building some kind of html parser. My suggestion would be to cleanup the text before it is transformed into html, at the moment you pull it out of the database. Or even better, cleanup the database once.
This should do it:
function html_ucfirst($s) {
return preg_replace_callback('#^((<(.+?)>)*)(.*?)$#', function ($c) {
return $c[1].ucfirst(array_pop($c));
}, $s);
}
Converts
<b>foo</b> to <b>Foo</b>,
<div><p>test</p></div> to <div><p>Test</p></div>,
but also bar to Bar.
Edit: According to your detailed question, you probably want to apply this function to each sentence. You will have to parse the text first (e.g. splitting by periods).

Highlight piece of HTML code with matching text

I need to find text in html and highlight that html without corrupting html itself which i being highlighted.
Text to Replace:
This is text 2. This is Text 3.
HTML:
This is text 1. <p>
This is <span>Text 2</span>. This <div>is</div> text 3.
</p> This is Text 4.
Desired Output:
This is text 1.<p>
<strong class="highlight">This is <span>Text 2</span>. This <div>is</div> text 3. </strong>
</p> This is Text 4.
EDIT: Sorry, if I was not able to explain properly.
I need to highlight a portion of html document (in php or javascript) if string i am searching matches to text in HTML.
But remember that the string i am searching my not be identical to search string, it may contain some extra HTML.
For example if i am searching for this string "This is text.", it should be matched with "This is text.", "<anyhtmltag>This</anyhtmltag> is text.", "This <anyhtmltag>is</anyhtmltag> text.", "This<anyhtmltag> is text</anyhtmltag>." and so on.
You need to be more specific, if you want to achieve this by either server-side (using PHP for example and returning to browser a HTML code already containing highlighted output) or client-side (using a jQuery for example to find and highlight something in HTML returned by server)?
It seems to me, that you just asked a question, without doing nothing (like searching the net), as finding proper solution for jQuery (client-side) took me around TEN seconds! And three most important search results were on StackExchange and jQuery documentation itself.
Find text using jQuery?
Find text string using jQuery?
jQuery .wrap() function description
Here is an example in a very brief:
<script>
$('div:contains("This is <span>Text 2</span>. This <div>is</div> text 3")')wrap("<strong class="highlight"></strong>");
</script>
It generally finds, what you want to find and wraps it with what you want it to be wrapped with.
This works, when the text you want to find is inside some div, that is why, I used $('div:contains. If you want to search whole page, you can use $('*:contains instead.
This is example for jQuery and client-side highlighting. For PHP (server-side) version, do some little searching on either Google or StackOverflow and you'll for sure find many examples.
EDIT: As for your updated question. If you are using any textbox to put there, what you want to search, you can of course use something like this:
<script>
$("#mysearchbox").change(
{
$('div:contains($("#mysearchbox")').wrap("<strong class="highlight"></strong>");
});
</script>
and define your search box somewhere else for example like this:
<input id="mysearchbox"></input>
This way, you're attaching an onChange event to your search box, that will be fired anytime you type anything to it and that should find (and highlight) anything you entered.
Note that this examples are from-hand (from memory). I don't have access do jQuery from where I'm writing, so I can't check, if there aren't any mistakes, in what I've just wrote. You need to test it yourself.
By using JQuery you can add classes to elements like this...
The HTML:
<p id='myParagraph'>I need highlighting!</p>
The JQuery:
$(document).ready(function(){
$('#myParagraph').addClass('highlight');
});
I would have a look at where you can put certain HTML elements, as it is not advisable to be putting things like div tags in p tags.
I hope this helps!
UPDATED
Okay, well you can use JQuery to wrap tags around your code.
If you need to remove the tags you can use PHP's strip tags function - this might help with comparing the text string without the HTML formatting - obviously will be done before the page has loaded in the browser. Not sure on a Javascript equivalent.
The wrap will allow you to get from your HTML to your Desired Output - That said, I would seriously consider the structure of your HTML to make sure it is the best it can be... might make life easier.

PHP: Filter specific html tags out of a given text

I googled a lot, for those kind of problems have been asked a lot in the past. But I didn't find anything to match my needs.
I have a html formatted text from a form. Just like this:
Hey, I am just some kind of <strong>formatted</strong> text!
Now, I want to strip all html tags, that I don't allow. PHP's built-in strip_tags() Method does that very well.
But I want to go a step further: I want to allow some Tags only inside or not inside of other tags. I also want to define my own XML Tags.
Another example:
I am a custom xml tag: <book><strong>Hello!</strong></book>. Ok... <strong>Hi!</strong>
Now, I want the <strong/> inside of <book/> to be stripped, but the <strong>Hi!</strong> can stay the way it is.
So, I want to define some rules of what I allow or don't allow, and want to have any filter do the rest.
Is there any easy way to do that? Regexp aren't what I'm looking for, for they can't parse html properly.
Regards, Jan Oliver
Don't think there is such a thing, I think not even HTML Purifier does that.
I suggest you parse the XHTML by hand using something like Simple HTML Dom.
Use a second argument to strip_tags, which is allowable tags.
$text = strip_tags($text, '<book><myxml:tag>');
I don't think there's a way to only strip certain tags if they're not inside other tags, without using regex.
Also, regex aren't not good at parsing HTML, but it's slow compared to the options. But that's not what you're doing here, anyways. You're going through the string and removing things you don't want. And for your complex requirement I think your only option is to use regex.
To be completely honest I think you should decide which tags are allowable and which aren't. Whether or not they are inside of other tags shouldn't matter at all. It's markup, not a script.
The second argument shows that you cal allow some tags:
string strip_tags ( string $str [, string $allowable_tags ] )
From php.net
I wrote my own Filter class based on the DOM classes of PHP. Look here: XHTMLFilter class

How to add attribute to first P tag using PHP regular expression?

WordPress spits posts in this format:
<h2>Some header</h>
<p>First paragraph of the post</p>
<p>Second paragraph of the post</p>
etc.
To get my cool styling on the first paragraph (it's one of those things that looks good only sparingly) I need to hook into the get_posts function to filter its output with a preg_replace.
The goal is to get the above code to look like:
<h2>Some header</h>
<p class="first">First paragraph of the post</p>
<p>Second paragraph of the post</p>
I have this so far but it's not even working (the error is: "preg_replace() [function.preg-replace]: Unknown modifier ']'")
$output=preg_replace('<p[^>]*>', '<p class="first">', $content);
I can't use CSS3 meta-selectors because I need to support IE6, and I can't apply the :first-line meta-selector (this is one that IE6 supports) on the parent container because it would hit the H2 instead of the first P.
You may find it easier and more reliable to use an HTML parser such as this one. HTML is notoriously difficult to parse reliably (technically, impossible) with regular expressions, and the parser will give you a very simple means to find the nodes you're interested in. The first page of the doc has a tab labelled "How to modify HTML elements".
Two right possibilities :
Do that in Javascript. Using jQuery, for example, it's a matter of one line : $("h2").next().addClass("first")
Use an HTML parser. Indeed, regexp are not a good tool to do what you want to do. Since loading a whole HTML parser for just this purpose is overkill, you'd really better be using Javascript.
The wrong way
Of course, in order to anwser the question, here is the best way I can't think of to make it happends with regexp. Though, I don't recommend it.
preg_replace('#(</h2>\s*<p[^>]*)>#im', '$1 class="first">', '<h2>Some header</h> <p>First paragraph of the post</p> <p>Second paragraph of the post</p> ');
What we do is:
using preg_replace so we can use advanced regexp to replace the code;
using "m" and "i" flag so the regexp does not bother about line break or case;
using </h2>\s* to match the closing "h2" tags and all the spaces/line breaks after;
using *<p[^>]* to match the "p" tag and its current attributs;
using parenthesis to save that;
using "$1" to replace to replace the matched string we the part we save;
adding the class and closing the ">".
The first draw back I can think of is that it doesn't handle the case where a class already exists.
Of, and by the way, you have <h2>...</h> instead of <h2>...</h2>. I don't know if it's a typo but I assumed it was. Replace in the regexp accordingly if it's not.
The problem is that the first character of the regex in a preg_* function is taken as a modifier delimiter. What you'd need is something like:
$output = preg_replace('~<p\b([^>]*)>~', '<p class="first" \1>', $content, 1);
This also puts back any extra attributes the <p> may have.
Overall, though, it's cleaner to do with CSS selectors and a JS fallback for IE.
EDIT: Added replacement limit and word break.
in this particular case regexp solution would be fairly easy
echo preg_replace('~</h2>\s*<p~', "$0 class='first'", $html);
Reading through the answers there are some that will work but all have drawbacks of either using an external parsing library or possibly matching tags other than the P tag or also matching its attributes.
I ended up using this solution with the str_replace_once function from here:
str_replace_once('<p>', '<p class="first">', $content);
Simple enough and it works just as intended. Here's full WordPress code snippet to filter the first paragraph any time the_content() is called:
add_filter('the_content', 'first_p_style');
function first_p_style($content) {
$output=str_replace_once('<p>', '<p class="first">', $content);
return ($output);
}
Thanks for all the answers!

Help with changing how jWYSIWYG editor works

In jWYSIWYG editor, pushing enter inserts <br />s.
Instead of this, I would prefer that pushing enter would wrap chunks in <p> tags.
WHAT IS OUTPUT
line
<br />
new line
WHAT I WANT
<p>line</p>
<p>new line</p>
Quick examination of the config seems I can't do it without hacking it internally.
Do you suggest I hack the plugin, or use PHP to do it? The incoming HTML is parsed with HTML Purifier, so if that could do it, that would be great.
So - where should I do it, in the plugin or PHP?
Any quick implementations of how to do it?
Thanks
You could search replace <br>s with newlines, and then use %AutoFormat.AutoParagraph

Categories