I need to find text in html and highlight that html without corrupting html itself which i being highlighted.
Text to Replace:
This is text 2. This is Text 3.
HTML:
This is text 1. <p>
This is <span>Text 2</span>. This <div>is</div> text 3.
</p> This is Text 4.
Desired Output:
This is text 1.<p>
<strong class="highlight">This is <span>Text 2</span>. This <div>is</div> text 3. </strong>
</p> This is Text 4.
EDIT: Sorry, if I was not able to explain properly.
I need to highlight a portion of html document (in php or javascript) if string i am searching matches to text in HTML.
But remember that the string i am searching my not be identical to search string, it may contain some extra HTML.
For example if i am searching for this string "This is text.", it should be matched with "This is text.", "<anyhtmltag>This</anyhtmltag> is text.", "This <anyhtmltag>is</anyhtmltag> text.", "This<anyhtmltag> is text</anyhtmltag>." and so on.
You need to be more specific, if you want to achieve this by either server-side (using PHP for example and returning to browser a HTML code already containing highlighted output) or client-side (using a jQuery for example to find and highlight something in HTML returned by server)?
It seems to me, that you just asked a question, without doing nothing (like searching the net), as finding proper solution for jQuery (client-side) took me around TEN seconds! And three most important search results were on StackExchange and jQuery documentation itself.
Find text using jQuery?
Find text string using jQuery?
jQuery .wrap() function description
Here is an example in a very brief:
<script>
$('div:contains("This is <span>Text 2</span>. This <div>is</div> text 3")')wrap("<strong class="highlight"></strong>");
</script>
It generally finds, what you want to find and wraps it with what you want it to be wrapped with.
This works, when the text you want to find is inside some div, that is why, I used $('div:contains. If you want to search whole page, you can use $('*:contains instead.
This is example for jQuery and client-side highlighting. For PHP (server-side) version, do some little searching on either Google or StackOverflow and you'll for sure find many examples.
EDIT: As for your updated question. If you are using any textbox to put there, what you want to search, you can of course use something like this:
<script>
$("#mysearchbox").change(
{
$('div:contains($("#mysearchbox")').wrap("<strong class="highlight"></strong>");
});
</script>
and define your search box somewhere else for example like this:
<input id="mysearchbox"></input>
This way, you're attaching an onChange event to your search box, that will be fired anytime you type anything to it and that should find (and highlight) anything you entered.
Note that this examples are from-hand (from memory). I don't have access do jQuery from where I'm writing, so I can't check, if there aren't any mistakes, in what I've just wrote. You need to test it yourself.
By using JQuery you can add classes to elements like this...
The HTML:
<p id='myParagraph'>I need highlighting!</p>
The JQuery:
$(document).ready(function(){
$('#myParagraph').addClass('highlight');
});
I would have a look at where you can put certain HTML elements, as it is not advisable to be putting things like div tags in p tags.
I hope this helps!
UPDATED
Okay, well you can use JQuery to wrap tags around your code.
If you need to remove the tags you can use PHP's strip tags function - this might help with comparing the text string without the HTML formatting - obviously will be done before the page has loaded in the browser. Not sure on a Javascript equivalent.
The wrap will allow you to get from your HTML to your Desired Output - That said, I would seriously consider the structure of your HTML to make sure it is the best it can be... might make life easier.
Related
Scenario:
I need to apply a php function to the plain text contained inside HTML tags, and show the result, maintaining the original tags (with their original attributes).
Visualize:
Take this:
<p>Some text here pointing to the moon and that's it</p>
Return this:
<p>
phpFunction('Some text here pointing to the ')
phpFunction('moon')
phpFunction(' and that\'s it')
</p>
What I should do:
Use a PHP html parser (instead of using regexp) and iterate over every tag, applying the callback to the node text content.
Problem:
If I have, for example, an <a> tag inside a <p> tag, the text content of the parent <p> tag would consist of two different plain text parts, which the php callback should considerate as separate.
Question:
How should I approach this in a clean and smooth way?
Thanks for your time, all the best.
In the end, I decided to use regex instead of including an external library.
For the sake of simplicity:
$expectedOutput = preg_replace_callback(
'/>(.*)</U',
function ($withstuff) {
return '>'.doStuff($withStuff).' <';
},
$fromInput
);
This will look for everything between > and <, which is, indeed, what I was looking for.
Of course any suggestion/comment is still welcome.
Peace.
Good morning,
Here's the problem:
I have some text being entered in via text editor (WYSIWYG/TinyMCE) and being displayed elsewhere as posting. The problem we have is that the text looses its formatting when being displayed as a posting. After digging through the code, I discovered that this was being done with a strip_tags() + echo preg_replace() combo. I'm still new to PHP, but I was able to figure out:
strip_tags() was taking out the formatting (b/c that's how it rolls)
I could add and to get the bold and italicized text to display
the underlined and strikethrough text are CSS styles and adding the code (as it is saved on the db table) to the strip_tags() list did NOT solve the problem
My question is: can I modify the existing code to solve this, or should I use something else (htmlentities() perhaps)?
EDIT: I tried htmlentities and it failed.
EDIT: I added just the tag and the problem is 50% solved. My text is underlined, but it shows lower than the non-underlined text that comes after it. Its as if the underlined text is being treated as subtext or something.
code snippet:
<div class="display_text_area">
<?php $text = strip_tags(str_ireplace("</p>", "</p><br/>",
$text_detail->description),
'<font><ul><li><br/><strong><em><span style="text-decoration: underline;">'); ?>
<?php echo preg_replace('/(<br[^>]*>\s*){2,}/', '<br/>', $text); ?>
</div>
I'm leaving the tag here to show that (a) I tried it, and (b) it didn't work. So (c) I know it needs to be removed or modified.
Many thanks in advance.
The point is that TinyMCE returns nominally valid rich HTML that doesn't need stripping or escaping before being used in an HTML page. However, you can't assume that the TinyMCE editor is running on the client, as a you might be exploited by someone who simply directly posts a response which contains an XSS attack.
IIRC, TinyMCE returns XHTML by default. You need to ensure that any returned HTML is correct using a library such as HTML Purifier.
I'm working on an E-Book that will be published to my website. I want to mimic OSX spotlight feature where someone can use a my fixed search bar and input text that is then highlighted on the page for them. I was trying to use Sphider but no such luck on getting this result.
•found this similar thread but not exactly what I'm looking for.
You could use a string replace to surround all text that needs to be highlighted with a span tag. Then create a CSS class for that span tag.
<?php
$searchString = $_POST['search'];
$EBOOK = str_replace($searchString, "<span class='highlighted'>$searchString</span>", $EBOOK);
Then some CSS
.highlighted {
background-color:yellow;
}
To take it to the next step you could use javascript to scroll the user's web browser to the first location of a span.highlighted.
Note I wouldn't use a regular expression to replace search string value (ie preg_replace) because the user's search input could contain special characters used by regex that may need to be escaped.
This is all theoretical of course... based on your question.
Edit: just thought of something, Ebook content will contain HTML tags so if you were to use a string replace function like I suggested. Take into consideration to not allow the tags to be searched and replaced. A regular expression replace may be needed in this case
What will be the best way to highligh the Searched pharase within a HTML Document.
I have Complete HTML Document as a Large String in a Variable.
And I want to Highlight the searched term excluding text with Tags.
For example if the user searches for "img" the img tag should be ignored but
phrase "img" within the text should be highlighted.
Don't use regex.
Because regex cannot parse HTML (or even come close), any attempt to mess around with matching words in an HTML string risks breaking words that appear in markup. A badly-implemented HTML regex hack can even leave you with HTML-injection vulnerabilities which an attacker may be able to leverage to do cross-site-scripting.
Instead you should parse the HTML and do the searches on the text content only.
If you can accept a solution that adds the highlighting from JavaScript on the client side, this is really easy because the browser will already have parsed the HTML into a bunch of DOM objects you can manipulate. See eg. this question for a client-side example.
If you have to do it with PHP that's a bit more tricky. The simple solution would be to use DOMDocument::loadHTML and then translate the findText function from the above example into PHP. At least the DOM methods used are standardised so they work the same.
Edit: This was tagged as Java before, so this answer might not be applicable.
This is quick and dirty but it might work for you, or at least be a starting point
private String highlight(String search,String html) {
return html.replaceAll("(>[^<]*)("+search+")([^>]*<)","$1<em>$2</em>$3");
}
This requires testing, and I make no guarantees that its correct but the simplest way to explain how is that you ensure that your term exists between two tags and is thus is not itself a tag or part of a tag parameter.
var highlight = function(what){
var html = document.body.innerHTML,
word = "(" + what + ")",
match = new RegExp(word, "gi");
html = html.replace(match, "<span style='background-color: red'>$1</span>");
document.body.innerHTML = html;
};
highlight('ll');
This would highlight any occurence of 'll'.
Be carefull by calling highlight() with < or > or any tag name, it would also replace those, screwing up your markup. You might workaround that by reading innerText instead of innerHTML, but that way you'll lose the markup information.
Best way probably is to implement a parser routine yourself.
Example: http://www.jsfiddle.net/DRtVn/
there is a free javascript library that might help you out -> http://scott.yang.id.au/code/se-hilite/
You must be using some server side language to render the search results on the webpage.
So the best way I can think of is to highlight the word while rendering it using the server side language itself,which may be php,java or any other language.
This way you would have only the result strings without html and without parsing overhead.
I have a textarea where someone can input text. I do not want a wysiwyg editor.
But what would be great:
Strip all tags, but make correct <p> and <br /> if user input has new lines.
Additionally convert all urls, with or without http// or parameter to clickable links.
I cannot find a solution.
So you could type into the textarea:
........
This is a paragraph
This ist still in the paragraph
this is a new paragraph www.this-would-be-clickable
new paragraphp `<strong>`this will be shown not bold`</strong>`
........
thankfull for every advice
Take a look at CKEditor. It may be more than what you need, but still very good.
http://ckeditor.com/
Another more simplistic alternative is Wymeditor.
Seems to me that Markdown or Textile would get you a long way, though.
But if all you need is the newline/paragraph control and url to link, you could easily build it yourself with some regex.
I found a function inside a famous blog software..... plus this regex for links, the regex seems to work, but most probably there are better solution:
/((http(s?)://)|(www.))([^\s()[]<>]+|([^\s)])|[[^\s]]])+(?