highlight words on page just like jQuery.highlight()? - php

I am currently using jQuery.highlight() to highlight given words on page and link them. But it took long if we have lots of words to check and highlight on a page.
I know how jQuery.highlight() functions, but can't think if we can implement the same functionality using PHP.
Is there anyway in php to highlight words and link them just like jQuery.highlight() doing? In addition to question, I want to mention that there can be static and dynamic contents on the page.

If you know what words need to be highlighted, just make a css class higlight and use as follows:<span class='highlight'>My highlighted text</span>.
As for the css, use for example
.highlight{
background-color: yellow;
}
You can also use str_replace to replace certain words, like so:
str_replace( $word, '<span class="highlight">'.$word.'</span>', $mytext)
$mytext would obviously be the the whole text, and $word the word you wish to highlight. You could make an array of words that need to be highlighted and just do a foreach().

Related

How to prevent PHP str_replace replace text in html tags?

I am developing a website where user can upload their texts. For managerial purpose, I want to
change all the text "apple" to <a href="https://apple.com">apple<a> dynamically by php.
I am using str_replace('apple','apple') Now
However, the word "apple" might already been linked to an external source by users. In this case, will mess up the original link.
Say the page has the following :
apple
my code will change it to
<a href="...">apple</a>
Is there any way I can identify if a certain "apple" was in an a tag or other html tags already?
Thank you
Use DOMDocument to turn the HTML into a DOM you can work with. Then, iterate over all text nodes, making the replacements.
Why not use an if statement to look for the <a href="..">, else do your replacement?
Would all occurrences of "Apple" be in regular sentences (i.e. preceded or followed by spaces or newlines)? If so, you could try something like this:
str_replace(' apple', ' apple, $string);
If that wont do what you need, do a catch-all str_replace and then use preg_match with regex to get clean up any nested links. Something along the lines of this which would preserve the original link (though I don't recommend using regex to parse HTML).
preg_match('/\\\3', $string);

If there a PHP function like ucfirst() that will ignore html?

I am programmatically cleaning up some basic grammar in comments and other user submitted content. Capitalizing I, the first letter of sentence, etc. The comments and content are mixed with HTML as users have some options in formatting their text.
This is actually proving to bit a bit more challenging than expected, especially to someone new to PHP and regex.
If there a function like ucfirst that will ignore html to help capitalize sentences?
Also, any links or tutorials on cleaning up text like this in html, would be appreciated. Please leave anything you feel would help in the comments. thanks!
EDIT:
Sample Text:
<div><p>i wuz walkin thru the PaRK and found <strong>ur dog</strong>. <br />i hoPe to get a reward.<br /> plz call or text 7zero4 8two8 49 sevenseven</div>
I need for it to be (ultimately)
<div><p>I was walking through the park and found <strong>your dog<strong>. <p>I hope to get a reward.</p><p> Please call or text (704) 828-4977.</p>
I know this is going a little farther than the intended question, but my thought was to do this incrementally. ucfirst() is just one of many functions I was using to do one small cleanup at a time per scan. Even if I had to run the text 100 times through the filter, this runs on a cron run when the site has no traffic. I wish there was a discussion forum where this could continue as obviously there would be some great ideas on continuing the approach. Any thoughts on how to approach this as an overall project by all means please leave a comment.
I guess in the spirit of the question itself. ucfirst then would not be the best function for this as it could not take an argument list of things to ignore. A flag IGNORE_HTML would be great!
Given this is a PHP question, then the DOM parser recommended below sounds like the best answer? Thoughts?
You can also add a CSS pseudo-element to your desired elements like this:
div:first-letter {
text-transform: uppercase;
}
But you will probably need to change the way, you print out your senteces ( if you are printing them all in one huge tag ), since CSS lacks the ability to detect the start of a new sentence inside a single tag :(
You should probably use a DOM parser (either the built-in one or for example this one, which is really easy to use).
Walk through all of the text nodes in your HTML and perform the clean-up with preg_replace_callback, ucfirst and a regular expression like this one:
'/(\s*)([^.?!]*)/'
This will match a string of whitespace, and then as many non-sentence-ending-punctuation characters as possible. The actual sentence (starting with a letter, unless your sentence starts with ", which complicates things a bit) will then be found in the first capturing group.
But from your question, I suppose you are already doing something like the latter and your code is just choking on HTML tags. Here is some example code to get all text nodes with the second DOM parser I linked:
require 'simple_html_dom.php';
$html = new simple_html_dom();
$html->load($fullHtmlStr);
foreach($html->find('text') as $textNode)
$textNode = cleanupFunction($textNode);
$cleanedHtmlStr = $html->save();
In html it will be very difficult to do, as you will be building some kind of html parser. My suggestion would be to cleanup the text before it is transformed into html, at the moment you pull it out of the database. Or even better, cleanup the database once.
This should do it:
function html_ucfirst($s) {
return preg_replace_callback('#^((<(.+?)>)*)(.*?)$#', function ($c) {
return $c[1].ucfirst(array_pop($c));
}, $s);
}
Converts
<b>foo</b> to <b>Foo</b>,
<div><p>test</p></div> to <div><p>Test</p></div>,
but also bar to Bar.
Edit: According to your detailed question, you probably want to apply this function to each sentence. You will have to parse the text first (e.g. splitting by periods).

Creating a "spotlight search" in PHP

I'm working on an E-Book that will be published to my website. I want to mimic OSX spotlight feature where someone can use a my fixed search bar and input text that is then highlighted on the page for them. I was trying to use Sphider but no such luck on getting this result.
•found this similar thread but not exactly what I'm looking for.
You could use a string replace to surround all text that needs to be highlighted with a span tag. Then create a CSS class for that span tag.
<?php
$searchString = $_POST['search'];
$EBOOK = str_replace($searchString, "<span class='highlighted'>$searchString</span>", $EBOOK);
Then some CSS
.highlighted {
background-color:yellow;
}
To take it to the next step you could use javascript to scroll the user's web browser to the first location of a span.highlighted.
Note I wouldn't use a regular expression to replace search string value (ie preg_replace) because the user's search input could contain special characters used by regex that may need to be escaped.
This is all theoretical of course... based on your question.
Edit: just thought of something, Ebook content will contain HTML tags so if you were to use a string replace function like I suggested. Take into consideration to not allow the tags to be searched and replaced. A regular expression replace may be needed in this case

In PHP, how should I shorten long strings in headings and paragraphs so they fit inside fixed boundaries?

When building websites I'm forever chopping up strings to make them display nicely as headings and paragraphs. I use the substr function to chop-off unwanted characters and then add in ellipses. For example:
if ( strlen ( $mystring ) > 22 ) {
echo substr( $mystring,0,21 ).'...';
} else {
echo $mystring;
}
This works pretty good most of the time, but it is far from perfect. Check out how the shortened headings look on one of my sites. You can clearly see a lot of inconsistency in how the shortened headings look.
Surely, there is a better PHP method/ technique?
Your problem is that normal fonts are not monospaced, i.e. the various letters have different widths. Because PHP can't tell the final width of the resulting string in the browser, it is impossible to tell what position one needs to cut the string at.
There are jQuery based solutions for this (jQuery, running in the browser, does have access to the actual width information. #Dan shows a plugin in his answer); the downside of this of course is that it won't work without JavaScript.
If you want to invest the time, it would be possible to use GD's imagettfbbox() to calculate the approximate boundary using a common font like Arial. That would be far from perfectly reliable, but should give you a rough idea where to apply the cut.
No, because PHP doesn't know anything about how the text is going to end up rendered in the browser. Other people aren't even seeing the same thing you are for the same HTML, so how can changing the HTML your PHP generates fix this?
The only way to get consistent length text is to do the adjustments on the client side.
Something like the jQUery Ellipsis plugin:
http://plugins.jquery.com/plugin-tags/ellipsis
Edit: My bad, you want ellipsis... Your ellipses looks fine on that page you showed...
If you Really want them to line up you could put the text in an inline element with max width such and such and overflow: hidden followed by a seperate element with the ellipsis.
Another way is play around with CSS. You don't cut your text (or you just shorten it a bit if it's very long) and then you place it in a fixed width container with overflow: hidden. If you want the dots you can add another element containing them above the end of the text with position: absolute.

php - preg_match string not within the href attribute

i find regex kinda confusing so i got stuck with this problem:
i need to insert <b> tags on certain keywords in a given text. problem is that if the keyword is within the href attribute, it would result to a broken link.
the code goes like this:
$text = preg_replace('/(\b'.$keyword.'\b)/i','<b>\1</b>',$text);
so for cases like
this keyword here
i end up with:
this <b>keyword</b> here
i tried all sorts of combinations but i still couldn't get the right pattern.
thanks!
You can't only use Regex to do that. They are powerful, but they can't parse recursive grammar like HTML.
Instead you should properly parse the HTML using a existing HTML parser. you just have to echo the HTML unless you encouter some text entity. In that case, you run your preg_repace on the text before echoing it.
If your HTML is valid XHTML, you can use the xml_parse function. if it's not, then use whatever HTML parser is available.
You can use preg_replace again after the first replacement to remove b tags from href:
$text=preg_replace('#(href="[^"]*)<b>([^"]*)</b>#i',"$1$2",$text);
Yes, you can use regex like that, but the code might become a little convulted.
Here is a quick example
$string = 'link text with keyword and stuff';
$keyword = 'keyword';
$text = preg_replace(
'/(<a href=")('.$keyword.')(.php">)(.*)(<\/a>)/',
"$1$2$3<b>$4</b>$5",
$string
);
echo $string."\n";
echo $text."\n";
The content inside () are stored in variables $1,$2 ... $n, so I don't have to type stuff over again. The match can also be made more generic to match different kinds of url syntax if needed.
Seeing this solution you might want to rethink the way you plan to do matching of keywords in your code. :)
output:
link text with keyword and stuff
<b>link text with keyword and stuff</b>

Categories