Place content in between paragraphs without images - php

I am using the following code to place some ad code inside my content .
<?php
$content = apply_filters('the_content', $post->post_content);
$content = explode (' ', $content);
$halfway_mark = ceil(count($content) / 2);
$first_half_content = implode(' ', array_slice($content, 0, $halfway_mark));
$second_half_content = implode(' ', array_slice($content, $halfway_mark));
echo $first_half_content.'...';
echo ' YOUR ADS CODE';
echo $second_half_content;
?>
How can i modify this so that the 2 paragraphs (top and bottom) enclosing the ad code should not be the one having images. If the top or bottom paragraph has image then try for next 2 paragraphs.
Example: Correct Implementation on the right.

preg_replace version
This code steps through every paragraph ignoring those that contain image tags. The $pcount variable is incremented for every paragraph found without an image, if an image is encountered however, $pcount is reset to zero. Once $pcount reaches the point where it would hit two, the advert markup is inserted just before that paragraph. This should leave the advert markup between two safe paragraphs. The advert markup variable is then nullified so only one advert is inserted.
The following code is just for set up and could be modified to split the content differently, you could also modify the regular expression that is used — just in case you are using double BRs or something else to delimit your paragraphs.
/// set our advert content
$advert = '<marquee>BUY THIS STUFF!!</marquee>' . "\n\n";
/// calculate mid point
$mpoint = floor(strlen($content) / 2);
/// modify back to the start of a paragraph
$mpoint = strripos($content, '<p', -$mpoint);
/// split html so we only work on second half
$first = substr($content, 0, $mpoint);
$second = substr($content, $mpoint);
$pcount = 0;
$regexp = '/<p>.+?<\/p>/si';
The rest is the bulk of the code that runs the replacement. This could be modified to insert more than one advert, or to support more involved image checking.
$content = $first . preg_replace_callback($regexp, function($matches){
global $pcount, $advert;
if ( !$advert ) {
$return = $matches[0];
}
else if ( stripos($matches[0], '<img ') !== FALSE ) {
$return = $matches[0];
$pcount = 0;
}
else if ( $pcount === 1 ) {
$return = $advert . $matches[0];
$advert = '';
}
else {
$return = $matches[0];
$pcount++;
}
return $return;
}, $second);
After this code has been executed the $content variable will contain the enhanced HTML.
PHP versions prior to 5.3
As your chosen testing area does not support PHP 5.3, and so does not support anonymous functions, you need to use a slightly modified and less succinct version; that makes use of a named function instead.
Also, in order to support content that may not actually leave space for the advert in it's second half I have modified the $mpoint so that it is calculated to be 80% from the end. This will have the effect of including more in the $second part — but will also mean your adverts will be generally placed higher up in the mark-up. This code has not had any fallback implemented into it, because your question does not mention what should happen in the event of failure.
$advert = '<marquee>BUY THIS STUFF!!</marquee>' . "\n\n";
$mpoint = floor(strlen($content) * 0.8);
$mpoint = strripos($content, '<p', -$mpoint);
$first = substr($content, 0, $mpoint);
$second = substr($content, $mpoint);
$pcount = 0;
$regexp = '/<p>.+?<\/p>/si';
function replacement_callback($matches){
global $pcount, $advert;
if ( !$advert ) {
$return = $matches[0];
}
else if ( stripos($matches[0], '<img ') !== FALSE ) {
$return = $matches[0];
$pcount = 0;
}
else if ( $pcount === 1 ) {
$return = $advert . $matches[0];
$advert = '';
}
else {
$return = $matches[0];
$pcount++;
}
return $return;
}
echo $first . preg_replace_callback($regexp, 'replacement_callback', $second);

You could try this:
<?php
$ad_code = 'SOME SCRIPT HERE';
// Your code.
$content = apply_filters('the_content', $post->post_content);
// Split the content at the <p> tags.
$content = explode ('<p>', $content);
// Find the mid of the article.
$content_length = count($content);
$content_mid = floor($content_length / 2);
// Save no image p's index.
$last_no_image_p_index = NULL;
// Loop beginning from the mid of the article to search for images.
for ($i = $content_mid; $i < $content_length; $i++) {
// If we do not find an image, let it go down.
if (stripos($content[$i], '<img') === FALSE) {
// In case we already have a last no image p, we check
// if it was the one right before this one, so we have
// two p tags with no images in there.
if ($last_no_image_p_index === ($i - 1)) {
// We break here.
break;
}
else {
$last_no_image_p_index = $i;
}
}
}
// If no none image p tag was found, we use the last one.
if (is_null($last_no_image_p_index)) {
$last_no_image_p_index = ($content_length - 1);
}
// Add ad code here with trailing <p>, so the implode later will work correctly.
$content = array_slice($content, $last_no_image_p_index, 0, $ad_code . '</p>');
$content = implode('<p>', $content);
?>
It will try to find a place for the ad from the mid of your article and if none is found the ad is put to the end.
Regards
func0der

I think this will work:
First explode the paragraphs, then you have to loop it and check if you find img inside them.
If you find it inside, you try the next.
Think of this as psuedo-code, since it's not tested. You will have to make a loop too, comments in the code :) Sorry if it contains bugs, it's written in Notepad.
<?php
$i = 0; // counter
$arrBoolImg = array(); // array for the paragraph booleans
$content = apply_filters('the_content', $post->post_content);
$contents = str_replace ('<p>', '<explode><p>', $content); // here we add a custom tag, so we can explode
$contents = explode ('<explode>', $contents); // then explode it, so we can iterate the paragraphs
// fill array with boolean array returned
$arrBoolImg = hasImages($contents);
$halfway_mark = ceil(count($contents) / 2);
/*
TODO (by you):
---
When you have $arrBoolImg filled, you can itarate through it.
You then simply loop from the middle of the array $contents (plural), that is exploded from above.
The startingpoing for your loop is the middle, the upper bounds is the +2 or what ever :-)
Then you simply insert your magic.. And then glue it back together, as you did before.
I think this will work. even though the code may have some bugs, since I wrote it in Notepad.
*/
function hasImages($contents) {
/*
This function loops through the $contents array and checks if they have images in them
The return value, is an array with boolean values, so one can iterate through it.
*/
$arrRet = array(); // array for the paragraph booleans
if (count($content)>=1) {
foreach ($contents as $v) { // iterate the content
if (strpos($v, '<img') === false) { // did not find img
$arrRet[$i] = false;
}
else { // found img
$arrRet[$i] = true;
}
$i++;
} // end for each loop
return $arrRet;
} // end if count
} // end hasImages func
?>

[This is just an idea, I don't have enough reputation to comment...]
After calling #Olavxxx's method and filling your boolean array you could just loop through that array in an alternating manner starting in the middle: Let's assume your array is 8 entries long. Calculating the middle using your method you get 4. So you check the combination of values 4 + 3, if that doesn't work, you check 4 + 5, after that 3 + 2, ...
So your loop looks somewhat like
$middle = ceil(count($content) / 2);
$i = 1;
while ($i <= $middle) {
$j = $middle + (-1) ^ $i * $i;
$k = $j + 1;
if (!$hasImagesArray[$j] && !$hasImagesArray[$k])
break; // position found
$i++;
}
It's up to you to implement further constraints to make sure the add is not shown to far up or down in the article...
Please note that you need to take care of special cases like too short arrays too in order to prevent IndexOutOfBounds-Exceptions.

Related

Facebook-like "show more" button for a string with URLs

I'm trying to have a feature that acts like Facebook's show more behaviour.
I want it to trim the string if:
its length is more than 200 characters.
there are more than 5 /n occurrences.
It sounds simple and I already have an initial function (that does it only by length, I haven't implemented the /n occurrences yet):
function contentShowMore($string, $max_length) {
if(mb_strlen($string, 'utf-8') <= $max_length) {
return $string; // return the original string if haven't passed $max_length
} else {
$teaser = mb_substr($string, 0, $max_length); // trim to max length
$dots = '<span class="show-more-dots"> ...</span>'; // add dots
$show_more_content = mb_substr($string, $max_length); // get the hidden content
$show_more_wrapper = '<span class="show-more-content">'.$show_more_content.'</span>'; // wrap it
return $teaser.$dots.$show_more_wrapper; // connect all together for usage on HTML.
}
}
The problem is that the string might include URLs, so it breaks them. I need to find a way to make a functional show-more button that checks length, newlines and won't cut URLs.
Thank you!
Example:
input: contentShowMore("hello there http://google.com/ good day!", 20).
output:
hello there http://g
<span class="show-more-dots"> ...</span>
<span class="show-more-content">oogle.com/ good day!</span>
the output i want:
hello there http://google.com/
<span class="show-more-dots"> ...</span>
<span class="show-more-content"> good day!</span>
found a solution!
function contentShowMore($string, $max_length, $max_newlines) {
$trim_str = trim($string);
if(mb_strlen($trim_str, 'utf-8') <= $max_length && substr_count($trim_str, "\n") < $max_newlines) { // return the original if short, or less than X newlines
return $trim_str;
} else {
$teaser = mb_substr($trim_str, 0, $max_length); // text to show
$show_more_content = mb_substr($trim_str, $max_length);
// the read more might have cut a string (or worse - an URL) in the middle of it.
// so we will take all the rest of the string before the next whitespace and will add it back to the teaser.
$content_parts = explode(' ', $show_more_content, 2); // [0] - before first space, [1] - after first space
$teaser .= $content_parts[0];
if(isset($content_parts[1])) { // if there are still leftover strings, its on show more! :)
$show_more_content = $content_parts[1];
}
// NOW WERE CHEKING MAX NEWLINES.
$teaser_parts = explode("\n", $teaser); // break to array.
$teaser = implode("\n", array_slice($teaser_parts, 0, $max_newlines)); // take the first $max_newlines lines and use them as teaser.
$show_more_content = implode("\n", array_slice($teaser_parts, $max_newlines)) . ' ' . $show_more_content; // connect the rest to the hidden content.
if(mb_strlen($show_more_content, "UTF-8") === 0) {
return $trim_str; // nothing to hide - return original.
} else {
$show_more_wrapper = '<span class="show-more-content">'.$show_more_content.'</span>';
$dots = '<span class="show-more-dots"> ...</span>'; // dots will be visible between the teaser and the hidden.
$button = ' <span class="show-more">Show more</span>';
return $teaser.$dots.$button.$show_more_wrapper; // connect ingredients
}
}
}

Insert text in content after 300 words but after closing tag of a Paragraph

I am looking for a way to insert an ad or text after X amount of words and after the closing tag of the paragraph the last word appears in.
So far, I have only been able to do this after the X amount of characters. The problem with this approach is that HTML characters are counted which gives inaccurate results.
function chars1($content) {
// only inject google ads if post is longer than 2500 characters
$enable_length1 = 2500;
// insert after the 210th character
$after_character1 = 2100;
if (is_single() && strlen($content) > $enable_length1) {
$before_content1 = substr($content, 0, $after_character1);
$after_content1 = substr($content, $after_character1);
$after_content1 = explode('</p>', $after_content1);
ob_start();
dynamic_sidebar('single-image-ads-1');
$text1 = ob_get_contents();
ob_end_clean();
array_splice($after_content1, 1, 0, $text1);
$after_content1 = implode('', $after_content1);
return $before_content1 . $after_content1;
} else {
return $content;
}
}
//add filter to WordPress with priority 49
add_filter('the_content', 'chars1',49);
Another approach I have tried is using:
strip_tags($content)
and counted the words using:
st_word_count()
The problem with this is that I have no way of returning the $content with the HTML tags
Depending on the size of the post, I will insert up to 5 ad units, with the functions I have above I would need to create a function for each ad. If there is a way to insert all 5 ads using one function that would be great.
Any help is appreciated.
Deciding what is a word or not can oftentimes be very hard. But if you're alright with an approximate solution, like defining a word as text between two whitespaces, I suggest you implement a simple function yourself.
This may be achieved by iterating over the characters of the string until 150 words are counted and then jumping to the end of the current paragraph. Insert an ad and then repeat until you've added sufficiently many.
Implementing this in your function might look like this
function chars1($content) {
// only inject google ads if post is longer than 2500 characters
$enable_length1 = 2500;
// Insert at the end of the paragraph every 300 words
$after_word1 = 300;
// Maximum of 5 ads
$max_ads = 5;
if (strlen($content) > $enable_length1) {
$len = strlen($content);
$i=0;
// Keep adding untill end of content or $max_ads number of ads has ben inserted
while($i<$len && $max_ads-->0) {
// Work our way untill the apropriate length
$word_cout = 0;
$in_tag = false;
while(++$i < $len && $word_cout < $after_word1) {
if(!$in_tag && ctype_space($content[$i])) {
// Whitespace
$word_cout++;
}
else if(!$in_tag && $content[$i] == '<') {
// Begin tag
$in_tag = true;
$word_cout++;
}
else if($in_tag && $content[$i] == '>') {
// End tag
$in_tag = false;
}
}
// Find the next '</p>'
$i = strpos($content, "</p>", $i);
if($i === false) {
// No more paragraph endings
break;
}
else {
// Add the length of </p>
$i += 4;
// Get ad as string
ob_start();
dynamic_sidebar('single-image-ads-1');
$ad = ob_get_contents();
ob_end_clean();
$content = substr($content, 0, $i) . $ad . substr($content, $i);
// Set the correct i
$i+= strlen($ad);
}
}
}
return $content;
}
With this approach, it's easy to add new rules.
I've just had to do this myself. This is how I did it. First explode the content on </p> tags. Loop over the resulting array, put the end </p> back onto the paragraph, do a count on the paragraph with the tags stripped and add it to the global count. Compare the global word count against our word positions. If it's greater, append the content and unset that word position. Stringify and return.
function insert_after_words( $content, $words_positions = array(), $content_to_insert = 'Insert Me' ) {
$total_words_count = 0;
// Explode content on paragraphs.
$content_exploded = explode( '</p>', $content );
foreach ( $content_exploded as $key => $content ) {
// Put the paragraph tags back.
$content_exploded[ $key ] .= '</p>';
$total_words_count += str_word_count( strip_tags( $content_exploded[ $key ] ) );
// Check the total word count against the word positoning.
foreach ( $words_positions as $words_key => $words_count ) {
if ( $total_words_count >= $words_count ) {
$content_exploded[ $key ] .= PHP_EOL . $content_to_insert;
unset( $words_positions[ $words_key ] );
}
}
}
// Stringify content.
return implode( '', $content_exploded );
}

How to truncate string to first n words in PHP

I would like to truncate a very long string, formatted via html elements.
I need the first 500 words (somehow I have to avoid html tags <p>, <br> while my function truncating the string), but in the result I have to keep/use those html elements because the result also should be formatted by html tags like the "original whole" text.
What's the best way to truncate my string?
Example:
Original text
> <p>The Huffington Post (via <a
> href="/t/daily-mail">Daily Mail</a>) is reporting that <a
> href="/t/misty">Misty</a> has been returned to a high kill shelter for
> farting too much! She appeared on Greenville County Pet Rescue’s
> “urgent” list, which means if she doesn’t get readopted, she will be
> euthanized!</p>
I need the first n words (n=10)
> <p>The Huffington Post (via <a
> href="/t/daily-mail">Daily Mail</a>) is reporting that.. </p>
A brute force method would be to just split all elements on blanks, then iterate over them. You count only non-tag elements up to a maximum, while you output tags nonetheless. Something along these lines:
$string = "your string here";
$output = "";
$count = 0;
$max = 10;
$tokens = preg_split('/ /', $string);
foreach ($tokens as $token)
{
if (preg_match('/<.*?>/', $token)) {
$output .= "$token ";
} else if ($count < $max) {
$output .= "$token ";
$count += 1;
}
}
print $output;
You could have found something like this with some Googling.
// Original PHP code by Chirp Internet: www.chirp.com.au
// Please acknowledge use of this code by including this header.
function restoreTags($input)
{
$opened = array();
// loop through opened and closed tags in order
if(preg_match_all("/<(\/?[a-z]+)>?/i", $input, $matches)) {
foreach($matches[1] as $tag) {
if(preg_match("/^[a-z]+$/i", $tag, $regs)) {
// a tag has been opened
if(strtolower($regs[0]) != 'br') $opened[] = $regs[0];
} elseif(preg_match("/^\/([a-z]+)$/i", $tag, $regs)) {
// a tag has been closed
unset($opened[array_pop(array_keys($opened, $regs[1]))]);
}
}
}
// close tags that are still open
if($opened) {
$tagstoclose = array_reverse($opened);
foreach($tagstoclose as $tag) $input .= "</$tag>";
}
return $input;
}
When you combine it with another function mentioned in the article:
function truncateWords($input, $numwords, $padding="")
{
$output = strtok($input, " \n");
while(--$numwords > 0) $output .= " " . strtok(" \n");
if($output != $input) $output .= $padding;
return $output;
}
Then you can just achieve what you're looking for by doing this:
$originalText = '...'; // some original text in HTML format
$output = truncateWords($originalText, 500); // This truncates to 500 words (ish...)
$output = restoreTags($output); // This fixes any open tags

create dynamic read more when the info is coming from WYSIWYG

So I have a ton of content that is already set up through tinyMCE and I need to somehow grab x number of characters or the first <p> tag then add a 'read more' link that will show the rest of the content. I have looked online, but I cant seem to find one that would work. And the ones I did find only really work if I strip all the html out of the string.
Does anyone have any ideas?
I kinda answered in code and ramblings at first, as this turned out to be more complex than I thought... But here's a more coherent version of what I had said before :)
My proposed solution is to take the substring of characters 0-100 in the original string. That's the dangerous string that might contain unterminated HTML tags.
Then, only if this chopped String is shorter than the input string (there's no point in further processing if this didn't actually cut the string), start magic:
/**
* 1. Loop from char 0 to 100, at each character, check if
* it might be an open tag "<"
* 2. Check if it's a "<TAG>" or "</TAG>"
* 3. Each time "<TAG>" occurs, add to array of
* "tags to close" Strings
* 4. Each time "</TAG>" occurs, pop the last entered "<TAG>"
* from the array
* 5. Once all the string is examined, foreach "tags to close",
* write "</".TAG.">"
**/
working example on this page: Concatenate a string, and terminate html within
Possible problems that are only partially resolved (see if(!$endloop){ section of track_tag()):
The last HTML tag is chopped by initial substr() method. For example, string ends with: <di or <div or </h, etc...
The input HTML string is not valid HTML, e.g. <div><h1>invalid html</div></h1> or </p><p>what's with the random </p> before this <p>, man?</p>
Unforutnately that opens a whole can of worms I won't get into, and you are likely to see them in a WYSIWYG-generated string...
<?php
$killer = new tagEndingConcatMachine ();
$killer->summaryLength = $length;
$killer->end = "...";
$killer->chop_and_append($input)."Read more";
class tagEndingConcatMachine {
public $end = '...';
public $summaryLength = 100;
private $tags_to_end = array();
public function chop_and_append($x){
$summary = substr($x, 0 ,$this->summaryLength);
if($summary !== $x){
$this->end_tags($summary);
return $summary . $this->end;
}
return $summary;
}
private function end_tags(&$summary){ ;
for($i = 0; $i<=$this->summaryLength; $i++){
if($summary[$i]=='<'){
$this->track_tag($summary, $i);
}
}
for($i = count($this->tags_to_end); $i>=0; $i--){
if($this->tags_to_end != '' && isset($this->tags_to_end[$i]))
$this->end .= '</'.$this->tags_to_end[$i].">";
}
}
private function track_tag(&$summary, $i){
$this_tag = '';
$endloop = false;
$ending = false;
$k = $i+1;
do{
$thischar = $summary[$k];
if($thischar=='/' && $summary[$k-1]== '<'){
$ending = true;
}elseif($thischar=='>'){
if($this_tag!=''){
if($ending)
array_pop($this->tags_to_end);
else
$this->tags_to_end[] = strtolower($this_tag);
}
$endloop = true;
}else{
$this_tag .= $thischar;
}
$k++;
}while($k<=$this->summaryLength && !$endloop);
if(!$endloop){
if($ending){
//opened end tag, never closed
//could be trouble... but tags_to_end knows which to close
$this->end = '>'.$this->end;
}else{
//open opening tag... remove it from the end of the summary
$summary = substr($summary, 0, strlen($summary)-strlen($this_tag)-1);
}
}
}
}
?>
Let's say $x is the content you received from the WYSIWYG.
echo substr($x, 0, 100) . '... Read More';
This will echo out the first 100 chars of $x and concatenate the read more link to the end.
Hope this helps, also check this out.

PHP: Display the first 500 characters of HTML

I have a huge HTML code in a PHP variable like :
$html_code = '<div class="contianer" style="text-align:center;">The Sameple text.</div><br><span>Another sample text.</span>....';
I want to display only first 500 characters of this code. This character count must consider the text in HTML tags and should exclude HTMl tags and attributes while measuring the length.
but while triming the code, it should not affect DOM structure of HTML code.
Is there any tuorial or working examples available?
If its the text you want, you can do this with the following too
substr(strip_tags($html_code),0,500);
Ooohh... I know this I can't get it exactly off the top of my head but you want to load the text you've got as a DOMDOCUMENT
http://www.php.net/manual/en/class.domdocument.php
then grab the text from the entire document node (as a DOMnode http://www.php.net/manual/en/class.domnode.php)
This won't be exactly right, but hopefully this will steer you onto the right track.
Try something like:
$html_code = '<div class="contianer" style="text-align:center;">The Sameple text.</div><br><span>Another sample text.</span>....';
$dom = new DOMDocument();
$dom->loadHTML($html_code);
$text_to_strip = $dom->textContent;
$stripped = mb_substr($text_to_strip,0,500);
echo "$stripped"; // The Sameple text.Another sample text.....
edit ok... that should work. just tested locally
edit2
Now that I understand you want to keep the tags, but limit the text, lets see. You're going to want to loop the content until you get to 500 characters. This is probably going to take a few edits and passes for me to get right, but hopefully I can help. (sorry I can't give undivided attention)
First case is when the text is less than 500 characters. Nothing to worry about. Starting with the above code we can do the following.
if (strlen($stripped) > 500) {
// this is where we do our work.
$characters_so_far = 0;
foreach ($dom->child_nodes as $ChildNode) {
// should check if $ChildNode->hasChildNodes();
// probably put some of this stuff into a function
$characters_in_next_node += str_len($ChildNode->textcontent);
if ($characters_so_far+$characters_in_next_node > 500) {
// remove the node
// try using
// $ChildNode->parentNode->removeChild($ChildNode);
}
$characters_so_far += $characters_in_next_node
}
//
$final_out = $dom->saveHTML();
} else {
$final_out = $html_code;
}
i'm pasting below a php class i wrote a long time ago, but i know it works. its not exactly what you're after, as it deals with words instead of a character count, but i figure its pretty close and someone might find it useful.
class HtmlWordManipulator
{
var $stack = array();
function truncate($text, $num=50)
{
if (preg_match_all('/\s+/', $text, $junk) <= $num) return $text;
$text = preg_replace_callback('/(<\/?[^>]+\s+[^>]*>)/','_truncateProtect', $text);
$words = 0;
$out = array();
$text = str_replace('<',' <',str_replace('>','> ',$text));
$toks = preg_split('/\s+/', $text);
foreach ($toks as $tok)
{
if (preg_match_all('/<(\/?[^\x01>]+)([^>]*)>/',$tok,$matches,PREG_SET_ORDER))
foreach ($matches as $tag) $this->_recordTag($tag[1], $tag[2]);
$out[] = trim($tok);
if (! preg_match('/^(<[^>]+>)+$/', $tok))
{
if (!strpos($tok,'=') && !strpos($tok,'<') && strlen(trim(strip_tags($tok))) > 0)
{
++$words;
}
else
{
/*
echo '<hr />';
echo htmlentities('failed: '.$tok).'<br /)>';
echo htmlentities('has equals: '.strpos($tok,'=')).'<br />';
echo htmlentities('has greater than: '.strpos($tok,'<')).'<br />';
echo htmlentities('strip tags: '.strip_tags($tok)).'<br />';
echo str_word_count($text);
*/
}
}
if ($words > $num) break;
}
$truncate = $this->_truncateRestore(implode(' ', $out));
return $truncate;
}
function restoreTags($text)
{
foreach ($this->stack as $tag) $text .= "</$tag>";
return $text;
}
private function _truncateProtect($match)
{
return preg_replace('/\s/', "\x01", $match[0]);
}
private function _truncateRestore($strings)
{
return preg_replace('/\x01/', ' ', $strings);
}
private function _recordTag($tag, $args)
{
// XHTML
if (strlen($args) and $args[strlen($args) - 1] == '/') return;
else if ($tag[0] == '/')
{
$tag = substr($tag, 1);
for ($i=count($this->stack) -1; $i >= 0; $i--) {
if ($this->stack[$i] == $tag) {
array_splice($this->stack, $i, 1);
return;
}
}
return;
}
else if (in_array($tag, array('p', 'li', 'ul', 'ol', 'div', 'span', 'a')))
$this->stack[] = $tag;
else return;
}
}
truncate is what you want, and you pass it the html and the number of words you want it trimmed down to. it ignores html while counting words, but then rewraps everything in html, even closing trailing tags due to the truncation.
please don't judge me on the complete lack of oop principles. i was young and stupid.
edit:
so it turns out the usage is more like this:
$content = $manipulator->restoreTags($manipulator->truncate($myHtml,$numOfWords));
stupid design decision. allowed me to inject html inside the unclosed tags though.
I'm not up to coding a real solution, but if someone wants to, here's what I'd do (in pseudo-PHP):
$html_code = '<div class="contianer" style="text-align:center;">The Sameple text.</div><br><span>Another sample text.</span>....';
$aggregate = '';
$document = XMLParser($html_code);
foreach ($document->getElementsByTagName('*') as $element) {
$aggregate .= $element->text(); // This is the text, not HTML. It doesn't
// include the children, only the text
// directly in the tag.
}

Categories