Cutting text in between destroys the design - php

I'm currently having an issue with my website. I count like 150 words and then cut it for displaying as an intro text on my website but this produces an issue.
When we have something like this in the text:
<div>
////TEXT TEXT TEXT TEXT TEXT////
----> Reach 150 words here <------
////TEXT TEXT TEXT TEXT TEXT////
</div>
It will print this in the front-page:
<div>
////TEXT TEXT TEXT TEXT TEXT////
----> Reach 150 words here <------
and the unclosed <div> tag destroys the design as it is expected.
How can I overcome this issue? Can we like proccess unclosed tags and close them in the end?

Use php's strip_tags to remove the div from your copy, then add it back in afterwards.
For example;
<?php
$html = '<div>Content goes here</div>';
$stripped = strip_tags($html);
$excerpt_pos = strpos(' ', $stripped, 150);
?>
<div><?php echo substr($stripped, 0, $excerpt_pos); ?></div>

Related

PHP Strip all content around text

I have text that looks like this or a billion variant of this, for example:
<div>content goes here... </div><div style="some style..."><span style="some styles..."><strong>[END_CONTACT]</strong></span></div><div>content goes here... </div>
<div>content goes here... </div><div style="other style..."><span style="other styles..."><strong>[END_CONTACT]</strong></span></div><div>content goes here... </div>
<div>content goes here... </div><div style="random stuff..."><span style="random stuff..."><strong>[END_CONTACT]</strong></span></div><div>content goes here... </div>
and a billion variations of this...
I want to be able to remove any variation of the text surrounding [END_CONTACT] so that all I am left with this is this:
<div>content goes here... </div><div>[END_CONTACT]</div><div>content goes here... </div>
How do I strip the content between the opening div tag and [END_CONTACT] and the content between [END_CONTACT] and the ending div tag?
Thanks
Use regular expressions! The following example using preg_replace will work as long as your content doesn't contain angle brackets, which you should not put in HTML.
$result = preg_replace('#<div\b[^>]*><span\b[^>]*><strong\b[^>]*>([^<]*)</strong></span></div>#i', '<div>$1</div>', $html);
How do I strip the content between the opening div tag and [END_CONTACT] and the content between [END_CONTACT] and ending div tag?
If the terms [END_CONTACT] and the <div> tag are always present, you can use PCRE REGEX in preg_replace():
$string = preg_replace('/<div[^>]*>.*\[END_CONTACT\].*<\/div>/i','<div>[END_CONTACT]</div>',$string);
Example:
$data = [];
$data[] = 'some text <div style="some style..."><span style="some styles..."><strong>[END_CONTACT]</strong></span></div>';
$data[] = 'somrthing else etc.<div style="other style..."><span style="other styles..."><strong>[END_CONTACT]</strong></span></div>';
$data[] = '<div style="random stuff..."><span style="random stuff..."><strong>[END_CONTACT]</strong></span></div>';
$data[] = 'and a billion variations of this...';
foreach ($data as $row){
$string = preg_replace('/<div[^>]*>.*\[END_CONTACT\].*<\/div>/i','<div>[END_CONTACT]</div>',$row);
print $string."<BR>";
}
Output:
<div>[END_CONTACT]</div>
<div>[END_CONTACT]</div>
<div>[END_CONTACT]</div>
and a billion variations of this...
UPDATE:
Sorry, wasn't clear about that in my original post. Is there any way to keep text or code outside of the string in question but still do the operation as you've suggested?
Try this Regex in the above PHP code:
(?!<div).(<div[^>]*>.*\[END_CONTACT\][^\div]*<\/div>)
Example:
content content content... <div style="random stuff..."><span style="random stuff..."><strong>[END_CONTACT]</strong></span></div> content content content
Output:
content content content... <div>[END_CONTACT]</div> content content content
NOTE:
It must be stated that you should use a DOM parser to work with HTML elements in complex compositions rather than Regex.
I have tested my answer and it does what is desired. And as stated above, what you should be using to deal with multilayered complex HTML is a proper PHP DOM Parser.

Wordpress: preg_replace inside a loop only works occasionally

I'm trying to make a custom RSS feed with some alteration to the HTML content of each post.
Inside the template file rss-custom.php I have this:
<?php while (have_posts()) : the_post(); ?>
<?php echo processPostContent(); ?>
<?php endwhile; ?>
in functions.php, there are three replacements as follows :
function processPostContent() {
$post = get_post(get_the_ID());
$post_content = strval($post->post_content);
// replace h3 and h4 tags with h2
$post_content = preg_replace('/<(\/?)h((?![12])\d)/im', "<$1h2", $post_content);
// strip every attribute of <img> other than src
$post_content = preg_replace('/<img[^>]*(src="[^"]*")[^>]*>/im', "<img $1 />", $post_content);
// insert text after some closing tags
$post_content = preg_replace('/<\/(h2|p|figure)>/im', "</$1><p>Inserted</p>", $post_content);
return $post_content;
}
Then I get a strange result: out of 20 posts, only 7-8 of them will have been fully replaced. The remaining get the first two replacements but not the third one. Does anyone know why that is?
The solution, turns out, doesn't have anything to do with the loop nor preg_replace. Some posts' contents do not include any HTML tag, only plain text. That's why preg_replace didn't have any effect on them. When those contents are rendered in the RSS feed, however, <p> tags are automatically inserted. That's what led me to believe the third replacement was skipped.
First paragraph.
Second paragraph.
is turned to
<p>First paragraph.</p>
<p>Second paragraph.</p>

putting paragraph tags on content paragraphs

Im trying to put paragraph text around the paragraphs. This code pulls out the blockquotes from my Wordpress post and outputs everything else
html
<?php
$block2 = get_the_content();
$block2 = preg_replace('~<blockquote>([\s\S]+?)</blockquote>~', '', $block2);
echo '<p>'.$block2.'</p>';
?>
But it only puts < p > tags around the fist paragraph and not the others
If I've understood this correctly, you could try splitting $block2 by newlines, looping through the resulting array and wrapping each element of the array in <p> tags as you have done.
Currently, your code wraps the entire content of $block2 in <p> tags, where I assume you wanted it to wrap the sections separated by newlines.
Example (I don't remember the exact syntax for PHP - sorry):
$split_block = split($block2, '\n');
for ($i in $split_block) {
$split_block[$i] = '<p>'.$split_block[$i].'</p>';
}
echo $split_block;

HTML manipulation: Match the first X number of HTML tags and move them

Let's say I've have the code like this:
<img src="001">
<img src="002">
<p>Some content here.</p>
<img src="003">
What I want to do now is to match the first two images (001 and 002) and store that part of the code in variable. I don't want to do anything with third image.
Id used something like preg_match_all('/<img .*>/', $result); but it obviously matched all the images. Not just those which appear on the top of the code. How to modify that regular expression to select just images that are on top of the code.
What I want to do is to now. I've have <h2> tag with title in one variable and the code above in the second. I want to move the first X images before the <h2> tag OR insert that <h2> tag after first X images. All that in back-end PHP. Would be fun to make it with CSS, but flexbox is not yet here.
You need to divide the problem to solve it. You have got two main parts here:
Division of the HTML into Top and Bottom parts.
Doing the DOMDocument manipulation on (both?) HTML strings.
Let's just do that:
The first part is actually quite simple. Let's say all line separators are "\n" and the empty line is actually an empty line "\n\n". Then this is a simple string operation:
list($top, $bottom) = explode("\n\n", $html, 2);
This solves the first part already. Top html is in $top and the rest we actually do not need to care much about is stored into $bottom.
Let's go on with the second part.
With simple DOMDocument operations you can now for example get a list of all images:
$topDoc = new DOMDocument();
$topDoc->loadHTML($top);
$topImages = $topDoc->getElementsByTagname('img');
The only thing you need to do now is to remove each image from it's parent:
$image->parentNode->removeChild($image);
And then insert it before the <h2> element:
$anchor = $topDoc->getElementsByTagName('h2')->item(0);
$anchor->parentNode->insertBefore($image, $anchor);
And you're fine. Full code example:
$html = <<<HTML
<h2>Title here</h2>
<img src="001">
<p>Some content here. (for testing purposes)</p>
<img src="002">
<h2>Second Title here (for testing purposes)</h2>
<p>Some content here.</p>
<img src="003">
HTML;
list($top, $bottom) = explode("\n\n", $html, 2);
$topDoc = new DOMDocument();
$topDoc->loadHTML($top);
$topImages = $topDoc->getElementsByTagname('img');
$anchor = $topDoc->getElementsByTagName('h2')->item(0);
foreach($topImages as $image) {
$image->parentNode->removeChild($image);
$anchor->parentNode->insertBefore($image, $anchor);
}
foreach($topDoc->getElementsByTagName('body')->item(0)->childNodes as $child)
echo $topDoc->saveHTML($child);
echo $bottom;
Output:
<img src="001"><img src="002"><h2>Title here</h2>
<p>Some content here. (for testing purposes)</p>
<h2>Second Title here (for testing purposes)</h2>
<p>Some content here.</p>
<img src="003">

Trim leading white space with PHP?

There seems to be a bug in a Wordpress PHP function that leaves whitespace in front of the title of the page generated by <?php echo wp_title(''); ?> I've been through the Wordpress docs and forums on that function without any luck.
I'm using it this way <body id="<?php echo wp_title(''); ?>"> in order to generate an HTML body tag with the id of the page title.
So what I need to do is strip that white space, so that the body tag looks like this <body id="mypage"> instead of this <body id=" mypage">
The extra white space kills the CSS I'm trying to use to highlight menu items of the active page. When I manually add a correct body tag without the white space, my CSS works.
So how would I strip the white space? Thanks, Mark
Part Two of the Epic
John, A hex dump was a good idea; it shows the white space as two "20" spaces. But all solutions that strip leading spaces and white space didn't.
And, <?php ob_start(); $title = wp_title(''); ob_end_clean(); echo $title; ?>
gives me < body id ="">
and <?php ob_start(); $title = wp_title(''); echo $title; ?>
gives me < body id =" mypage">
Puzzle. The root of the problem is that wp_title has optional page title leading characters - that look like chevrons - that are supposed to be dropped when the option is false, and they are, but white space gets dumped in.
Is there a nuclear option?
Yup, tried them both before; they still return two leading spaces... arrgg
Strip all whitespace from the left end of the title:
<?php echo ltrim(wp_title('')); ?>
Strip all whitespace from either end:
<?php echo trim(wp_title('')); ?>
Strip all spaces from the left end of the title:
<?php echo ltrim(wp_title(''), ' '); ?>
Remove the first space, even if it's not the first character:
<?php echo str_replace(' ', '', wp_title(''), 1); ?>
Strip only a single space (not newline, not tab) at the beginning:
<?php echo preg_replace('/^ /', '', wp_title('')); ?>
Strip the first character, whatever it is:
<?php echo substr(wp_title(''), 1); ?>
Update
From the Wordpress documentation on wp_title, it appears that wp_title displays the title itself unless you pass false for the second parameter, in which case it returns it. So try:
<?php echo trim(wp_title('', false)); ?>
ltrim()
ltrim($str)
Just to throw in some variety here: trim
<body id="<?=trim(wp_title('', false));?>">
Thanks for this info! I was in the same boat in that I needed to generate page ids for CSS purposes based on the page title and the above solution worked beautifully.
I ended up having an additional hurdle in that some pages have titles with embedded spaces, so I ended up coding this:
<?php echo str_replace(' ','-',trim(wp_title('',false))); ?>
add this to your functions.php
add_filter('wp_title', create_function('$a, $b','return str_replace(" $b ","",$a);'), 10, 2);
should work like a charm

Categories