I'm trying to scrape product descriptions on a div based on its id. If the description is too long it shows a "show more". The div id's are short-description and long description.
My code works great on short-description but invisible to the DOM parser for long-description. How can I toggle the show more and then call the long-description ?
function my_function($str){
$url = $str;
// break apart ugly affiliate URL
list($chuck, $keep) = explode('=', $url);
// call simple DOM library
require_once 'libs/simple_html_dom.php';
// clean up unencoded URL
$keep = preg_replace("/%u([0-9a-f]{3,4})/i","&#x\\1;",urldecode($keep));
// Get HTML from clean URL
$html = file_get_html($keep);
//find div element
foreach($html->find('*/div[id="short-description"]') as $element){
echo $element;
}
}
Content I'm trying to get.
<div id='short-description'>
Eos inciderint interpretaris ea. Eam ut legere aperiri qualisque.
In propriae perfecto gubergren vix. Omnesque perpetua id duo,
no mel habeo appetere persecuti. Dico phaedrum qui ad, discere euripidis
delicatissimi vim cu.....
<a class='more'>Show More</a>
</div>
<div id='long-description' style='display: none;'>
Eos inciderint interpretaris ea. Eam ut legere aperiri
qualisque. In propriae perfecto gubergren vix. Omnesque
perpetua id duo, no mel habeo appetere persecuti. Dico phaedrum
qui ad, discere euripidis delicatissimi vim cu.Eos inciderint
interpretaris ea. Eam ut legere aperiri qualisque.
<a class='less'>Show Less</a>
</div>
</div>
Update: Solved
Code was working the entire time, didn't realize the raw output had the tags attached. It was including
style='display: none;
so I couldn't see my output result. I added
echo strip_tags($element);
and then I could see the output.
Related
I am programming on wordpress and I want to edit a php file. I want the text to be displayed with line breaks and not all in one line.
Here is my code(I want jonh in one line and travolta in another but it gets displayed in one):
<div class="slide">
<img class="animated fade_left" src='<?php echo esc_url(onepage_get_option('onepage_testimonial_2_image', ONEPAGE_DIR_URI . "assets/images/team2.jpg")); ?>' onmouseover="javascript: this.title = '';" title="">
<div class="bx-caption animated fade_right"><span><a class="arrow"></a><?php echo esc_attr(onepage_get_option('onepage_testimonial_2_content', __('Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam.','one-page'))); ?><a class="testimonial"><?php echo esc_attr(onepage_get_option('onepage_testimonial_2_name', __('john \n travolta','one-page'))); ?></a></span></div>
Any suggestions?
If you want the link / element with the class testimonial on a new line, I would use css as that keeps it flexible and makes it easy to change if you want to do it differently in the future.
So in your css file:
.testimonial {
display: block;
}
In general, I would try to keep presentational stuff out of the php code.
I have json data that looks like this
{
"project_no":1693,
"project_name":"Theresa Project",
"description":"Nonumy euismod ornatus usu te, quodsi viderer accommodare sea cu, ut alterum officiis nec. At deleniti eloquentiam vis. Explicari definitionem ei sea. No nec erat fugit voluptaria, in his elit discere fastidii. Aperiri virtute no eos. Te per habemus vulputate, partem iuvaret intellegebat eam in.",
"project_cost":10000.00,
}
{
"project_no":1664,
"project_name":"School Supplies for Children",
"description":"Nonumy euismod ornatus usu te, quodsi viderer accommodare sea cu, ut alterum officiis nec. At deleniti eloquentiam vis. Explicari definitionem ei sea. No nec erat fugit voluptaria, in his elit discere fastidii. Aperiri virtute no eos. Te per habemus vulputate, partem iuvaret intellegebat eam in. ",
"project_cost":8000.00,
},
I have over 60 records, With php I want to show 10 records on each page and dynamically populate the page numbers based on how many records I have.
Heres how I'm displaying the data.
$json = file_get_contents('http://linktojsondata.com');
$obj = json_decode($json, true);
<?php
$i = 0;
foreach ($obj as $project_name => $project_info) { ?>
<a href="single-project-detail.php/<?php echo $project_info['project_no'];?>">
<img class="img-thumbnail" alt="" src="<?php echo $project_info['featured_image_url']; ?>">
</a>
<a href="single-project-detail.php/<?php echo $project_info['project_no'];?>">
<?php echo $project_info['project_name']; ?>
</a>
<p>
<?php $string = strip_tags($project_info['description']);?>
</p>
<?php if (++$i == 10) break; } ?>
Here is a start, you will split the json array into blocks of 10 using array_chunk, and then loop through this using the page number $_GET['p'] - 1 so your page url may look like page.php?p=2 which will select the second set of data.
$pages = array_chunk(json_decode($json, true), 10, true);
foreach ($pages[$_GET['p'] - 1] as $project_name => $project_info) {
// your code
}
I've got a weird layout to get around and am at a loss, even in the planning stage. Essentially I need to separate out all content that's not a .gallery and put it into an <aside />. I initially considered a plugin using the edit_post hook from the Plugin API, but have since decided against it because this content change is layout specific and I want to maintain a clean database. So...
How can I parse through WP's the_content for content that's not .gallery? Admittedly not a PHP guy, so I doubly appreciate the help!
As per Michael's comment below - here's an example of WP's the_content class output:
HTML
<div class="entry-content">
<div class="gallery">
<dl class="gallery-item">
<dt class="gallery-icon portrait">
<img src="/imagePath/etc.jpg" class="attachment-thumbnail">
</dt>
</dl>
<dl class="gallery-item">
<dt class="gallery-icon portrait">
<img src="/imagePath/etc.jpg" class="attachment-thumbnail">
</dt>
</dl>
<dl class="gallery-item">
<dt class="gallery-icon portrait">
<img src="/imagePath/etc.jpg" class="attachment-thumbnail">
</dt>
</dl>
</div>
<p>Curabitur vulputate, ligula lacinia scelerisque tempor, lacus lacus ornare ante, ac egestas est urna sit amet arcu. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Sed molestie augue sit amet.</p>
<ul>
<li>Item A</li>
<li>Item B</li>
<li>Item C</li>
</ul>
</div>
Desired Output
<div class="entry-content">
<div class="gallery">
<dl class="gallery-item">
<dt class="gallery-icon portrait">
<img src="/imagePath/etc.jpg" class="attachment-thumbnail">
</dt>
</dl>
<dl class="gallery-item">
<dt class="gallery-icon portrait">
<img src="/imagePath/etc.jpg" class="attachment-thumbnail">
</dt>
</dl>
<dl class="gallery-item">
<dt class="gallery-icon portrait">
<img src="/imagePath/etc.jpg" class="attachment-thumbnail">
</dt>
</dl>
</div>
<aside>
<p>Curabitur vulputate, ligula lacinia scelerisque tempor, lacus lacus ornare ante, ac egestas est urna sit amet arcu. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Sed molestie augue sit amet.</p>
<ul>
<li>Item A</li>
<li>Item B</li>
<li>Item C</li>
</ul>
</aside>
</div>
You'll want to use a Dom Parser for this. Here's an example in how you can go about it using your markup as an example. Testing yielded the desired results, so hopefully this will give you the head start you need:
add_filter( 'the_content', 'wrap_nongallery_aside', 20 );
function wrap_nongallery_aside($content){
$dom = new DOMDocument();
$dom->loadHTML($content); // Replace with Edit below if PHP >= 5.4
$aside = $dom->createElement('aside');
$xpath = new DOMXPath($dom);
$not_gallery = $xpath->query('//div[#class="entry-content"]/*[not(contains(#class, "gallery"))]');
foreach($not_gallery as $ng){
$aside->appendChild($ng);
}
$dom->getElementsByTagName('div')->item(0)->appendChild($aside);
return $dom->saveHTML();
}
Edit:
If you're using PHP >= 5.4, then you can easily remove any extra <html> and <body> tags from the generated markup by using the following:
$dom->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
Maiorano84's answer worked beautifully, but prior to his reply I worked out an alternate method that's less specific to my situation, so I figured it'd be good to share.
I had originally written off the plugin approach because it requires changing post content itself - not just the format of the output, but realized that plugins live independent of the theme installation. Below is a very simple, developer targeted plugin that converts a [aside /] shortcodes into HTML elements. It's entirely based on BSD Aside by Sean D Burkin. I'll eventually include a button for the WP text editor and open source it.
<?php
/*
Plugin Name: RW Content Aside
Description: Inserts aside formatting into post content via shortcodes
Author: Daniel Redwood
Version: 0.1
Author URI: http://www.rdwd.fm/
Based on SBD Aside by Sean B. Durkin:
Original Plugin: http://seanbdurkin.id.au/pascaliburnus2/archives/51
Author: http://www.seanbdurkin.id.au
*/
if ( !is_admin() ){
add_filter('the_content', 'handle_rw_aside');
}
function generate_random_str( $length=10)
{
return substr(md5(rand()), 0, $length);
}
function generate_place_marker()
{
return '#' . generate_random_str( 10) . '#';
}
function GetBody( $aside_instruction) {
return preg_replace( '~^((<p>)? \S+\s*=\s*.*?(<br \/>|<\/p>)\n?)*~mi', '', $aside_instruction);
}
function handle_rw_aside($the_content)
{
$begin = generate_place_marker();
$end = generate_place_marker();
$new_content = preg_replace(
'~^((<p>)?\[aside\](<br />|</p>))(.*?)(^(<p>)?\[\/aside\](<br />|</p>))~ms',
$begin . '$4' . $end,
$the_content);
$new_content = preg_replace_callback(
'~^(<p>)?(!+\[\/?aside\])~m',
function ($match) {
return $match[1] . substr( $match[2], 1);
},
$new_content);
$pattern = '~'.$begin.'(.*?)'.$end.'~s';
return preg_replace_callback(
$pattern,
function ($match) {
$aside_instruction = $match[1];
$body = GetBody( $aside_instruction);
$aside = '<aside class="contentAside">' . $body . '</aside>';
return $aside;
},
$new_content);
}
?>
I have a lot of these in a file jQuery1332199407617="01" that need removing, however, the bunch of numbers of always different, is that any way I can just deleted everything between jQuery and ="(number is also always different)"?
Thanks in advance.
Sample of part of the file as requested:
(as you can see it adds another jQuery thing each time it is saved.. hence the need to remove it)
<H2 editing="false" revert="Projects:" jQuery1332198888840="12" jQuery1332199361841="12" jQuery1332199407617="12">ProjectsTesting</H2>
<UL class=list1 jQuery1332198888840="17" jQuery1332199361841="17" jQuery1332199407617="17">
<LI jQuery1332198888840="16" jQuery1332199361841="16" jQuery1332199407617="16">Praesent vestibulum molestie
<LI jQuery1332198888840="19" jQuery1332199361841="19" jQuery1332199407617="19">Aenean nonummy
<LI jQuery1332198888840="21" jQuery1332199361841="21" jQuery1332199407617="21">Hendrerit mauris phasellus
<LI jQuery1332198888840="23" jQuery1332199361841="23" jQuery1332199407617="23">Porta fusce suscipit varius
<LI jQuery1332198888840="25" jQuery1332199361841="25" jQuery1332199407617="25">Cum sociis natoque
<LI jQuery1332198888840="27" jQuery1332199361841="27" jQuery1332199407617="27">Penatibus et magnis disI
<LI jQuery1332198888840="29" jQuery1332199361841="29" jQuery1332199407617="29">Parturient montes </LI></UL></DIV>
This will remove the jQuery text, no matter what numbers are present. It will also take out the excessive space at the end of the tags which is left after the jQuery tags were removed.
$old = '<H2 editing="false" revert="Projects:" jQuery1332198888840="12" jQuery1332199361841="12" jQuery1332199407617="12">ProjectsTesting</H2> <UL class=list1 jQuery1332198888840="17" jQuery1332199361841="17" jQuery1332199407617="17"> <LI jQuery1332198888840="16" jQuery1332199361841="16" jQuery1332199407617="16">Praesent vestibulum molestie <LI jQuery1332198888840="19" jQuery1332199361841="19" jQuery1332199407617="19">Aenean nonummy <LI jQuery1332198888840="21" jQuery1332199361841="21" jQuery1332199407617="21">Hendrerit mauris phasellus <LI jQuery1332198888840="23" jQuery1332199361841="23" jQuery1332199407617="23">Porta fusce suscipit varius <LI jQuery1332198888840="25" jQuery1332199361841="25" jQuery1332199407617="25">Cum sociis natoque <LI jQuery1332198888840="27" jQuery1332199361841="27" jQuery1332199407617="27">Penatibus et magnis disI <LI jQuery1332198888840="29" jQuery1332199361841="29" jQuery1332199407617="29">Parturient montes </LI></UL></DIV>';
//This will erase all the jQuery strings.
$new = preg_replace('/jQuery\d+="\d+"/', '', $old);
//This will take out the extra spaces at the end of the tags that was left open.
$new = preg_replace('/\s+>/', '>', $new);
echo $new;
For more information see: http://php.net/manual/en/function.preg-replace.php
If I understand you correctly, this should work:
$myContent = preg_replace('/jQuery\d+="(\d+)"/g', 'jQuery="${1}"', $myContent);
See: http://php.net/manual/en/function.preg-replace.php
I have a blog entry that will sometimes contain a lot of text/images, and I want to cut an excerpt from that blog. To be more specific I want to match everything until after the second image tag
below is some sample text.
I've tried a negative lookaheads like
/[\w\r\n;:',."&\s*<>=-_]+(?!<img)/i
but I can't figure out a way to have the lookahead apply to a '+' modifier. Anyone got any clue, I'd be real grateful.
*override*
I've been stuck in a room lately, and though it's hard to stay creative all the time, sometimes you need that extra kick. Well for some us we have to throw pictures of true creative genius at ourselves to stimulate us.
So sit back and soak in some inspiration I've come across the past year.
<figure>
<a href="">
<img class="aligncenter" src="http://funnypagenet.com/wp-content/uploads/2011/07/Talesandminimalism_12_www.funnypagenet.com_.jpg" alt="" width="574" height="838" />
</a>
<figcaption></figcaption>
</figure>
<h4 style="text-align: center;">
source
</h4>
Couldn't find who did this, but couldn't explain the movie any simpler
<figure>
<img class="aligncenter" src="http://brickhut.files.wordpress.com/2011/05/theempirestrikesback1.jpg" alt="" width="540" height="800" />
<figcaption></figcaption>
</figure>
Obvious a straight forward string cutting is not suitable for your second image:
...
<figure>
<img class="aligncenter" src="http://brickhut.files.wordpress.com/2011/05/theempirestrikesback1.jpg" alt="" width="540" height="800" />
<figcaption></figcaption>
</figure>
Cutting after the image would leave unclosed elements:
...
<figure>
<img class="aligncenter" src="http://brickhut.files.wordpress.com/2011/05/theempirestrikesback1.jpg" alt="" width="540" height="800" />
Which could destroy the rendering of the page inside the browser. And it does not play a role if you use preg_match with a regular expression here or some string functions.
What you need is a DOM parser like DOMDocument that is able to process the HTML:
Given some sample HTML code that is similar to yours in question:
$html = <<<HTML
dolor sit amet, consectetuer adipiscing elit. <img src="http://example.com/img-a.jpg"> Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes,
nascetur ridiculus mus.
<figure>
<img src="http://example.com/img-b.jpg">
<figcaption>Figure Caption</figcaption>
</figure>
Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut.
HTML;
You can now use the DOMDocument class to load the HTML chunk inside a <body> tag - because it's your whole html body for the manipulation. As you use non-standard HTML tags (<figure> & <figcaption>) you should disable warnings about those when loading the string with libxml_use_internal_errors:
$doc = new DOMDocument();
libxml_use_internal_errors(1);
$doc->loadHTML(sprintf('<body>%s</body>', $html));
This is the basic setup of the DOM parser, your HTML is now inside the parser. Now comes the interesting part. You want to create the excerpt until the second image of the document. That means, everything after that element should be removed. Sounds as easy as like cutting a string which we know does not work, but this time the DOM parser does all the work for us.
You only need to obtain all nodes (<tag>, Text, <!-- comments -->, ...) and delete them. All nodes after the second <img> tag in (following document order). Such things can be expressed with XPath:
/descendant::img[position()=2]/following::node()
PHP's DOM parser comes with XPath, so let's do it:
$xp = new DOMXPath($doc);
$delete = $xp->query('/descendant::img[position()=2]/following::node()');
foreach ($delete as $node)
{
$node->parentNode->removeChild($node);
}
The only thing left is to obtain (exemplary output) the excerpt that is left over. As we know it's all inside the <body> tag:
foreach ($doc->getElementsByTagName('body')->item(0)->childNodes as $child)
{
echo $doc->saveHTML($child);
}
Which will give you the following:
dolor sit amet, consectetuer adipiscing elit. <img src="http://example.com/img-a.jpg"> Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes,
nascetur ridiculus mus.
<figure><img src="http://example.com/img-b.jpg"></figure>
As this example shows, the <figure> tag is properly closed now.
A similar scenario is to create an excerpt after a specific text-length or word-count: Wordwrap / Cut Text in HTML string
Well, it's not regex, but it should work:
$post = str_ireplace('<img', '!!!<img', $post);
list($p1, $p2) = explode('!!!', $post);
$keep = $p1 . $p2;
Puts a split marker before the image tags (!!!), splits on them and keeps the first two chunks, which should be everything up to the second image tag. No regex required.
Edit: Because this is for a excerpt, you might want to run strip_tags() on the result. It's possible that if you don't, you'll have some opened HTML tags that never get closed.
If you really want regex based solution then here it is:
// assuming $str is your full HTML text
if ( preg_match_all('~^(.*?<img\s.*?<img\s[^>]*>)~si', $str, $m) )
print_r ( $m[1] );