I am parsing a webpage and following the links in order to map links from one page to another. I am only pulling the title of the page the link is on, the URL used to link the pages, and the title of the page the URL leads to.
My code works smoothly to discover the links I am interested in and descends subpages to find additional product links. There are a few hundred of these across at least a hundred pages, so it's several HTML files of parsing. I am building an in the form $products[index] contains is an array array(['url'] => URLToPage, ['title'] => TitleOfPage, ['link_title'] => TitleOfLinkedPage) as I hope this demonstrates.
The script works fine until I add this snippet, after which the script will stop execution with no errors, warnings, notices or anything; it simply never reaches the end of the script. I have included set_time_limit(0) to prevent execution time from expiring as this script takes some time to complete. This code is executed after the $products array has been populated, if any links found, and $products is always an array, and I have outputted the $link_html_strings in test cases to verify that the pages are being retrieved as expected. This is the offending code:
// Populate the destination link titles
if ( isset( $products ) && count( $products ) > 0 )
{
foreach( $products as $id => $product )
{
$from_this_page = $product['url'];
if ( $DEBUG ) echo 'Parsing ' . $from_this_page . '.<br />';
$link_html_string = file_get_contents( $from_this_page, NULL, NULL, NULL, 500 );
$string_parts = explode( '<title>', $link_html_string );
$string_parts = explode( '</title>', $string_parts[1] );
$products[$id]['link_title'] = $string_parts[0];
if ( $DEBUG ) echo 'Found title: ' . $products[$id]['link_title'] . '<br />';
ob_flush();
flush();
}
}
There should never really be 500 characters needed, however, I had some concern with memory usage when reading the entire files so I reduced the load (I think) by limiting the read. I thought perhaps the script was using up all the allocated memory for PHP. When this is included, it will iterate through this loop several times but at some point stop execution, this point is not also exactly the same. I will get several echo for what file is being parsed.
This is the full code for the script, included to answer questions regarding contents of $products in comments.
<?php
// PHP HTML DOM Parser from http://simplehtmldom.sourceforge.net/
require_once( 'includes/simple_html_dom.php' );
//error_reporting( E_ALL );
set_time_limit( 0 );
// Debugging flag
$DEBUG = false;
function reportProducts( $category, $products )
{
echo '<table width="90%" align="center"><tr><th colspan="3">';
echo $category . ' has ' . count( $products ) . ' products listed, or in subpages.';
echo '</th></tr>';
echo '<tr><td bgcolor="#777777" width="30%">This page</td>
<td bgcolor="#bbbbbb" width="30%">links with</td>
<td bgcolor="#777777" width="30%">to this page</td></tr>';
foreach( $products as $product )
{
echo '<tr><td bgcolor="#777777">' . $product['title'] . '</td>
<td bgcolor="#bbbbbb"><a href="' . $product['url'] . '">' . $product['url'] .
'</a></td><td bgcolor="#777777">' . $product['link_title'] . '</td></tr>';
}
echo '</table><br />';
ob_flush(); // Server may buffer again, preventing incremental display
flush();
}
function parseProductsForPage( $page_to_parse )
{
global $DEBUG;
$failed = false;
$product_id = 0;
$page_dom = new simple_html_dom();
$page_html_string = #file_get_contents( $page_to_parse->href );
$load_state = #$page_dom->load( $page_html_string );
if ( $load_state === NULL )
{
// Find any direct product pages for this page
if ( $DEBUG ) echo $page_to_parse->href . ' being checked for products... ';
$possible = $page_dom->find( 'a[onclick]' );
foreach( $possible as $link )
{
if ( $link->innertext == "[ Add to cart ]" )
{
$products[$product_id]['url'] = $link->href;
$titles = $page_dom->find( 'title' );
$products[$product_id]['title'] = $titles[0]->innertext;
$product_id++;
}
}
if ( $DEBUG )
{
if ( isset( $products ) )
{
echo count( $products ) . ' found on page.<br />';
} else
{
echo '0 found on page.<br />';
}
}
// Find subpages...
if ( $DEBUG ) echo $page_to_parse->href . ' being checked for links... ';
$subpages = $page_dom->find( 'a[class=buy]' );
if ( $DEBUG ) echo count( $subpages ) . ' found.<br />';
// ... and parse
foreach( $subpages as $subpage )
{
$subpage_dom = new simple_html_dom();
$subpage_html_string = #file_get_contents( $subpage->href );
$load_state = #$subpage_dom->load( $subpage_html_string );
if ( $load_state === NULL )
{
// Find any direct product pages for this page
if ( $DEBUG ) echo $subpage->href . ' being checked for products... ';
$possible = $subpage_dom->find( 'a[onclick]' );
foreach( $possible as $link )
{
if ( $link->innertext == "[ Add to cart ]" )
{
$products[$product_id]['url'] = $link->href;
$titles = $page_dom->find( 'title' );
$products[$product_id]['title'] = $titles[0]->innertext;
$product_id++;
}
}
if ( $DEBUG )
{
if ( isset( $products ) )
{
echo count( $products ) . ' found on page.<br />';
} else
{
echo '0 found on page.<br />';
}
}
$subpage_dom->clear();
} else
{
$failed[] = $subpage->href;
}
$subpage_dom->clear();
unset( $subpage_dom );
}
// Populate the destination link titles
if ( isset( $products ) && count( $products ) > 0 )
{
foreach( $products as $id => $product )
{
// $from_this_page = $product['url'];
// if ( $DEBUG ) echo 'Parsing ' . $from_this_page . '.<br />';
// $link_html_string = file_get_contents( $from_this_page, NULL, NULL, NULL, 500 );
// $string_parts = explode( '<title>', $link_html_string );
// $string_parts = explode( '</title>', $string_parts[1] );
// $products[$id]['link_title'] = $string_parts[0];
// if ( $DEBUG ) echo 'Found title: ' . $products[$id]['link_title'] . '<br />';
// ob_flush();
// flush();
}
}
} else
{
$failed[] = $page_to_parse->href;
}
$titles = $page_dom->find( 'title' );
if ( isset( $products ) ) reportProducts( $titles[0]->innertext, $products );
$page_dom->clear();
unset( $page_dom );
return $failed;
}
// Initialize the object
$html = new simple_html_dom();
$html->load_file( 'index.html' );
// Start output buffer
ob_start();
// Find all product categories listed on the website
if ( $DEBUG ) echo '<h1>Collecting links from LHN...</h1>';
$sidelinks = $html->find( 'a[class=sidelink_main]' );
$html->clear();
unset( $html );
echo '<h1>Found ' . count( $sidelinks ) . ' categories.</h1><br />';
ob_flush(); // Server may buffer output, preventing incremental display
flush();
// Find links and products for each category
foreach( $sidelinks as $sidelink )
{
if ( $DEBUG ) echo 'Sending ' . $sidelink->href . ' to parser.<br />';
$parse_failed = parseProductsForPage( $sidelink );
if ( $parse_failed )
{
foreach( $parse_failed as $failure )
{
$failures[] = $failure;
}
}
}
echo count( $failures ) . ' pages failed to parse.<br />';
echo '<br />FIN!<br />'; // Easily searched to verfiy end of script was reached, also
// celebratory.
ob_end_flush(); // Clear output buffer
flush();
?>
Are you sure that set_time_limit has any effect (when running php with safe_mode on it will not have any effect )?
Also be sure that $string_parts = explode( '<title>', $link_html_string ); gives an result(there may be no title-element or the tagName may be used uppercase)
Related
I'm running into an issue with ACF, and I just can't figure out what's going on, and nothing on the internet is helping out.
I've added some fields to the Image Slider block:
But no matter what I try inside of our custom block code: image-slider.php I cannot get the values of any of the auto_play fields. get_field always returns null. I know the value is there, because if I dump out get_fields( $postID ), I can see the ['page_builder'][2] element has the value I want. I could get to it that way, but I can't seem to determine which index I'm on (the 2) programmatically.
So if you know either, how I can access the field directly, or figure out my current 'page_builder' index, that would be extremely helpful.
It's super confusing, because the have_rows( 'slide_setting' ) call obviously knows where to look, and works as expected.
The custom block php looks like:
<?php
if(have_rows( 'slide_setting' ) ) {
$digits = 3;
$randID = rand(pow(10, $digits-1), pow(10, $digits)-1);
echo '<div class="container"><div class="row"><div id="swiper_'.$randID.'" class="col-md-12 wiche-swiper-top-navigation-wrapper">';
echo '<div class="swiper-container wiche-swiper-top-navigation">';
// var_dump( get_fields( get_the_ID() )['page_builder'][2] );
// var_dump( get_post_field( 'auto_play' ) );
// var_dump(get_field('image_slider_settings_auto_play'));
// var_dump(get_row_index());
// var_dump(get_field_objects( $post->ID ));
// var_dump( get_row_index() );
// var_dump( acf_get_field_group( 'slide_setting' ) );
// die();
if ( get_field( 'auto_play' ) ) {
echo '<div class="swiper-wrapper" data-swiper-autoplay="' . get_field( 'auto_play_delay' ) . '" data-swiper-disable-on-interaction="' . get_field( 'auto_play_disable_on_interaction' ) . '">';
} else {
echo '<div class="swiper-wrapper">';
}
while( have_rows( 'slide_setting' ) ) {
the_row();
$title = get_sub_field( 'title' );
$image = get_sub_field( 'image' );
$content = get_sub_field( 'content' );
if ( $image || $content ) {
echo '<div class="swiper-slide swiper-banner-slide swiper-no-swiping">';
if ( $title ) {
echo '<div class="text-center slider-top-title">';
echo $title;
echo '</div>';
}
if ( $image ) {
echo '<div class="banner-image">';
echo wp_get_attachment_image( $image, 'full', '', array( 'loading' => false ) );
echo '</div>';
}
if ( $content ) {
echo '<div class="banner-content">';
echo $content;
echo '</div>';
}
echo '</div>';
}
}
echo '</div>';
echo '</div>';
echo '<div class="swiper-button-next swiper-button-next-outsite">Next</div><div class="swiper-button-prev swiper-button-prev-outsite">Prev</div>';
echo '</div></div></div>';
}
So I wasn't able to get a perfect answer to my question, looks like the API to get what I want doesn't exist (dumb).
What I ended up with - I set up a new function in my theme's functions.php file that looks like the following:
$post_slider_config_index = 0;
function get_the_slider_config( $post_id ) {
global $post_slider_config_index;
$page_builder = get_fields( $post_id )['page_builder'];
$slider_config = null;
foreach ($page_builder as $key => $value) {
if ( $value['acf_fc_layout'] === 'image_slider_settings' ) {
if ( $key > $post_slider_config_index ) {
$slider_config = $value;
$post_slider_config_index = $key;
break;
}
}
}
return $slider_config;
}
And then inside my image-slider.php file I call it like so:
$slider_config = get_the_slider_config( get_the_ID() );
if ( $slider_config[ 'auto_play' ] ) {
echo '<div class="swiper-wrapper" data-swiper-autoplay="' . $slider_config[ 'auto_play_delay' ] . '" data-swiper-disable-on-interaction="' . $slider_config[ 'auto_play_disable_on_interaction' ] . '">';
} else {
echo '<div class="swiper-wrapper">';
}
The $post_slider_config_index variable keeps track of the last index retrieved so that if there are multiple sliders on a page, it'll grab the right one as its rendered.
It's not perfect, it's not super efficient, but it does what I needed. Annoying WP doesn't just give you the information it obviously has already regarding where you are in the page.
Based on the answer on my previous question, Issue display shipping data on WooCommerce single product page, I'm trying to replace the 0 with FREE or whatever word I choose.
Problem is, whatever I do, it does not work. This is my attempt and I need help with it:
add_action('woocommerce_after_add_to_cart_form', 'display_shipping_on_product_page', 10, 0);
function display_shipping_on_product_page(){
// get all zones
$zones = WC_Shipping_Zones::get_zones();
// get the shop base country
$base_country = WC()->countries->get_base_country();
$base_city = WC()->countries->get_base_city();
// start display of table
echo '<div>' . __( '<b>Available Shipping</b>', 'woocommerce' );
echo '<br><small><span class="shipping-time-cutoff">All orders are shipped from the '.$base_country.'. Order before 12AM Mon-Fri for same day delivery within '.$base_city.'. Order before 3PM Mon-Thu for next day delivery.</span></small>';
echo '<small><table class="shipping-and-delivery-table">';
// get name of each zone and each shipping method for each zone
foreach ( $zones as $zone_id => $zone ) {
echo '<tr><td>';
echo '<strong>' . $zone['zone_name'] . '</strong>' . '</td><td>';
$zone_shipping_methods = $zone['shipping_methods'];
foreach ( $zone_shipping_methods as $index => $method ) {
$instance = $method->instance_settings;
if ( isset( $instance['min_amount'] ) ) {
$instance_min_amount = $instance['min_amount'];
} else {
$instance_min_amount = 0;
}
if ( isset( $instance['cost'] ) ) {
$instance_cost = $instance['cost'];
} else {
$instance_cost = str_replace($instance_cost, "FREE" );
}
$cost = $instance_cost ? $instance_cost : $instance_min_amount;
$above = $instance_min_amount ? 'above ' : '';
echo $instance['title'] . ': ' . $above . '<strong>' . wc_price( $cost ) . '</strong>' . '<br>';
}
echo '</td></tr>';
}
echo '</table></small></div>';
}
This is what I tried to add / edit without success:
$instance_cost = str_replace($instance_cost, "FREE" );
To display 'free' instead of 0, you need to change your if/else conditions. Explanation via comment tags added to my answer.
So you get:
function action_woocommerce_after_add_to_cart_form() {
// get all zones
$zones = WC_Shipping_Zones::get_zones();
// get the shop base country
$base_country = WC()->countries->get_base_country();
$base_city = WC()->countries->get_base_city();
// start display of table
echo '<div>' . __( 'Available Shipping', 'woocommerce' );
echo '<br><small><span class="shipping-time-cutoff">All orders are shipped from the '.$base_country.'. Order before 12AM Mon-Fri for same day delivery within '.$base_city.'. Order before 3PM Mon-Thu for next day delivery.</span></small>';
echo '<small><table class="shipping-and-delivery-table">';
// get name of each zone and each shipping method for each zone
foreach ( $zones as $zone_id => $zone ) {
echo '<tr><td>';
echo '<strong>' . $zone['zone_name'] . '</strong>' . '</td><td>';
$zone_shipping_methods = $zone['shipping_methods'];
foreach ( $zone_shipping_methods as $index => $method ) {
$instance = $method->instance_settings;
// Initialize
$above = '';
$output_cost = __( 'Free', 'woocommerce' );
// Cost isset
if ( isset( $instance['cost'] ) ) {
// NOT empty
if ( ! empty ( $instance['cost'] ) ) {
// Output
$output_cost = wc_price( $instance['cost'] );
}
}
// Min amount isset
if ( isset( $instance['min_amount'] ) ) {
// NOT empty
if ( ! empty ( $instance['min_amount'] ) ) {
// Above
$above = __( 'above ', 'woocommerce' );
// Output
$output_cost = wc_price( $instance['min_amount'] );
}
}
echo $instance['title'] . ': ' . $above . '<strong>' . $output_cost . '</strong>' . '<br>';
}
echo '</td></tr>';
}
echo '</table></small></div>';
}
add_action( 'woocommerce_after_add_to_cart_form', 'action_woocommerce_after_add_to_cart_form', 10, 0 );
I have a special function on some of my pages that returns a modified embed code. It works great:
if (/*is_page() &&*/ has_category('Krajské zprávy')) {
/* get subtitle */
$subtitle = apply_filters( 'plugins/wp_subtitle/get_subtitle', '', array(
) );
if ((strpos($subtitle , 'kraj') == false) && (strpos($subtitle , 'Praha') == false) && (strpos($subtitle , 'Vysočina') == false)) {
$subtitle = "Všechny kraje";
};
/* get page id with embed code */
$id_short = 357;
$tag_id = array (
'Ovzduší' => 357
);
foreach ($tag_id as $k => $v) {
if (has_tag($k)) {$id_short = $v;}
}
/* get embed code & replace */
$acf_kod = get_field('embed_kod', $id_short /*,false*/);
$replace_sub = "<param name='filter' value='Parameters.Kraj=" . $subtitle . "'>";
preg_match_all('/<param [^>]*>/', $acf_kod, $matches);
$m_1 = $matches[0][0]; $m_2 = $matches[0][1];
$pos = strpos( $acf_kod, $m_1) + strlen( $m_1 );
if (strpos($acf_kod,$m_2)-(strpos($acf_kod,$m_1)+strlen($m_1))<2) {
$before = substr ($acf_kod, 0, $pos);
$after = substr( $acf_kod, $pos, strlen( $acf_kod ) );
$whole = $before . $replace_sub . $after;
$content = $whole .'<!--more-->' . $content;
};
/*echo $whole;*/
};
return $content;
};
add_filter('the_content', 'add_embed_parameter');
However, I also have a filter on my site that filters pages by categories and tags and returns them via Ajax. That, too, works just fine - but only with "standard pages" that don't use the code above. For the pages that do, it returns nothing.
This is a snippet of the php code that is used by the filter:
$query = new WP_Query( $args );
if( $query->have_posts() ) :
while( $query->have_posts() ): $query->the_post();
/*$content = get_post_field( 'post_content', get_the_ID() );*/
$content = get_the_content (/*get_the_ID()*/);
$content_parts = get_extended( $content );
echo '<h2>' . $query->post->post_title . '</h2>',
$content/*$content_parts['main'] /*'<p>' . $query->post->post_excerpt . '</p>'*/;
endwhile;
wp_reset_postdata();
else :
echo 'No posts found';
endif;
die();
This is the whole AJAX thing (it's not mine, I'm using the tutorial here):
jQuery(function($){
$('#filter').submit(function(){
var filter = $('#filter');
$.ajax({
url:filter.attr('action'),
data:filter.serialize(), // form data
type:filter.attr('method'), // POST
beforeSend:function(xhr){
filter.find('button').text('Processing...'); // changing the button label
},
success:function(data){
filter.find('button').text('Apply filter'); // changing the button label back
$('#response').html(data); // insert data
}
});
return false;
});
});
Any idea where the problem might be? Many thanks!
I made a function for Woo store to display custom taxonomies. And somehow my span conatiners for each are destroyed. Here's the code:
add_action( 'woocommerce_product_meta_start', 'add_my_meta', 1 );
function add_my_meta() {
$series = the_terms($post->ID, 'series');
if ($series) {
$meta_output = '<span style="display:block;">Series: ';
$meta_array = array();
foreach ($series as $serie) {
$meta_array[] = '' . $serie->name . '';
}
$meta_output .= join( ', ', $meta_array ) . '</span>';
}
return $meta_output;
}
So expected output is <span style="display:block;">Series: My Series</span>
Current output is My Series
Spans and text removed. Never faced that problem before, what's the problem and how to solve?
Found some mistakes — needed to use get_the_terms (was the_terms) and $serie->term_id (was $serie->slug)
add_action( 'woocommerce_product_meta_start', 'add_my_meta', 1 );
function add_my_meta() {
$series = get_the_terms($post->ID, 'series');
if ( is_array($series) ) {
$meta_array = array();
foreach ($series as $serie) {
$meta_array[] = '' . $serie->name . '';
}
echo '<span class="tagged_as">Series: ' . implode( ', ', $meta_array ) . '</span>';
}
}
I have a php file which is part of a wordpress plugin. I need to debug an issue we are having. I want to find out what a variable's value is. How can I print the variable's value to console? echo or chrome or firefox extensions have been suggested. I couldn't get echo to output to console (echo “$variablename";) and neither using the firephp extension for firefox.
To answer your question, you can do this:
echo '<script>console.log("PHP error: ' . $error . '")</script>';
but I would recommend doing one of the things #Ishas suggested instead. Make sure $error doesn't contain anything that can mess up your script.
If you are thinking about the javascript console, you can not do this from PHP.
You have a few options you could choose from:
echo
var_dump
create a log file
xdebug
For a quick check for a variables value I would use var_dump, it will also show you the data type of the variable. This will be output to the browser when you request the page.
Logging to the DevTools console from PHP in WordPress
Here you can see my solution for the problem in action while debugging coupon logic in WooCommerce. This solution is meant for debug purposes, only. (Note: Screenshot not up to date, it will also expose private members.)
Features
Allow printing before and after rendering has started
Works in front-end and back-end
Print any amount of variables
Encode arrays and objects
Expose private and protected members of objects
Also log to the log file
Safely and easily opt-out in the production environment (in case you keep the calls)
Print the caller class, function and hook (quality of life improvement)
Solution
wp-debug.php
function console_log(): string {
list( , $caller ) = debug_backtrace( false );
$action = current_action();
$encoded_args = [];
foreach ( func_get_args() as $arg ) try {
if ( is_object( $arg ) ) {
$extract_props = function( $obj ) use ( &$extract_props ): array {
$members = [];
$class = get_class( $obj );
foreach ( ( new ReflectionClass( $class ) )->getProperties() as $prop ) {
$prop->setAccessible( true );
$name = $prop->getName();
if ( isset( $obj->{$name} ) ) {
$value = $prop->getValue( $obj );
if ( is_array( $value ) ) {
$members[$name] = [];
foreach ( $value as $item ) {
if ( is_object( $item ) ) {
$itemArray = $extract_props( $item );
$members[$name][] = $itemArray;
} else {
$members[$name][] = $item;
}
}
} else if ( is_object( $value ) ) {
$members[$name] = $extract_props( $value );
} else $members[$name] = $value;
}
}
return $members;
};
$encoded_args[] = json_encode( $extract_props( $arg ) );
} else {
$encoded_args[] = json_encode( $arg );
}
} catch ( Exception $ex ) {
$encoded_args[] = '`' . print_r( $arg, true ) . '`';
}
$msg = '`📜`, `'
. ( array_key_exists( 'class', $caller ) ? $caller['class'] : "\x3croot\x3e" )
. '\\\\'
. $caller['function'] . '()`, '
. ( strlen( $action ) > 0 ? '`🪝`, `' . $action . '`, ' : '' )
. '` ➡️ `, ' . implode( ', ', $encoded_args );
$html = '<script type="text/javascript">console.log(' . $msg . ')</script>';
add_action( 'wp_enqueue_scripts', function() use ( $html ) {
echo $html;
} );
add_action( 'admin_enqueue_scripts', function() use ( $html ) {
echo $html;
} );
error_log( $msg );
return $html;
}
wp-config.php (partially)
// ...
define( 'WP_DEBUG', true );
// ...
/** Include WP debug helper */
if ( defined( 'WP_DEBUG' ) && WP_DEBUG && file_exists( ABSPATH . 'wp-debug.php' ) ) {
include_once ABSPATH . 'wp-debug.php';
}
if ( ! function_exists( 'console_log' ) ) {
function console_log() {
}
}
/** Sets up WordPress vars and included files. */
require_once( ABSPATH . 'wp-settings.php' );
Usage
Before the HTML <head> is rendered:
console_log( $myObj, $myArray, 123, "test" );
After the HTML <head> is rendered (in templates, etc. / use when the above does not work):
echo console_log( $myObj, $myArray, 123, "test" );
Output format
📜 <caller class>\<caller function>() 🪝 <caller action/hook> ➡️ <variables ...>
Special thanks to
Andre Medeiros for the property extraction method
You can write a utility function like this:
function prefix_console_log_message( $message ) {
$message = htmlspecialchars( stripslashes( $message ) );
//Replacing Quotes, so that it does not mess up the script
$message = str_replace( '"', "-", $message );
$message = str_replace( "'", "-", $message );
return "<script>console.log('{$message}')</script>";
}
The you may call the function like this:
echo prefix_console_log_message( "Error Message: This is really a 'unique' problem!" );
and this will output to console like this:
Error Message: This is really a -unique- problem!
Notice the quotes replaced with "-". It is done so that message does not mess up your script as pointed by #Josef-Engelfrost
You may also go one step further and do something like this:
function prefix_console_log_message( $message, $type = "log" ) {
$message_types = array( 'log', 'error', 'warn', 'info' );
$type = ( in_array( strtolower( $type ), $message_types ) ) ? strtolower( $type ) : $message_types[0];
$message = htmlspecialchars( stripslashes( $message ) );
//Replacing Quotes, so that it does not mess up the script
$message = str_replace( '"', "-", $message );
$message = str_replace( "'", "-", $message );
return "<script>console.{$type}('{$message}')</script>";
}
and call the function like this:
echo prefix_console_log_message( "Error Message: This is really a 'unique' problem!" , 'error');
It will output error in console.