Currently, I have the following problem:
I have created a WordPress environment that sends personalized emails to subscribers based on their preferences. This has worked for quite some time but for a couple of months, we are experiencing some inconsistencies. These inconsistencies are as followed:
Once in a while, the foreach loop for sending the emails stops in the middle of its execution. For example, we have a newsletter with 4000 subscribers. Once in a while, the program randomly stops its sending procedure at around 2500 emails. When this happens, there are literally no signs of any errors and there is also nothing to be seen in the debug log.
I have tried the following things to fix the issue:
Different sender; we switched from Sendgrid to SMTPeter (Dutch SMTP service)
Delays; we have tried whether placing a wait after x number of emails would have any impact because there might be too many requests per minute, but this was not the case.
Disable plugins; For 5 weeks we thought we had found the problem. WordFence seemed to be the problem, unfortunately, the send function stopped again last week and this did not appear to be causing the problems. Just to show how unstable it really is. It can go well for 5 weeks and then not for 2 weeks.
Rewriting of functions
Logging, we write values to a txt file after every important step to keep track of where the send function stops. This is just to see which users have received an email and which still need to receive it so that we can continue sending it from there.
Debug log, the annoying thing is that even when we have the wp_debug on, nothing comes up that indicates a cause of crashing.
To schedule the sender I use the WP_Cron to run the task in the background. From there the following function is triggered;
Below, the code I wrote in stripped format. I removed all the $message additions as this is just HTML with some variables of ACF for the email. I translated it so it's easier to understand.
<?php
function send_email($edition_id, $post)
{
require_once('SMTPeter.php'); //Init SMTPeter Sender
$myfile = fopen("log.txt", "a") or die("Unable to open file!"); //Open custom logfile
$editionmeta = get_post_meta($edition_id); //Get data of edition
$users = get_users();
$args = array(
'post_type' => 'articles',
'post_status' => 'publish',
'posts_per_page' => -1,
'order' => 'asc',
'meta_key' => 'position',
'orderby' => 'meta_value_num',
'meta_query' => array(
array(
'key' => 'edition_id',
'value' => $edition_id,
'compare' => 'LIKE',
),
),
);
$all_articles = new WP_Query($args); // Get all articles of edition
$i = 0; // Counter users interrested in topic
$j = 0; // Counter sent emails
foreach ($users as $user) { //Loop over all users <---- This is the loop that not always finishes all itterations
$topic_ids = get_field('topicselect_', 'user_' . $user->ID);
$topic_id = $editionmeta['topic_id'][0];
if (in_array($editionmeta['topic_id'][0], $topic_ids)) { // Check if user is interrested in topic.
$i++; // Counter interrested in topic +1.
// Header info
$headerid = $editionmeta['header_id'][0];
$headerimage = get_field('header_image', $headerid);
$headerimagesmall = get_field('header_image_small', $headerid);
// Footer info
$footerid = $editionmeta['footer_id'][0];
$footer1 = get_field('footerblock_1', $footerid);
$footer2 = get_field('footerblock_2', $footerid);
$footer3 = get_field('footerblock_3', $footerid);
$message = '*HTML header newsletter*'; // First piece of content email
if ($all_articles->have_posts()) :
$articlecount = 0; // Set article count to check for empty newsletters
while ($all_articles->have_posts()) : $all_articles->the_post();
global $post;
$art_categories = get_the_category($post->ID); // Get categories of article
$user_categories = get_field('user_categories_', 'user_' . $user->ID); // Get categories user is interrested in
$user_cats = array();
foreach ($user_categories as $user_category) {
$user_cats[] = $user_category->name; // right format for comparison
}
$art_cats = array();
foreach ($art_categories as $art_category) {
$art_cats[] = $art_category->name; // right format for comparison
}
$catcheck = array_intersect($user_cats, $art_cats); // Check if 1 of the article's categories matches one of a user's categories
if (count($catcheck) > 0) { // The moment the array intersect count is greater than 0 (at least 1 category matches), the article is added to the newsletter.
$message .= "*Content of article*"; // Append article to content of newsletter
$articlecount++;
}
endwhile;
endif;
if ($articlecount > 0) { //As soon as the newsletter contains at least 1 article, it will be sent.
$j++; //Sent email counter.
$mailtitle = $editionmeta['mail_subject'][0]; // Title of the email
$sender = new SMTPeter("*API Key*"); // Class SMTPeter sender
$output = $sender->post("send", array(
'recipients' => $user->user_email, // The receiving email address
'subject' => $mailtitle, // MIME's subject
'from' => "*Sender*", // MIME's sending email address
'html' => $message,
'replyto' => "*Reply To*",
'trackclicks' => true,
'trackopens' => true,
'trackbounces' => true,
'tags' => array("$edition_id")
));
error_log(print_r($output, TRUE));
fwrite($myfile, print_r($output, true));
}
}
}
fclose($myfile);
}
All I want to know is the following;
Why can't my code run the foreach completely, every time? I mean, it's quite frustrating to see that it sometimes works like a charm, and the next time it could get stuck again.
Some things I thought about but did not yet implement:
Rewrite parts of the function into separate functions. Retrieving the content and setting up the HTML for the newsletter could be done in a different function. Besides the fact that it would obviously be an improvement for cleaner code, I just wonder if this could actually be the problem.
Can a foreach crash due to a fwrite trying to write to a file that is already being written to? So does our log cause the function to not run properly? (Concurrency, but is this a thing in PHP with its workers?)
Could the entire sending process be written in a different way?
Thanks in advance,
Really looking forward to your feedback and findings
Related
I am creating a plugin to insert woocommerce products from an API, and everything is working fine for what I need however because there are a lot of products, the script fails after a while. So the script does its job and inserts about 170-180 products but because the script is running for so long it fails because it reaches the maximum execution time. I am looking for a way to make sure the script can install at least 4k-5k products.
I know I can increase the maximum execution time but this does not seem to me like a professional method of doing this job, and it means I would have to increase this manually depending on how many products need to be created/updated which seems very wrong and I am sure there must be a much better way to handle things like this, here is my code so far:
public static function bb_products_api_call()
{
// Fetch products from API
$url = 'http://all-products-api-endpoint-here.com';
$args = [
'timeout' => 55,
'headers' => array(
'Authorization' => 'XXXXXXXXX'
)
];
$external_products = wp_remote_retrieve_body( wp_remote_get( $url, $args ) );
$products = json_decode( $external_products );
echo "<div class=\"wrap\">";
echo "<pre>";
foreach($products as $key => $product) {
if( $product->situation > 0 ) {
$str = $product->description;
$dash = strpos($str, '-');
$dashPostion = $dash + 1;
$bar = strpos($str, '|');
$barPosition = $bar + 1;
if($dash && $bar !== false) {
$sD = "";
$sB = "";
$secondDash = strpos($str, '-', $dashPostion);
if($secondDash !== false) {
//echo "more than 1 - people!\n ";
$sD = $secondDash;
}
$secondBar = strpos($str, '|', $barPosition);
if($secondBar !== false) {
//echo "more than 1 | ffs!\n ";
$sB = $secondBar;
}
if($sD == "" && $secondBar == "") {
//echo "all good";
// getting final product list
$inStock[] = array(
"productID" => $product->productID, // ID
"modelAndColor" => $product->code2, // model and color
"name" => $product->subGroupDescription, // product name (title)
"description" => $product->longDescription, // product description
"sku" => $product->description, // product SKU
"color" => $product->classifier1Description, // color
"size" => $product->classifier2Description, // size
"category" => $product->classifier4Description, // category
"subCategory" => $product->classifier6Description, // sub category
"regularPrice" => $product->salesPriceDefault, // product price
"hasDiscount" => $product->hasDiscount, // 1 for discount, 0 for not on discount
"discountPercentage" => $product->discountPercentage, // discount percentage
"stock" => $product->situation, // stock
);
foreach($inStock as $item) {
$hash = $item['sku'];
$hash = substr( $hash, 0, strpos( $hash, "-" ) );
$uniqueArray[$hash] = $item;
}
$parentProducts = array_values( $uniqueArray );
if(!empty( $parentProducts )) {
foreach($parentProducts as $product) {
$variable = $product['sku'];
$variable = substr( $variable, 0, strpos( $variable, "-" ) );
$product_id = wc_get_product_id_by_sku( $variable );
$product['sku'] = $variable;
if( empty( $product_id ) ) {
$product_id = self::createOrUpdateProduct( $product );
} else {
$product_id = self::createOrUpdateProduct( $product, $product_id );
}
}
}
}
}
}
}
//print_r( $inStock );
print_r( $parentProducts );
echo "</pre>";
echo "</div>";
}
I did also try adding a for loop and count how many products have been installed and let the script sleep for 2-3 seconds hoping it would maybe reset the max execution time and prevent it from happening like so (no luck on this):
for($i = 0; $i >= 25; $i++) {
$variable = $product['sku'];
$variable = substr( $variable, 0, strpos( $variable, "-" ) );
$product_id = wc_get_product_id_by_sku( $variable );
$product['sku'] = $variable;
if( empty( $product_id ) ) {
// $product_id = self::createOrUpdateProduct( $product );
if( $product_id = self::createOrUpdateProduct( $product ) ) {
$count = $count + 1;
}
} else {
// $product_id = self::createOrUpdateProduct( $product, $product_id );
if( $product_id = self::createOrUpdateProduct( $product, $product_id ) ) {
$count = $count + 1;
}
}
if( $count >= 25 ) {
sleep(3);
$count = 0;
}
}
Note: Please dont mind what I am doing to that SKU by extracting a certain part from it and finding only the distinct model numbers and
then using them for an SKU, that part is working fine.
If anyone has had similar experiences and found a way to successfully implement a script that does not exceed execution time, I would appreciate a lot if you can share a solution, thank you.
Approach 1: Send requests for smaller batches using JavaScript
The user visits a page that contains JavaScript that does "Ajax" requests for small parts of the task.
Determine a batch size that is guaranteed to complete before any timeouts. Maybe that's about 50 to 100 products for you? Given it fails at about 170, with some margin.
Now add a REST API endpoint (or WP Ajax) that you can call that will process this amount of products.
Since the bottleneck here is probably not the API call, you could still fetch the whole API response in one go, store it somewhere (wp_cache_add for example), and process it in chunks.
You then need some JavaScript on the front end that sends one of these batches, waits for it to complete, then sends another one, ...
Pro:
server has no long running requests
relatively easy to implement
easy to visualize processing status on front end
Con:
need to add JavaScript
processing doesn't complete if user closes browser tab
Examples
How to turn code into an endpoint?
You should, first of all, split your logic into functions. Currently your code is all inside a single function making it quite inflexible. Let's assume for the example code that you have a function saveProduct($data), and a function fetchAllFromApi() that returns you an array of $data arrays. For brevity I won't include their implementation, you can pretty much just move the code from the big function inside them with no major changes.
$cache_key = 'someAPIsProducts';
$cache_lifetime = 3600; // 1 hour
$batch_size = 50;
$start = $_GET['start']; // Or any other way to accept request params...
$from_cache = wp_cache_get($cache_key);
$data = $from_cache === false ? fetchAllFromApi() : $from_cache;
if ($from_cache === false) {
// Cache for next time.
wp_cache_add($cache_key, $data, null, $cache_lifetime);
}
$batch = array_slice($data, $start, $batch_size);
foreach ($batch as $data) {
saveProduct($data);
}
// Return successful HTTP response.
wp_send_json_success( ... );
I'll explain the front end details using WooCommerce's solution.
WooCommerce
Back end
Here they register an endpoint for executing a batch.
add_action( 'wp_ajax_woocommerce_do_ajax_product_import', array( $this, 'do_ajax_product_import' ) );
The exact details of their endpoint are not that important. In your case you can just put a loop in there and make the loop trigger your product save functions for the specified amount of items in a batch, as per the above code.
Front end
I'll just include the relevant fragments from their front end file handling the import batching.
They initiate a request to this endpoint in the run_import method.
$.ajax( {
type: 'POST',
url: ajaxurl,
data: {
action : 'woocommerce_do_ajax_product_import',
position : $this.position,
mapping : $this.mapping,
file : $this.file,
update_existing : $this.update_existing,
delimiter : $this.delimiter,
security : $this.security
},
In the response handler the function calls itself as long as the response didn't indicate this was the last batch.
if ( 'done' === response.data.position ) {
// ...
} else {
$this.run_import();
}
And that's basically it. Their implementation is a bit more complex because they handle mapping CSV fields, which you don't need to do.
ElasticPress plugin
If you don't have this plugin set up (and it requires setting up ES too), it can be a bit hard to follow the code. Still mentioning it because in case you do have it, it's easy to see their solution in action on the admin index page.
This plugin provides integration for WordPress with ElasticSearch. While it's quite a different use case, they have the same problem when trying to synchronize a large amount of data to the ElasticSearch database.
Front end
They use essentially the React version of WooCommerce's jQuery code. source
const doIndex = useCallback(
/**
* Start or continues a sync.
*
* #param {boolean} isDeleting Whether to delete and sync.
* #returns {void}
*/
(isDeleting) => {
index(isDeleting)
.then(updateSyncState)
.then(
/**
* If an existing sync has been found just check its status,
* otherwise continue syncing.
*
* #param {string} method Sync method.
*/
(method) => {
if (method === 'cli') {
doIndexStatus();
} else {
doIndex(isDeleting);
}
},
)
.catch(syncFailed);
},
[doIndexStatus, index, syncFailed, updateSyncState],
);
Back end
I didn't find time to locate this file, but they work similarly to WooCommerce.
Approach 2: Asynchronous server side processing
You could set up an offline job runner of some sort, that has a really long max execution time. Then when the user requests to get products from the API, you can immediately send a response notifying the job is accepted for processing.
A user can then track on your page the status of this processing job (or multiple).
You can use WP-Cron as a "poor man's cron job", which will execute these jobs on random requests. But it's not advisable, definitely if the job takes long to run.
At least that's the default behavior. You can set it up to only run when triggered by the system cron.
You could still batch theses jobs of course.
Pro:
doesn't need web requests so will always complete
no impact on web traffic response times
Con:
requires much more complex infrastructure
challenging to implement
Approach 3: Run batch SQL queries to improve capacity
This approach is generally not advisable in WordPress, as it bypasses a whole bunch of filters and actions that may be used by other plugins. For some types of data it's not a problem, but you should only do this if you're entirely sure it's safe.
The slowness is likely due to many separate database queries being executed for every product, one by one.
If you know the exact structure your data needs to have in the database, and are sure it doesn't need to pass through any filters first, you can write plain SQL queries that do batched inserts.
There's also this WordPress StackExchange answer which suggests to use wp_defer_term_counting( true ) before batch importing. I just mention it here because likely it won't be enough of a speedup for you, but you never know.
Pro:
orders of magnitude faster
Con:
generally unsafe to do
need to maintain a complex SQL query
Approach 4: Use WooCommerce's built in CSV importer
WooCommerce has a built in CSV importer which already handles importing large amounts of items. It's implemented like approach 1, here's the endpoint they call from the front end to execute batches.
This involves writing the API data to a CSV file, and inserting the right records to initiate the WooCommerce import as though a user had just uploaded that file. Alternatively you could figure out how to directly create these import jobs without needing the intermediary file. With the file is probably overall easier to implement, though.
I might come back to this with more detailed steps, as it would definitely be the overall best solution.
Pro:
no need to implement these features
more stable
Con:
need to figure out how WooCommerce CSV upload works
may require a lot of "wiring up" code
My code is based off the sample mentioned on this page:
use Google\Cloud\Logging\LoggingClient;
$filter = sprintf(
'resource.type="gae_app" severity="%s" logName="%s"',
strtoupper($level),
sprintf('projects/%s/logs/app', 'MY_PROJECT_ID'),
);
$logOptions = [
'pageSize' => 20,
'resultLimit' => 20,
'filter' => $filter,
];
$logging = new LoggingClient();
$logs = $logging->entries($logOptions);
foreach ($logs as $log) {
/* Do something with the logs */
}
This code is (at best) slow to complete, and (at worst) times out on the foreach loop with a DEADLINE_EXCEEDED error.
How can I fix this?
If your query does not match the first few logs it finds, Cloud Logging will attempt to search your entire logging history for the matching logs.
If there are too many logs to filter through, the search will time out with a DEADLINE_EXCEEDED message.
You can fix this by specifying a time frame to search from in your filter clause:
// Specify a time frame to search (e.g. last 5 minutes)
$fiveMinAgo = date(\DateTime::RFC3339, strtotime('-5 minutes'));
// Add the time frame constraint to the filter clause
$filter = sprintf(
'resource.type="gae_app" severity="%s" logName="%s" timestamp>="%s"',
strtoupper($level),
sprintf('projects/%s/logs/app', 'MY_PROJECT_ID'),
$fiveMinAgo
);
I am trying to send a notification when a certain event is about to happen in one hour.
For testing purposes I currently am having the cron process run once a minute. But I suspect there is a more efficient way to go about this.
I am trying to avoid keeping track of notifications, and so I am trying to just build in some logic so that I can get one trigger on the notification.
Here is my current process:
function webinar_starts_onehour() {
// Get all lessons that are in future
$today = time();
$args = array(
'post_type' => 'lessons',
'post_status' => 'publish',
'posts_per_page' => -1,
'meta_query' => array(
'key' => 'webinar_time',
'value' => $today,
'compare' => '>='
)
);
$lessons = get_posts( $args );
$notifications = get_notifications( 'webinar_starts_onehour' );
// Foreach lesson
foreach ( $lessons as $lesson ) {
$webinar_time = strtotime($lesson->webinar_time);
$difference = round(($webinar_time - $today) / 3600,2);
if (($difference > .98) && ($difference < 1.017)) {
// do something
}
}
}
So what I am trying to do is have it trigger if it is a little less than an hour away or a little more (that is +- one minute).
I suspect my condition can be set twice in some situations and so trying to figure out a more solid way to make sure with a cron that fires every minute, that this condition would be triggered only once.
Ideas?
And if you think this is really unreliable (which I know it is) what would be a pragmatic way to add a table to track this sort of notification? Would I just create a table, say, sent_notifications which would have user_id, notification_id, lesson_id, status
and then check if there was a successful notification for this particular lesson and for failed sends use another cron to continuously try sending the failed ones?
thanks,
Brian
Trying to update a batch of emails. I think I've tried every way to do this, but my use of DrewM's MailChimp wrapper only returns the following $result content:
Array ( [id] => 1234abcd [status] => pending [total_operations] => 0 [finished_operations] => 0
And so on. No errors, but no operations!
Essentially, my code looks like this, where $emails stores all the emails in an array.
include("MailChimp.php");
include("Batch.php");
$list_id = "1234abcd";
use \DrewM\MailChimp\MailChimp;
use \DrewM\MailChimp\Batch;
$apiKey = 'aslkjf84983hg84938h89gd-us13';
if(!isset($emails)){ // If not sending bulk requests
$MailChimp = new MailChimp($apiKey);
$subscriber_hash = $MailChimp->subscriberHash($email);
$result = $MailChimp->patch("lists/$list_id/members/$subscriber_hash",
array(
'status' => 'subscribed',
)
);
/* SENDING BATCH OF EMAILS */
} else if($emails){
$MailChimp = new MailChimp($apiKey);
$Batch = $MailChimp->new_batch();
$i = 1;
foreach($emails as &$value){
$Batch->post("op".$i, "lists/$list_id/members", [
'email_address' => $value,
'status' => 'subscribed',
]);
$i++;
}
$result = $Batch->execute(); // Send the request (not working I guess)
$MailChimp->new_batch($batch_id); // Now get results
$result = $Batch->check_status();
print_r($result);
}
If anyone can see what I'm not seeing, I'll be very grateful!
Problem solved. After talking with a rep at MailChimp, he helped to find two major problems.
Instead of using a POST method, he said to use PUT, when working with already existing emails. POST is best used for adding emails, while PUT can add and update emails.
So, change
$Batch->post
to
$Batch->put
Secondly, after successfully sending requests and getting errors in the $result, he found they were 405 errors and told me to add the md5 hash to my emails.
So, change
$Batch->post("op".$i, "lists/$list_id/members", [ ...
to
$subscriber_hash = $MailChimp->subscriberHash($value);
$Batch->put("op$i", "lists/$list_id/members/$subscriber_hash", [ ...
And they sent me a MailChimp stocking cap for being a good sport :-)
Veni. Vidi. Vici.
I have successfully created a simple RSS feed, but entries keep coming back as unread and updated, and entries deleted from the client reappear everytime I ask mail to update the feed.
What am I doing wrong?
I use this simple function to create an rss feed:
public static function getFeed($db)
{
$title = 'Latest feeds';
$feedUri = '/rss/';
//link from which feed is available
$link = 'http://' . $_SERVER['HTTP_HOST'] . $feedUri;
//create array according to structure defined in Zend_Feed documentation
$feedArr = array('title' => $title,
'link' => $link,
'description' => $title,
'language' => 'en-us',
'charset' => 'utf-8',
//'published' => 1237281011,
'generator' => 'Zend Framework Zend_Feed',
'entries' => array()
);
$itemObjs = array();
$select = $db->select('id')->from('things')
->order('createddate desc')
->limit(10);
$results = $db->fetchAll($select->__toString());
$count = count($results);
for($i=0;$i<$count;$i++) {
$itemObjs[] = SiteUtil::getItemObjectInstance($db, $results[$i]['id']);
}
$count = count($itemObjs);
for($i=0;$i<$count;$i++) {
$obj = & $itemObjs[$i];
$feedArr['entries'][] = array('title' => $obj->getSummary(),
'link' => 'http://' . $_SERVER['HTTP_HOST'] . $obj->getDetailUri(),
'description' => $obj->description,
'publishdate' => $obj->publishedDate,
'guid' => 'http://' . $_SERVER['HTTP_HOST'] . $obj->getDetailUri()
);
}
$feed = Zend_Feed::importArray($feedArr, 'rss');
return $feed;
}
The action in the controller class is:
public function rssAction()
{
$feed = FeedUtil::getFeed($this->db);
$feed->send();
}
So to access the feed, I point the client to:
http://mysite.com/rss
I am using mac mail's rss client to test. The feed downloads just fine, showing all 5 items I have in the database for testing purposes. The problems are as follows:
1) If I mark one or more items as 'read' and then tell mail to update the feed, it pulls all items again as if I never downloaded them in the first place.
2) If I delete one or more items they come back again, unread, again as if it were the first time I subscribed to the feed.
3) Feeds are always marked as updated. Is that supposed to be the case?
Is is something to do with the parameters I'm setting, am I omitting something, or could the solution be something more subtle like setting HTTP content headers (e.g. '304 Not Modified')?
My understanding of rss is that once an item has been marked as read or deleted from the client, it should never come back, which is the behaviour I'm after.
Just to note, the 'link' and 'guid' parameters are always unique, and I have tried experimenting with 'published' and 'publishdate' (both optional) attributes only get the same result. The above code is a simplified version of what I have, showing only the relevant bits, and finally, yes, I have read the rss specification.
Thanks in advance for any help offered here, I'll be happy to clarify any point.
According to the Zend Framework Doc, you must use the lastUpdate parameter to set the last modification date of an entry.
'entries' => array(
array(
[...]
'lastUpdate' => 'timestamp of the publication date', // optional
[...]
So published for the feed, and lastUpdate for the entries.