PHP loop db query until value changes - php

Basically i have rate limit from my API provider for 50 simultaneous jobs. as an example lets say i have to run 500 jobs:
$jobs = $this->db->get('jobs')->result_array(); (loads all 500 jobs)
Bellow code loops api query for all 500 jobs.
foreach($jobs as $job)
{
//API call
}
each job has status parameter $job['status'], i want to do the following:
to avoid abuse of rate limit, i want to count rows with busy status
$busy = $this->db->get_where('jobs', ['status' => 'busy']->num_rows();
and keep on checking (looping) until $busy < 50
Final results
foreach($jobs as $job)
{
//API call
//Count busy jobs
//If busy >= 50 wait(50) and check again - (I need help with this part, no idea how to do it)
//If busy < 50 continue foreach loop
}
Hope all details are in place.
p.s.
I am using:
CodeIgniter3, PHP7.4
Edit:
To avoid confusion as mentioned in final result (//If busy >= 50 wait(50) and check again - (I need help with this part, no idea how to do it)
)
I am looking for a way to loop $busy SQL query until num_row() will be less than 50.

Found solution
$busy = $this->db->get_where('jobs', ['status' => 'busy']->num_rows();
while($busy >= 50 ) {
$busy = $this->db->get_where('jobs', ['status' => 'busy']->num_rows();
}

Related

mongodb deferred update while running the script

I do not understand what is going on with my migration script. So a have a collection with 40+m records in it, and historically that collection did not have a strict model, so I'm working on adding default values for some optional fields, for example, if the document does not have deleted_at I'll add it with the null value.
Basically, I'm taking documents in batches by 300, checking if a document should be updated and if so updating it. All was fine, I was able to update 12M documents in 9 hours. But after that, something weird started to happen, first of all, it started to work much much slower, like 100k documents in an hour which is ~10x slower than was before. Also from the logs, I can see that script updating documents pretty fast (I have a bunch of log entries related to updated documents every second), but if I run the count query to get the number of modified documents, the amount is not increasing so often. For example, depending on logs in 10 seconds 400 rows were updated, but the number of modified documents did not increase when the count query runs. The number of the modified documents simply increases once per some period of time, for example, the number can be the same for 2-3 minutes, and then at some point, it increases on 4k rows.
So I do not understand why at some point mongo starts running updates with some delay, scheduling them or something, and why it starts to work slower?
The script is pretty big, but I'll try to share the simplified version, so you can see how I'm looping through documents:
class Migration {
private Connection $connection;
public function __construct(Connection $collection)
{
$this->connection = $collection;
}
public function migrate(): void
{
$totalAmount = $this->connection->collection('collection')->count();
$chunkSize = 300;
$lastIdInBatch = null;
for ($i = 0; $i < $totalAmount; $i += $chunkSize) {
$aggregation = [];
$aggregation[] = [
'$sort' => ['_id' => 1],
];
if ($lastIdInBatch !== null) {
$aggregation[] = [
'$match' => [
'_id' => [
'$gt' => new ObjectId($lastIdInBatch),
],
],
];
}
$aggregation[] = [
'$limit' => $chunkSize,
];
$documents = $this->connection->collection('collection')->raw()->aggregate(
$aggregation
);
$lastIdInBatch = $documents[array_key_last($documents)]['_id'];
foreach ($documents as $document) {
// checks to see if we need to update the document
// ....
if (!empty($changes)) {
$updated = $this->connection
->collection('collection')
->where('_id', document['_id'])
->update($changes);
if ($updated) {
Log::info('row udpated', ['product_id' => document['_id']]) // I see multiple of this logs each seconds, but no changes in database
}
}
}
}
}
}
Issue self-healed after restart of kubernetes pod, so it seems like wasn't the issue with mongo

Cake PHP prevent retrieving same model rows from database with multiple cron jobs

I'm working inside a Cake PHP 2 web application, I have a database table called jobs where data is stored, I have a Console command which runs on a cron every minute and when it runs it grabs data from my jobs table in a function called getJobsByQueuePriority and then does something.
The issue I'm facing is that I have multiple cron jobs that need to be ran every minute and need to run at the same time, when they run, they're both grabbing the same sets of data from the database table, how can I prevent this and ensure that if a column was already retrieved by one cron, the other cron picks a different row?
I Initially tried adding 'lock' => true to my queries as per the docs, but this isn't achieving the result I need as when logging data to a file both running crons are pulling the same database entry ID's.
I then tried using transactions, I put a begin before the queries and a commit afterwards, maybe this is what I need to use but am using it slightly wrong?
The function which performs the required query with my attempt of transactions is:
/**
* Get queues in order of priority
*/
public function getJobsByQueuePriority($maxWorkers = 0)
{
$jobs = [];
$queues = explode(',', $this->param('queue'));
// how many queues have been set for processing?
$queueCount = count($queues);
$this->QueueManagerJob = ClassRegistry::init('QueueManagerJob');
$this->QueueManagerJob->begin();
// let's first figure out how many jobs are in each of our queues,
// this is so that if a queue has no jobs then we can reassign
// how many jobs can be allocated based on our maximum worker
// count.
foreach ($queues as $queue) {
// count jobs in this queue
$jobCountInQueue = $this->QueueManagerJob->find('count', array(
'conditions' => array(
'QueueManagerJob.reserved_at' => null,
'QueueManagerJob.queue' => $queue
)
));
// if there's no jobs in the queue, subtract a queue
// from our queue count.
if ($jobCountInQueue <= 0) {
$queueCount = $queueCount - 1;
}
}
// just in case we end up on zero.
if ($queueCount <= 0) {
$queueCount = 1;
}
// the amount of jobs we should grab
$limit = round($maxWorkers / $queueCount);
// now let's get all of the jobs in each queue with our
// queue count limit.
foreach ($queues as $queue) {
$job = $this->QueueManagerJob->find('all', array(
'conditions' => array(
'QueueManagerJob.reserved_at' => null,
'QueueManagerJob.queue' => $queue
),
'order' => array(
'QueueManagerJob.available_at' => 'desc'
),
'limit' => $limit
));
// if there's no job for this queue
// skip to the next so that we don't add
// an empty item to our jobs array.
if (!$job) {
continue;
}
// add the job to the list of jobs
array_push($jobs, $job);
}
$this->QueueManagerJob->commit();
// return the jobs
return $jobs[0];
}
What am I missing or is there a small change I need to tweak in my function to prevent multiple crons picking the same entries?

Getting more than 50 000 results crash the server on Laravel 5.8

I'm trying to get 400 000 results from a database with Laravel. My problem is that the server gives me a 500 error code, but does not generate a log file.
When I put a limit on my query, it works, but the limit is arround 20 000.
Maybe it's the php ini file, but I configurated it like this:
max_execition_time : -1
max_input_time : -1
max_input_vars: -1
memory_limit: -1
post_max_Size: -1
While Laravel gives a 500 error code with the limit, the server doesn't give an error. I need the 400 000 results.
(from OP in the comments):
The code is simply get all table order by datetime
Thanks for your help
Check this out:
Eloquent collection chunk
query builder chunk
Use chunk to get records in batch, and insert them in batch:
Model::orderBy('id')
->select(['col1', 'col2', ...])
->chunk(100, function ($records) {
// Create your array here:
$inserted_array = [];
foreach($records as $record) {
$inserted_attr = [];
$inserted_attr['column1'] = $record->col1;
$inserted_attr['column2'] = $record->col2;
...
$inserted_array []= $inserted_attr;
}
// And inserted 100 records at once.
History::insert($inserted_array);
});

Running for loop function inside Laravel Observable

I have an orders table with orderStatus and paymentStatus fields. When an order is made, the orderStatus is set to initialized and paymentStatus set to pending.
At the point when the order is created, I want to check if paymentStatus changed. If it did not, change after 12 minutes I want to update orderStatus to completed and 'paymentStatustoaborted`.
I have a schedule task that checks every one minute but unfortunately I have not been able to run cron jobs on Bluehost. So I tried using a for loop in the create method of OrderObserver but the code doesn't work.
public function created(Order $order)
{
// check if user reservation record exist
$reservation = Reservation::where([
['user_id', $order->user_id],
['product_id', $order->product_id]
]);
if ($reservation) {
// delete reservation record
$reservation->delete();
}
// start 12 mins count down for payment
$period = ($order->created_at)->diffInMinutes();
for ($counter = 0; $period >= 12; ++$counter) {
$order->update([
'orderStatus' => 'completed',
'paymentStatus' => 'aborted'
]);
}
}
From php artisan tinker, I can see that this part of the code works
for ($counter = 0; $period >= 12; ++$counter) {
$order->update([
'orderStatus' => 'completed',
'paymentStatus' => 'aborted'
]);
}
Why does the code not run in the observable?
This might have to do something with the fact that you are blocking php executing for 12 minutes, by making it be stuck in the same for loop. You're probably exceeding the max executing time.
max_execution_time integer
This sets the maximum time in seconds a script is allowed to run before it is terminated by the parser. This helps prevent poorly written scripts from tying up the server. The default setting is 30. When running PHP from the command line the default setting is 0.
Seeing as artisan tinker runs from the command line, it makes sense that it works there.

FB Ads API (#17) User request limit reached

I am working on Facebook ads api to get the account Campaign data.What I am doing here is I get list of all campaigns and doing forloop of each campaign get Campaign stat
$campaignSets = $account->getCampaigns(array(
CampaignFields::ID,
CampaignFields::NAME
));
foreach ($campaignSets as $campaign) {
$campaign = new Campaign($campaign->id);
$fields = array(
InsightsFields::CAMPAIGN_NAME,
InsightsFields::IMPRESSIONS,
InsightsFields::UNIQUE_CLICKS,
InsightsFields::REACH,
InsightsFields::SPEND,
InsightsFields::TOTAL_ACTIONS,
InsightsFields::TOTAL_ACTION_VALUE
);
$params = array(
'date_preset' => InsightsPresets::TODAY
);
$insights = $campaign->getInsights($fields, $params);
}
when executing above code I am getting error as (#17) User request limit reached.
Can anyone help me how to solve this kind of error?
Thanks,
Ronak Shah
You should consider generating a single report against the adaccount which returns insights for all of your campaigns, this should reduce the number of requests required significantly.
Cursor::setDefaultUseImplicitFetch(true);
$account = new AdAccount($account_id);
$fields = array(
InsightsFields::CAMPAIGN_NAME,
InsightsFields::CAMPAIGN_ID,
InsightsFields::IMPRESSIONS,
InsightsFields::UNIQUE_CLICKS,
InsightsFields::REACH,
InsightsFields::SPEND,
InsightsFields::TOTAL_ACTIONS,
InsightsFields::TOTAL_ACTION_VALUE,
);
$params = array(
'date_preset' => InsightsPresets::TODAY,
'level' => 'ad',
'limit' => 1000,
);
$insights = $account->getInsights($fields, $params);
foreach($insights as $i) {
echo $i->campaign_id.PHP_EOL;
}
If you run into API limits, your only option is to reduce calls. You can do this easily by delaying API calls. I assume you are already using a Cron Job, so implement a counter that stores the last campaign you have requested the data for. When the Cron Job runs again, request the data of the next 1-x campaign data (you have to test how many are possible per Cron Job call) and store the last one again.
Also, you should batch the API calls - it will not avoid limits, but it will be a lot faster. As fast as the slowest API call in the batch.
Add this to your code and you'll never have to worry about FB's Rate Limiting/User Limit Reached.
Your script will automatically sleep as soon as you approach the limit, and then pick up from where it left after the cool down. Enjoy :)
import logging
import requests as rq
#Function to find the string between two strings or characters
def find_between( s, first, last ):
try:
start = s.index( first ) + len( first )
end = s.index( last, start )
return s[start:end]
except ValueError:
return ""
#Function to check how close you are to the FB Rate Limit
def check_limit():
check=rq.get('https://graph.facebook.com/v3.3/act_'+account_number+'/insights?access_token='+my_access_token)
call=float(find_between(check.headers['x-business-use-case-usage'],'call_count":','}'))
cpu=float(find_between(check.headers['x-business-use-case-usage'],'total_cputime":','}'))
total=float(find_between(check.headers['x-business-use-case-usage'],'total_time":',','))
usage=max(call,cpu,total)
return usage
#Check if you reached 75% of the limit, if yes then back-off for 5 minutes (put this chunk in your loop, every 200-500 iterations)
if (check_limit()>75):
print('75% Rate Limit Reached. Cooling Time 5 Minutes.')
logging.debug('75% Rate Limit Reached. Cooling Time 5 Minutes.')
time.sleep(300)

Categories