I have module, called parser2, which does parsing from html pages. Everything works perfectly, except cron schedule. It just doesnt add my task to cron tasks, and in logs, i see there are 0 scheduled tasks, everytime cron runs. The next issue, if i manually start cron few times, after i'm getting white screen of death, and the only 1 thing that helps me, its delete parser2 tables from DB, and from system.table, and then run update.php.
Here is the code that should be doing all this work, but i cant understand where is error here
function parser_cron_queue_info() {
$info = array();
$info['get_parser_weather'] = array(
'worker callback' => 'parser_weather',
'time' => 10,
);
$query = db_select('parser_jobs', 'pn')
->fields('pn', array('id','time_run_in_crone'))
->condition('run_in_crone', 1)
->execute();
foreach ($query as $job){
$info['get_parser_weather_'.$job->id] = array(
'worker callback' => 'parser_weather',
'time' => $job->time_run_in_crone,
};
}
return $info;
}
function parser_cron() {
$query = db_select('parser_jobs', 'pn')
->fields('pn', array('id','time_run_in_crone'))
->condition('run_in_crone', 1)
->execute();
foreach ($query as $job){
$queue = DrupalQueue::get('get_parser_weather_'.$job->id);
$queue->createItem($job->id);
}
}
function parser_weather($job_id){
$job = parser_job_load($job_id);
_parser_url_delete_all();
_parser_url_add($job->start_url);
while (_parser_url_get_not_parsed())
{
parser_parse2($job);
};
}
Try getting the cron command the code is generating and then run it in you command line to make sure it is correct also double check if the user your script is running has permission to add new task.
Related
I do not understand what is going on with my migration script. So a have a collection with 40+m records in it, and historically that collection did not have a strict model, so I'm working on adding default values for some optional fields, for example, if the document does not have deleted_at I'll add it with the null value.
Basically, I'm taking documents in batches by 300, checking if a document should be updated and if so updating it. All was fine, I was able to update 12M documents in 9 hours. But after that, something weird started to happen, first of all, it started to work much much slower, like 100k documents in an hour which is ~10x slower than was before. Also from the logs, I can see that script updating documents pretty fast (I have a bunch of log entries related to updated documents every second), but if I run the count query to get the number of modified documents, the amount is not increasing so often. For example, depending on logs in 10 seconds 400 rows were updated, but the number of modified documents did not increase when the count query runs. The number of the modified documents simply increases once per some period of time, for example, the number can be the same for 2-3 minutes, and then at some point, it increases on 4k rows.
So I do not understand why at some point mongo starts running updates with some delay, scheduling them or something, and why it starts to work slower?
The script is pretty big, but I'll try to share the simplified version, so you can see how I'm looping through documents:
class Migration {
private Connection $connection;
public function __construct(Connection $collection)
{
$this->connection = $collection;
}
public function migrate(): void
{
$totalAmount = $this->connection->collection('collection')->count();
$chunkSize = 300;
$lastIdInBatch = null;
for ($i = 0; $i < $totalAmount; $i += $chunkSize) {
$aggregation = [];
$aggregation[] = [
'$sort' => ['_id' => 1],
];
if ($lastIdInBatch !== null) {
$aggregation[] = [
'$match' => [
'_id' => [
'$gt' => new ObjectId($lastIdInBatch),
],
],
];
}
$aggregation[] = [
'$limit' => $chunkSize,
];
$documents = $this->connection->collection('collection')->raw()->aggregate(
$aggregation
);
$lastIdInBatch = $documents[array_key_last($documents)]['_id'];
foreach ($documents as $document) {
// checks to see if we need to update the document
// ....
if (!empty($changes)) {
$updated = $this->connection
->collection('collection')
->where('_id', document['_id'])
->update($changes);
if ($updated) {
Log::info('row udpated', ['product_id' => document['_id']]) // I see multiple of this logs each seconds, but no changes in database
}
}
}
}
}
}
Issue self-healed after restart of kubernetes pod, so it seems like wasn't the issue with mongo
I'm working inside a Cake PHP 2 web application, I have a database table called jobs where data is stored, I have a Console command which runs on a cron every minute and when it runs it grabs data from my jobs table in a function called getJobsByQueuePriority and then does something.
The issue I'm facing is that I have multiple cron jobs that need to be ran every minute and need to run at the same time, when they run, they're both grabbing the same sets of data from the database table, how can I prevent this and ensure that if a column was already retrieved by one cron, the other cron picks a different row?
I Initially tried adding 'lock' => true to my queries as per the docs, but this isn't achieving the result I need as when logging data to a file both running crons are pulling the same database entry ID's.
I then tried using transactions, I put a begin before the queries and a commit afterwards, maybe this is what I need to use but am using it slightly wrong?
The function which performs the required query with my attempt of transactions is:
/**
* Get queues in order of priority
*/
public function getJobsByQueuePriority($maxWorkers = 0)
{
$jobs = [];
$queues = explode(',', $this->param('queue'));
// how many queues have been set for processing?
$queueCount = count($queues);
$this->QueueManagerJob = ClassRegistry::init('QueueManagerJob');
$this->QueueManagerJob->begin();
// let's first figure out how many jobs are in each of our queues,
// this is so that if a queue has no jobs then we can reassign
// how many jobs can be allocated based on our maximum worker
// count.
foreach ($queues as $queue) {
// count jobs in this queue
$jobCountInQueue = $this->QueueManagerJob->find('count', array(
'conditions' => array(
'QueueManagerJob.reserved_at' => null,
'QueueManagerJob.queue' => $queue
)
));
// if there's no jobs in the queue, subtract a queue
// from our queue count.
if ($jobCountInQueue <= 0) {
$queueCount = $queueCount - 1;
}
}
// just in case we end up on zero.
if ($queueCount <= 0) {
$queueCount = 1;
}
// the amount of jobs we should grab
$limit = round($maxWorkers / $queueCount);
// now let's get all of the jobs in each queue with our
// queue count limit.
foreach ($queues as $queue) {
$job = $this->QueueManagerJob->find('all', array(
'conditions' => array(
'QueueManagerJob.reserved_at' => null,
'QueueManagerJob.queue' => $queue
),
'order' => array(
'QueueManagerJob.available_at' => 'desc'
),
'limit' => $limit
));
// if there's no job for this queue
// skip to the next so that we don't add
// an empty item to our jobs array.
if (!$job) {
continue;
}
// add the job to the list of jobs
array_push($jobs, $job);
}
$this->QueueManagerJob->commit();
// return the jobs
return $jobs[0];
}
What am I missing or is there a small change I need to tweak in my function to prevent multiple crons picking the same entries?
Basically i have rate limit from my API provider for 50 simultaneous jobs. as an example lets say i have to run 500 jobs:
$jobs = $this->db->get('jobs')->result_array(); (loads all 500 jobs)
Bellow code loops api query for all 500 jobs.
foreach($jobs as $job)
{
//API call
}
each job has status parameter $job['status'], i want to do the following:
to avoid abuse of rate limit, i want to count rows with busy status
$busy = $this->db->get_where('jobs', ['status' => 'busy']->num_rows();
and keep on checking (looping) until $busy < 50
Final results
foreach($jobs as $job)
{
//API call
//Count busy jobs
//If busy >= 50 wait(50) and check again - (I need help with this part, no idea how to do it)
//If busy < 50 continue foreach loop
}
Hope all details are in place.
p.s.
I am using:
CodeIgniter3, PHP7.4
Edit:
To avoid confusion as mentioned in final result (//If busy >= 50 wait(50) and check again - (I need help with this part, no idea how to do it)
)
I am looking for a way to loop $busy SQL query until num_row() will be less than 50.
Found solution
$busy = $this->db->get_where('jobs', ['status' => 'busy']->num_rows();
while($busy >= 50 ) {
$busy = $this->db->get_where('jobs', ['status' => 'busy']->num_rows();
}
I have an orders table with orderStatus and paymentStatus fields. When an order is made, the orderStatus is set to initialized and paymentStatus set to pending.
At the point when the order is created, I want to check if paymentStatus changed. If it did not, change after 12 minutes I want to update orderStatus to completed and 'paymentStatustoaborted`.
I have a schedule task that checks every one minute but unfortunately I have not been able to run cron jobs on Bluehost. So I tried using a for loop in the create method of OrderObserver but the code doesn't work.
public function created(Order $order)
{
// check if user reservation record exist
$reservation = Reservation::where([
['user_id', $order->user_id],
['product_id', $order->product_id]
]);
if ($reservation) {
// delete reservation record
$reservation->delete();
}
// start 12 mins count down for payment
$period = ($order->created_at)->diffInMinutes();
for ($counter = 0; $period >= 12; ++$counter) {
$order->update([
'orderStatus' => 'completed',
'paymentStatus' => 'aborted'
]);
}
}
From php artisan tinker, I can see that this part of the code works
for ($counter = 0; $period >= 12; ++$counter) {
$order->update([
'orderStatus' => 'completed',
'paymentStatus' => 'aborted'
]);
}
Why does the code not run in the observable?
This might have to do something with the fact that you are blocking php executing for 12 minutes, by making it be stuck in the same for loop. You're probably exceeding the max executing time.
max_execution_time integer
This sets the maximum time in seconds a script is allowed to run before it is terminated by the parser. This helps prevent poorly written scripts from tying up the server. The default setting is 30. When running PHP from the command line the default setting is 0.
Seeing as artisan tinker runs from the command line, it makes sense that it works there.
So I am developing a laravel application and I am trying to get my seeds optimised so that they run faster.
http://bensmith.io/speeding-up-laravel-seeders
This guide helped a ton. According to this, I should minimise the number of SQL queries by doing mass insertions and it cut down the time to 10% of the original seeding time which is awesome.
So now I am doing something like:
$comments = [];
for ($i = 0; $i < 50; $i++) {
$bar->advance();
$comments[] = factory(Comment::class)->make([
'created_at' => Carbon\Carbon::now(),
'updated_at' => Carbon\Carbon::now(),
'comment_type_id' => $comment_types->shuffle()->first()->id,
'user_id' => $users->shuffle()->first()->id,
'commentable_id' => $documents->shuffle()->first()->id,
])->toArray();
}
Comment::insert($comments);
This works like a charm. It gets the queries down to a single one.
But then I have other seeders where I to work with dumps and they are more complex:
$dump = file_get_contents(database_path('seeds/dumps/serverdump.txt'));
DB::table('server')->truncate();
DB::statement($dump);
$taxonomies = DB::table('server')->get();
foreach($taxonomies as $taxonomy){
$bar->advance();
$group = PatentClassGroup::create(['name' => $taxonomy->name]);
preg_match_all('/([a-zA-Z0-9]+)/', $taxonomy->classes, $classes);
foreach(array_pop($classes) as $key => $class){
$type = strlen($class) == 4 ? 'GROUP' : 'MAIN';
$inserted_taxonomies[] = PatentClassTaxonomy::where('name', $class)->get()->count()
? PatentClassTaxonomy::where('name', $class)->get()->first()
: PatentClassTaxonomy::create(['name' => $class, 'type' => $type]);
}
foreach($inserted_taxonomies as $inserted_taxonomy){
try{
$group->taxonomies()->attach($inserted_taxonomy->id);
}catch(\Exception $e){
//
}
}
}
So here when I am attaching taxonomies to groups, I use the native eloquent code so taking the record and mass inserting is difficult.
Yes, I can fiddle around and figure out a way to mass insert that too but my problem is that I have to write and optimise all seeds and every part of those seeds to mass insert.
Is there a way, where I can listen to DB queries laravel is trying to execute while seeding. I know I can do something like this:
DB::listen(function($query) {
//
});
But it would still be executed right. What I would like to do is somehow catch the query in a variable, add it to a stack and then execute the whole stack when the seed is coming to an end. Or in between too since I might need some ids for some seeds. What is a good workaround this? And how to really optimise the seeds in laravel with a smart solution?