In Drupal there are the QueueWorker
and QueueWorkerManager
classes that are helpful in managing a queue. One benefit of managing tasks is a queue is they offer a way of managing tasks or jobs in an orderly way. Think First in First Out.
The way the QueueWorker in Drupal works is tied in with cron.
Lets have a look at the run function in web/core/lib/Drupal/Core/Cron.php
/**
* {@inheritdoc}
*/
public function run() {
// Allow execution to continue even if the request gets cancelled.
@ignore_user_abort(TRUE);
// Force the current user to anonymous to ensure consistent permissions on
// cron runs.
$this->accountSwitcher->switchTo(new AnonymousUserSession());
// Try to allocate enough time to run all the hook_cron implementations.
Environment::setTimeLimit(240);
$return = FALSE;
// Try to acquire cron lock.
if (!$this->lock->acquire('cron', 900.0)) {
// Cron is still running normally.
$this->logger->warning('Attempting to re-run cron while it is already running.');
}
else {
$this->invokeCronHandlers();
// Process cron queues.
$this->processQueues();
$this->setCronLastTime();
// Release cron lock.
$this->lock->release('cron');
// Return TRUE so other functions can check if it did run successfully
$return = TRUE;
}
// Restore the user.
$this->accountSwitcher->switchBack();
return $return;
}
Here we can see the following:
- Once cron task has been started it can not be cancelled. See https://www.php.net/manual/en/function.ignore-user-abort.php
- Cron is run as an anonymous user
- The time limit is set to 4 minutes
- Return value is set to FALSE (default)
- The cron function then tries to acquire the cron lock and sets the time out to 900 (15 minutes)
- If the lock is not acquired, a notice message is logged to that affect
- If the lock file is acquired, we start be invoking all the modules that implement hook_cron.
- Then the process queues are processed. We can look at this in more detail later.
- Then the cron last updated time is updated. More details on how that is handled can be found in
web/core/lib/Drupal/Core/Lock/LockBackendInterface.php
- Once the lock has been released the function sets the
$return
value to TRUE - Next the user's user session is restored
- Finally the return value is returned
This shows us a few things:
- If the cron lock is in place (see semaphore table) cron::run will return FALSE
- The cron lock file is set to 15 minutes
- Drupal's Queue Worker is processed when running cron
- Hook cron tasks are executed before the Queue Worker Tasks
If the tasks exceed 4 minutes, the chances are cron will returns FALSE. If everything completes within that time will return TRUE.
i.e. if a task takes more than 4 minutes but less than 15 minutes, it may run continue to successfully or not. However if it takes longer than 4 minutes, there is no way of knowing if it completes successfully or not.
If a task takes longer than 15 minutes again we have no way of knowing if a task is successful or not.
There is also a chance that a after the cron lock clears, a subsequent task will either interrupt the previous task and cause it to not complete successfully. An example of this might be a long running migration task that has it's own lock mechanism to ensure a migration cannot be started if one is running already. However, if the status of the migration does not exit correctly, it will still be marked as running, causing any subsequent migrations to not be able to start.
One benefit of using a queue is they are designed to process one task at a time. The way this works is a queue may be set to process as many jobs within a given timescale. This is different to a cron task that will likely process ALL tasks sequentially. One benefit of this is it allows you to use a queue as a way to throttle the demands on your website. If we don't have time to execute a task now, we can make sure we can run it later.
Here is the logic around how ProcessQueues works
/**
* Processes cron queues.
*/
protected function processQueues() {
$max_wait = (float) $this->queueConfig['suspendMaximumWait'];
// Build a stack of queues to work on.
/** @var array<array{process_from: int<0, max>, queue: \Drupal\Core\Queue\QueueInterface, worker: \Drupal\Core\Queue\QueueWorkerInterface}> $queues */
$queues = [];
foreach ($this->queueManager->getDefinitions() as $queue_name => $queue_info) {
if (!isset($queue_info['cron'])) {
continue;
}
$queue = $this->queueFactory->get($queue_name);
// Make sure every queue exists. There is no harm in trying to recreate
// an existing queue.
$queue->createQueue();
$worker = $this->queueManager->createInstance($queue_name);
$queues[] = [
// Set process_from to zero so each queue is always processed
// immediately for the first time. This process_from timestamp will
// change if a queue throws a delayable SuspendQueueException.
'process_from' => 0,
'queue' => $queue,
'worker' => $worker,
];
}
OK, so here we can see that the max wait is configurable. By default this is set to 30 seconds. This is taken from default.services.yml
queue.config:
# The maximum number of seconds to wait if a queue is temporarily suspended.
# This is not applicable when a queue is suspended but does not specify
# how long to wait before attempting to resume.
suspendMaximumWait: 30
If $queue_info['cron']
is not defined on each queue, it is skipped.
One thing of note is that modules like migrate queue importer that work with Drupal's Queue mechanism actually attempt to bypass the limitation here by depending on hook_cron rather than the queue. This is taken from migrate_queue_importer.module
/**
* Implements hook_cron().
*/
function migrate_queue_importer_cron() {
// Some migrations import process might cause timeout, thus avoid executing
// this from the CRON form.
Summary
While Drupal cron works well with shorter tasks, using it for longer running tasks can be a challenge. Queues in Drupal are also dependent on a timely and functioning cron system. Hopefully this article will give you an introduction into how it works in practice and perhaps go some way to also understanding any challenges you might be experiencing implementing long running tasks with cron in drupal.
Comments
cron queue explanation- timely (pun intended)
Thanks for this explanation. I have a drupal cron use case that constantly generates the message "attempting to re-run cron while cron is already running." It seems that several custom modules use cron via hook_cron to access external databases over the network and the system log indicates cron completion for the custom modules taking longer than 4 minutes: 144 seconds for one custom module, 95 seconds for another custom module, 42 seconds for a third custom module, etc. Another custom module relies on hook cron for all its internal object housekeeping, so its logic demands that cron run reliably.
One approach to fixing the issue seeks to use "The Ultimate Cron" to experiment with different cron schedule timings. My concern has that at best "The Ultimate Cron" will mask the issues without repairing them, and at worst it will require considerable time to establish settings that work for the custom modules.
Thanks for a nice overview.
Timeouts, 4 minute and 15 minute. FALSE return.
Thanks again for the great post. Questions arise:
What happens when tasks exceed the four minute threshold? What happens when tasks exceed the 15 minute lock timeout?
How does a FALSE return value impact the Drupal system?
Thanks.
Add new comment