Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-82252

Failure of cron runner causes ad-hoc tasks to be stuck in "running" state

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • 4.3.5, 4.4.6, 4.5.2
    • Tasks
    • Any
    • MOODLE_403_STABLE, MOODLE_404_STABLE, MOODLE_405_STABLE
    • MDL-82252_404_STABLE
    • MDL-82252_405_STABLE
    • MDL-82252_main
    • Hide

      From the web UI:

          Navigate to "Site Administration -> Courses -> Backups -> Asynchronous backup/restore"
          Enable setting "Enable asynchronous backups"
          Click "Save changes"

          Navigate to "Site Administration -> Development -> Make a test course"
          Select Size of Course L
          Set short name and full name to "Test backup course"
          Click "Create course"
          Once progress bars are complete click "Continue" to be directed to the course

          Navigate to "More -> Course reuse -> Backup"
          Leave settings as default and click "Jump to final step"

          Navigate to "Site Administration -> Server -> Ad hoc tasks -> asynchronous_backup_task"
          Note the backupid in the payload column (for example: 808f4b81dc2e1e471435559f4d800262)

          Navigate to "Site Administration -> Server -> Task processing"
          Set "Ad hoc task runner lifetime" to 10 minutes
          Click "Save changes"

      From the root directory on the CLI:

          Run 

      php admin/cli/adhoc_task.php --classname="\core\task\asynchronous_backup_task"

          When the following output (replacing [BACKUP_ID] with the backupid from above) appears press CTRL + z:

      Processing asynchronous backup for backup: [BACKUP_ID]

          Note the output, for example:

      [3]+  Stopped php admin/cli/adhoc_task.php --classname="\core\task\asynchronous_backup_task"

          3 in this example is the job number.
          Using the job number above run (replacing [NUMBER] with the job number from the above output):

      kill -9 %[NUMBER]

          This will kill the process emulating a server crash or shutdown.

      From the web UI:

          Navigate to "Site Administration -> Server -> Ad hoc tasks -> asynchronous_backup_task"
          Confirm that our task has the value "Started" in the "Next run" column

          Wait 10 minutes.

      From the root directory on the CLI:

          Run 

      php admin/cli/adhoc_task.php --classname="\core\task\asynchronous_backup_task"

      Confirm that the adhoc task now re-processes

      Show
      From the web UI:     Navigate to "Site Administration -> Courses -> Backups -> Asynchronous backup/restore"     Enable setting "Enable asynchronous backups"     Click "Save changes"     Navigate to "Site Administration -> Development -> Make a test course"     Select Size of Course L     Set short name and full name to "Test backup course"     Click "Create course"     Once progress bars are complete click "Continue" to be directed to the course     Navigate to "More -> Course reuse -> Backup"     Leave settings as default and click "Jump to final step"     Navigate to "Site Administration -> Server -> Ad hoc tasks -> asynchronous_backup_task"     Note the backupid in the payload column (for example: 808f4b81dc2e1e471435559f4d800262)     Navigate to "Site Administration -> Server -> Task processing"     Set "Ad hoc task runner lifetime" to 10 minutes     Click "Save changes" From the root directory on the CLI:     Run  php admin/cli/adhoc_task.php --classname= "\core\task\asynchronous_backup_task"     When the following output (replacing [BACKUP_ID] with the backupid from above) appears press CTRL + z: Processing asynchronous backup for backup: [BACKUP_ID]     Note the output, for example: [ 3 ]+  Stopped php admin/cli/adhoc_task.php --classname= "\core\task\asynchronous_backup_task"     3 in this example is the job number.     Using the job number above run (replacing [NUMBER] with the job number from the above output): kill - 9 %[NUMBER]     This will kill the process emulating a server crash or shutdown. From the web UI:     Navigate to "Site Administration -> Server -> Ad hoc tasks -> asynchronous_backup_task"     Confirm that our task has the value "Started" in the "Next run" column     Wait 10 minutes. From the root directory on the CLI:     Run  php admin/cli/adhoc_task.php --classname= "\core\task\asynchronous_backup_task" Confirm that the adhoc task now re-processes
    • Hide

      Code verified against automated checks.

      Checked MDL-82252 using repository: https://github.com/SimonThornett/moodle

      More information about this report

      Built on: Thu Mar 13 10:28:26 UTC 2025

      Show
      Code verified against automated checks. Checked MDL-82252 using repository: https://github.com/SimonThornett/moodle MOODLE_404_STABLE (0 errors / 0 warnings) [branch: MDL-82252_404_STABLE | CI Job ] MOODLE_405_STABLE (0 errors / 0 warnings) [branch: MDL-82252_405_STABLE | CI Job ] main (0 errors / 0 warnings) [branch: MDL-82252_main | CI Job ] More information about this report Built on: Thu Mar 13 10:28:26 UTC 2025

      Observed recently was a cron container that was cycled during an adhoc task resulting in a task stuck in the "running" state and blocking subsequent tasks of the same class.

      The task was left in the "running" state based on the timestarted, pid, and hostname. As no exception is thrown, to be caught by the `run_inner_adhoc_task` try-catch statement, this resulted in no further instances of the same task to run due to it thinking that there was one already in progress despite exceeding the max runtime setting.

      Proposed solution would be to check currently running tasks against the `task_adhoc_max_runtime` setting within the `\core\task\manager::get_next_adhoc_task` function, and if the task has exceeded this (plus a small margin of error) mtrace the failed adhoc task and trigger the `\core\task\manager::adhoc_task_failed($task);`

       

      Replication steps:

       

      From the web UI:

          Navigate to "Site Administration -> Courses -> Backups -> Asynchronous backup/restore"
          Enable setting "Enable asynchronous backups"
          Click "Save changes"

          Navigate to "Site Administration -> Development -> Make a test course"
          Select Size of Course L
          Set short name and full name to "Test backup course"
          Click "Create course"
          Once progress bars are complete click "Continue" to be directed to the course

          Navigate to "More -> Course reuse -> Backup"
          Leave settings as default and click "Jump to final step"

          Navigate to "Site Administration -> Server -> Ad hoc tasks -> asynchronous_backup_task"
          Note the backupid in the payload column (for example: 808f4b81dc2e1e471435559f4d800262)

          Navigate to "Site Administration -> Server -> Task processing"
          Set "Ad hoc task runner lifetime" to 10 minutes
          Click "Save changes"

      From the root directory on the CLI:

          Run 

      php admin/cli/adhoc_task.php --classname="\core\task\asynchronous_backup_task"

          When the following output (replacing [BACKUP_ID] with the backupid from above) appears press CTRL + z:

      Processing asynchronous backup for backup: [BACKUP_ID]

          Note the output, for example:

      [3]+  Stopped php admin/cli/adhoc_task.php --classname="\core\task\asynchronous_backup_task"

          3 in this example is the job number.
          Using the job number above run (replacing [NUMBER] with the job number from the above output):

      kill -9 %[NUMBER]

          This will kill the process emulating a server crash or shutdown.

      From the web UI:

          Navigate to "Site Administration -> Server -> Ad hoc tasks -> asynchronous_backup_task"
          Confirm that our task has the value "Started" in the "Next run" column

          Wait 10 minutes.

      From the root directory on the CLI:

          Run 

      php admin/cli/adhoc_task.php --classname="\core\task\asynchronous_backup_task"

      Expected: The adhoc task that was previously stopped has now exceeded the runner lifetime setting and as such should be reset to allow for re-processing

      Actual:    The adhoc task will remain in the "Started" status and if the "Ad hoc task concurrency limit" setting is low enough will block subsequent tasks of the same class from running.

            Unassigned Unassigned
            SimonThornett Simon Thornett
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:

                Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.