Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-65722

Change postgres lock implementation to not depend on mdl_lock_db

XMLWordPrintable

    • MOODLE_37_STABLE, MOODLE_38_STABLE, MOODLE_39_STABLE
    • MOODLE_39_STABLE
    • MDL-65722-no-lock-db
    • Hide

      Mostly covered by existing unit tests. To show that the system is using less db calls overall lets load test the adhoc task queue before and after the patch:

      1) Install this plugin to make testing easier:

      https://github.com/catalyst/moodle-tool_testtasks

      2) To expose the number of housekeeping db calls temp patch this code:

      +++ b/admin/tool/task/cli/adhoc_task.php
      @@ -108,10 +108,13 @@ mtrace("Server Time: {$humantimenow}\n");
       
       // Run all adhoc tasks.
       $taskcount = 0;
      +$predbqueries = $DB->perf_get_queries();
       while (!\core\task\manager::static_caches_cleared_since($timenow) &&
               $task = \core\task\manager::get_next_adhoc_task($timenow)) {
           cron_run_inner_adhoc_task($task);
           $taskcount++;
           unset($task);
      +    mtrace("... used " . ($DB->perf_get_queries() - $predbqueries) . " overhead dbqueries");
      +    $predbqueries = $DB->perf_get_queries();
       }
       mtrace("Ran {$taskcount} adhoc tasks found at {$humantimenow}");
      

      3) Test a single ad hoc task in the queue:

      $ php admin/tool/testtasks/cli/queue_adhoc_tasks.php -n=1 -d=1
      $ php admin/tool/task/cli/adhoc_task.php --execute | grep overhead
      </pre>... used 12 overhead dbqueries
      

      after the patch:

      $ php admin/tool/task/cli/adhoc_task.php --execute | grep overhead
      </pre>... used 9 overhead dbqueries
      
      

      At scale this gets tons worse / better. Queue up 1000 tasks:

      $ php admin/tool/testtasks/cli/queue_adhoc_tasks.php -n=1000 -d=1
      

      Now run this ad hoc task processors 5 times:

      $ php admin/tool/task/cli/adhoc_task.php --execute | grep overhead &
      

      and you should see these before the patch:

      ... used 19 overhead dbqueries
      

      and after the patch:

      ... used 13 overhead dbqueries 
      

      Show
      Mostly covered by existing unit tests. To show that the system is using less db calls overall lets load test the adhoc task queue before and after the patch: 1) Install this plugin to make testing easier: https://github.com/catalyst/moodle-tool_testtasks 2) To expose the number of housekeeping db calls temp patch this code: +++ b/admin/tool/task/cli/adhoc_task.php @@ - 108 , 10 + 108 , 13 @@ mtrace( "Server Time: {$humantimenow}\n" ); // Run all adhoc tasks. $taskcount = 0 ; +$predbqueries = $DB->perf_get_queries(); while (!\core\task\manager::static_caches_cleared_since($timenow) && $task = \core\task\manager::get_next_adhoc_task($timenow)) { cron_run_inner_adhoc_task($task); $taskcount++; unset($task); + mtrace( "... used " . ($DB->perf_get_queries() - $predbqueries) . " overhead dbqueries" ); + $predbqueries = $DB->perf_get_queries(); } mtrace( "Ran {$taskcount} adhoc tasks found at {$humantimenow}" ); 3) Test a single ad hoc task in the queue: $ php admin/tool/testtasks/cli/queue_adhoc_tasks.php -n= 1 -d= 1 $ php admin/tool/task/cli/adhoc_task.php --execute | grep overhead </pre>... used 12 overhead dbqueries after the patch: $ php admin/tool/task/cli/adhoc_task.php --execute | grep overhead </pre>... used 9 overhead dbqueries At scale this gets tons worse / better. Queue up 1000 tasks: $ php admin/tool/testtasks/cli/queue_adhoc_tasks.php -n= 1000 -d= 1 Now run this ad hoc task processors 5 times: $ php admin/tool/task/cli/adhoc_task.php --execute | grep overhead & and you should see these before the patch: ... used 19 overhead dbqueries and after the patch: ... used 13 overhead dbqueries

      The postgres advisory locks works with an integer, but the moodle lock API works with a string. So the postgres impl re-uses the mdl_lock_db table to manage this mapping:

      https://github.com/moodle/moodle/blob/master/lib/classes/lock/postgres_lock_factory.php#L144-L156

      This impl made some sense back when the number of resource keys was discrete, and essentially limited to just the names of scheduled tasks.

      Now I don't think this makes any sense, there are lots of ad hocs tasks and other clients of the Lock API which are single use (eg objectfs leverages this heaps). In our largest client we have 10's of million rows in mdl_lock_db and it is all junk.

      So I'm proposing to eliminate the db lookup and polling and convert this to a deterministic fast hash, eg something like string -> md5 / sha256 -> truncate to N chars -> convert to integer that fits inside a postgres bigint

      https://www.postgresql.org/docs/current/datatype-numeric.html

      bigint 8 bytes large-range integer -9223372036854775808 to 9223372036854775807

      And an upgrade task to truncate that table as needed

       

            brendanheywood Brendan Heywood
            brendanheywood Brendan Heywood
            Matt Porritt Matt Porritt
            Eloy Lafuente (stronk7) Eloy Lafuente (stronk7)
            Janelle Barcega Janelle Barcega
            Votes:
            4 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved:

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 6 hours, 20 minutes
                6h 20m

                  Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.