Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-37577

Deleting users and very slow quiz_get_attempt_usertime_sql

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • 2.3.5, 2.4.2
    • 2.3.4, 2.4, 2.5
    • Quiz
    • MOODLE_23_STABLE, MOODLE_24_STABLE, MOODLE_25_STABLE
    • MOODLE_23_STABLE, MOODLE_24_STABLE
    • Hide

      -1. Run unit tests on all DBS [Dan is volunteering]

      0. You need a course with at least one quiz Q and one group G1.

      1. Add student S1 to group G1, then start a quiz attempt as S1. Also start a quiz attempt as student S2 who is not currently in group G1.

      2. Add an time-limit or close data override for Group G1 in Quiz Q. Verify that there are no errors.

      3. Add student S2 to G1, and remove student S1. Verify that there are no errors.

      4. Run cron, verify that the quiz overdue attempts processing cron runs, and that there are no errors.

      5. Edit the quiz settings and change the time-limit and/or close date. Verify that there are no errors when you save the form.

      6. Delete group G1. Verify that there are no errors.

      Show
      -1. Run unit tests on all DBS [Dan is volunteering] 0. You need a course with at least one quiz Q and one group G1. 1. Add student S1 to group G1, then start a quiz attempt as S1. Also start a quiz attempt as student S2 who is not currently in group G1. 2. Add an time-limit or close data override for Group G1 in Quiz Q. Verify that there are no errors. 3. Add student S2 to G1, and remove student S1. Verify that there are no errors. 4. Run cron, verify that the quiz overdue attempts processing cron runs, and that there are no errors. 5. Edit the quiz settings and change the time-limit and/or close date. Verify that there are no errors when you save the form. 6. Delete group G1. Verify that there are no errors.

      We have a large database and are attempting to delete users from it. That coupled with a large number if quiz attempts results in being only able to delete about 8 users per hour.
      What we are seeing is multiple database calls involving the sql query in quiz_get_attempt_usertime_sql(). These take approximately 30 seconds each. See below for further details.
      It appears that open quizes are recalculated for each course when a user is deleted. This means can get a large number of 30 seconds blocks for deleting of a single user.

      Having read all the related materials, this is not going to be a fun fix and producing the current situation was complex enough. We are running PostgreSQL so all information
      here is developed in that context and not tested against other databases.

      There are a couple of options to improve this;
      1. Update the delete users to not continually call quiz_get_attempt_usertime_sql(). It's not known if this is possible and wasn't the focus of investigation
      2. Improve the SQL query to run faster in large environments.

      Attempts have been made to improve the SQL performance, but haven't been tested against other databases.

      Tables sizes

      mdl_groups_members: 294686
      mdl_quiz_overrides: 558
      mdl_quiz: 14336
      mdl_quiz_attempts: 836894

      Commit introducing this SQL: 8e771aed93eea08cc3e9410283f5354e02311281

      Related Tracker Items:
      https://tracker.moodle.org/browse/MDL-35717
      https://tracker.moodle.org/browse/MDL-36842

      See attached: Initial Plan.txt

      With this plan we end up producing about 9 million rows before completing an aggregate which is nearly 500MB of data to process. This is slow and completed on disk.

      In an attempt to limit the number of rows produced at any nodes, the results need to be summarized where possible to ensure the smallest set of rows bubble up.
      The groups_members joins to mdl_quiz_overrides creates a large number of output rows to build a set of group overrides for some quizes. By only outputting the one
      set of results for each quiz and user in the system with a group override, the result set at this point is much smaller. This is then joined back into the original
      main query and re-aggregates the user overrides and default quiz limits. The results are a much smaller set of rows bubbling up at the expense of a potentially greater subset
      of rows produced at the qogroup level.

      In our large environment the changes resulted in a 10x speed increase in the results. The executed per course for user deletes make them complete substantially faster.

      This is the adjusted query that operates on PostgreSQL

      SELECT iquiza.id,
      COALESCE(MAX(quo.timeclose), MAX(qogroup.timeclose), iquiz.timeclose) AS usertimeclose,
      COALESCE(MAX(quo.timelimit), MAX(qogroup.timelimit), iquiz.timelimit) AS usertimelimit
      FROM mdl_quiz_attempts iquiza
      JOIN mdl_quiz iquiz ON iquiz.id = iquiza.quiz

      – Process user quiz overrides
      LEFT JOIN mdl_quiz_overrides quo ON quo.quiz = iquiza.quiz AND quo.userid = iquiza.userid
      LEFT JOIN
      – Building the aggregate of overrides considerably constrains the number of rows processed at the top aggregate.
      – In our environment it's a factor of 11 small at the top aggregate processing
      (SELECT gm.userid, qgo1.quiz,
      COALESCE(MAX(qgo1.timeclose), MAX(qgo2.timeclose)) AS timeclose,
      COALESCE(MAX(qgo3.timelimit), MAX(qgo4.timelimit)) AS timelimit
      FROM mdl_groups_members gm
      LEFT JOIN mdl_quiz_overrides qgo1 ON qgo1.groupid = gm.groupid AND qgo1.timeclose = 0
      LEFT JOIN mdl_quiz_overrides qgo2 ON qgo2.groupid = gm.groupid AND qgo2.timeclose > 0
      LEFT JOIN mdl_quiz_overrides qgo3 ON qgo3.groupid = gm.groupid AND qgo3.timelimit = 0
      LEFT JOIN mdl_quiz_overrides qgo4 ON qgo4.groupid = gm.groupid AND qgo4.timelimit > 0
      GROUP BY gm.userid, qgo1.quiz
      ) qogroup
      ON qogroup.quiz = iquiza.quiz AND qogroup.userid = iquiza.userid
      GROUP BY iquiza.id, iquiz.id, iquiz.timeclose, iquiz.timelimit;

      For the resulting Query plan for our dataset see Updated Plan.txt

      Please review and confirm these queries are equivilent and whether my approach to this problem is appropriate.
      It will also need to be confirmed this format of subquery works on all databases. I expect it does as the query itself is a subquery itself and I've just created another nested level.

            timhunt Tim Hunt
            mr-russ Russell Smith (Inactive)
            Andrew Lyons Andrew Lyons
            Eloy Lafuente (stronk7) Eloy Lafuente (stronk7)
            Andrew Davis Andrew Davis
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved:

                Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.