Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-83708

Manage question categories page slow to load due to a complex query

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a bug
    • Icon: Minor Minor
    • None
    • 4.4.5
    • Questions
    • MOODLE_404_STABLE

      The code in question technically works, but the performance exponentially deteriorates with an increasing number of question versions, so I chose to mark this as a bug.

      The endpoint question/bank/managecategories/category.php?courseid=X is supposed to display all question categories in the question bank associated with course X. We noticed an incredibly long page load time (~1 minute) for a large course on our Moodle instance. I did some profiling and found the culprit. When the qbank_managecategories\question_category_object is constructed, the qbank_managecategories\question_category_list::get_records method is called, which in turn calls the qbank_managecategories\helper::get_categories_for_contexts method. This method performs a single big database query and that query is responsible for 99% of the entire load time.

      I manually constructed the corresponding SQL statement to run EXPLAIN ANALYZE directly in Postgres to try and figure out what was wrong. Here is the query:

      SELECT c.*,
             (SELECT COUNT(1)
                FROM mdl2_question q
                JOIN mdl2_question_versions qv ON qv.questionid = q.id
                JOIN mdl2_question_bank_entries qbe ON qbe.id = qv.questionbankentryid
               WHERE q.parent = '0'
                     AND (qv.status = 'ready' OR qv.status = 'draft')
                     AND c.id = qbe.questioncategoryid
                     AND (0 = 1  -- $showallversions = 0
                          OR (qv.version = (SELECT MAX(v.version)
                                              FROM mdl2_question_versions v
                                              JOIN mdl2_question_bank_entries be ON be.id = v.questionbankentryid
                                             WHERE be.id = qbe.id)
                          )
                    )
            ) AS questioncount
      FROM mdl2_question_categories c
      WHERE c.contextid IN (3300476) -- $topwhere = ''
      ORDER BY parent, sortorder, name
      

      Prepending EXPLAIN ANALYZE yields the following results:

      QUERY PLAN
      Sort  (cost=633315.36..633315.39 rows=11 width=139) (actual time=87613.673..87613.746 rows=143 loops=1)
        Sort Key: c.parent, c.sortorder, c.name
        Sort Method: quicksort  Memory: 45kB
        ->  Index Scan using mdl2_quescate_con_ix on mdl2_question_categories c  (cost=0.42..633315.17 rows=11 width=139) (actual time=213.334..87613.350 rows=143 loops=1)
              Index Cond: (contextid = 3300476)
              SubPlan 2
                ->  Aggregate  (cost=57571.64..57571.65 rows=1 width=8) (actual time=611.173..611.174 rows=1 loops=143)
                      ->  Nested Loop  (cost=1.30..57571.64 rows=1 width=0) (actual time=459.046..611.169 rows=6 loops=143)
                            ->  Nested Loop  (cost=0.86..57568.60 rows=5 width=8) (actual time=1.067..495.207 rows=36420 loops=143)
                                  ->  Index Scan using mdl2_quesbankentr_que_ix on mdl2_question_bank_entries qbe  (cost=0.43..2191.63 rows=930 width=8) (actual time=0.081..21.137 rows=36420 loops=143)
                                        Index Cond: (questioncategoryid = c.id)
                                  ->  Index Scan using mdl2_quesvers_que_ix on mdl2_question_versions qv  (cost=0.43..59.54 rows=1 width=24) (actual time=0.011..0.011 rows=1 loops=5208028)
                                        Index Cond: (questionbankentryid = qbe.id)
                                        Filter: ((((status)::text = 'ready'::text) OR ((status)::text = 'draft'::text)) AND ((SubPlan 1) = version))
                                        Rows Removed by Filter: 0
                                        SubPlan 1
                                          ->  Aggregate  (cost=17.02..17.03 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=5214244)
                                                ->  Nested Loop  (cost=0.86..17.01 rows=3 width=8) (actual time=0.004..0.006 rows=1 loops=5214244)
                                                      ->  Index Only Scan using mdl2_quesbankentr_id_pk on mdl2_question_bank_entries be  (cost=0.43..8.45 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=5214244)
                                                            Index Cond: (id = qbe.id)
                                                            Heap Fetches: 0
                                                      ->  Index Scan using mdl2_quesvers_que_ix on mdl2_question_versions v  (cost=0.43..8.53 rows=3 width=16) (actual time=0.001..0.002 rows=1 loops=5214244)
                                                            Index Cond: (questionbankentryid = qbe.id)
                            ->  Index Scan using mdl2_ques_id_pk on mdl2_question q  (cost=0.43..0.61 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=5208019)
                                  Index Cond: (id = qv.questionid)
                                  Filter: (parent = '0'::bigint)
                                  Rows Removed by Filter: 1
      Planning Time: 22.131 ms
      JIT:
        Functions: 34
        Options: Inlining true, Optimization true, Expressions true, Deforming true
        Timing: Generation 2.535 ms, Inlining 67.840 ms, Optimization 61.909 ms, Emission 81.007 ms, Total 213.291 ms
      Execution Time: 87850.144 ms
      

      As you can see, due to the double-nesting of sub-queries, the innermost loops over 5.2 million times in this particular case resulting in ~90 seconds of execution time. As far as I can tell, this is not an index issue since only index scans are performed here.

      So far I have not tried to figure out, how the query can be modified to improve performance, but I thought I should post the issue anyway. Maybe I'll come up with a suggestion, but for now we will be working on a script to delete as many unused/old question versions in a specified context as possible, in the hopes that this will significantly reduce the execution time for this query.

      PS:

      Not technically related to the problem itself, but the fact that the get_records_sql call in that get_categories_for_contexts method uses no parameters, but instead just inserts unsanitized function arguments directly into the SQL is a huge security risk.

      Oh, and it seems the function is almost an exact copy of this one. Not sure, which came first, but both suffer from the same problems.

      PPS (Solution):
       
      After bit of digging we figured out that the vast majority of questions in our database (literally millions or ~ 99%) were legacy random type questions that are no longer used anywhere in Moodle. (See MDL-61267) We are not sure, how exactly we ended up with this staggering number, but this allowed us to solve the performance problem regardless.

      Using the DELETE queries suggested by Tim Hunt in this forum post we simply purged all of those questions. This obviously had a huge impact on the performance of the query mentioned above and all similar queries, thus speeding up the responsiveness of most question bank operations/page load times substantially.

      Since it seems that this was an XY Problem, I am closing this issue.

            Unassigned Unassigned
            daniil-berg Daniil Fajnberg
            Votes:
            3 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved:

                Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.