Since MDL-59694 not all the site analysables are analysed the first time the training scheduled task runs. We flag a model as trained the first time a model is trained; in a big site it can take a few scheduled task runs until the site is completely analysed if the model is flagged as trained straight after the first scheduled task runs it means that it will start generating predictions immediately, theorically those predictions will be less accurate because machine learning backends will base predictions on not much data. We should define a % of the site analysables that should be analysed before we flag a model as trained.
For what I remember we could do that easily at the end of the training process, when we now flag the model as trained. We can check there analytics_used_analysables table and see the % of analysables that have been analysed and only flag it as trained if we reach that %.
- is child of
-
MDL-62166 Project Inspire Phase II proposal
-
- Closed
-