This design pattern helps detect and take action when a deployed model is no longer fit-for-purpose.
Reasons for model degradation
- Concept drift
- Data drift
Concept drift | Data drift |
---|---|
Relationship between the model inputs and target have changed | Any change that has occurred to the data being fed to the model for predictions as compared to the data that was used for training. |
Identifying model deterioration
- Continuous monitoring of the model's predictive performance over time
- Assess this performance with the same evaluation metrics used during development
Continuous model evaluation provides a framework to evaluate a deployed mode's performance exclusively on new data. This means we can detect a model's staleness as early as possible. This information helps to do either of the following:
- Retrain the model
- Replace the existing model with a new version entirely.
This is done by capturing the following:
- ground truth
- prediction inputs and outputs for comparing with ground truth values
- model versions
- timestamp of prediction requests
Triggers for retraining
Whether to retrain or not based on an evaluation report depends on what amount of deterioration of performance is acceptable in relation to the cost of retraining.
Setting a higher threshold for model performance ensures a higher-quality model in production but will require more frequent retraining jobs which is costly.
Having a lower threshold on the other hand, would mean more cost effective but there is a chance of having a stale model in production.
Scheduled retraining
Continuous evaluation | Scheduled retraining |
---|---|
May happen everyday | May occur only every week or every month |
Top comments (0)