This is where I find having a disconnect between an ML team and product team is so broken. Same for SE to be fair.
Accuracy rates, F1, anything, they're all just rough guides. The company cares about making money and some errors are much bigger than others.
We'd manually review changes for updates to our algos and models. Even with a golden set, breaking one case to fix five could be awesome or terrible.
I've given talks about this, my classic example is this somewhat imagined scenario (because it's unfair of me to accuse people of not checking at all):
It's 2015. You get an update to your classification model. Accuracy rates go up on a classic dataset, hooray! Let's deploy.
Your boss's, boss's, boss gets a call at 2am because you're in the news.
Ah. Turns out improving classifications of types of dogs improved but... that wasn't as important as this.
Issues and errors must be understood in context of the business. If your ML team is chucking models over the fence you're going to at best move slowly. At worst you're leaving yourself open to this kind of problem.
Accuracy rates, F1, anything, they're all just rough guides. The company cares about making money and some errors are much bigger than others.
We'd manually review changes for updates to our algos and models. Even with a golden set, breaking one case to fix five could be awesome or terrible.
I've given talks about this, my classic example is this somewhat imagined scenario (because it's unfair of me to accuse people of not checking at all):
It's 2015. You get an update to your classification model. Accuracy rates go up on a classic dataset, hooray! Let's deploy.
Your boss's, boss's, boss gets a call at 2am because you're in the news.
https://www.bbc.co.uk/news/technology-33347866
Ah. Turns out improving classifications of types of dogs improved but... that wasn't as important as this.
Issues and errors must be understood in context of the business. If your ML team is chucking models over the fence you're going to at best move slowly. At worst you're leaving yourself open to this kind of problem.