OpenEvidence Sounds Promising, but is it Reliable?

liv@beehaw.org · 3 days ago

OpenEvidence Sounds Promising, but is it Reliable?

liv@lemmy.nz · 3 days ago

When we look at passing scores, is there any way to quantitatively grade them for magnitude?

Not all bad advice is created equal.

jarfil@beehaw.org · 15 hours ago

The grading is a mess. It goes about qualitative, quantitative… and statistical corrections “to make it fair”.

Anyway, there is ~30% margin on the scores for passing, so chances are that 9% is better than the worst doctor who still “passed”.

liv@lemmy.nz · 8 hours ago

I’d hope the bar for medical advice is higher than “better than the worst doctor”.

Will be interesting to see where liability lies with this one. In the example given, following the advice could permanently worsen patients.

Given that the advice is proven to be wrong and goes against official medical guidance for doctors, that could potentially be material for a class action lawsuit.

jarfil@beehaw.org · 2 hours ago

It’s like in the joke: “What do you call someone who barely finished medical school?.. Doctor.”

Every doctor is allowed to provide medical advice, even those who should better shut up. Liabilities are like what a friend got after a botched operation, when confronting her doctor: “Sue me, that’s what my insurance is for”.

I’d like to see the actual final assessment of an AI on these tests, but if it’s just “9% vs 15% error rate”, I’d take it.

My guess would be the AI might not be great at all kinds of assessments, but having a panel of specialized AIs, like we now have multiple specialist cooperating, sounds like a reasonable idea. Having a transcript of such meeting analyzed by a GP, could be even better.