Why One Metric Is Never Enough to Evaluate Generative AIA QA‑focused breakdown of ROUGE, BLEU, BERTScore, and why evaluation needs humansApr 30, 2026·3 min read·6