Teaching language models to support answers with verified quotes

Learn about SentencePiece - they claim it's ~4x more efficient on tokens than raw byte tokenization

they've had ~85% agreement rates of researchers with some raters - they promoted some to “super-raters” (those who agreed highly)