Learn about SentencePiece - they claim it's ~4x more efficient on tokens than raw byte tokenization
they've had ~85% agreement rates of researchers with some raters - they promoted some to “super-raters” (those who agreed highly)
Learn about SentencePiece - they claim it's ~4x more efficient on tokens than raw byte tokenization
they've had ~85% agreement rates of researchers with some raters - they promoted some to “super-raters” (those who agreed highly)