Evaluating translation quality...
BLASER scores range from 1 to 5, where 5 indicates perfect semantic equivalence.
COMET scores are mapped to match BLASER's 1-5 range.