MapReduce LoRA: train reward-specific LoRA experts in parallel (Map) and iteratively merge them (Reduce) with configurable weights (default 1:1:1). Reward-aware Token Embedding (RaTE): learn ...