Abstract: In this paper, we consider the model merging process for large language models (LLMs) under a two-stage optimization framework. Traditional merging methods usually apply fixed blending rates ...
Abstract: In recent years, large language models (LLMs) based on the Transformer architecture have demonstrated excellent performance in code generation, but there have been fewer studies on data flow ...