Greetings! I am Yunze Song, a research intern at Tencent and a graduate student at the National University of Singapore.

Email YunzeSong77 [at] gmail.com

Publications

ICLR 2026
TrustJudge figure

TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them

Yidong Wang*, Yunze Song*, Tingyuan Zhu, Xuanwang Zhang...

We identify two fundamental inconsistencies in LLM-as-a-judge evaluation: Score-Comparison inconsistency and Pairwise Transitivity inconsistency. We propose TrustJudge, a probabilistic framework with distribution-sensitive scoring and likelihood-aware aggregation to alleviate these issues and improve reliability of automated assessment.

EMNLP 2024

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Xuanwang Zhang*, Yunze Song*, Yidong Wang, Shuyun Tang...

Two major obstacles hinder RAG's progress: (1) lack of thorough and fair comparisons between emerging RAG algorithms; (2) high-level abstractions in open-source tools reduce transparency and limit innovation. We introduce RAGLAB, a modular, research-focused open-source library that replicates existing algorithms and provides an ecosystem for exploring RAG techniques, enabling fair comparisons across benchmarks.

Education

Internships