Dylan Zhang

Contact

🌍 Champaign, IL

<aside> 📩 **s [email protected]**

</aside>

<aside> <img src="attachment:16c7ec70-e7da-4d4f-bfb6-db1d2ac77393:128px-Google_Scholar_logo.svg.png" alt="attachment:16c7ec70-e7da-4d4f-bfb6-db1d2ac77393:128px-Google_Scholar_logo.svg.png" width="40px" /> Google Scholar

</aside>

<aside> <img src="attachment:0852e7a0-6246-40d6-9cca-4c13516da734:icons8-twitter-500.svg" alt="attachment:0852e7a0-6246-40d6-9cca-4c13516da734:icons8-twitter-500.svg" width="40px" /> X

</aside>

Misc.

I like minions, watching stand-up comedy, cars and planes.

Acknowledgement

I borrowed this template from my amazing lab-mate Deema.

🍟 About Me

I am Dylan, a Ph.D. Student in Computer Science at the University of Illinois Urbana-Champaign (UIUC)🌽 , advised by Prof. Hao Peng. A proud member of our lab (as of the most recent update, our lab’s official name is ALTA).

I’m broadly interested in language model post-training. My previous work has focused on the data-centric aspects of post-training, and moving forward, I’m eager to explore new directions such as offline-to-online reinforcement learning, incentivizing proactive reasoning behaviors for knowledge discovery, and enabling the rapid deployment of foundation model–based agents in novel environments.

I’m also deeply excited about the growing role of foundation models in scientific discovery—and the rich post-training algorithmic challenges that come with it.

I am looking for Research Internship opportunities for 2026 summer

🎓 Education

Ph.D. in Computer Science

University of Illinois Urbana-Champaign 🌽 🐿️

Advisor: Prof. Hao Peng

2022 - Present

🔔 News

[9.2025] GRAPE was accepted to NeurIPS 2025 Spotlight
[5.2025] Started My Student Researcher journey with Google

📰 Selected Works

2025

The Best Instruction-Tuning Data are Those That Fit

Authors: Dylan Zhang, Qirun Dai, Hao Peng

Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities

Authors: Qirun Dai, Dylan Zhang, Jiaqi W. Ma, Hao Peng

Entropy-Regularized Process Reward Model

Authors: Hanning Zhang*, Pengcheng Wang*, Shizhe Diao*, Yong Lin, Rui Pan, Dylan Zhang, Pavlo Molchanov, Tong Zhang

ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

Authors: Dylan Zhang*, Rui Pan*, Hanning Zhang*, Xingyuan Pan*, Minrui Xu, Jipeng Zhang, Renjie Pi, Xiaoyu Wang, Tong Zhang (*equal contrib)

2024

Only-IF: Revealing the Decisive Effect of Instruction Diversity on Generalization

Authors: Dylan Zhang, Justin Wang, Francois Charton

SciCode: A Research Coding Benchmark Curated by Scientists

Authors: Minyang Tian*, Luyu Gao*, Dylan Zhang, Xinan Chen, … (multi-institution collaboration)

2023

Making Large Language Models Better Reasoners with Step-Aware Verifier

Authors: Yifei Li, Zeqi Lin, Dylan Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen

2021

Pre-training Co-evolutionary Protein Representation via a Pairwise Masked Language Model

Authors: Liang He, Dylan Zhang, Lijun Wu, Huanhuan Xia, Fusong Ju, He Zhang, Siyuan Liu, …, Tie-Yan Liu

💼 Internship Experience

Google_2015_logo.svg.webp

Student Researcher

Google

Research Intern

Microsoft Research

Research Intern

Microsoft Research

Last updated: Sept, 2025