Yangruibo (Robin) Ding

Email  /  Google Scholar  /  LinkedIn  /  Twitter

Bio:  I am an incoming Assistant Professor of Computer Science at University of California, Los Angeles.

I am finishing my Ph.D. in Computer Science at Columbia University, advised by Prof. Baishakhi Ray and Prof. Gail Kaiser. Previoulsy, I have also had wonderful experiences at Google DeepMind, Amazon AWS AI Labs, and IBM Research

Research:  My research focuses on developing large language models (LLMs) for code. I train LLMs to generate, analyze, and refine software programs. I also construct benchmarks to systematically evaluate LLMs' capabilities in solving software engineering tasks. Most recently, I am interested in improving LLMs' reasoning capability to tackle complex programming tasks, such as debugging and patching.

For an overview of my research, this video might help.

[Prospective Students] I am actively looking for brilliant students to join my research group @ UCLA CS. Solid background in Large Language Models, Program Analysis, or Security is strongly preferred.

Drop me an email with (1) your CV and (2) a brief introduction of your research interests and background, if you are interested in working with me.

profile photo
News

📣 May. 2025: I will join the Department of Computer Science @ UCLA as an Assistant Professor!

Sep. 2024: "SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning" got accepted by NeurIPS 2024.

🎉July 2024: "Vulnerability Detection with Code Language Models: How Far Are We?" got accepted by ICSE 2025.

DeepMind Logo May. 2024: I joined Google DeepMind as a Student Researcher, working on Code LLMs.

Honors and Awards
Publications
SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
Yangruibo Ding, Jinjun Peng, Marcus J. Min, Gail Kaiser, Junfeng Yang, Baishakhi Ray

NeurIPS 2024
Vulnerability Detection with Code Language Models: How Far Are We?
Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair,
David Wagner, Baishakhi Ray, Yizheng Chen

ICSE 2025
CYCLE: Learning to Self-Refine Code Generation
Yangruibo Ding, Marcus J. Min, Gail Kaiser, Baishakhi Ray

OOPSLA 2024
Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain
Marcus J. Min, Yangruibo Ding, Luca Buratti, Saurabh Pujar, Gail Kaiser, Suman Jana, Baishakhi Ray

ICLR 2024
TRACED: Execution-aware Pre-training for Source Code
Yangruibo Ding, Ben Steenhoek, Kexin Pei, Gail Kaiser, Wei Le, Baishakhi Ray

ICSE 2024
Automated Code Editing with Search-Generate-Modify
Changshu Liu, Pelin Cetin, Yogesh Patodia, Baishakhi Ray, Saikat Chakraborty, Yangruibo Ding

IEEE Transactions on Software Engineering (TSE)
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context
Yangruibo Ding*, Zijian Wang*, Wasi Uddin Ahmad*, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang (* equal contribution)

LREC-COLING 2024
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion
Yangruibo Ding*, Zijian Wang*, Wasi Uddin Ahmad*, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang (* equal contribution)

NeurIPS 2023 (Datasets & Benchmarks)
CONCORD: Clone-aware Contrastive Learning for Source Code
Yangruibo Ding, Saikat Chakraborty, Luca Buratti, Saurabh Pujar, Alessandro Morari, Gail Kaiser, Baishakhi Ray
🏆 ACM SIGSOFT Distinguished Paper Award

ISSTA 2023
NatGen: Generative pre-training by "Naturalizing" source code
Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar Devanbu, Baishakhi Ray

ESEC/FSE 2022
Towards Learning (Dis)-Similarity of Source Code from Program Contrasts
Yangruibo Ding, Luca Buratti, Saurabh Pujar, Alessandro Morari, Baishakhi Ray, Saikat Chakraborty

ACL 2022
Deep learning based vulnerability detection: Are we there yet
Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, Baishakhi Ray
🏆 IEEE TSE Best Paper Award Runner-up

ICSE 2022 (Journal-First), IEEE Transactions on Software Engineering (TSE)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements
Yangruibo Ding, Sahil Suneja, Yunhui Zheng, Jim Laredo, Alessandro Morari, Gail Kaiser, Baishakhi Ray

SANER 2022
CODIT: Code Editing With Tree-Based Neural Models
Saikat Chakraborty Yangruibo Ding, Miltiadis Allamanis, Baishakhi Ray

ICSE 2021 (Journal-First), IEEE Transactions on Software Engineering (TSE)
Patching as Translation: the Data and the Metaphor
Yangruibo Ding, Baishakhi Ray, Premkumar Devanbu, Vincent J Hellendoorn

ASE 2020
Work Experiences
Recent Talks
  • Feb. - Apr. 2025: "From Code Generation Towards Software Engineering: Advancing Code Intelligence w/ Language Models" @ UW,  UMD,  CMU,  UCLA,  UTD,  JHU,  Georgia Tech,  Stony Brook,  Dartmouth,  NUS.
  • Oct. 2024: "Training Code Language Models with Comprehensive Semantics Reasoning" @ UIUC.
  • Oct. 2024: "Semantic-aware Source Code Modeling" @ UMD, NCSU, ASE'24.
  • Aug. 2024: "Training Code Language Models with Comprehensive Semantics Reasoning" @ Google DeepMind.
  • Apr. 2024: "Vulnerability Detection with Code Language Models: How Far Are We?" @ Columbia SSS Seminar.

Services

Program Committee

Conference Reviewer

Journal Reviewer

Teaching


Last Updated: May 2025.

Photo by Lingyi. Website Template by Jon Barron