Yangruibo (Robin) Ding
Email  / 
Google Scholar  / 
LinkedIn  / 
Twitter
I am a final-year Ph.D. student in Computer Science at Columbia University advised by Prof. Baishakhi Ray and Prof. Gail Kaiser. I have also been a student researcher at Google DeepMind since May 2024.
My research focuses on developing large language models for code. I am interested in training language models to learn code-specific semantics (e.g., dynamic execution) and properties (e.g., functionality and constraint) to generate, analyze, and refine software programs. Most recently, I work on improving LLMs' reasoning capability to tackle complex programming tasks, such as debugging and patching.
📣 Office Hours: I am holding office hours on Tuesdays 3-4 PM, offering mentorship and advice to Columbia undergraduate/master students. If you want to discuss research ideas with me, please fill out this Form by EOD of Mondays.
|
|
News
⭐ Sep. 2024: "SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning" got accepted by NeurIPS 2024.
🎉July 2024: "Vulnerability Detection with Code Language Models: How Far Are We?" got accepted by ICSE 2025.
May. 2024: I joined Google DeepMind as a Student Researcher, working on Code LLMs.
Jan. 2024: "CYCLE: Learning to Self-Refine Code Generation" got accepted by OOPSLA 2024.
Jan. 2024: "Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain" got accepted by ICLR 2024. Congrats to Marcus!
Dec. 2023: "Deep Learning Based Vulnerability Detection: Are We There Yet" got IEEE TSE Best Paper Award Runner-up. Congrats to Saikat!
Nov. 2023: I will serve as a Program Committee Member of ASE 2024.
Sep. 2023: Our Datasets and Benchmarks paper, CrossCodeEval, got accepted by NeurIPS 2023.
July 2023: CONCORD got ACM SIGSOFT Distinguished Paper Award.
|
|
SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
Yangruibo Ding,
Jinjun Peng, Marcus J. Min, Gail Kaiser, Junfeng Yang, Baishakhi Ray
NeurIPS 2024
|
|
Vulnerability Detection with Code Language Models: How Far Are We?
Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair,
David Wagner, Baishakhi Ray, Yizheng Chen
ICSE 2025
|
|
CYCLE: Learning to Self-Refine Code Generation
Yangruibo Ding,
Marcus J. Min, Gail Kaiser, Baishakhi Ray
OOPSLA 2024
|
|
Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain
Marcus J. Min, Yangruibo Ding, Luca Buratti, Saurabh Pujar, Gail Kaiser, Suman Jana, Baishakhi Ray
ICLR 2024
|
|
TRACED: Execution-aware Pre-training for Source Code
Yangruibo Ding,
Ben Steenhoek, Kexin Pei, Gail Kaiser, Wei Le, Baishakhi Ray
ICSE 2024
|
|
Automated Code Editing with Search-Generate-Modify
Changshu Liu, Pelin Cetin, Yogesh Patodia, Baishakhi Ray, Saikat Chakraborty, Yangruibo Ding
IEEE Transactions on Software Engineering (TSE)
|
|
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context
Yangruibo Ding*,
Zijian Wang*, Wasi Uddin Ahmad*, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang (* equal contribution)
LREC-COLING 2024
|
|
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion
Yangruibo Ding*,
Zijian Wang*, Wasi Uddin Ahmad*, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang (* equal contribution)
NeurIPS 2023 (Datasets & Benchmarks)
|
|
CONCORD: Clone-aware Contrastive Learning for Source Code
Yangruibo Ding,
Saikat Chakraborty, Luca Buratti, Saurabh Pujar, Alessandro Morari, Gail Kaiser, Baishakhi Ray
ACM SIGSOFT Distinguished Paper Award
ISSTA 2023
|
|
NatGen: Generative pre-training by "Naturalizing" source code
Saikat Chakraborty, Toufique Ahmed,
Yangruibo Ding,
Premkumar Devanbu, Baishakhi Ray
ESEC/FSE 2022
|
|
Towards Learning (Dis)-Similarity of Source Code from Program Contrasts
Yangruibo Ding,
Luca Buratti, Saurabh Pujar, Alessandro Morari, Baishakhi Ray, Saikat Chakraborty
ACL 2022
|
|
Deep learning based vulnerability detection: Are we there yet
Saikat Chakraborty, Rahul Krishna,
Yangruibo Ding,
Baishakhi Ray
IEEE TSE Best Paper Award Runner-up
ICSE 2022 (Journal-First), IEEE Transactions on Software Engineering (TSE)
|
|
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements
Yangruibo Ding,
Sahil Suneja, Yunhui Zheng, Jim Laredo, Alessandro Morari, Gail Kaiser, Baishakhi Ray
SANER 2022
|
|
CODIT: Code Editing With Tree-Based Neural Models
Saikat Chakraborty
Yangruibo Ding,
Miltiadis Allamanis, Baishakhi Ray
ICSE 2021 (Journal-First), IEEE Transactions on Software Engineering (TSE)
|
|
Patching as Translation: the Data and the Metaphor
Yangruibo Ding,
Baishakhi Ray, Premkumar Devanbu, Vincent J Hellendoorn
ASE 2020
|
Services
Program Committee
Conference Reviewer
Journal Reviewer
|
Last Updated: Oct 2024.
Photo by Lingyi. Website Template by Jon Barron
|
|