About Me

Hi! I’m Yufan Cao, a second-year Ph.D. student in EECS at BAIR, supervised by Professor Yun S. Song. I’m also actively collaborating with the Lawrence Berkeley National Laboratory and the Innovative Genomics Institute.

Feel free to reach out for a coffee chat - I really love coffee ☕️

If you’d like to support my work, you can 🥺👉buy me a coffee👈🥺

Education

Ph.D. in EECS

2024-09-01

UC Berkeley

B.Eng. in EE

2020-09-01
2024-07-01

Tsinghua University

Research Interests

Biological Language Models Evolutionary Genomics Generative Sequence Modeling Computational Biology
🔬 My Research

My research explores how machine learning models can capture structure, evolution, and function in biological sequences. I am particularly interested in generative and evolutionary modeling frameworks that connect sequence data with underlying biological mechanisms. This interest grew out of my earlier work on graph neural networks and symbolic modeling, where I studied how expressive models can represent underlying mechanisms in physical systems.

Over time, my focus shifted toward biological data, particularly single-cell and genomic settings, where complexity and noise demand statistically grounded, generative, and evolutionary perspectives. I am now interested in building AI systems that not only model biological sequences, but also help reveal principles that can guide biological understanding and experimentation.

Publications

AI for Physics

(2022). Learning Symbolic Models for Graph-Structured Physical Mechanism. In ICLR 2022.
PDF

AutoML for Recommender Systems

(2025). Towards Automated Model Design on Recommender Systems. ACM TORS, 3(3), 1-23.
PDF
(2023). Farthest Greedy Path Sampling for Two-shot Recommender Search.