Bo Liu (Benjamin Liu)

Email: benjaminliu [dot] eecs [at] gmail [dot] com

prof.png

I am a Ph.D. student at National University of Singapore in Department of Computer Science, advised by Prof. Wee Sun Lee and Prof. David Hsu. My research interest lies in the intersection of Reinforcement Learning, Reasoning and Machine Learning Systems with their applications in complex, real-world environments.

I recently worked at DeepSeek as a Student Researcher on foundation models. I hope to build upon scaling laws with my research to create an autonomous decision-making system that can act intelligently in any unknown environment.

Before that, I worked as a Research Assistant with Prof. Jun Wang. I also had the privilege of working closely with Prof. Yaodong Yang. I received my B.S. in Machine Intelligence and B.A. in Economics from Peking University in 2020, where I was advised by Prof. Zongqing Lu.

I love playing soccer in my free time. I am also open to collaborating on exploring the potential of reinforcement learning across various fields. If you’re interested in discussing new ideas or collaborating, feel free to drop me an email or schedule a meeting with me here!

News

Aug 15, 2024 One model DeepSeek-Prover-V1.5 has been released. This model enhances theorem proving in Lean 4 with state-of-the-art performance, including a 63.5% success rate on miniF2F. Available for research and commercial use.
May 23, 2024 One model DeepSeek-Prover achieves new state-of-the-art results in theorem proving, leveraging large-scale synthetic data to outperform GPT-4 on benchmarks like miniF2F and FIMO. Available for research and commercial use.
May 7, 2024 One model DeepSeek-V2, a 236B Mixture-of-Experts language model, has been released. It offers stronger performance with 42.5% lower training costs and 93.3% reduced KV cache. Available for research and commercial use.
Mar 8, 2024 One model DeepSeek-VL, a state-of-the-art Vision-Language model designed for real-world applications, is now available. Supports multimodal tasks like logical diagrams, scientific literature, and more. Released for research and commercial use.
Feb 24, 2024 One paper Grasp Multiple Objects With One Hand accepted at RA-L 2024. Presented in the IROS 2024 as an Oral Presentation.
Jan 5, 2024 One model DeepSeek-LLM, a 67B parameter language model, outperforms LLaMA2 70B in key tasks. Available for research and commercial use.
Nov 10, 2023 One paper TorchOpt: An Efficient Library for Differentiable Optimization accepted at JMLR 2023.
May 19, 2023 One open source project TorchOpt accepted as a PyTorch Ecosystem project. Check out the blog post!
Oct 21, 2022 One paper TorchOpt: An Efficient Library for Differentiable Optimization accepted at NeurIPS 2022 Workshop OPT.
Sep 27, 2022 Two papers A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning and EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine accepted at NeurIPS 2022.

Selected Publications [Full List]

  1. Preprint
    DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
    Huajian Xin, Z. Z. Ren, Junxiao Song, Zhihong Shao, Wanjia Zhao, Haocheng Wang, Bo Liu, Liyue Zhang, Xuan Lu, Qiushi Du, Wenjun Gao, Qihao Zhu, Dejian Yang, Zhibin Gou, Z. F. Wu, Fuli Luo, and Chong Ruan
    ArXiv preprint, 2024.
  2. Preprint
    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
    DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, Hao Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, Jianzhong Guo, Jiaqi Ni, Jiashi Li, Jin Chen, Jingyang Yuan, Junjie Qiu, Junxiao Song, Kai Dong, Kaige Gao, Kang Guan, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruizhe Pan, Runxin Xu, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Size Zheng, Tao Wang, Tian Pei, Tian Yuan, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaosha Chen, Xiaotao Nie, and Xiaowen Sun
    ArXiv preprint, 2024.
  3. Preprint
    DeepSeek-VL: Towards Real-World Vision-Language Understanding
    Haoyu Lu, Wen Liu, Bo Zhang, Bingxuan Wang, Kai Dong, Bo Liu, Jingxiang Sun, Tongzheng Ren, Zhuoshu Li, Hao Yang, Yaofeng Sun, Chengqi Deng, Hanwei Xu, Zhenda Xie, and Chong Ruan
    ArXiv preprint, 2024.
  4. RA-L & IROSOral
    Grasp Multiple Objects With One Hand
    Yuyang Li, Bo Liu, Yiran Geng, Puhao Li, Yaodong Yang, Yixin Zhu, Tengyu Liu, and Siyuan Huang
    The IEEE Robotics and Automation Letters, 2024. Abridged in the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024.
    Oral Presentation
  5. JMLR
    TorchOpt: An Efficient Library for Differentiable Optimization
    The Journal of Machine Learning Research, 2023. Abridged in the 36th Conference on Neural Information Processing Systems OPT Workshop, 2022.
    PyTorch Ecosystem Project
  6. NeurIPS
    A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
    The 36th Conference on Neural Information Processing Systems, 2022.
  7. NeurIPS
    EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine
    The 36th Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
  8. AAMASOral
    Learning Correlated Communication Topology in Multi-Agent Reinforcement learning
    The 20th International Conference on Autonomous Agents and Multiagent Systems, 2021.
    Oral Presentation
Last updated: September 11, 2024.