Tianxiang Sun

I am a 5th-year Ph.D. student at the NLP Lab at Fudan University, advised by Prof. Xipeng Qiu and Prof. Xuanjing Huang. I am also interning at Shanghai AI Laboratory. Previously, I had an internship at Alibaba DAMO Academy and Amazon Shanghai AI Lab.

My research interests lie in the field of Machine Learning and Natural Language Processing, particularly in pre-trained language models and their optimization-, inference-, and data-efficient methods. Reach out to me over email: txsun19@fudan.edu.cn.

CV  /  Google Scholar  /  Github  /  Twitter  /  Zhihu

profile photo
  • [May 2023] Four papers accepted to ACL 2023!
  • [Feb. 2023] We are excited to release MOSS, a conversational language model. Try it now at moss.fastnlp.top.
  • [Oct. 2022] Three papers accepted to EMNLP 2022!
  • [Aug. 2022] I gave a talk on LMaaS and black-box tuning at AI Time.
  • [Aug. 2022] I am co-organizing a PLM-tuning competition (total prize of 1 million RMB) with Zhengfu He. Welcome!
  • [July 2022] I gave a talk on derivative-free optimization for pre-trained language models at MLNLP. Slides here.
  • [July 2022] We have released a paper list on Language-Model-as-a-Service (LMaaS). Feel free to submit pull requests!
  • [May 2022] One paper accepted to ICML 2022 (21.9% acceptance rate)!
  • ---- show more ----
Selected Publications ( All Papers )

Google Scholar / Semantic Scholar / DBLP / ORCID

(*: Equal contribution)

Black-Box Tuning for Language-Model-as-a-Service
Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
ICML, 2022   (Spotlight)
pdf / code / slides

We propose a promising and practical scenario, Language-Model-as-a-Service (LMaaS), where users cannot access model parameters and gradients but can only access language models' output probability. For such a scenario, we propose the black-box tuning to optimize continuous prompts via derivative-free optimization.

Paradigm Shift in Natural Language Processing
Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang
Machine Intelligence Research, 2022   (Invited Paper)
pdf / project / slides

Recent years have witnessed a trend of paradigm shift in a variety of NLP tasks, which is to solve a task that is originally performed with a paradigm (e.g., sequence labeling) with another paradigm (e.g., machine reading comprehension).

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Xiangyang Liu*, Tianxiang Sun*, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
NAACL, 2022   (Oral Presentation)
pdf / code / benchmark / slides

We propose a benchmark, ELUE (Efficient Language Understanding Evaluation), for efficient NLP models and a strong baseline/backbone pre-trained model, ElasticBERT.

CoLAKE: Contextualized Language and Knowledge Embedding
Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, Zheng Zhang
COLING, 2020
pdf / code / slides

We pre-train a model called CoLAKE for jointly learning language and knowledge representation by unifying language and knowledge into word-knowledge graphs.

Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, Xuanjing Huang
SCIENCE CHINA Technological Sciences, 2020   (Invited Paper, Most Influential Paper of SCTS in 2020)

We provide a comprehensive survey of pre-trained models (PTMs) for NLP, ranging from non-contextual word embeddings to state-of-the-art language models. This is a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.

Learning Sparse Sharing Architectures for Multiple Tasks
Tianxiang Sun*, Yunfan Shao*, Xiaonan Li, Pengfei Liu, Hang Yan, Xipeng Qiu, Xuanjing Huang
AAAI, 2020   (Oral Presentation)
pdf / code / slides

We propose a new parameter sharing mechanism for multi-task learning, sparse sharing, which allocates a subnet for a task based on lottery ticket hypothesis. The sparse sharing successfully avoids negative transfer between tasks.

Projects & Resources
MOSS: A Conversational Language Model
project led by Tianxiang Sun

MOSS is a conversational language model like ChatGPT. It is capable of following users' instructions to perform various natural language tasks including question answering, generating text, summarzing text, generating code, etc. MOSS is also able to challenge incorrect premises, and reject inappropriate requests. Here is a brief introduction to MOSS.

Paper List on Language-Model-as-a-Service (LMaaS)
maintained by Tianxiang Sun

Pre-trained large language models (LLMs) such as GPT-3 are usually released as a service instead of open sourcing model weights. We call this scenario "Language-Model-as-a-Service (LMaaS)", where users can access the powerful LLMs through their inference APIs. We maintain a curated list of papers that fit into this scenario.

  • WAIC Yunfan Award (Rising Star, 2023)
  • Academic Star (associated with Fudan University, 10 winners across all the STEM graduate schools, 2023)
  • Most Influential Paper Award of Sci. China Tech Sci. (2022)
  • National Scholarships (Ministry of Education, China, 2020)
  • Outstanding Graduate (associated with Xidian University, 2019)
  • First Prize in China High School Biology Olympiad (2014)

Student Seminar Co-Chair

  • CCL 2023

Reviewer / Program Committee Member

  • ACL (2021, 2022, 2023)
  • EMNLP (2021, 2022, 2023)
  • COLING (2020, 2022)
  • ICML (2022)
  • NeurIPS (2022, 2023)
  • AAAI (2021)
  • IJCAI (2021)
Will not submit papers or reviews to AAAI/IJCAI since 2022 :)

Design and source code from Jon Barron's website