Tianxiang Sun (孙天祥)

Tianxiang Sun

I am a 5th-year Ph.D. student at the NLP Lab at Fudan University, advised by Prof. Xipeng Qiu and Prof. Xuanjing Huang. Previously, I had an internship at Shanghai AI Laboratory (2023), Alibaba DAMO Academy (2022), and Amazon Shanghai AI Lab (2019-2020).

My research interests lie in the field of Machine Learning and Natural Language Processing, particularly in pre-trained large language models and their optimization-, inference-, and data-efficient methods. Reach out to me over email: txsun19@fudan.edu.cn.

CV / Google Scholar / Github / Twitter / OpenMOSS

News

[Mar. 2024] Excited to announce OpenMOSS!
[May 2023] Four papers accepted to ACL 2023!
[Feb. 2023] We are excited to release MOSS, a conversational language model.
[Oct. 2022] Three papers accepted to EMNLP 2022!
[Aug. 2022] I gave a talk on LMaaS and black-box tuning at AI Time.
[Aug. 2022] I am co-organizing a PLM-tuning competition (total prize of 1 million RMB) with Zhengfu He. Welcome!
[July 2022] I gave a talk on derivative-free optimization for pre-trained language models at MLNLP. Slides here.
[July 2022] We have released a paper list on Language-Model-as-a-Service (LMaaS). Feel free to submit pull requests!
[May 2022] One paper accepted to ICML 2022 (21.9% acceptance rate)!

---- show more ----

[Apr. 2022] Our paradigm shift survey is accepted to Machine Intelligence Research as an invited paper.

[Apr. 2022] One paper accepted to NAACL 2022!

[Feb. 2022] One paper accepted to ACL 2022 (Findings, 31.4% acceptance rate)!

[Oct. 2021] I gave a talk on efficient NLP at BAAI Big Model Meetup.

[Oct. 2021] I gave a talk on paradigm shift in NLP at BAAI Qinyuan LIVE.

[Sep. 2021] We have released a paper list on early exiting. Feel free to submit pull requests!

[Sep. 2021] I gave a talk on CoLAKE at SFFAI.

[May 2021] One paper accepted to ACL 2021 (21.2% acceptance rate)!

[Mar. 2021] One paper accepted to NAACL 2021 (26% acceptance rate)!

[Dec. 2020] I gave a talk on CoLAKE at CSSNLP.

[Sep. 2020] One paper accepted to COLING 2020 (32.9% acceptance rate)!

[May 2020] Our PTM survey is accepted to SCIENCE CHINA Technological Sciences as an invited paper.

[Nov. 2019] I gave a talk on entity linking at Amazon Shanghai AI Lab. Slides can be downloaded here.

[Nov. 2019] My first paper is accepted to AAAI 2020 for oral presentation (4.5% oral presentation acceptance rate)!

[Oct. 2019] I joined Amazon Shanghai AI Lab as a research intern, supervised by Zheng Zhang.

[Sep. 2019] I joined the NLP Lab at Fudan University as a Ph.D. student.

[Jun. 2019] I received B.Eng. from School of Computer Science and Technology at Xidian University. GPA: 3.8/4.0 (top 0.5%)

Highlighted Papers

Full list of papers can be found at Google Scholar / Semantic Scholar / DBLP / ORCID

(*: Equal contribution)

	Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT Zhengfu He, Xuyang Ge, Qiong Tang, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu arXiv, 2402.12201 pdf / blog on OpenMOSS Sparse dictionary learning has been a rapidly growing technique in mechanistic interpretability to attack superposition and extract more human-understandable features from model activations. We ask a further question based on the extracted more monosemantic features: How do we recognize circuits connecting the enormous amount of dictionary features? We propose a circuit discovery framework alternative to activation patching.
	Can AI Assistants Know What They Don't Know? Qinyuan Cheng, Tianxiang Sun*, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, Shimin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu arXiv*, 2401.13275 pdf / code / blog on OpenMOSS We ask the question "Can AI assistants know what they don't know and express them through natural language?" To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets. Then we align the assistant with its corresponding Idk dataset and observe whether it can refuse to answer its unknown questions after alignment.
	Black-Box Tuning for Language-Model-as-a-Service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu ICML, 2022 (Spotlight) pdf / code / slides We propose a promising and practical scenario, Language-Model-as-a-Service (LMaaS), where users cannot access model parameters and gradients but can only access language models' output probability. For such a scenario, we propose the black-box tuning to optimize continuous prompts via derivative-free optimization.
	Paradigm Shift in Natural Language Processing Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang Machine Intelligence Research, 2022 (Invited Paper) pdf / project / slides Recent years have witnessed a trend of paradigm shift in a variety of NLP tasks, which is to solve a task that is originally performed with a paradigm (e.g., sequence labeling) with another paradigm (e.g., machine reading comprehension).
	Towards Efficient NLP: A Standard Evaluation and A Strong Baseline Xiangyang Liu, Tianxiang Sun*, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu NAACL, 2022 (Oral Presentation)* pdf / code / benchmark / slides We propose a benchmark, ELUE (Efficient Language Understanding Evaluation), for efficient NLP models and a strong baseline/backbone pre-trained model, ElasticBERT.
	CoLAKE: Contextualized Language and Knowledge Embedding Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, Zheng Zhang COLING, 2020 pdf / code / slides We pre-train a model called CoLAKE for jointly learning language and knowledge representation by unifying language and knowledge into word-knowledge graphs.
	Pre-trained Models for Natural Language Processing: A Survey Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, Xuanjing Huang SCIENCE CHINA Technological Sciences, 2020 (Invited Paper, Most Influential Paper of SCTS in 2020) pdf We provide a comprehensive survey of pre-trained models (PTMs) for NLP, ranging from non-contextual word embeddings to state-of-the-art language models. This is a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
	Learning Sparse Sharing Architectures for Multiple Tasks Tianxiang Sun, Yunfan Shao, Xiaonan Li, Pengfei Liu, Hang Yan, Xipeng Qiu, Xuanjing Huang AAAI, 2020 (Oral Presentation) pdf / code / slides We propose a new parameter sharing mechanism for multi-task learning, sparse sharing, which allocates a subnet for a task based on lottery ticket hypothesis. The sparse sharing successfully avoids negative transfer between tasks.

Projects & Resources

MOSS: A Conversational Language Model
project led by Tianxiang Sun

MOSS is a conversational language model like ChatGPT. It is capable of following users' instructions to perform various natural language tasks including question answering, generating text, summarzing text, generating code, etc. MOSS is also able to challenge incorrect premises, and reject inappropriate requests. Here is a brief introduction to MOSS.

Paper List on Language-Model-as-a-Service (LMaaS)
maintained by Tianxiang Sun

Pre-trained large language models (LLMs) such as GPT-3 are usually released as a service instead of open sourcing model weights. We call this scenario "Language-Model-as-a-Service (LMaaS)", where users can access the powerful LLMs through their inference APIs. We maintain a curated list of papers that fit into this scenario.

Awards

ByteDance Scholarships (13 winners in China, 2023)
National Scholarships (Ministry of Education, China, 2023)
WAIC Yunfan Award - Rising Star (15 winners across the world, 2023)
Fudan Academic Star (10 winners across STEM graduate schools, 2023)
Most Influential Paper Award of Sci. China Tech Sci. (2022)
National Scholarships (Ministry of Education, China, 2020)
Outstanding Graduate (associated with Xidian University, 2019)
First Prize in China High School Biology Olympiad (2014)

Service

Student Seminar Co-Chair

CCL 2023

Reviewer / Program Committee Member

ACL (2021, 2022, 2023)
EMNLP (2021, 2022, 2023)
COLING (2020, 2022)
ICML (2022)
ICLR (2023)
NeurIPS (2022, 2023)
AAAI (2021)
IJCAI (2021)

Design and source code from Jon Barron's website