About me

Currently, I work at Alibaba Tongyi Lab. I received my Ph.D. degree from ShanghaiTech University in 2023. I was advised by Prof. Kewei Tu. My researches are mainly focused on Retrieval-Augmented Generation (RAG), knowledge boundary and Agent for textual LLMs and multi-modal LLMs. Before I received my Ph.D., my researches are mainly focused on Structured Prediction (Syntactic/Semantic Dependency Parsing, Sequence Labeling (such as Named Entity Recognition), Neural Architecture Search in structured prediction), Knowledge Distillation and Multilingual NLP. I visited Prof. Lu Wei’s lab as a visiting research fellow at SUTD.

Here are my Publications and CV.

I am currently hiring self-motivated researchers and interns in Shanghai and Hangzhou. Please feel free to contact me with your resume. tomas.wxy@alibaba-inc.com


2024-12 Checkout our new work OmniSearch: A Self-Adaptive Planning Agent For Multimodal RAG for VQA. Github and DEMO

2024-12 Our paper Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark was accepted to COLING 2025!

2024-07 Our paper Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts was accepted to ACL-Findings 2024!

2024-04 Our new book 《动手学自然语言处理》 (Hands-On Natural Language Processing) is released! It is a Chinese NLP introductory book for beginners. Book Link

2023-12 I got outstanding Ph.D. thesis nomination of CIPS (Chinese Information Processing Society of China).

2023-06 I got president’s award at ShanghaiTech University, which is Top 2 out of all 717 graduate students.

2022-10 Our paper Named Entity and Relation Extraction with Multi-Modal Retrieval was accepted to the Findings of EMNLP 2022!

2022-07 Our paper A Knowledge-based System for Multilingual Named Entity Recognition wins the Best System Paper Award (top 0.45%=1/221) at SemEval 2022!

2022-04 Our paper ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition was accepted to the Main Conference of NAACL 2022!

2022-02: Our system wins the SemEval-2022 Multilingual Complex Named Entity Recognition shared task over 10 out of 13 tracks! Paper

2021-05: Our paper Automated Concatenation of Embeddings for Structured Prediction was accepted to the Main Conference of ACL 2021!

2021-05: Our paper Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor was accepted to the Main Conference of ACL 2021!

2021-05: Our paper Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning was accepted to the Main Conference of ACL 2021!

2021-04: I got Excellent Intern award, which is for the top 12 outstanding interns at Alibaba Group in 2020.

2020-09: Our paper AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network was accepted to the Main Conference of EMNLP 2020!

2020-09: Our paper More Embeddings, Better Sequence Labelers? was accepted to Findings of EMNLP 2020!

2020-09: Our paper Second-Order Neural Dependency Parsing with Message Passing and End-to-End Training was accepted to AACL 2020!

2020-04: Our paper Structure-Level Knowledge Distillation For Multilingual Sequence Labeling was accepted to ACL 2020!