Te-Lin Wu (吳德霖)

Member of Technical Staff, AI

Email: telinwu [at] cs (dot) ucla (dot) edu

I am one of the early founding members in the AI domain of a stealth video AI startup. Previously, I was a researcher at Character.ai working on large language models (LLMs) and model post-training for smarter and more amusing chatbots.
Prior to that, I obtained my PhD at UCLA PlusLab advised by Nanyun (Violet) Peng, where my research focuses on multimodal models across NLP and computer vision.
I have also worked with Joseph J. Lim on reinforcement learning and vision for robotics topics.
Prior to PhD, I obtained my M.S. at Stanford University where I was advised by Silvio Savarese, and I did my undergrad at National Tsing-Hua University (國立清華大學).

Over the summers, I've been lucky to work as research interns in several wonderful groups, including Google Research, Meta Reality Labs, Amazon AI, and Adobe Research.

Selected Publications

For a full list of my publications, please see here. (* denotes equal contributions.)

Contrastive Visual Data Augmentation

Yu Zhou, Bingxuan Li*, Mohan Tang*, Te-Lin Wu, Kuan-Hao Huang, Heng Ji, Kai-Wei Chang, Nanyun Peng

ICML 2025 / Paper

An elegantly designed contrastive framework to improve vision-language models on learning novel concepts.

Agent-DocEdit: Language-Instructed LLM Agent for Content-Rich Document Editing

Te-Lin Wu, Rajiv Jain, Yufan Zhou, Puneet Mathur, Vlad I Morariu

COLM 2024 / Paper

A modularized LLM-agent approach for content-rich multimodal document editing.

DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation

Xueqing Wu, Rui Zheng, Jingzhen Sha, Te-Lin Wu, Hanyu Zhou, Tang Mohan, Kai-Wei Chang, Nanyun Peng, Haoran Huang

NeurIPS Datasets and Benchmarks Track 2024 / Paper

A new dataset for complex data analysis tasks and newly proposed RLHF techniques for the tasks.

VDebugger: Harnessing Execution Feedback for Debugging Visual Programs

Xueqing Wu, Zongyu Lin, Songyan Zhao, Te-Lin Wu, Pan Lu, Nanyun Peng, Kai-Wei Chang

EMNLP 2024 (Findings) / Paper

A debugging tool for visual program generator.

LegalDiscourse: Interpreting When Laws Apply and To Whom

Alexander Spangher, Zihan Xue, Te-Lin Wu, Mark Hansen, Jonathan May

NAACL 2024 / Paper

A novel legal-related dataset that emphasizes on the discourse and the taxonomy of span-and-relation parsing.

Localizing Active Objects from Egocentric Vision with Symbolic World Knowledge

Te-Lin Wu*, Yu Zhou*, Nanyun Peng

EMNLP 2023 / Paper / Video

A novel technique to ground active objects in egocentric vision with LLM-enhanced knowledge.

ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos

Te-Lin Wu*, Zi-Yi Dou*, Qingyuan Hu*, Yu Hou, Nischal Chandra, Marjorie Freedman, Ralph Weischedel, Nanyun Peng

EMNLP 2023 / Paper / Video

A novel dataset for understanding counterfactual commonsense reasoning in videos.

SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams

Te-Lin Wu, Satwik Kottur, Andrea Madotto, Mahmoud Azab, Pedro Rodriguez, Babak Damavandi, Nanyun Peng, Seungwhan Moon

ACL 2023 / Paper / Video

A dataset for situated conversational agent with applications in AR/VR shopping domains.

Learning Action Conditions from Instructional Manuals for Instruction Understanding

Te-Lin Wu, Caiqi Zhang, Qingyuan Hu, Alex Spangher, Nanyun Peng

ACL 2023 / Paper / Video

We learn a model to perform pre- and post-conditon inference to actionables in instructional manuals via a weakly-supervised method.

Character-Centric Story Visualization via Visual Planning and Token Alignment

Hong Chen, Rujun Han, Te-Lin Wu, Hideki Nakayama, Nanyun Peng

EMNLP 2022 / Paper / Code

A method that utilizes grad-cam to propose plausible character plans for story generation (visualization) task.

Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals

Te-Lin Wu, Alex Spangher, Pegah Alipoormolabashi, Marjorie Freedman, Ralph Weischedel, Nanyun Peng

ACL 2022 / Paper / Video

We propose several sequence-aware pre-training objectives to equip multimodal models with task-order knowledge.

HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning

Mingyu Ma, Muhao Chen*, Te-Lin Wu*, Nanyun Peng,

Findings of EMNLP 2021

Paper / Code

Using a hyperbolic representation learning scheme is more effective more KG taxonomy expansion.

COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences

Shikhar Singh∗, Nuan Wen∗, Yu Hou, Pegah Alipoormolabashi, Te-Lin Wu, Xuezhe Ma, Nanyun Peng

Findings of ACL 2021 / Paper / Dataset & Codes /

A dataset for complementary commonsense reasoning collected via a model-in-the-loop gamified session.

MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification

Te-Lin Wu, Shikhar Singh, Sayan Paul, Gully Burns, Nanyun Peng

AAAI 2021 / Paper / Dataset & Codes

A dataset for multimodal biomedical method classification.

LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding

Te-Lin Wu, Cheng Li, Mingyang Zhang, Tao Chen, Spurthi Amba Hombaiah, Michael Bendersky

ViGIL Workshop, NAACL 2021 / Paper

A pre-training paradigm to exploit document layout to learn a document representation.

Program Guided Agent

Shao-Hua Sun, Te-Lin Wu, Joseph J. Lim

ICLR 2020

Paper / Project Page

A framework to programmatically control an RL-trained agent.

Demo2Vec: Reasoning Object Affordances from Online Videos

Te-Lin Wu*, Kuan Fang*, Daniel Yang, Silvio Savarese, Joseph J. Lim

CVPR 2018

Paper / Project Page

Learning to infer object affordance with a video demonstration of how to interact with objects.

Feedback Networks

Te-Lin Wu*, Amir R. Zamir*, Lin Sun, William B. Shen, Bertram E. Shi, Jitendra Malik, Silvio Savarese

CVPR 2017

Paper / Project Page

A study of feedback mechanism of convolutional neural networks.