Publications

Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis

Large Language Models (LLMs) have demonstrated great capabilities in solving a wide range of tasks in a resource-efficient manner …

In-Context Instruction Learning

Instruction learning of Large Language Models (LLMs) has enabled zero-shot task generalization. However, instruction learning has been …

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate …

Contextualized Generative Retrieval

The text retrieval task is mainly performed in two ways: the bi-encoder approach and the generative approach. The bi-encoder approach …

Generative Multi-hop Retrieval

Multi-hop retrieval is the task of retrieving a series of multiple documents that together provide sufficient evidence to answer a …

TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models

Language Models (LMs) become outdated as the world changes; they often fail to perform tasks requiring recent factual information which …

Towards Continual Knowledge Learning of Language Models

Large Language Models (LMs) are known to encode world knowledge in their parameters as they pretrain on a vast amount of web corpus, …

Spatial Dependency Parsing for Semi-Structured Document Information Extraction

Information Extraction (IE) for document images is often approached as a BIO tagging problem, where the model sequentially goes through …

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering

In open-domain question answering, retrieve-and-read mechanism has the inherent benefit of interpretability and the easiness of adding, …

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

We review the EfficientQA competition from NeurIPS 2020. The competition focused on open-domain question answering (QA), where systems …

Is Retriever Merely an Approximator of Reader?

The state of the art in open-domain question answering (QA) relies on an efficient retriever that drastically reduces the search space …

ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers

Automatic speech recognition (ASR) via call is essential for various applications, including AI for contact center (AICC) services. …

Efficient Dialogue State Tracking by Selectively Overwriting Memory

Recent works in dialogue state tracking (DST) focus on an open vocabulary-based setting to resolve scalability and generalization …

Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation

Answerer in Questioner’s Mind (AQM) is an information-theoretic framework that has been recently proposed for task-oriented …