Publications

Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?

We evaluate how well Large Language Models (LLMs) latently recall and compose facts to answer multi-hop queries like “In the year …

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Despite the recent observation that large language models (LLMs) can store substantial factual knowledge, there is a limited …

Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries

Large language models (LLMs) can solve complex multi-step problems, but little is known about how these computations are implemented …

Exploring the Practicality of Generative Retrieval on Dynamic Corpora

Benchmarking the performance of information retrieval (IR) methods are mostly conducted with a fixed set of documents (static corpora); …

Do Large Language Models Latently Perform Multi-Hop Reasoning?

We study whether Large Language Models (LLMs) latently perform multi-hop reasoning with complex prompts such as “The mother of …

Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis

Previous works in prompt engineering for large language models have introduced different gradient-free probability-based prompt …

Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following

In this paper, we present our finding that prepending a Task-Agnostic Prefix Prompt (TAPP) to the input improves the …

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate …

Contextualized Generative Retrieval

The text retrieval task is mainly performed in two ways: the bi-encoder approach and the generative approach. The bi-encoder approach …

Generative Multi-hop Retrieval

Multi-hop retrieval is the task of retrieving a series of multiple documents that together provide sufficient evidence to answer a …

TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models

Language Models (LMs) become outdated as the world changes; they often fail to perform tasks requiring recent factual information which …

Towards Continual Knowledge Learning of Language Models

Large Language Models (LMs) are known to encode world knowledge in their parameters as they pretrain on a vast amount of web corpus, …

Spatial Dependency Parsing for Semi-Structured Document Information Extraction

Information Extraction (IE) for document images is often approached as a BIO tagging problem, where the model sequentially goes through …

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering

In open-domain question answering, retrieve-and-read mechanism has the inherent benefit of interpretability and the easiness of adding, …

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

We review the EfficientQA competition from NeurIPS 2020. The competition focused on open-domain question answering (QA), where systems …

Is Retriever Merely an Approximator of Reader?

The state of the art in open-domain question answering (QA) relies on an efficient retriever that drastically reduces the search space …

ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers

Automatic speech recognition (ASR) via call is essential for various applications, including AI for contact center (AICC) services. …

Efficient Dialogue State Tracking by Selectively Overwriting Memory

Recent works in dialogue state tracking (DST) focus on an open vocabulary-based setting to resolve scalability and generalization …

Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation

Answerer in Questioner’s Mind (AQM) is an information-theoretic framework that has been recently proposed for task-oriented …