Question Asking and Answering with Sentence-BERT and Berkeley Neural Parser

This post is extended from a Natural Language Processing course project I did with two teammates. The task is to generate questions as well as answer questions given an Wikipedia article. We got the 2nd highest grade, and I was asked by the instructor to be a teaching assistant for the course (but I graduated that semester).

In a nutshell, for the question asking part, we applied Berkeley Neural Parser to create basic parsing rules for both binary and Wh-questions. For the question answering part, we utilized Sentence-BERT, which uses a siamese network to train sentence embeddings, found candidate answers by comparing cosine-similarity, and handled different types of question variations.

Task Setup and Our Results

Inputs of the question asking task include a Wikipedia article without any processing and the number of questions required n. We need to generate different types of questions. Our system utilizes the part of speech tags extensively and generate different types of questions (Do/Be/What/Who/When/Where/Which/Why/How many/..), with some parts of the question substitued with synonyms or wrong informations. For example, these are our questions generated from the Wikipedia entry about Kangaroos.

Inputs of the question answering task include a Wikipedia article and questions. We need to answer the question correctly and as concisely as possible. Below are our answers about Kangaroos. Some answers are incorrect unfortunately.

Sentence-BERT (SBERT) Overview

S-BERT is a pre-trained sentence embedding model that builds on the BERT architecture and uses a Siamese network to generate high-quality sentence embeddings. The Siamese network is composed of two identical subnetworks, each taking a sentence as input and producing a fixed-length vector representation. The subnetworks share weights and architecture, which allows them to learn to encode sentences into meaningful vector representations that capture their semantic meaning.

During training, SBERT is fed pairs of sentences and learns to predict whether the two sentences have the same meaning. The network then adjusts its weights to minimize the contrastive loss function, which measures the difference between the predicted similarity scores and the true labels.

Once trained, SBERT can encode any given sentence into a dense, fixed-length vector representation that captures its semantic meaning. These sentence embeddings can be used for a range of NLP tasks, such as text classification, semantic similarity analysis, and information retrieval.

One key benefit of SBERT is its ability to handle sentence-level similarity. By encoding similar sentences into similar vector representations, SBERT can be used for tasks such as finding the most similar pair of sentences or determining whether two sentences have the same meaning. Additionally, SBERT can be fine-tuned on specific tasks using only a small amount of task-specific data, i.e., few-shots learning.