Open-domain QA problem is defined as finding answers in collections of unstructured documents. Knowledge Base (KB) has developed many models to address the open domain QA. But it inherits incompleteness, fixed schemas. This paper addresses these issues. Moreover, the recent development of deep learning with sophisticated architecture like attention-based and memory augmented neural network gives the authors the hope to give a fresh look at this problem.
In this paper authors use some different architectures like index lookup followed by vector model, multitask learning, distant supervision, bigram hashing, TF-IDF with a multi-layer recurrent neural network.
I think that one of the main contributions of this paper is combining these complex architectures to solve such a critical problem. Another contribution is authors have combined several QA datasets via multitask learning that can provide us a general system capable of asking different open domain question. This architecture can be used for other rich websites. As the system uses only one source to answer the questions, it reduces information redundancy among the sources to answer and forces the model to be very precise while answering the question.
Apart from this, the authors of this paper develops a model using token feature and aligned question embedding for comprehension of text that outperforms all previous models.
The system described in this article has two components: (1) Document Retriever that finds relevant articles of the question and (2) Document Reader that extract answer from the articles returned by the Document Retriever.
For first component the models uses a non-machine learning system that narrows the search space.
On narrowed search space it uses inverted index lookup followed by term vector. Then for taking local word order into account it uses bigram counts. By using the above architecture the model performs well than any previous model even better than built-in ElasticSearch based Wikipedia API.
Document Reader described in this article uses several techniques combinedly to achieve better performance. Like word embedding that is useful because questions consist of some frequent words like what, how, which etc. Exact match that uses the binary features indicating whether one token can be exactly matched with one question word either is in its original form, lowercase or lemma form. Token feature which reflects some properties to its own context which include POS (Parts-Of-Speech) , NER (Named Entity Recognition) and TF (Term Frequency). Moreover, this model uses aligned question embedding to compare between similar but non-identical words.
Here aligned question embedding and exact match performs very well and plays a similar but complementary role. If they remove these two parts of Data Reader then the performance of Data Reader drops dramatically.
The questions are encoded by applying recurrent neural network on top of word embedding of question-words. While encoding question they calculated the weight vector for each question to learn easily.
At prediction level, they used paragraph vector and question vector as input and trained two classifier independently. More specifically they used bilinear term to capture the similarity between question tokens and paragraph tokens.
This system uses quite standard benchmark dataset. And I think that data was sufficient enough to train and test the system. Evaluating the individual components as well as the full system across multiple benchmarks have shown the efficacy of this system.
Now I can say that this machine reading at a scale is a key challenging task for researcher to focus on. And only machine comprehension and page ranking alone cannot solve this real world application based problem.
An inelegant part of this paper is that this paper does not give any quantitative measurement of time complexity and memory complexity. This paper only generally tells us that its memory efficient but does not give us any mathematical or statistical proof. This is one of the cutting corner of this paper. Apart from this, the paper does not give any idea of implementing the model in real world and how to check the impediments it would face.
I think the main purpose described in the abstract had been achieved. The main insight of this paper is combining different deep learning systems to resolve one problem, that is-solving the open domain QA.