Coursera Capstone Project

Next Word Prediction using Katz Backoff Model - Part 3: Prediction Model Implementation

Executive Summary The Capstone Project of the Johns Hopkins Data Science Specialization is to build an NLP application, which should predict the next word of a user text input. In Part 1, we have analysed and found some characteristics of the training dataset that can be made use of in the implementation. We have also discussed the Good-Turing smoothing estimate and Katz backoff model that powering our text prediction application in Part 2.

Next Word Prediction using Katz Backoff Model - Part 2: N-gram model, Katz Backoff, and Good-Turing Discounting

Executive Summary The Capstone Project of the Johns Hopkins Data Science Specialization is to build an NLP application, which should predict the next word of a user text input. In Part 1, we have analysed the data and found that there are a lot of uncommon words and word combinations (2- and 3-grams) can be removed from the corpora, in order to reduce memory usage and speed up the model building time.

Next Word Prediction using Katz Backoff Model - Part 1: The Data Analysis

Executive Summary The Capstone Project of the Data Science Specialization in Coursera offered by Johns Hopkins University is to build an NLP application, which should predict the next word of a user text input. This report will discuss the nature of the project and data, the model and algorithm powering the application, and the implementation of the application. Part 1 will focus on the analysis of the datasets provided, which will guide the direction on the implementation of the actual text prediction program.