Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 11

8. Linguistic Analysis. Linguists examine responsive and nonresponsive documents to derive classification rules that maximize the correct classification of documents. 9. Naïve Bayesian Classifier. A system that examines the probability that each word in a new document came from the word distribution derived from trained responsive documents or from trained non-responsive documents. The system is naïve in the sense that it assumes that all words are independent of one another. All of these approaches involve machine learning, except, typically, Linguistic Analysis (which may or may not include machine learning components). A computational process extracts pertinent information from example documents and builds a mathematical model that allows responsive and non-responsive documents to be distinguished from one another based on the text that they contain. The accuracy of these systems will depend on the specifics of the implementation and on the qu