How to apply Naive Bayes Classifiers to document classification problems.

Within the last decades it turned out that it is often much easier to tell a computer how to learn to do a specific task rather then telling it exactly how to do it. One of the generic terms for this could be Machine Learning, which is basically summarized by Wikipedia as:

Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases.

In this blog post I will introduce a simple theory-meets-practice example showing how to classify documents using a Naive Bayes Classifier together with a supervised learning strategy.

Continue reading