How to apply Naive Bayes Classifiers to document classification problems.

Within the last decades it turned out that it is often much easier to tell a computer how to learn to do a specific task rather then telling it exactly how to do it. One of the generic terms for this could be Machine Learning, which is basically summarized by Wikipedia as:

Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases.

In this blog post I will introduce a simple theory-meets-practice example showing how to classify documents using a Naive Bayes Classifier together with a supervised learning strategy.

How to embed actions within every match of a one-or-more ( rule+ ) or zero-or-more ( rule* ) rule in ANTLR?

Yeah, well, the particular task is very simple if you know the solution. But when you are just getting started with ANTLR it is not that obvious what kind of things you can do within a grammar, and what you can not. I looked around the web for a while, but it was difficult for me to find a solution, so I will share it here.

