Within the last decades it turned out that it is often much easier to tell a computer how to learn to do a specific task rather then telling it exactly how to do it. One of the generic terms for this could be Machine Learning, which is basically summarized by Wikipedia as:
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases.
In this blog post I will introduce a simple theory-meets-practice example showing how to classify documents using a Naive Bayes Classifier together with a supervised learning strategy.
While working at an upcoming blogpost, I encountered the problem of extracting some plain text from HTML. If I was interested the whole plain text, I could just run
html2text in bash and feed it with the HTML, but what I needed was just a specific part of the plain text between two certain comments. As it was hard to google a simple solution for this I decided to share mine.
Yeah, well, the particular task is very simple if you know the solution. But when you are just getting started with ANTLR it is not that obvious what kind of things you can do within a grammar, and what you can not. I looked around the web for a while, but it was difficult for me to find a solution, so I will share it here.