Displaying posts categorized under

Nokogiri

From http://nokogiri.org/ :

Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser. Among Nokogiri’s many features is the ability to search documents via XPath or CSS3 selectors.

XML is like violence – if it doesn’t solve your problems, you are not using enough of it.

How to extract plain text from HTML with Nokogiri

While working at an upcoming blogpost, I encountered the problem of extracting some plain text from HTML. If I was interested the whole plain text, I could just run html2text in bash and feed it with the HTML, but what I needed was just a specific part of the plain text between two certain comments. [...]