jsoup is a Java library that makes it easy to work with real-world HTML and XML. It offers an easy-to-use API for URL fetching, data parsing, extraction, and manipulation using DOM API methods, CSS, ...
It is a port of the Python library html5lib. It passes all relevant tests from html5lib. It is not tied to a specific DOM implementation. SBCL or ECL. CL-PPCRE and FLEXI-STREAMS. Might work with CLISP ...