Audienti-Indexer
Last updated
Last updated
is the gem that takes an HTML and produces a "merged" version of the HTML that can have both the HTML and plain text elements as scannable items.
It can take an HTML file, and return a cleaned, sanitized HTML item. It can remove the boilerplate from the HTML. It can extract the sentences from the HTML. And, you can do queries of the document and get answers.