2018-01-09
As of 8 January 2018
Goals
Complete extraction of the Retriever and Enricher functions into their own, discrete gems (and the Single Responsibility Principal components that are required for this).
Retriever
Enricher
Get V1 prototype of React App working in the mention-pipeline context
Get the mention pipeline up and running, merging the front end and back end. Make it so we can configure and use the app in the mention context
Refine the front and and back end together to make the app easier to use.
Complete data science part of the app.. and make sure we have a good strategy for creating People and Companies from the profiles we find.
Implement the data science algorithm in the application so that we can produce People and Companies
Update front end to configure and use people and companies in auotmation.
Mention analytics intro into the system.
DONE. Major milestone. Replacement of the app with a new, React-based version.
Status
Substantial progress has been made on the front end application in React. We have hired 2 developers that are activity working on this part of the application. In doing so, we've implemented a number of additional back end/API features to aid in the development of the front end, such as Swagger documentation.
The other major job we've been working on for the last 2 months is to extract the "Retriever" and "Enricher" parts of the application into separate Gems. The idea with this, long term, is to put them behind an API. But, as an interim step, we are working to extract them into gems.
And, as a part of this task, we've built a number of additional gems. They are:
Scour - This is an HTTP client that can use a SPLASH proxy to enable Javascript, scripting, etc. while using a Ruby interface and Nokogiri
Haystack - Text and HTML parsing methods to return basic information about the page or text. Contains all our logic that produces "features" of the page retrieved.. such as author, tags, important phrases in the text, sentiment, etc.
Blackbook - Parsers that handle each unique page/service we encounter. Mostly, this does profile parsing (like extracting information from a Twitter profile page), but also has the parsers for Google etc.
Allusion - This is a URL parser. For a URL, it will attempt to extract information directly from the URL, and give indications of what it is. Gems like Blackbook use this to figure out what parser to use.
Audienti-Indexer - Takes an HTML and returns the plain text, sentences, etc. from the document.
Audienti-Api - Let's us do API calls in a standard way, as many different servers are better "resolved" by hitting their APIs versus retrieving them and parsing them.
Audienti-Retriever - The wrapper for the function we are extracting. It allows us to do Audienti::Retriever.new(term: 'food', via: 'facebook', since: 1.day.ago).call and get a list of mentions of the term "food" in the last day.
Last updated