guide
  • Introduction
  • Guiding Principles
    • Mission Statement
    • Conflict Resolution Process
  • Operating Model
    • Working Together
    • Holacracy
      • Meetings
      • Specific Roles
      • Terms and Definitions
      • Finer Points
      • Holacracy-Asana Key
    • Getting Things Done
      • Daily, Weekly, Monthly, and Annual Reviews
      • GTD-Asana Key
    • Transparency
    • Language
    • Budgeting
    • By Department
      • Engineering Operations
  • General Guidelines
  • Employment Policies
    • Equal Opportunity Employment
    • At-Will Employment
    • Code of Conduct in the Community
    • Complaint Policy
    • Drug and Alcohol Policy
    • Vacation, Holiday, and Paid Time Off (PTO) Policy
    • Supplemental Policies for Remote Employees and Contractors
    • Supplemental Policy for Bonus, Commissions, and other Performance-based Payments
    • Supplemental Policies for Hourly International Contractors or Workers
    • Supplemental Policies for Hourly International Contractors or Workers
    • Disputes and Arbitration
  • Benefits and Perks
    • Health Care
    • Vacation, Holiday and Paid Time Off (PTO) Policy
    • Holiday List
  • Hiring Documents
    • Acknowledgement of Receipt
    • Partner Proprietary Information and Inventions Agreement
  • Engineering Wiki
    • Code Snippets
      • Front End Code Snippets
    • Setup
      • 1: Overview of development using Audienti
      • 2: How to setup your dev environment on Docker
      • 2a: Setting up on our cloud your dev server
      • 3: Connect to Production using the VPN
      • 4: Import data into your development environment
    • Deployment
      • Docker based deployment of back end (manual)
    • Culture
      • How our development team works
      • Code Best Practices
    • Tips
      • Setting up a new development machine
      • Importing data to Development environment
      • GIT workflow and work tracking
      • Using Slack
      • Using Rubocop
      • Our Code Standards
      • General suggested best practices
      • Tracking your time
      • Naming Iterations
    • Migrations
      • Postgres
      • ElasticSearch
      • Redis
    • Database and System Maintenance
      • Redis Howtos
      • Elasticsearch HowTos
      • Postgres HowTos
      • Administration recipes
      • App maintenance crash course notes
    • Front End
      • 2016 Plan
      • Deploy
      • Assets
      • SearchLogic
      • How to create UI components
      • OMA Standard Tables
    • Monitoring and Alerting
      • Monitoring Systems
      • Monitoring individual controller actions
      • Get notified when a metric reaches a certain threshold
      • Instrumenting your models using Oma Stats
      • Configuring Graphite Charts
      • Tracking your results with StatsD
      • Logging Fields
      • Updating Kibana Filtering
    • Testing
      • Coverage
      • Elasticsearch mapping config synchronization
      • Testing Gotchas
      • Rspec Preloader
      • Test Best Practices
    • Models
      • Backlinks
    • Queueing and Worker System
      • Queueing and Job Overview
    • Processors
      • Rebuilding Spot Instances
      • Deploying processors
      • Running processors in development
      • Reverting to the previous build on a failed deployment
    • Processors / Opportunity Pipeline
      • Opportunity Pipeline
      • Diagram
    • Processors / Enrichment Pipeline
      • Diagram
      • Clustering
    • Processors / Backlink Pipeline
      • Diagram
      • Backlink Pipeline external APIs
      • Backlink pipeline logic
    • Processors / Automation Pipeline
      • Diagram
      • Automation Pipeline Overview
      • Agents
      • Running in development
    • Messaging and Social Accounts
      • Overview
    • API
      • Audienti API
    • Algorithms
    • Troubleshooting
      • Elasticsearch
    • Big Data Pipeline Stuff
      • Spark
    • Our Product
      • Feature synopsis of our product
    • Research
      • Backend framework comparison
      • Internet marketing Saas companies
    • Code snippets
      • Commonly Used
      • Not Used
    • Miscellaneous
      • Proxies and Bax
    • Legacy & Deprecated
      • Search criteria component
      • Classes list
      • Target Timeline
      • Twitter processor
      • Asset compilation
      • Test related information
      • Interface to EMR Hadoop jobs
      • Mongo Dex Indexes to be Built
      • Mongodb errors
      • Opportunity pipeline scoring
      • Graph Page
      • Lead scoring
      • Insights
      • Shard keys
      • Setting up OMA on local
      • Clone project to local machine
      • Getting around our servers in AWS
  • Acknowledgements
  • Documents That Receiving Your First Payment Triggers Acknowledgement and Acceptanace
Powered by GitBook
On this page
  • Required infrastructure
  • Setting up local
  • Viewing the landing page
  • Running the backend processors
  • Crawler
  • Opportunities
  1. Engineering Wiki
  2. Legacy & Deprecated

Setting up OMA on local

(created by nicholas; last update by william, apr 2014)

Required infrastructure

  • elasticsearch - 0.9 branch

  • redis

  • mongodb

  • postgresql

Setting up local

There is a postgresql backup on the team's dropbox.

Create a local development database:

 createdb oma_dev

Load the backup data:

gunzip -c marketfu_production.sql.gz  | psql oma_dev

Make sure you have a config/database.yml which sets up the PG database. Example:

development:
  username: postgre_username
  database: oma_dev
  adapter: postgresql
test:
  username: postgre_username
  database: oma_test
  adapter: postgresql

Now copy config/application.yml.example to application.yml and make sure that names of development and test Postgres databases match those in database.yml. At this point you should be able to start the application.

bundle exec thin start -R oma.ru -p 3000 -e development

Viewing the landing page

The problem is that the landing page expects a subdomain tied to a company. First you need to make sure that domains likeomadev.localhost.dev (this is an example) also point to localhost.

Ubuntu 12.04 : You can achieve this by editing /etc/hosts file:

127.0.0.1       localhost
127.0.0.1       localhost.dev
127.0.0.1       omadev.localhost.dev

Note: this works only for explicitly declared domains, in case you want a generic solution (*localhost.dev) consider using dnsmasq.

In the process you'll receive an activation email with an incorrect link. Replace the production domain (omadev.omaengine.com) with the local one(omadev.localhost.dev:3000) and paste into the browser to activate the newly created account.This isn't right and should be fixed.

Running the backend processors

Crawler

The crawler is run with rake tasks. The entire crawler consists of several running processors:

  1. crawler

  2. link

  3. hydra

  4. attribute

  5. issues

  6. writer

In addition there's a token_tap task which provides rate limiting and there are commands to turn the crawler on and off.

The token tap needs to be running at all times

bundle exec rake opportunity_pipeline:token_tap

Queue a domain for crawling

rails console:

rc = RedisCrawler::Console.new
rc.queue domain_id

When the crawler is running it will start immediately.

run the crawler

Crawler and link processor

The link and crawler need to run at the same time. The crawler task will fetch pages from the internet and store them in redis while the link processor will analyze those pages for new links to crawl and forward the crawled page to the hydra.

bundle exec rake redis_crawler:crawler
bundle exec rake redis_crawler:link

Hydra processor

The hydra processor checks links that are not part of the crawl-domain for their status codes. when all the links are checked it will push the page to the attribute processor

bundle exec rake redis_crawler:hydra

Attribute processor

The attribute processor analyzes the page and extracts attributes from it.

bundle exec rake redis_crawler:attribute

Issue processor

The issue processor analyzes the page for issues.

bundle exec rake redis_crawler:issue

Writer

This processor writes out the page to mongodb, elasticsearch and S3

bundle exec rake redis_crawler:writer

Opportunities

The opportunity pipeline is another important part of our infrastructure and pulls in data from social media and third party resources.

Sources: 1. facebook 2. twitter 3. bing news 4. forum 5. serps 6. profile providers for enrichment e.g. fullcontact api

initiating opps retrieval for a project

rails console:

> include OpportunityPipeline::Console
> queue_project_keywords_for(project.id)
> queue_states
2013-03-13 15:56:51 UTC
Enrichment queue: 0
Facebook queue: 9
Twitter queue: 9
News queue: 9
Forum queue: 9
Serps queue: 9
Twitter Write Count: 0

retrieving news opps

News opportunities are retrieved by running a news retriever and a news processor

bundle exec rake opportunity_pipeline:news_retriever
bundle exec rake opportunity_pipeline:news

retrieving facebook opps

facebook opportunities are retrieved by running a facebook retriever and a facebook processor

bundle exec rake opportunity_pipeline:facebook_retriever
bundle exec rake opportunity_pipeline:facebook

retrieving twitter opps

twitter opportunities are retrieved by running a twitter retriever and a twitter processor

bundle exec rake opportunity_pipeline:twitter_retriever
bundle exec rake opportunity_pipeline:twitter

retrieving forum opps

forum opportunities are keyword mentions on forums. forum opportunities are retrieved by running a forum retriever and a forum processor

bundle exec rake opportunity_pipeline:forum_retriever
bundle exec rake opportunity_pipeline:forum

retrieving serps opps

serps are loaded from the serps table and then treated as a source for opportunities. so generate a mention and a potential lead and load these up for enrichment

bundle exec rake opportunity_pipeline:forum_retriever
bundle exec rake opportunity_pipeline:forum

Enrichment

During the opportunity retrieval we've identified entities, these are potentially contactable items such as websites/social media users. The enrichment fase consists of digging in and trying to find more about them + hopefully isolating contact details.

bundle exec rake opportunity_pipeline:enrichment

some more processing

The mentions and leads section still doesn't work as we still need to write to elasticsearch and do some twitter processing

resqueworker: bundle exec rake resque:work QUEUE=twitter_processor
mention_resqueworker: bundle exec rake resque:work QUEUE=opp_mention_writer_queue
PreviousShard keysNextClone project to local machine

Last updated 7 years ago

After running the app you'll notice gets you redirected toand you're not viewing the development app anymore.

Now you should "Sign Up" theomadevcompany and expect it's landing page to be available at. The signup form is here:.

http://localhost:3000
http://getoma.com
http://omadev.localhost.dev:3000
http://localhost.dev:3000/company_signup