Backlink pipeline logic
Attributes
URL specific attributes
title (String) title attribute
title_words (Array) tokenized title attribute
anchor_text (String) anchor text
anchor_words (Array) tokenized anchor text
alt (String) alt text
follow (boolean) the link is a follow-link yes/no
image_link (boolean) the link is an image link yes/no
locations (Hash) The location where the link is found (in body, in navigation, ...)
Connection attributes
source_url (String)
source_host (String)
destination_url (String)
destination_host (String)
code (integer)
destination_page_code (integer)
ip_address (String)
SEO analysis of the backlink
match_type (String) (no_match, host_match, path_match)
host_match_link_count (integer) Destination links to the domain we want, but not the path (this is when your money domain is pbs.org/kids/)
path_match_link_count (integer) Destination links to the full path of the configured domain (pbs.org/kids)
backlinks_count (integer) data retrieved from an api that counts the backlinks to the source_url. This is needed for our own market_rank calculation and link juice/value calculations.
links_count (integer) Amount of links on the source page
juice (float) How much SEO-value this link represents
link_value (float) not sure what the difference is with juice, might be redundant
market_rank (integer) OMA custom calculation of pagerank
page_pr (integer) Pagerank of source_url
attributes about the source page
content_type (String) what the content_type of the page is
page_title (String) page title of the page
page_title_array (Array) tokenized page title
Metadata
domain_id (String)
created_at (date)
updated_at (date)
id (mongodb id)
tags (Array) -
>
tags is an array that contains the kind and match_type
data_source (String) Seomoz or ahrefs source, not used in frontend
kind (String) This contains the classification (missing, momentum, relevance, authority, strategic)
link_status (String) missing or live
Useless data
status (String) Status is always "Active"
page_digest (String) Who cares about the page digest really?
data (Hash)
destination_digest (String) Hash of destination, no value afaik
source_digest (String) Hash of source, no value afaik
history (Array) Seems like there's a history record on some, this needs to be solved differently imo.
domain_pr (integer) This is some bogus calculation that isn't grounded in SEO, in fact the method calculating it contained a comment saying so:
Last updated