DEV Community

Cover image for Understanding Data Enrichment: A Comprehensive Guide | Hightouch
Luke Kline for Hightouch

Posted on • Originally published at

Understanding Data Enrichment: A Comprehensive Guide | Hightouch

What is Data Enrichment?

Over the last five years, the technology landscape has changed dramatically. Today nearly every single organization is collecting both first-party data and third-party data (ex: Clearbit) to power insights and make business decisions. However, one of the core challenges across all industries is data enrichment. Most companies are capable of capturing raw data and generating insights to make informed decisions, but few are able to turn these insights into action. This is why data enrichment is so important.

In its simplest form, data enrichment or data enhancement is the process of enhancing existing datasets with information that is generated from additional sources, whether it be product analytics, marketing analytics, sales analytics, billing analytics, etc. The goal is to pair this customer data together to enable cross-analysis and deeper insights. Emphasizing data enrichment processes improves data accuracy. This translates into more personalization, which in turn leads to a better customer experience.

A good way to view a data enrichment process is through the lens of an operational system. At a basic level, a CRM tool like Salesforce or Hubspot provides high-quality data on various properties like contacts, companies, deals, etc. Typically these properties have a set of sub-properties like company headquarters, first name, last name, email, phone number, deal stage, deal owner, etc. All of this contributes to the overall customer profile. This same analogy can be applied to pretty much any operational system. With data enrichment, an organization might add additional information from other sources to their CRM. One example of this could be product data since this is not something that CRM’s innately capture.

Image of Hubspot with Custom Fields

All of this information is extremely valuable and super helpful for outreach and customer personalization. Marketers use it on a daily basis to experiment and run campaigns, and salespeople leverage it to grow their sales pipeline. However, operational systems are only designed to answer simple questions so it is often difficult to maintain a holistic view of the customer.

Why is Data Enrichment so Important?

To realize why data enrichment is so important, it’s first relevant to understand the pieces of a modern data stack. For most organizations, the end goal is always to create an end-to-end flow between data collection, data integration, and data consumption. Data is generally collected through a variety of data sources (Google, Facebook, Salesforce, Hubspot Marketo, Amplitude, Zendesk, Asana, etc.). This raw information is then ingested into a cloud data platform (Snowflake) using a data integration tool (Fivetran).

When the data is in the warehouse, the next order of business is to transform and model it for analysis using a tool like dbt in most cases. Once all of this is done, the data is consumed through a dashboard using a tool like PowerBI, Tableau, Thoughtspot, Looker, etc. This information is then dispersed to various stakeholders and business teams as needed. Reports and dashboards only provide business direction though; they don’t make data actionable. Even worse, they only show a zoomed-out view of the data. This means it’s not associated with a specific prospect, user, or customer.

The problem is, all of this data is stuck in a dashboard and not actionable by anyone except data teams. Other teams like marketing, product, sales, etc cannot access the detailed version of this information without going directly through an analyst or engineer, making it impossible to answer questions like: “Who is the most active user in an account?” or “How many active users does this account have?” or “Which contacts have downloaded X marketing resource?” or “What pricing plan should customer ABC be on based on their usage?”

This means that various teams (specifically marketing and sales) can’t answer all of the questions they have because the information doesn’t exist in their native tools. Every tool is limited to the data it captures. Data Enrichment de-silos information across the entire organization and democratizes it for everyone. When more information is available, various teams can ask and answer more questions than ever before. Best of all, it creates a single unified vision across the entire organization because every single team has access to the same information. With data enrichment, businesses can gain a higher understanding of their consumers.

Data Enrichment Examples

Data enrichment can solve an assortment of use cases for companies in every industry. Conventionally a large focus has been placed on demographic data enrichment (information about the customer) and geographic data enrichment (information around the customer). However, businesses in every industry struggle with creating a 360-degree view of their consumers, so the emphasis should be placed on defining the common behavior attributes for ideal customers whether it be something simple like income level, marital status, physical address, etc. to create a more valuable data set. With that in mind, there are four main data types that provide the most value when it comes to data enrichment.

Product data refers to all information about the customer that is captured directly through the product. Some examples could be:

  • Purchases
  • Number of users
  • Signup date
  • Product usage metrics
  • Use Case

Sales data refers to all of the information about the customer that is captured in the sales process and pipeline. Some examples could be:

  • Active deals
  • Companies in POC/trial
  • First meeting
  • Product demo date

Marketing data refers to all of the information that is captured in the customer journey. Some examples could be:

  • Web pages viewed
  • Resources downloaded
  • Links clicked
  • Session length

Billing data refers to all of the information that is captured throughout the payment process. Some examples could be:

  • Contract size
  • (ARR) Annual recurring revenue
  • Last payment date

When this type of information is made available outside of the native system that it was captured in and copied into other tools, it enables some really powerful actions. For instance, marketing teams can create email lists to target specific people with ads, campaigns, and offers if they are able to associate properties like pages viewed, resources downloaded, and links clicked with specific users (like promoting deals to customers that viewed your pricing page). Refining this information even further, marketing could use these assets to score leads based on intent. Likewise, when product data is added to a platform like Hubspot or Salesforce, it helps salespeople identify which leads to target and it increases personalization from a marketing standpoint because both can see which users are most active in a given account. Additionally, when sales data is made available to marketers, targeting customers in active deal cycles becomes extremely easy. Lastly, when billing data is made available to other systems, customer support teams can trigger emails to remind customers about upcoming payments. Best of all, every single scenario just listed can be fully automated.

Data Enrichment Use Cases for Marketing and Sales

The core use cases for Data Enrichment are most often centered around marketing and sales teams. These teams are often looking for more detailed information on various leads and accounts as it relates to the customer journey as a whole.

Consider this scenario. Product-led-growth companies like Slack and Grammarly give users the ability to sign up for a free version of their products. Both of these companies offer additional features in the premium version of their products. The typical adoption path begins with a single user and expands when additional team members see the value in the product. Once enough users are leveraging the tool, management will purchase an enterprise license to cover the entire organization. This is a fantastic go-to-market strategy because it amplifies sales and marketing efforts to spread awareness and increase adoption. Obviously, this model only works with a strong product, hence the name “product-led-growth.”

Converting free users to paid customers is a major challenge, so in most cases, the role of marketing and sales in PLG companies is to accelerate the adoption cycle. This means delivering highly personalized content, messages, and offers. When nearly all of the information about the customer is captured in-product, it makes it really difficult to leverage the information because it doesn’t exist in native business systems like Salesforce or Hubspot where marketers and salespeople live on a daily basis.

What is Lead Enrichment?

Lead enrichment is all about tracking the activity of specific customers or prospects to enhance internal data. Ultimately, lead enrichment provides additional insight into existing leads by adding additional information from other sources. Typically this is done by enhancing lead information in an existing database or CRM (i.e. CRM data enrichment). This is especially necessary for companies with a PLG/self-serve signup process because all of the data about the customer is captured in-product and business teams typically don’t have access to tools like Amplitude, Heap, or Mixpanel which capture product analytics.

Knowing exactly where a prospect or customer is in their journey is absolutely crucial for sales and marketing teams because conversion rates increase when personalization increases. Being able to associate product usage, emails opened, integrations installed, links clicked, pages visited, resources downloaded, etc. is priceless. Retool solved this personalization problem by leveraging Hightouch to sync their product data in-realtime back into their CRM. By syncing fresh product usage data to Hubspot and Salesforce, the marketing and SDR team were able to launch personalized campaigns extremely quickly. This led to a 32% increase in response rate on personalized emails and a 500% increase in click and feature adoption.

What is Account Enrichment? (for Account-Based Marketing)

Account enrichment is nearly identical to lead enrichment with the only difference being that the focus is placed on accounts rather than individual leads and prospects. When product data is inaccessible to business teams it also poses a problem on an account level because individual account executives and marketers don’t know who to target.

Figuring out which inbound leads to focus on within a given account can be a huge pain and even more so when there is no data available to differentiate between them. Identifying which users are most active based on product usage helps to simplify this problem. Scoring leads based on product usage is a great way to solve this challenge and it is exactly what Zeplin did using Hightouch to sync product data back into their CRM.

Platforms for Data Enrichment

Since the end goal for data enrichment is to enhance existing data sets and the main use cases are focused on marketing and sales, it’s obvious that the optimal tools to perform data enrichment should be sales and marketing platforms. That is to say, it makes the most sense to perform data enrichment in platforms like Salesforce, Hubspot, Marketo, Acoustic, Pipedrive, etc.

Data Enrichment Tools and Services

Although types of data enrichment companies exist, there are two main categories for data enrichment tools, iPaas (Integration Platform as a Service), and Reverse ETL (Extract, Transform, Load).


Simply stated, iPaas solutions move data between apps or external data sources with little or no transformation. They only give users the ability to send data from one source at a time and the data is still raw. Likewise, the data can’t be combined to create a 360 view of the customer. They are strictly point-to-point solutions. This makes it impossible to send information like ARR (annual recurring revenue), something simple like individual purchases would have to be sent instead. iPaas solutions like Tray and Workato are also largely based on event triggers. A trigger represents an event that takes place in an individual system.

That event is then transmitted to the integration platform through an API call or Webhook which then performs predefined actions set in place by the user. Because these solutions are often based on events or records they often run into rate limit errors. Additionally, with iPaas solutions, the user has to worry about painful “edge cases” like foreign keys, API limits, and the inevitable tree of if/else statements. One of the main drawcards for iPaaS solutions is that they provide an extremely simple UI that requires no technical knowledge. This means non-technical users can control their workflow automation needs. The downfall of this is that it can cause things can get complicated very quickly. Likewise, it is important to note that data cannot be moved unless there is an event trigger and this can cause serious problems.

Reverse ETL

Unlike iPaas solutions, Reverse ETL solutions like Hightouch integrate directly with the data warehouse, meaning that data can be synced to various operational systems so that it updates in real-time. This is extremely useful because the data warehouse is typically the single source of truth for most organizations. With Reverse ETL, operational systems can show the same information that is displayed in the warehouse - all in a matter of minutes.

Better yet, Reverse ETL solutions can leverage existing data models (ex: lifetime value, propensity scores, customer health scores, ARR/MRR, funnel stages). that have been built on top of the data warehouse. Reverse ETL can sync all of the data broken down by each user instantly. It also automatically handles rate limits retries, etc., and uses bulk APIs without requiring any user input. Hightouch lets users easily define data, map the appropriate fields, and send that information to the tool of their choosing.

The use case should always be at the forefront of any decision when considering the adoption of a tool. For more information oniPaas and Reverse ETL check out this Guide to Data Integration.

The Benefits of Data Enrichment

Basically, every single company captures all of the data required to make business decisions. However, very few leverage that data to turn insights into actions because it is kept in a dashboard or report. Even worse, the data is often siloed in a way that is not accessible by various non-technical members and these are the exact teams that need this data to drive their day-to-day decisions. This is why data enrichment is valuable. When data is accessible by everyone, it’s actionable by everyone and this means that different teams will always be working towards the same goals because they all have the same view of the customer. Every company is different, so data should always be a valuable asset. With that in mind, the importance of Reverse ETL and Operational Analytics cannot be understated.

Top comments (1)

botezatu profile image
Olga • Edited

Data enrichment involves enhancing existing datasets with information from additional sources, such as product, marketing, sales, and billing analytics. This process aims to improve data accuracy and enable cross-analysis for deeper insights, leading to better customer experiences through personalization. Data enrichment addresses the challenge of turning data insights into actionable outcomes. Doing CRM enrichment, organizations can break down data silos, democratize access to information across the entire organization, and empower various teams to ask and answer more questions, leading to a unified understanding of consumers.