Over the past few years, the technology industry has exploded. It seems like there is a new record-breaking IPO every single year. Companies are not slowing down either. In fact, corporate spending has done nothing but increase as businesses have begun to budget more and more for strategic initiatives and overall digital transformation. Gartner estimates that worldwide IT spending is projected to reach $4.2 trillion within 2021, with enterprise software accounting for about 14.2% of that total. This is an increase of 8.6% in spending from 2020. Gone are the days of expensive hardware appliances. Today every company is using an assortment of different SaaS technologies that have replaced conventional back-end systems. For instance, Sales teams typically leverage a customer relationship management (CRM) platform, Finance teams use enterprise resource planning (ERP) systems, Human Resource teams leverage human resources information systems (HRIS), Support teams use customer support (CS) tools. These tools typically include technologies like Hubspot, Salesforce, Netsuite, SAP, Workday, Gainsight, Zendesk, etc. These are just a few examples, but this list could go on and on...
Simply stated, SaaS is a software licensing model where the software is located on external servers rather than proprietary internal servers. Typically SaaS tools are built on top of the major cloud providers (i.e. AWS, Azure, GCP). In most cases, the software is provided through a subscription or license. Users typically access SaaS tools through a web browser, logging in using a username and password, rather than having to install actual software on their computer. With SaaS tools, companies don’t have to worry about any of the infrastructure or maintenance to keep the service up and running. In most cases, SaaS tools also have a lower up-front cost, since most companies offer consumption-based and flat-rate pricing models. Additionally, the largest SaaS companies are constantly innovating and rolling out new features and updates to improve their products in order to gain more customers. SaaS tools have one huge problem though. Since all SaaS apps hold a unique data set, every new SaaS tool a company adopts creates a data silo. This is because each SaaS tool is designed for a set purpose and most don’t necessarily play nicely with other tools
SaaS integrations have risen up as a way to combat the problem of data silos. Simply stated, a SaaS integration connects a SaaS-based application with either another cloud-based app or on-premise software through an application programming interface or API integration. Once this integration is complete, both applications can request and share data with each other easily. SaaS integrations only provide the blueprints for connecting applications together. In order to actually connect two different tools, businesses have to create an integration, so creating data pipelines is often the most time-consuming aspect of data engineering teams (this is not typically what they were hired for though). This process can be extremely time-consuming and difficult since the average mid-sized company often has hundreds of applications and SaaS tools. Additionally, since most SaaS companies are constantly rolling out updates and new features, these integrations and pipelines are extremely prone to failure. Even worse, this data is often still in its raw state as it has not been transformed, so it’s not even that usable.
In order to get around the problems mentioned above, companies have consistently leveraged an assortment of data integration solutions:
iPaaS solutions move data directly between applications doing little to no transformation on the data. Typically they offer a visual interface to build integrations. If an API endpoint is exposed by a SaaS vendor, then an iPaaS solution can push data to it or pull data from it. In general, iPaaS solutions perform actions when a trigger is met. Simply stated, when a trigger is met or an event takes place in one system, that information is then transmitted to another application via an API or Webhook which then performs one or more predefined actions. Fundamentally, all iPaaS solutions work the same way, in that they all send data from point A to point B or vice versa. Since these point-to-point connections are just large workflows, they become even more challenging to maintain compared to a “homegrown integration” created by a data engineering team. Additionally, iPaaS solutions are only designed to handle simple objects, so companies cannot send more complex information like ARR (annual recurring revenue) or product usage data to combine it with additional information in a marketing platform like Marketo or Hubspot. At their core, iPaaS solutions are imperative, they have to be told exactly what to do and how to do it. Data integration should be imperative.
In its simplest form, a CDP collects and consolidates customer data from various sources into a single repository and then sends that information to various destinations. CDPs provide a lot more functionality compared to iPaaS solutions. CDPs give marketers and growth teams the ability to compile and consolidate data from various sources to create segments based on user behavior and traits and then sync these segments directly back into their third-party tools to build customized experiences without having to rely on data engineering teams. CDPs were created solely for marketing purposes and exist to remove the friction that often exists between marketing and data teams. However, CDPs typically rely on extremely limited and predefined data models centered on users and accounts. In reality, every company has its own unique objects (i.e. subscriptions, carts, products, playlists, artists, etc.) Additionally, CDPs do not integrate with other technologies. For example, when it comes to data transformation, organizations are limited to the native integrations and product features built within a CDP and cannot use other tools on top of it.
ETL is a relatively old data integration process that dates back all the way to the 1970s. An ETL tool extracts data from first-party databases and third-party sources. After this data is extracted, it is transformed to meet the needs of analysts and data scientists and then loaded directly into the data warehouse. This creates a problem because the data is stuck in the data warehouse. Tools like Informatica have popularized this methodology of data integration. Cloud data warehouses derive insights and create reports, but they are less practical when it comes to using that data to create meaningful campaigns and tailored experiences towards customers in the same way that CDPs and iPaaS solutions are used.
All of the problems just expressed have given rise to a new line of thinking focused around ELT (extract, transform, load). This has largely been fueled by innovations in the cloud data warehousing space. Solutions like Snowflake and BigQuery have become extremely efficient and reliable for analytics purposes. ELT tools like Fivetran have made it really simple for businesses to move data from various sources to the data warehouse. As a native SaaS solution, Fivetran provides nearly 200 custom connectors or custom integrations for various data sources and SaaS applications that are designed to handle the “E” and “L” aspects of ELT, automating the entire data pipelining process for engineers. On the other hand, dbt (data build tool) has completely revolutionized the “T” in ELT by creating a tool that runs on top of the data warehouse to transform data with SQL. With dbt, companies can create reusable data models to orchestrate and transform their data. ELT should be thought of as the solution which empowers the data warehouse. ELT combined with the data warehouse has completely changed the data ecosystem by eliminating data silos. However, by eliminating data silos, the data warehouse has, in fact, become a data silo. Data warehouses are useful for creating dashboards and reports which are often powered through a Business Intelligence tool like PowerBI, Looker, Tableau, etc. However, neither ELT nor data warehousing has addressed the problem of SaaS integrations which is really just focused on pushing data back into the tools of non-technical business users.
This is the exact problem that Hightouch solves with Reverse ETL. “Reverse ETL is the process of copying data from a cloud data warehouse (i.e. Amazon Redshift, Google BigQuery, Snowflake, Azure Synapse, etc.) to operational systems of record, including but not limited to SaaS tools used for growth, marketing, sales, and support.” Whereas ETL and ELT read from the source and write to the warehouse, Reverse ETL reads from the warehouse and writes to the source. The data warehouse already has all of the information from every data source across the entire organization, so it is only logical that it is standardized as the single source of truth. With Hightouch, companies can leverage their existing data models (churn rate, lifetime value, workspaces created, etc.) to sync that information directly back into a destination of their choosing in real-time. These syncs can be done manually or scheduled at a set interval (every few minutes or hourly/daily). They can also be scheduled using custom recurrence or cron expression. Syncs can even be set to run after a dbt job is complete. Better yet, Hightouch simply runs on top of the data warehouse and doesn’t actually store any data. With Hightouch, organizations can take full control of their data stack and eliminate the bottlenecks found in conventional data integration solutions. This removes the friction between data engineers and business teams because data engineers can finally focus on the actual jobs they were hired for, and business teams can access the data they need in their native tools. Democratizing the data in the warehouse creates a single source of truth across every single different operational system or SaaS application. Ultimately, Hightouch eliminates the need for SaaS integrations. Every data integration solution should be declarative to ensure there is alignment across teams and everyone is working towards the same goals.