DEV Community

Cover image for How to Migrate from Segment to RudderStack
Team RudderStack for RudderStack

Posted on • Originally published at rudderstack.com

How to Migrate from Segment to RudderStack

Why Migrate From Segment To RudderStack?

On the surface, RudderStack and Segment look like very similar applications. If you've already gone through the pain of standing up your Segment instance, you're probably kicking yourself for not having read this sooner. But if you're still on the fence about why you should switch from Segment to RudderStack, here are some things to consider:

The fundamentals

RudderStack is a developer-focused CDP featuring robust APIs built to use your data warehouse. The bottom line is that we do not believe in holding your data hostage. For a full head-to-head rundown of the tools, check out our RudderStack versus Segment feature comparison.

Advanced features

We built RudderStack for developers, and the product includes a number of advanced features that are worth highlighting before we jump into how easy it is to switch from Segment.

Transformations: You can run JavaScript functions on the live event stream or DBT models on at-rest data, then send the results to your entire stack. Transformations leverage standard Javascript libraries allowing you to code in a robust language you already know.

Data Governance API: Extend JSON payloads with rich metadata on your event stream and integrate with your existing alerting system and CI/CD pipeline, then push schema enforcement back into system---all via API.

Rich Grafana Dashboards: Get a real-time view of events, see performance under load and get rich statistics about event delivery.

Platform Flexibility: Your place or ours! RudderStack provides dedicated VPC as well as open-source (run on your own) versions in addition to our secure Cloud option.

Price

We don't charge for MTU's. Our flexible, volume based pricing is predictable and designed to scale with you.

Warehouse flexibility

It's your data. Host it wherever you want, and we'll bring the data to you.

Switch with minimal engineering work and no data loss

Now that we've made our case, let's walk you through the basics of switching from Segment to RudderStack. If you haven't already, go ahead and sign up for a free hosted instance on our website.

Step 1: Setting Up RudderStack

Once you have set up your account on RudderStack, you are just a few steps away from migrating to RudderStack. Our goal here is to lay out the steps necessary to replace your instrumentation code for generating events from using the Segment SDK to the RudderStack SDK with minimal changes.

NOTE: You can also check out our guide on how to add sources and destinations in RudderStack.

Start with creating an account on the RudderStack dashboard. Similar to Segment, you will create sources and destinations here. This will help you create the necessary connections for the event data to flow from your sources to the destination.

RudderStack requires a data plane for the events to flow through. You can set it up yourself within your cloud computing environment, or you can have us host it within our VPC. To set it up yourself, check our installation guide.

If you prefer to have us host it within our VPC, turn on the RudderStack Hosted Service button on the Connections page of your dashboard to get started with it. ‌

If you need more support or want us to manage your hosting, please feel free to contact us.

Step 2: Updating SDK implementation

In this example, we will highlight adding the JavaScript SDK for your existing websites. For iOS and Android sample code, see our documentation on Migrating From Segment To RudderStack:


<script>
    rudderanalytics = window.rudderanalytics = [];

    var methods = [
        "load",
        "page",
        "track",
        "identify",
        "reset"
    ];
    for (var i=0; i<methods.length; i++) {
        var method = methods[i];
        rudderanalytics[method] = function(methodName) {
            return function() {
                rudderanalytics.push([methodName, ...arguments]);
            }
        } (method)
    }
    rudderanalytics.load(<YOUR_WRITE_KEY>, <DATA_PLANE_URI>);
    rudderanalytics.page();
</script>
<script src="https://cdn.rudderlabs.com/rudder-analytics.min.js"></script>

Enter fullscreen mode Exit fullscreen mode

Note the change to the object. We use rudderanalytics as the global object library in comparison to Segment's analytics object. From here on out, you can use the rest of your code as is, as the RudderStack SDK is fully API-compatible with Segment

The only thing left to do is create your destinations within the RudderStack app. For a quick video tutorial, check out Sending Data Using RudderStack In Under 5 Minutes

Step 3: Migrating (not) your warehouse schemas

A major advantage of switching from Segment to RudderStack is the ability to store all of your event and user data in your own warehouse. Migrating your data warehouse destinations from Segment to RudderStack is fairly straightforward as RudderStack can leverage the existing schemas you've already created as storage destinations within Segment. This ensures no historical data will be lost as the event sources are switched. In other words, you do not have to migrate data!

Exactly how we tackle migrating segment event sources involves a couple of different options depending on whether your sources are running in Cloud Mode or Device Mode and whether your source events are being sent from a web client, server or device.

Step 1: Create the warehouse destination

Create a new warehouse destination and set the namespace to be the same as the schema that Segment is writing to. RudderStack will then write to the same set of tables as Segment. For help with a specific warehouse type, check out our documentation on warehouse destinations.

Step 2: Route server-side source events to RudderStack

We recommend starting with server-side event sources first. Since we have complete control over the timing of the transition, migrating these sources is more straightforward than those running on a client which might require a user to upgrade an app or clear their cache before rudder-bound events are fired.

Within the RudderStack Application, verify that your server-side sources are connected to the data warehouse destination created in step 1. Next, switch your server-side clients to route your event data to RudderStack, and don't worry, because we are using the same data tables as Segment, any legacy events not yet processed within Segment will continue to load into the same table even as the new RudderStack events are flowing. Also, because Segment server source events only run in cloud mode, once you make the switch to sending data through RudderStack, there is no need to continue running the segment connection.

Step 3: Client-side events - configure Segment as a RudderStack source

image-f291faeacf67122980e14d7073140c7cedd34b9d-1148x1128-png

Unlike server-side events, client-side events may be running in device mode or cloud mode, and, as such, you may not always have complete control over when events start flowing through RudderStack especially for IOS and Android clients that will require the end user to upgrade their version of your app before the change can take place. To accommodate these situations, we recommend creating a Segment source in RudderStack and pointing it to your data warehouse destination.

Then copy the Webhook URL on the Settings tab, replacing the <DATAPLANE_URL> with your data plane which can be found at the top of the connections screen.

image-1fbc11d5c3e1b2b70aca97d126667cc53fdded58-892x280-png

Next, create a Webhook Destination in Segment using the webhook url created above. There are no additional header values required. Connect your source to the webhook destination, and RudderStack will begin sending those events to your warehouse.

At this point, you will want to disconnect your warehouse destination in Segment. Otherwise, duplicate events will be created in your data warehouse.

Step 4: Backfilling anonymous IDs from Segment

When migrating from Segment, you likely already have some amount of anonymous traffic that has not yet been identified. When Segment or RudderStack track events for non-identified users, both assign a random UUID as an anonymousId. This ID is used to track an unknown user until they are identified and allows us to stitch together user behavior, journeys, and first touch attribution before and after they are identified.

To avoid duplicating these previously assigned anonymous users, we recommend loading the RudderStack SDK in the ready callback of the segment SDK for a period of time. By loading RudderStack in the callback, we can retrieve the previously assigned anonymousId from the segment cookie and assign that same anonymousId to the RudderStack user while initializing the RudderStack SDK. After we have overlapped the SDKs enough to feel confident that the majority of our anonymousId have been backfilled, we can remove the Segment SDK and begin using only the RudderStack SDK.

A code snippet for loading the SDKs in parallel is shown below:


<script type="text/javascript">
!function(){var e=window.rudderanalytics=window.rudderanalytics||[];e.methods=["load","page","track","identify","alias","group","ready","reset","getAnonymousId","setAnonymousId"],e.factory=function(t){return function(){var r=Array.prototype.slice.call(arguments);return r.unshift(t),e.push(r),e}};for(var t=0;t<e.methods.length;t++){var r=e.methods[t];e[r]=e.factory(r)}e.loadJS=function(e,t){var r=document.createElement("script");r.type="text/javascript",r.async=!0,r.src="https://cdn.rudderlabs.com/v1/rudder-analytics.min.js";var a=document.getElementsByTagName("script")[0];a.parentNode.insertBefore(r,a)}}()
!(function(){
// Create a queue, but don't obliterate an existing one!
var analtics = window.analytics = window.analytics || [];
// If the real analytics.js is already on the page return.
if (analytics.initialize) return;
// If the snippet was invoked already show an error.
if (analytics.invoked) {
if (window.console && console.error) {
console.error('Segment snippet included twice.');
}
return;
}
// Invoked flag, to make sure the snippet
// is never invoked twice.
analytics.invoked = true;
// A list of the methods in Analytics.js to stub.
analytics.methods = [
'trackSubmit',
'trackClick',
'trackLink',
'trackForm',
'pageview',
'identify',
'reset',
'group',
'track',
'ready',
'alias',
'debug',
'page',
'once',
'off',
'on',
'addSourceMiddleware',
'addIntegrationMiddleware',
'setAnonymousId',
'addDestinationMiddleware'
];
// Define a factory to create stubs. These are placeholders
// for methods in Analytics.js so that you never have to wait
// for it to load to actually record data. The `method` is
// stored as the first argument, so we can replay the data.
analytics.factory = function(method){
return function(){
var args = Array.prototype.slice.call(arguments);
args.unshift(method);
analytics.push(args);
return analytics;
};
};
// For each of our methods, generate a queueing stub.
for (var i = 0; i < analytics.methods.length; i++) {
var key = analytics.methods[i];
analytics[key] = analytics.factory(key);
}
// Define a method to load Analytics.js from our CDN,
// and that will be sure to only ever load it once.
analytics.load = function(key, options){
// Create an async script element based on your key.
var script = document.createElement('script');
script.type = 'text/javascript';
script.async = true;
script.src = 'https://cdn.segment.com/analytics.js/v1/'
+ key + '/analytics.min.js';
// Insert our script next to the first script element.
var first = document.getElementsByTagName('script')[0];
first.parentNode.insertBefore(script, first);
analytics._loadOptions = options;
};
// Add a version to keep track of what's in the wild.
analytics.SNIPPET_VERSION = '4.1.0';
// Load Analytics.js with your key, which will automatically
// load the tools you've enabled for your account. Boosh!
analytics.load("SEGMENT_WRITE_KEY");
// Make the first page call to load the integrations. If
// you'd like to manually name or tag the page, edit or
// move this call however you'd like.
analytics.page();
// analytics ready callback
analytics.ready(function() {
// INITIALIZE RUDDER SDK with setAnonymousId
window.rudderanalytics.unshift(["setAnonymousId", window.analytics.user().anonymousId()])
window.rudderanalytics.unshift(["load", "RudderStack_WRITE_KEY", "RudderStack_DATAPLANE_URL"])
window.rudderanalytics.page()
window.rudderanalytics.loadJS()
})})();
</script>
Enter fullscreen mode Exit fullscreen mode

NOTE: You will need to enter your SEGMENT_WRITE_KEYRudderStack_WRITE_KEY, and RudderStack_DATAPLANE_URL in the above example.

Advanced options

If you leverage Segment's Personas and User Traits features, stay tuned for part two and part three of this series where we'll detail how RudderStack supports identity resolution, SQL Traits and building and distributing custom audiences using DBT and RudderStack Warehouse Actions!

Ready to start migrating?

Sign up for free today to test out our event stream, ELT, and reverse-ETL pipelines. Use our HTTP source to send data in less than 5 minutes, or install one of our 12 SDKs in your website or app. Get started.

Top comments (0)