loading...

GraphQL + SQL Magic for faster downloads on Shopify

arjunrajkumar profile image Arjun Rajkumar ・3 min read

Shopify's GraphQL Admin API lets you download large volumes of data asynchronously in bulk.

For example, here is a bulk query for getting a list of all the products with its images and variants.

def structure_the_query
    context.download_products_query = context.client.parse <<-'GRAPHQL'
      mutation {
        bulkOperationRunQuery(
         query: """
          {
            products {
              edges {
                node {
                  id
                  title
                  images {
                    edges {
                      node {
                        id
                        originalSrc
                      }
                    }
                  }
                  variants {
                    edges {
                      node {
                        id
                        title
                        position
                        price
                        image {
                          id
                        }
                      }
                    }
                  }
                }
              }
            }
          }
          """
        ) {
          bulkOperation {
            id
            status
          }
          userErrors {
            field
            message
          }
        }
      }
    GRAPHQL
  end

And, when the bulk query operation is complete, you get back a JSON file.. This is a sample of how it looks like

{"id":"gid:\/\/shopify\/Product\/5311829246111","title":"Guitars"}
{"id":"gid:\/\/shopify\/ProductImage\/17459197116575","originalSrc":"https:\/\/cdn.shopify.com\/s\/files\/1\/0414\/2294\/6463\/products\/shopping.png?v=1592737453","__parentId":"gid:\/\/shopify\/Product\/5311829246111"}
{"id":"gid:\/\/shopify\/ProductVariant\/34792999878815","title":"Default Title","position":1,"price":"0.00","image":null,"__parentId":"gid:\/\/shopify\/Product\/5311829246111"}

-

This is my flow on how I used GraphQL with Active Record Import to improve performance.

1. Retrieve all the products from the shop

def call
  activate_graphql_client
  structure_the_query
  make_the_query
  poll_status_of_bulk_query
  retrieve_products
end

The five methods above in the interactor focusses on making the connection with Shopify and retrieving the products. The retrieve_products method does the job of checking if the bulk query is completed, and saving the products to the database.

def retrieve_products
    if context.result_poll_status.data.current_bulk_operation.status == 'COMPLETED'
      context.url = context.result_poll_status.data.current_bulk_operation.url

      SaveProducts.call!(shop_id: context.shop.id, url: context.url)
    else
      sleep 2
      poll_status_of_bulk_query
      retrieve_products
    end
  end

2. Save the products in your database

The retrieve_products method calls this interactor: SaveProducts.call!(shop_id: context.shop.id, url: context.url) whose main purpose is to download the products from the file which Shopify shared with us, and to save the products in the database.

class SaveProducts
  include Interactor

  def call
    download_products_from_file
  end

  private

  def download_products_from_file
    context.products = []
    line = nil

    URI.open(context.url) do |f|
      f.each do |new_line|
        parse_line(line, new_line) if line
        line = new_line
      end
      parse_line(line)
    end

    Product.import context.products, recursive: true
  end

  def parse_line(line, next_line=nil)
    json = JSON.parse(line)

    if belongs_to_product(json['id'])
      create_new_product(line, json)

      move_product_into_products(next_line)
    elsif belongs_to_image(json['id'])
      create_new_image(line, json)

      move_product_into_products(next_line)
    elsif belongs_to_variant(json['id'])
      create_new_variant(line, json)

      move_product_into_products(next_line)
    end
  end

  def belongs_to_product(id)
    !belongs_to_image(id) && !belongs_to_variant(id)
  end

  def belongs_to_image(id)
    id.include? 'ProductImage'
  end

  def belongs_to_variant(id)
    id.include? 'ProductVariant'
  end

  def create_new_product(line, json)
    context.product = Product.new(shopify_product_id: json['id'].delete('^0-9'), title: json['title'], shop_id: context.shop_id)
  end

  def move_product_into_products(next_line=nil)
    if next_line
      json_next_line = JSON.parse(next_line)
      context.products << context.product if belongs_to_product(json_next_line['id'])
    else
      context.products << context.product
    end
  end

  def create_new_image(line, json)
    context.product.images.build(shopify_image_id: json['id'].delete('^0-9'), src: json['originalSrc'], shop_id: context.shop_id)
  end

  def create_new_variant(line, json)
    context.product.variants.build(shopify_variant_id: json['id'].delete('^0-9'), title: json['title'], position: json['position'], price: json['price'], shopify_image_id: json.dig('image', 'id')&.delete('^0-9'), shop_id: context.shop_id)
  end
end

This line Product.import context.products, recursive: true also helps in improving the performance of the downloads. This comes from the ActiveRecord Import gem.

If you had to download 1000 products, and were hitting the DB each time, you would end up making 1000 SQL calls. Instead Product.import lets you batch all the products together, so that you can save this with just 1 SQL call.

Move on from manually paginating results and managing a client-side throttle, to performing bulk operations with the Shopify's GraphQL Admin API. And combine this with some SQL magic to get some real performance boosts!

Discussion

pic
Editor guide