In this article, we'll go over the methods used to achieve async CSV export with Ruby on Rails.
Problem: When a large amount of data is exported and the server is unable to process it quickly after receiving client requests, a timeout error may occasionally occur with CSV exports.
The solution is to move the CSV export to a worker thread, process it there so that it doesn't slow down the main thread, and notify the client when the desired CSV file is prepared.
We implemented a general solution using the Command design pattern, which will enable us to run the same code for many types of CSV format outputs.
Technology used
- Heroku
- Ruby On Rails
- Sidekiq as a worker thread
- Redis for keeping the state of csv export processing
- Filestack or any other cloud storage service
Here, two enums with the following structure are used.
EXPORTABLE_REDIS_STATUSES = {
processing: 'Processing',
complete: 'Processed',
}.freeze
EXPORTABLE_REDIS_KEYS = {
members_csv: 'MEMBERS_CSV_GENERATORS',
all_tasks_csv: 'ALL_TASKS_CSV',
my_tasks_csv: 'MY_TASKS_CSV',
}.freeze
Here, we have two states: processing
and processed
. One is used to launch a worker thread, and the other is used to produce a CSV file once the task is completed.
The second enum is only used as a filter to prevent csv exports that aren't registered with the application.
This is how the command invoker class, which acts as a sidekiq worker, looks.
class Exports::ExportableCommandJob < ApplicationJob
after_enqueue do |job|
uuid = job.arguments.first[:uuid]
redis_key = redis_collection_key(job.arguments.first[:redis_key])
REDIS.hset(
redis_key,
uuid,
{ status: Constants::EXPORTABLE_REDIS_STATUSES[:processing] }.to_json
)
end
def perform(uuid:, redis_key:, command:, params: {}, cleanup_interval: nil)
params = JSON.parse(params).symbolize_keys unless params.is_a?(Hash)
command = command.constantize.new(params)
redis_key = redis_collection_key(redis_key)
file_data = command.call
tmp_file = Tempfile.new('upload', encoding: 'ascii-8bit')
tmp_file << file_data
tmp_file.flush
tmp_file.rewind
file_name = command.file_name
uploaded_file = UploadFileService::UploadableFile.new(file: tmp_file, filename: file_name)
details = UploadFileService.upload_file(uploaded_file)
tmp_file.unlink
file_path = details.metadata[:fileurl]
generator = JSON.parse(REDIS.hget(redis_key, uuid))
generator['status'] = Constants::EXPORTABLE_REDIS_STATUSES[:complete]
generator['exportable'] = file_path
REDIS.hset(redis_key, uuid, generator.to_json)
end
after_perform do |job|
uuid = job.arguments.first[:uuid]
redis_key = redis_collection_key(job.arguments.first[:redis_key])
ExportableCleanup.set(wait: job.arguments.first[:cleanup_interval] || 1.hour)
.perform_later(uuid: uuid, redis_key: redis_key)
end
private
def redis_collection_key(key)
redis_key = key.to_sym
Constants::EXPORTABLE_REDIS_KEYS[redis_key] || key
end
end
We used the uuid and redis_key to set the job's status to processing after it was queuing, allowing us to monitor its progress at any time.
We accept a command class name and its arguments via params in the perform method, allowing us to invoke a function and expect the presence of CSV data. After that, we store the data in a temporary file and upload it using a fileStack service or any cloud storage service.
We obtain the file's url after putting it on the cloud, and we use it to set Redis to change the task state from processing
to processed
. The client can now request to get updated when the CSV export is completed and to receive generated URL for downloading.
For filtering purposes, the private method redis_collection_key
has been used here.
In the end after_perform
schedules a cleanup task, as shown in this example.
class ExportableCleanup < ApplicationJob
def perform(uuid:, redis_key:)
exportable_json = REDIS.hget(redis_key, uuid)
unless exportable_json.nil?
generator = JSON.parse(exportable_json)
file_url = generator['exportable']
UploadFileService.remove_file(file_url) unless file_url.blank?
end
REDIS.hdel(redis_key, uuid)
end
end
Here it just removes data from Redis and file from cloud storage.
The invoker class call looks like this
def export_csv_async(args, redis_key)
uuid = SecureRandom.uuid
Exports::ExportableCommandJob.perform_later(
uuid: uuid,
redis_key: redis_key,
command: 'CsvExportDataGenerator',
params: args.to_h.to_json,
)
uuid
end
which returns the uuid and will return it to the client so it can make the state checking calls as described above.
This is a simple action which checks process state in Redis
class ExportableGeneratorsController < ActionController::API
include HttpErrorHandling
before_action :load_resource
def show
render json: { status: @generator['status'], fileUrl: @generator['exportable'] }
end
private
def load_resource
@exportable_key = Constants::EXPORTABLE_REDIS_KEYS[params[:key].to_sym]
gen = REDIS.hget(@exportable_key, params[:uuid])
return not_found('Process not found') if gen.nil?
@generator = JSON.parse(gen)
end
end
Top comments (0)