Problem Statement:
Let’s design a file hosting service like Dropbox or Google Drive.
Cloud file storage enables users to store their data on remote servers.
Usually, these servers are maintained by cloud storage providers and made available to users over a network (internet).
Users pay for their cloud data storage on a monthly basis.
Similar Services: OneDrive, Google Drive
Step-1: Why Cloud Storage ?
Cloud file storage services have become very popular recently as they simplify the storage and exchange of digital resources among multiple devices.
The shift from using single personal computers to using multiple devices with different platforms and operating systems such as smartphones and tablets and their portable access from various geographical locations at any time is believed to be accountable for the huge popularity of cloud storage services.
Some of the top benefits of such services are:
Availability: The motto of cloud storage services is to have data availability anywhere anytime. Users can access their files/photos from any device whenever and wherever they like.
Reliability and Durability Cloud storage also offers 100% reliability and durability of data. Cloud storage ensures that users will never lose their data, by keeping multiple copies of the data stored on different geographically located servers.
Scalability: Users will never have to worry about getting out of storage space, they have unlimited storage offcourse at a price.
Step-2: Requirements and Goals of the System
Top-level requirements:
Users should be able to upload and download their files/photos from any device.
Users should be able to share files or folders with other users.
Our Service should support automatic synchronization between devices, i.e., after updating a file on one device, it should get synchronized on all devices.
The system should support storing large files up to a GB.
ACID-ity is required. Atomicity, Consistency, Isolation and Durability of all file operations should be guaranteed.
Our system should support offline editing. Users should be able to add/delete/modify files while offline, and as soon as they come online, all their changes should be synced to the remote servers and other online devices.
Extended Requirements:
The system should support snapshotting of the data, so that users can go back to any version of the files.
Step-3: Some Design Considerations
Internally, files can be stored in small parts or chunks (say 4MB), this can provide a lot of benefits e.g. all failed operations shall only be retried for smaller parts of a file. If a user fails to upload a file, then only the failing chunk will be retried.
We can reduce the amount of data exchange by transferring updated chunks only.
By removing duplicate chunks, we can save storage space and bandwidth usage.
Keeping a local copy of the metadata (file name, size, etc.) with the client can save us a lot of round trips to the server.
For small changes, clients can intelligently upload the diffs instead of the whole chunk.
Step-4: Capacity Estimation and Constraints
Let’s assume that we have 500M total users, and 100M daily active users.
Let’s assume that on average each user connects from three different devices.
On average if a user has 200 files or photos, we will have 100 billion total files.
Let’s assume that average file size is 100KB, then total storage would be : 100B * 100KB => 10PB
Let’s also assume that we will have one million active connections per minute.
Step-5: High Level Design
The user will specify a folder as the workspace on their device.
Any file/photo/folder placed in this folder will be uploaded to the cloud, and whenever a file is modified or deleted, it will be reflected in the same way in the cloud storage.
The user can specify similar workspaces on all their devices and any modification done on one device will be propagated to all other devices to have the same view of the workspace everywhere.
At a high level, we need to store files and their metadata information like File Name, File Size, Directory, etc., and with whom it is shared .
So, we need some servers that can help the clients to upload/download files to Cloud Storage and some servers that can facilitate updating metadata about files and users.
We also need some mechanism to notify all clients whenever an update happens so they can synchronize their files.
As shown in the diagram below, Block servers will work with the clients to upload/download files from cloud storage, and Metadata servers will keep metadata of files updated in a SQL or NoSQL database.
Synchronization servers will handle the workflow of notifying all clients about different changes for synchronization.
Step-6: Detailed Component Design
a) Client
The Client Application monitors the workspace folder on user’s machine and syncs all files/folders in it with the remote Cloud Storage.
The client application will work with the storage servers to upload, download and modify actual files to backend Cloud Storage.
The client also interacts with the remote Synchronization Service to handle any file metadata updates e.g. change in the file name, size, modification date, etc.
How do we handle file transfer efficiently ?
We can break each file into smaller chunks so that we transfer only those chunks that are modified and not the whole file.
Let’s say we divide each file into fixed size of 4MB chunks.
We can statically calculate what could be an optimal chunk size based on:
Storage devices we use in the cloud to optimize space utilization and Input/output operations per second
Network bandwidth
Average file size in the storage etc.
In our metadata, we should also keep a record of each file and the chunks that constitute it.
Should we keep a copy of metadata with Client
Based on the above considerations we can divide our client into following 4 parts:
Internal Metadata Database: will keep track of all the files, chunks, their versions, and their location in the file system.
Chunker: will split the files into smaller pieces called chunks. It will also be responsible for reconstructing a file from its chunks. Our chunking algorithm will detect the parts of the files that have been modified by the user and only transfer those parts to the Cloud Storage, this will save us bandwidth and synchronization time.
Watcher: will monitor the local workspace folders and notify the Indexer of any action performed by the users, e.g., when users create, delete, or update files or folders. Watcher also listens to any changes happening on other clients that are broadcasted by Synchronization service.
Indexer: will process the events received from the Watcher and update the internal metadata database with information about the chunks of the modified files. Once the chunks are successfully submitted/downloaded to the Cloud Storage, the Indexer will communicate with the remote Synchronization Service to broadcast changes to other clients & update remote metadata database.
b) Metadata Database
The metadata database maintains the indexes of the various chunks. The information contains files/chunks names, and their different versions along with the information of users and workspace. You can use RDBMS or NoSQL but make sure that you meet the data consistency property because multiple clients will be working on the same file. With RDBMS there is no problem with the consistency but with NoSQL, you will get eventual consistency. If you decide to use NoSQL then you need to do different configurations for different databases (For example, the Cassandra replication factor gives the consistency level). Relational databases are difficult to scale so if you’re using the MySQL database then you need to use a database sharding technique (or master-slave technique) to scale the application. In database sharding, you need to add multiple MySQL databases but it will be difficult to manage these databases for any update or for any new information that will be added to the databases. To overcome this problem we need to build an edge wrapper around the sharded databases. This edge wrapper provides the ORM and the client can easily use this edge wrapper’s ORM to interact with the database (instead of interacting with the databases directly).
c) Synchronization Service
The client communicates with the synchronization services either to receive the latest update from the cloud storage or to send the latest request/updates to the Cloud Storage. The synchronization service receives the request from the request queue of the messaging services and updates the metadata database with the latest changes. Also, the synchronization service broadcast the latest update to the other clients (if there are multiple clients) through the response queue so that the other client’s indexer can fetch back the chunks from the cloud storage and recreate the files with the latest update. It also updates the local database with the information stored in the Metadata Database. If a client is not connected to the internet or offline for some time, it polls the system for new updates as soon as it goes online.
d) Message Queuing Service (MQS)
An important part of our architecture is a messaging middleware that should be able to handle a substantial number of requests.
A scalable MQS that supports asynchronous message-based communication between clients and the Synchronization Service instances best fits the requirements of our application.
e) Cloud/Block Storage
You can use any cloud storage service like Amazon S3 to store the chunks of the files uploaded by the user. The client communicates with the cloud storage for any action performed in the files/folders using the API provided by the cloud provider.
Block Service
Block Service interacts with block storage for uploading and downloading of files. Clients connect with Block Service to upload or download file chunks.
When a client finishes downloading file, Block Service notifies Meta Service to update the metadata. When a client uploads a file, Block Service on finishing the upload to block storage, notifies the Meta Service to update the metadata corresponding to this client and broadcast messages for other clients.
Block Storage can be implemented using a distributed file system like Glusterfs or Amazon S3. Distributed file system provides high reliability and durability making sure the files uploaded are never lost.
Step-7: File Processing Workflow
The sequence below shows the interaction between the components of the application in a scenario when Client A updates a file that is shared with Client B and C, so they should receive the update too.
If the other clients were not online at the time of the update, the Message Queuing Service keeps the update notifications in separate response queues for them until they become online later.
Step-8: Data Deduplication
Data deduplication is a technique used for eliminating duplicate copies of data to improve storage utilization.
It can also be applied to network data transfers to reduce the number of bytes that must be sent.
Step-9: Metadata Partitioning
To scale out metadata DB, we need to partition it so that it can store information about millions of users and billions of files/chunks.
We need to come up with a partitioning scheme that would divide and store our data to different DB servers.
Notification Service
Notification Service broadcasts the file changes to connected clients making sure any change to file is reflected all watching clients instantly.
Notification Service can be implemented using HTTP Long Polling, Websockets or Server Sent Events. Websockets establish a persistent duplex connection between client and server. It is not a good choice in this scenario as there is no need of two way communication. We only need to broadcast the message from service to client and hence it’s an overkill.
Database Design:
Top comments (0)