Task
Not long ago I was looking for an additional work project as GO-developer and found a vacancy of a no-name company with the test task to write a simple client-server app for uploading large files with gRPC connection.
I thought: OK, why not.
Spoiler: I got an offer but declined it.
Solution
Preparation
Official docs of gRPC protocol says that there are two ways of communication: unary and streaming.
For uploading big files we could use streaming some bytes of a part of a file from a user to the server.
Let's write simple proto file for it:
#grpc-filetransfer/pkg/proto
syntax = "proto3";
package proto;
option go_package = "./;uploadpb";
message FileUploadRequest {
string file_name = 1;
bytes chunk = 2;
}
message FileUploadResponse {
string file_name = 1;
uint32 size = 2;
}
service FileService {
rpc Upload(stream FileUploadRequest) returns(FileUploadResponse);
}
As you see, FileUploadRequest
contains file_name
and file_chunk
, FileUploadResponse
- simple response after correct uploading. The service has the only method Upload
.
Let's generate go-file in your proto dir: protoc --go_out=. --go_opt=paths=source_relative upload.proto
Ok, we have generated go code and can start to write our services.
Server side
Server side can read its config, listen to incoming gRPC clients by Upload
procedure and write files parts and answers after clients streams ends.
The server should embed UnimplementedFileServiceServer
that has been generated by protoc:
type FileServiceServer struct {
uploadpb.UnimplementedFileServiceServer
l *logger.Logger
cfg *config.Config
}
and implements Upload
method that takes stream as argument:
func (g *FileServiceServer) Upload(stream uploadpb.FileService_UploadServer) error {
file := NewFile()
var fileSize uint32
fileSize = 0
defer func() {
if err := file.OutputFile.Close(); err != nil {
g.l.Error(err)
}
}()
for {
req, err := stream.Recv()
if file.FilePath == "" {
file.SetFile(req.GetFileName(), g.cfg.FilesStorage.Location)
}
if err == io.EOF {
break
}
if err != nil {
return g.logError(status.Error(codes.Internal, err.Error()))
}
chunk := req.GetChunk()
fileSize += uint32(len(chunk))
g.l.Debug("received a chunk with size: %d", fileSize)
if err := file.Write(chunk); err != nil {
return g.logError(status.Error(codes.Internal, err.Error()))
}
}
fileName := filepath.Base(file.FilePath)
g.l.Debug("saved file: %s, size: %d", fileName, fileSize)
return stream.SendAndClose(&uploadpb.FileUploadResponse{FileName: fileName, Size: fileSize})
}
I used simple File
stuct that has three methods for files operation: SetFile
, Write
and Close
type File struct {
FilePath string
buffer *bytes.Buffer
OutputFile *os.File
}
func (f *File) SetFile(fileName, path string) error {
err := os.MkdirAll(path, os.ModePerm)
if err != nil {
log.Fatal(err)
}
f.FilePath = filepath.Join(path, fileName)
file, err := os.Create(f.FilePath)
if err != nil {
return err
}
f.OutputFile = file
return nil
}
func (f *File) Write(chunk []byte) error {
if f.OutputFile == nil {
return nil
}
_, err := f.OutputFile.Write(chunk)
return err
}
func (f *File) Close() error {
return f.OutputFile.Close()
}
The server writes file parts to the hard drive right away as soon as they are received from the client. That is why the file size doesn't matter, it depends only on file system.
I know that using log.Fatal
isn't a good idea so don't do that in your production apps.
Now we have a fully written server side. As I didn't put here the whole code, you can check it on my github
Client side
Our client app is a simple CLI with two required options: a gRPC server address and a path for uploading file.
For CLI interface I chose cobra
framework just because it's simple to use and shows that I know it=) But it's overhead for two params app.
An example of the client app usage:
./grpc-filetransfer-client -a=':9000' -f=8GB.bin
Let's write client uploading logic. The app should connect to server, upload a file and close connection after transferring it.
type ClientService struct {
addr string
filePath string
batchSize int
client uploadpb.FileServiceClient
}
The client reads the file by chuck size==batchSize and sends it to gRPC steam.
func (s *ClientService) upload(ctx context.Context, cancel context.CancelFunc) error {
stream, err := s.client.Upload(ctx)
if err != nil {
return err
}
file, err := os.Open(s.filePath)
if err != nil {
return err
}
buf := make([]byte, s.batchSize)
batchNumber := 1
for {
num, err := file.Read(buf)
if err == io.EOF {
break
}
if err != nil {
return err
}
chunk := buf[:num]
if err := stream.Send(&uploadpb.FileUploadRequest{FileName: s.filePath, Chunk: chunk}); err != nil {
return err
}
log.Printf("Sent - batch #%v - size - %v\n", batchNumber, len(chunk))
batchNumber += 1
}
res, err := stream.CloseAndRecv()
if err != nil {
return err
}
log.Printf("Sent - %v bytes - %s\n", res.GetSize(), res.GetFileName())
cancel()
return nil
}
Conclusion
We wrote the client-server gRPC file transfer app that could upload a file of any size. The speed of uploading depends on a batch size, you can try to find the best value for that.
Full code on my github
I used some code from those tutorials:
- Upload file in chunks with client-streaming gRPC - Go
- Transferring files with gRPC client-side streams using Golang
Got cover image from that place.
It was my first article, so any comments would be welcomed.
Latest comments (0)