Github Repository: Streaming_API
In the project I developed, we return a great response message to the API calls of the customers. It takes a long time to process large data. At this stage, the memory swells and the server CPU increases. To prevent this, we started using streaming method.
I've done an analysis to explain what this process has brought us. I have compared a normal response to a call request and streamed a call request.
Working area
- Python3.6
- Aiohttp
- PostgreSql (SQLAlchemy)
- 500000 rows of data set
(Referenced from the github readme file.)
How to install
git clone git@github.com:tolgahanuzun/Streaming_API.git
virtualenv -p python3 venv
source venv/bin/activate
cd Streaming_API
pip install -r requirement.txt
vim settings.py
and you need to edit the settings file yourself.
How to run
Serve
python run.py
Client
- JWT needs it. You can remove it if you want.
curl -H 'Accept: text/plain' -H "Authorization: JWT eyJ0eX....." -v http://0.0.0.0:8080/standard
curl -H 'Accept: text/plain' -H "Authorization: JWT eyJ0eX....." -v http://0.0.0.0:8080/chunked
Process analysis
Chunked data comes to you piece by piece.
In the standard, the data comes at once.
Incoming data sample.
Standard Response
All data is captured, processed and sent at one time. Blocks each other. In this example, it corresponds to 500000 row.
- Up to 0.16 seconds no data flow occurs. The data comes after that second.
Streaming Response
The dataset size is 500000 rows. Data from the database is taken as 1000 row. Each group of data is processed and streamed without waiting.
Up to 0.04 seconds no data flow occurs. The data comes after that second.
Look out! The Streaming API finished before the Standard API started sending data. It didn't block itself because it sent and received data in pieces.
Top comments (1)
Thanks. I liked this post. I need to something like that but in a different language. Maybe I can use Python using this post. :)