Introduction
What is Flyte?
- Kubernetes-native workflow automation platform
- Open-source
- Makes it easy create concurrent, scalable, and maintainable workflows
- DLF AI & Data Incubation Project
- Opinionated, scalable & hosted workflow automating platform
- Extensible, Auditable, Observable
Integrations
Flyte supports a ton of integrations such as TensorFlow, Google Cloud, Apache Spark, PyTorch, Hive, Kubernetes, etc...
This is an overall view of how many integrations they support!
You can check out all the integrations they support by clicking here
Trust by Companies
Flyte is used in production at Lyft, Spotify, Freenome and others.
Setting Up Flyte
Requirements
- Docker
- Python
Ensure that your Docker Daemon is running
Installation
Installing Flytekit
pip install flytekit
NOTE: For Mac M1 Users, run this command before running the above command to install grpcio
in your computer
pip install --no-binary :all: grpcio --ignore-installed
Installing FlyteCTL
FlyteCTL is a command-line interface for Flyte
OSX
brew install flyteorg/homebrew-tap/flytectl
Other Operating Systems
curl -L https://raw.githubusercontent.com/flyteorg/flytectl/HEAD/install.sh | bash
Creating an Example Flyte Script
Just to checkout your setup works and have a bit of fun with Flyte.
Let's create an example script with flyte that:
- Generate a dataset of numbers drawn from a normal distribution.
- Compute the mean and standard deviation of the numbers data.
Here's the script, insert it into any python file
import typing
import pandas as pd
import numpy as np
from flytekit import task, workflow
@task
def generate_normal_df(n:int, mean: float, sigma: float) -> pd.DataFrame:
return pd.DataFrame({"numbers": np.random.normal(mean, sigma,size=n)})
@task
def compute_stats(df: pd.DataFrame) -> typing.Tuple[float, float]:
return float(df["numbers"].mean()), float(df["numbers"].std())
@workflow
def wf(n: int = 200, mean: float = 0.0, sigma: float = 1.0) -> typing.Tuple[float, float]:
return compute_stats(df=generate_normal_df(n=n, mean=mean, sigma=sigma))
Running Flyte workflows
You can run the workflow in example.py on a local Python environment or a Flyte cluster.
Running a workflow using a local python env
Run this command to kickstart your newly created workflow using a python env
NOTE: Change main.py
with the filename your Python file is!
pyflyte run main.py wf --n 500 --mean 42 --sigma 2
Creating a Demo Flyte Cluster
Run this command to kickstart your newly created workflow using a Flyte Cluster.
flytectl demo start
If you have setup everything correctly, You should recieve the following message:
Now run the workflow on the cluster using this command:
pyflyte run --remote main.py wf --n 500 --mean 42 --sigma 2
Great! You have run and successfully setup Flyte in your computer
Conclusion
🎉 Congratulations! In this getting started guide, you:
- 🤓 You learned all about Flyte
- 💻 Setup Flyte in your computer
- 📜 Created a Flyte script
- 🛥 Created a demo Flyte cluster on your local system.
- 👟 Ran a workflow locally and on a demo Flyte cluster.
Flyte is a great workflow automation tool for Data, Machine Learning Processes
Lastly, don't forget to leave a LIKE
and key in your feedback in the comments!
Latest comments (0)