After a lot of struggle doing this, I finally found a simple way.
I've discovered that if you want to be able to save a model/pipeline and have it be importable without encountering
ModuleNotFoundErrors when you try to load it again, then you need to be sure that your model is built in the same place that it's getting saved. In the case of a neural network, this means compiling, fitting, and saving in the same module. This has been a big headache for me, so I hope you can avoid it.
We can write and read
sklearn models/pipelines using
Local Write / Read
from pathlib import Path path = Path(<local path>) # WRITE with path.open("wb") as f: joblib.dump(model, f) # READ with path.open("rb") as f: f.seek(0) model = joblib.load(f)
We can do the same thing on AWS S3 using a
AWS S3 Write / Read
import tempfile import boto3 import joblib s3_client = boto3.client('s3') bucket_name = "my-bucket" key = "model.pkl" # WRITE with tempfile.TemporaryFile() as fp: joblib.dump(model, fp) fp.seek(0) s3_client.put_object(Body=fp.read(), Bucket=bucket_name, Key=key) # READ with tempfile.TemporaryFile() as fp: s3_client.download_fileobj(Fileobj=fp, Bucket=bucket_name, Key=key) fp.seek(0) model = joblib.load(fp) # DELETE s3_client.delete_object(Bucket=bucket_name, Key=key)
Top comments (0)