SQLAlchemy has been in market since a long time now and is one of the best ORM's available so far. When working on backend Frameworks such as Flask
or FastAPI
, we usually come across this ORM.
There are two approaches, we can use SQLAlchemy
:
- Creating Schema, Tables manually using
declarative_base
objects and migrating them. - Reflecting existing objects in the database using
metadata
.
The issue with later approach is that when there are a lot of tables to work on, initial reflect to all the tables would take a lot of time and increase your application boot time. I came up with an approach to tackle such situations wherein you need to reflect existing tables without degrading application performance.
Idea is to reflect tables and views lazily on requirements rather than loading everything at once. We do not require all the tables at once, do we ?
Since, API's would only be querying or performing CRUD operations on a subset of tables, we have room to skip loading other tables, but also keep already reflected tables persistent.
I have created a lazy
wrapper for this purpose which reflects tables on requirements only once and persist them so that you can use same objects going forward.
class LazyDBProp(object):
"""This descriptor returns sqlalchemy
Table class which can be used to query
table from the schema
"""
def __init__(self) -> None:
self._table = None
self._name = None
def __set_name__(self, _, name):
self._name = name
def __set__(self, instance, value):
if isinstance(value, (CustomTable, Table)):
self._table = value
def __get__(self, instance, _):
if self._table is None:
self._table = CustomTable(
self._name, instance.metadata, autoload=True)
return self._table
This class uses descriptors
under the hood to persist table
or view
objects. I have also created a wrapper to generate dynamic class to hold these descriptor based table objects.
def get_lazy_class(engine: Engine) -> object:
"""
Function to create Lazy class for pulling table object
using SQLalchemy metadata
"""
def __init__(self, engine: Engine):
self.metadata = MetaData(engine)
self.engine = engine
def __getattr__(self, attr):
if attr not in self.__dict__:
obj = self.__patch(attr)
return obj.__get__(self, type(self))
def __patch(self, attribute):
obj = LazyDBProp()
obj.__set_name__(self, attribute)
setattr(type(self), attribute, obj)
return obj
# naming classes uniquely for different schema's
# to avoid cross referencing
LazyClass = type(f"LazyClass_{engine.url.database}", (), {})
LazyClass.__init__ = __init__
LazyClass.__getattr__ = __getattr__
LazyClass.__patch = __patch
return LazyClass(engine)
Above class can be simply used as below:
from lazy_alchemy import get_lazy_class
from sqlalchemy import create_engine
db_engine = create_engine(DB_CONNECT_STRING)
lazy_db = get_lazy_class(db_engine)
db_model = lazy_db.my_db_table_foo
query = session.query(db_model).filter(db_model.foo == "bar").all()
Once reflected, these objects can be referenced repeatedly. Reflecting only required object enhances application performance with minimal overhead.
This had me cut down application boot time from more than a minute to couple of seconds :).
If you want to implement above in your project, you can simply use my pypi package Lazy Alchemy.
I would be happy to hear you views and alternatives to this approach.
Thank you for reading this, I hope you found some helpful tips.
Top comments (0)