Not so long ago Microsoft suffered an outage with their package repo over at https://packages.microsoft.com. The short version is that it looks as though the
pools directory was removed, forcing Microsoft to synchronize everything back out. Of course the outage started pushing traffic to other end-points which added bottlenecks and it was just a lot of pain for around 26 hours.
Whilst this affected a lot of devs for a variety of projects, the one I was more affected by was the impact on Databricks environments which were pulling in the Linux ODBC driver for Microsoft SQL Server. The standard init script for doing this relies on installing the packages at the point the cluster is created. If the repo isn't available then the init script takes longer to run, fails, but the cluster still runs; it just breaks anything which needs the driver.
I wanted to come up with a way to try and prevent this happening again. Whilst it's unlikely to happen again, adding some resiliency is never a bad thing.
So I created a Databricks notebook which downloads the package and it's dependencies from the Microsoft repo to the cluster. It does this by downloading each required package, when they have all successfully completed it swaps out the current set of files with the new ones. If any of the files can't be downloaded for any reason then the existing files are left intact. The init script uses these downloaded files instead to install the driver. In the event of another outage the local files are not replaced and the cluster will start up as normal with the previously downloaded packages.
Each time the packages are successfully downloaded the script updates the init script, just in case any more packages have been added or the file locations have been changed.
try: # Download packages if previous.exists(): shutil.rmtree(previous) shutil.move(current, previous) shutil.move(new, current) except: # Remove any newly downloaded files shutil.rmtree(next)
The notebook is available on GitHub as both an
dbc file. Hopefully someone else finds it useful.
Top comments (0)