DEV Community

MLOps Community

Monitoring Unstructured Data // Aparna Dhinakaran & Jason Lopatecki // Lightning Sessions #2

Lightning Sessions #2 with Aparna Dhinakaran, Co-Founder and Chief Product Officer, and Jason Lopatecki, CEO and Co-Founder of Arize. Lightning Sessions is sponsored by Arize

// Abstract  
Monitoring embeddings on unstructured data is not an easy feat let's be honest. Most of us know what it is but don't understand it one hundred percent.  

Thanks to Aparna and Jason of Arize for breaking down embedding so clearly. At the end of this Lightning talk, we get to see a demo of how Arize deals with unstructured data and how you can use Arize to combat that.

// Bio
Aparna Dhinakaran
Aparna is the Co-Founder and Chief Product Officer at Arize AI, a pioneer, and early leader in machine learning (ML) observability. A frequent speaker at top conferences and thought leader in the space, Dhinakaran was recently named to the Forbes 30 Under 30. Before Arize, Dhinakaran was an ML engineer and leader at Uber, Apple, and TubeMogul (acquired by Adobe). During her time at Uber, she built several core ML Infrastructure platforms, including Michaelangelo.

Aparna has a BA from Berkeley's Electrical Engineering and Computer Science program, where she published research with Berkeley's AI Research group. She is on a leave of absence from the Computer Vision Ph.D. program at Cornell University.

Jason Lopatecki
Jason is the Co-founder and CEO of Arize AI, a machine learning observability company. He is a garage-to-IPO executive with an extensive background in building marketing-leading products and businesses that heavily leverage analytics. Prior to Arize, Jason was co-founder and chief innovation officer at TubeMogul where he scaled the business into a public company and eventual acquisition by Adobe.   

Jason has hands-on knowledge of big data architectures, programmatic advertising systems, distributed systems, and machine learning and data processing architectures. In his free time, Jason tinkers with personal machine learning projects as a hobby, with a special interest in unsupervised learning and deep neural networks. He holds an electrical engineering and computer science degree from UC Berkeley.

// MLOps Jobs board  
https://mlops.pallet.xyz/jobs

// Related Links
arize.com

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Aparna on LinkedIn: https://www.linkedin.com/in/aparnadhinakaran/
Connect with Jason on LinkedIn: https://www.linkedin.com/in/jason-lopatecki-9509941/

Timestamps:
[00:00] Introduction to the topic
[01:13] Troubleshooting unstructured ML models is difficult
[01:40] Challenges with monitoring unstructured data
[02:10] How data looks like
[03:02] Embeddings are the backbone of unstructured models
[03:28] ML teams need a common tool
[04:06] What are embeddings?
[05:08] The real WHY behind AI
[06:41] ML observability for unstructured data
[07:08] Index and Monitor every Embedding
[08:05] Measuring drift of unstructured data
[08:54] Interactive visualizations  
[09:34] Fix underlying data issue
[09:44] Data-centric AI workflow
[10:08] Demo of the product
[12:48] Wrap up

Episode source