This is an anonymous post sent in by a member who does not want their name disclosed. Please be thoughtful with your responses, as these are usually tough posts to write. Email email@example.com if you'd like to leave an anonymous comment.
Some background for reference:
I'm a long time "ops" guy, who codes web apps for fun. I spent a large part of my career as manager of an ops team that managed apps, but not hosts. We partnered with infra to set up servers to our specs, mount storage, and open ports. We where given credentials to provision the machine. We handled the day to day support operations for the applications. I was lucky enough to be a part of a grass roots movement to implement DevOps before it became a buzzword.
From there I did some consulting and bounced around for a few years. Then I was offered the opportunity to manage a Site Reliability Engineering team. This was a new concept for me, but after doing some research, It wasn't anything new for me.
A couple years in there is one blazing truth. We are not a SRE team. I work for a large company with a micro service architecture, around 300 services in total. Ops teams are not integrated with development teams. I'm now trying to re-tool my team to better fit the company needs.
My proposal is our team of eight works with product owners to understand the portfolio of applications under their domain. We would help to make sure things like graceful degradation, and observability are being considered along with functional features. We would help build monitors and alerts, and work tightly with our 24/7 noc. We would build and manage alerting and incident management tools for our noc, be level 2 support, and severe level incident managers. We would also be responsible for unplanned production changes, lake feature flags, cache flushes, and adhoc batching jobs.
We are not positioned to take part in things like infrastructure architecture, or server management. We are more dev facing that infra ops. There is a need there, and a great opportunity. I see a lot of similarities on the team I used to manage in the first paragraph up there. Is anyone aware of a team structure, or concept similar to this? I'm looking for examples or resources to help set direction.
TL;DR Are there any devs who are aware of a team that "manages" and supports their apps in production, but isn't a traditional operations team?