Discussion on: Why SREs Should be Responsible for Development Environments

View post

The development environment must be managed centrally by the DevOps team responsible for it.

I wouldn't recommend this, as you're creating a barrier between the development team and the tools that are essential for doing their job. The greater the separation between persons involved at various parts of the process, the longer the feedback loops, the lesser the understanding and the more often other teams will block yours.

For example; "We need to add a new service so we can consume things from this queue. Okay, but we need to wait for the SRE team to create an environment for us". This is now an impediment to the team, and it will increase the chance that they will not create a separate service in favour of less robust options because it's easier.

So, what's a better option? Let's remind ourselves of the three ways of DevOps:

First Way: Work always flows in one direction – downstream
Second Way: Create, shorten and amplify feedback loops
Third Way: Continued experimentation, to learn from mistakes, and achieve mastery

freshservice.com/itsm/phoenix-proj...

Returning to the earlier example, what is the best way to shorten feedback loops, ensure work flows in one direction and enable experimentation?

Don't have a separate "DevOps" team. Embed individuals with SRE/DevOps skills into your teams so that those teams are capable to deliver end-to-end solutions themselves.

Stephen Leyva (He/Him) • Aug 2 '20

I’ve found embedding SREs in teams has trade offs as well. One being knowledge sharing across teams (especially if there are a lot of teams) becomes difficult and you arrive at a hundred different ways to solve the problems on each team.

Dedicating teams to build layers of abstractions on top of common tooling kind of gives the best of both worlds as long as the abstractions are clean. This way, you’re building tools for developers who may not have a deep dive ops expertise. The developers still have to be familiar with your abstractions but not the implementation.

At scale in my experience, the embedded model starts to break apart of you silo teams off and one guy becomes the “DevOps guy”.

Just my perspective, It’s ok to have an ops team with a different approach to solving problems through automation and being proactive as opposed to reactive. You can still follow the three ways by

System thinking: Viewing yourself as a stage of the software pipeline. Are you facilitating velocity or becoming a bottleneck?
Shorten Feedback loop: release often and early, dog food (I hate this term :D) your own tooling where possible, and constantly collab and talk with development teams.
Continual improvement: learn from your developers as they learn from you :)

This is just to show there are different implementations of the devops philosophy each with its own trade offs. Just my humble opinion based on my experience:)

Stephen Leyva (He/Him) • Aug 2 '20

Example: The ops team owns environment creation but builds a service on top of their infrastructure to make environment creation self service.

Increase velocity and owning your problem domain with your specific expertise.

Simon Bracegirdle • Aug 2 '20

Yeah I completely agree that there are different implementations of the devops philosophy.

Theoretically you could do this with a separate ops and dev teams, but in my experience it makes it harder because you don't have that mix of disciplines and the diversity of perspectives that it brings. There's also more hand-offs as you pass it to ops to run the thing after building.

It doesn't make it impossible, just less conducive in my experience.

At scale in my experience, the embedded model starts to break apart of you silo teams off and one guy becomes the “DevOps guy”.

Yeah I agree that actually sounds worse. It doesn't sound like DevOps if it's just one person responsible for the "ops" part.

The first way mentions removing impediments; that's not possible if one person is a single point of failure.

The second way mentions feedback loops. Those loops are going to be longer if a single person is blocking the ops part of the process.

The third way mentions continuous learning, which isn't happening if one person is hoarding all of the ops knowledge. It's also not happening if teams are silo'd and not sharing their learning with other teams.

It has to be an organisation wide change, not something that single person or team can do in isolation for it to be effective.