Return to Well-Architected Framework Guide
- Evaluate:
- Internal and external customer needs
- Threats to the business (liabilities, info security)
- Impact of risks and tradeoffs between approaches
- Understand:
- Team members' roles in supporting workload
- Support required to achieve business outcomes
- Team members' roles in the success of other teams (and vice versa)
- Responsibility, ownership, how decisions are made, and who has authority to make decisions
- Ensure:
- There are identified owners for each application, workload, platform, and infrastructure component
- Each process and procedure has an identified owner responsible for its definition, and owners responsible for their performance
- Team members have the resources to be successful and scale to support your business outcomes
- Define:
- Guidelines or obligations based on organizational governance and external factors, such as regulatory compliance requirements and industry standards
- Responsibilities of team members
- Agreements between teams describing how they work together to support each other and your business outcomes
- Ask:
- How do you determine what your priorities are?
- How do you structure your organization to support your business outcomes?
- How does your organizational culture support your business outcomes?
- Design your workload to provide information necessary to understand its internal state
- Capture a broad set of information to enable situational awareness
- Adopt approaches that improve the flow of changes into production and that enable refactoring, fast feedback on quality, and bug fixing
- Adopt approaches that provide fast feedback on quality and enable rapid recovery from changes that do not have desired outcomes
- Plan for unsuccessful changes so that you are able to respond faster if necessary and test and validate the changes you make
- Evaluate the operational readiness of your workload, processes, procedures, and personnel to understand the operational risks related to your workload
- Ask:
- How do you design your workload so that you can understand its state?
- How do you reduce defects, ease remediation, and improve flow into production?
- How do you mitigate deployment risks?
- How do you know that you are ready to support a workload?
- Define expected outcomes
- Identify metrics to measure success
- Establish metrics baselines for improvement, investigation, and intervention
- Use established runbooks for well-understood events, and use playbooks to aid in investigation and resolution of issues
- Communicate operational status of workloads through dashboards and notifications tailored to target audience
- Develop scripted responses to well-understood events and automate their performance in response to recognizing the event
- Ask:
- How do you understand the health of your workload?
- How do you understand the health of your operations?
- How do you manage workload and operations events?
- Dedicate work cycles to making continuous incremental improvements
- Perform post-incident analysis of all customer impacting events
- Identify the contributing factors and preventative action to limit or prevent recurrence
- Communicate contributing factors with affected communities as appropriate
- Ask:
- How do you evolve operations?
Top comments (0)