DEV Community

Sean Atukorala
Sean Atukorala

Posted on

Key Points from Software Engineering At Google By Titus Winters, Tom Manshreck & Hyrum Wright

Here are the key points worthy of mention from each chapter in the great and popular book Software Engineering At Google by Titus Winters, Tom Manshreck & Hyrum Wright.

Software Engineering At Google Book Cover

Figure 1: Software Engineering At Google Book Cover

Key ideas from Preface:

  • Software Engineering encompasses not just the act of writing code but all of the tools and process an organization uses to build and main that code over time
  • The three fundamental principles that software organizations should keep in mind when designing, architecting and writing their code are:
    • Time and Change: How code will need to adapt over the length of its life
    • Scale and Growth: How an organization will need to adapt as it evolves
    • Trade-offs and Costs: How an organization makes decisions, based on the lessons of Time and Change and Scale and Growth
  • Google's main aspect's of its software engineering landscape are as follows:
    • Culture
    • Processes
    • Tools
  • What this book isn't:
    • This book doesn't cover software design. That subject would require its own separate book.

Keys ideas for Chapter 1: What is Software Engineering?

  • "Software engineering" differs form "programming" in dimentionality: programming is about producing code. Software engineering extends that to include the maintenance of that code for its useful life span
  • There is a factor of at least 100,000 times between the life spans of short-lived code and long-lived code. It is silly to assume that the same best practices apply universally on both ends of that spectrum
  • Software is sustainable when, for the expected life span of the code, we are capable of responding to changes in dependencies, technology, or product requirements. We may choose to not change things, but we need to be capable
  • Hyrum's Law: with a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviours of your system will be depended on by somebody
  • Every task your organization has to do repeatedly should be scalable(linear or better) in terms of human input. Policies are a wonderful tool for making process scalable
  • Process inefficiencies and other software-development tasks tend to scale up slowly. Be careful about boiled-frog problems
  • Expertise pays off particularly well when combined with economies of scale
  • "Because I said so" is a terrible reason to do things
  • Being data driven is a good start, but in reality, most decisions are based on a mix of data assumption, precedent, and argument. It's best when objective data makes up the majority of those inputs, but it can rarely be all of them
  • Being data driven over time implies the need to change directions when the data changes( or when assumptions are dispelled). Mistakes or revised plans are inevitable

Key ideas for Chapter 2: How to Work Well on Teams

  • Be aware of the trade-offs of working in isolation
  • Acknowledge the amount of time that you and your team spend communicating and in interpersonal conflict. A small investment in understanding personalities and working styles of yourself and others can go a long way toward improving productivity
  • If you want to work effectively with a team or a large organization, be aware of your preferred working style and that of others

Key ideas for Chapter 3: Knowledge Sharing

  • Psychological safety is the foundation for foundation for fostering a knowledge-sharing environment
  • Start small: ask questions and write things down
  • Make it easy for people to get the help they from both human experts and documented references
  • At a systemic level, encourage and reward those who take time to teach and broaden their expertise beyond just themselves, their team, or their organization
  • There is no silver bullet: empowering a knowledge-sharing culture requires a combination of multiple strategies, and the exact mix that works best for your organization will likely change over time

Key ideas for Chapter 4: Engineering for Equity

  • Bias is the default
  • Diversity is necessary to design properly for a comprehensive user base
  • Inclusivity is critical not just to improving the hiring pipeline for underrepresented groups, but to providing a truly supportive work environment for all people
  • Product velocity must be evaluated against providing a product that is truly useful to all users. It's better to slow down than to release a product that might cause harm to some users

Key ideas for Chapter 5: How to Lead a Team

  • Don't "manage" in the traditional sense; focus on leadership, influence, and serving your team
  • Delegate where possible; don't DIY(Do It Yourself)
  • Pay particular attention to the focus, direction, and velocity of your team

Key ideas for Chapter 6: Leading at Scale

  • Always Be Deciding: Ambiguous problems have no magic answer; they're all about finding the right trade-offs of the moment, and iterating
  • Always Be Leaving: Your job, as a leader, is to build an organization that automatically solves a class of ambiguous problems - over time - without you needing to be present
  • Always Be Scaling: Success generates more responsibility over time, and you must proactively manage the scaling of this work in order to protect your scarce resources of personal time, attention, and energy

Key ideas for Chapter 7: Measuring Engineering Productivity

  • Before measuring productivity, ask whether the result is actionable, regardless of whether the result is positive or negative. If you can't do anything with the result, it is likely not worth measuring
  • Select meaningful metrics using the GSM framework. A good metric is a reasonable proxy to the signal you're trying to measure, and it is traceable back to your original goals
  • Select metrics that cover all parts of productivity (QUANTS). By doing this, you ensure that you aren't improving one aspect of productivity (like developer velocity) at the cost of another (like code quality)
  • Qualitative metrics are metrics, too! Consider having a survey mechanism for tracking longitudinal metrics about engineers' beliefs. Qualitative metrics should also align with the quantitative metrics; if they do not, it is likely the quantitative metrics that are incorrect
  • Aim to create recommendations that are built into the developer workflow and incentive structures. Even though it is sometimes necessary to recommend additional training or documentation, change is more likely to occur if it is built into the developer's daily habits

Key ideas for Chapter 8: Style Guides and Rules

  • Rules and guidance should aim to support resilience to time and scaling
  • Know the data so that rules can be adjusted
  • Not everything should be a rule
  • Consistency is key
  • Automate enforcement when possible

Key ideas for Chapter 9: Code Review

  • Code review has many benefits, including ensuring code correctness, comprehension, and consistency across a codebase
  • Always check your assumptions through someone else; optimize for the reader
  • Provide the opportunity for critical feedback while remaining professional
  • Code review is important for knowledge sharing throughout an organization
  • Automation is critical for scaling the process
  • The code review itself provides a historical record

Key ideas for Chapter 10: Documentation

  • Documentation is hugely important over time and scale
  • Documentation changes should leverage the existing developer workflow
  • Keep documents focused on one purpose
  • Write for your audience, not yourself

Key ideas for Chapter 11: Testing Overview

  • Automated testing is foundational to enabling software to change
  • For tests to scale, they must be automated
  • A balanced test suite is necessary for maintaining healthy test coverage
  • "If you liked it, you should have put a test on it"
  • Changing the testing culture in organizations takes time

Key ideas for Chapter 12: Unit Testing

  • Strive for unchanging tests
  • Test via public APIs
  • Test state, not interactions
  • Make your tests complete and concise
  • Test behaviours, not methods
  • Structure tests to emphasize behaviours
  • Name tests after the behaviour being tested
  • Don't put logic in tests
  • Write clear failure messages
  • Follow DAMP over DRY when sharing code for tests

Key ideas for Chapter 13: Test Doubles

  • A real implementation should be preferred over a test double
  • A fake is often the ideal solution if a real implementation can't be used in a test
  • Overuse of stubbing leads to tests that are unclear and brittle
  • Interaction testing should be avoided when possible: it leads to tests that are brittle because it exposes implementation details of the system under test

Key ideas for Chapter 14: Larger Testing

  • Larger tests cover things unit tests cannot
  • Large tests are composed of a System Under Test, Data, Action, and Verification
  • A good design includes a test strategy that identifies risks and larger tests that mitigate them
  • Extra effort must be made with larger tests to keep them from creating friction in the developer workflow

Key ideas for Chapter 15: Deprecation

  • Software systems have continuing maintenance costs that should be weighed against the costs of removing them
  • Removing things is often more difficult than building them to begin with because existing users are often using the system beyond its original design
  • Evolving a system in place is usually cheaper than replacing it with a new one, when turndown costs are included
  • It is difficult to honestly evaluate the costs involved in deciding whether to deprecate: aside from the direct maintenance costs involved in keeping the old system around, there are ecosystem costs involved in having multiple similar system to choose between and that might need to interoperate. The old system might implicitly be a drag on feature development for the new. These ecosystem costs are diffuse and difficult to measure. Deprecation and removal costs are often similarly diffuse

Key ideas for Chapter 16: Version Control and Branch Management

  • Use version control for any software development product larger than "toy product with only one developer that will never be updated"
  • There's an inherent scaling problem when there are choices in "which version of this should I depend upon?"
  • One-Version Rules are surprisingly important for organization efficiency. Removing choices in where to commit or what to depend upon can result in significant simplification
  • In some languages, you might be able to spend some effort to dodge this with technical approaches like shading, separate compilation, linker hiding, and so on. The work to get those approaches working is entirely lost labor - your software engineers aren't producing anything, they're just working around technical debts
  • Previous research (DORA/State of DevOps/ Accelerate) has shown that trunk-based development is a predictive factor in high-performing development organizations. Long-lived dev branches are not a good default plan
  • Use whatever version control system makes sense for you. If your organization wants to prioritize separate repositories for separate projects, it's still probably wise for interrepository dependencies to be unpinned/"at head"/"trunk based". There are an increasing number of VSC and build system facilities that allow you to have both small, fine-grained repositories as well as a consistent "virtual" head/trunk notion for the whole organization

Key ideas for Chapter 17: Code Search

  • Helping your developers understand code can be a big boost to engineering productivity. At Google, the key tool for this is Code Search
  • Code Search has additional value as a basis for other tools and as a central, standard place that all documentation and developer tools link to
  • The huge size of the Google codebase made a custom tool - as opposed to, for example, grep or an IDE's indexing - necessary
  • As an interactive tool, Code Search must be fast, allowing a "question and answer" workflow. It is expected to have low latency in every respect: search, browsing, and indexing
  • It will be widely used only if it is trusted, and will be trusted only if it indexes all code, gives all results, and gives the desired results first. However, earlier, less powerful, versions were both useful and used, as long as their limits were understood

Key ideas for Chapter 18: Build Systems and Build Philosophy

  • A fully featured build system is necessary to keep developers productive as an organization scales
  • Power and flexibility come at a cost. Restricting the build system appropriately makes it easier on developers
  • Build systems organized around artifacts tend to scale better and be more reliable than build systems organized around tasks
  • When defining artifacts and dependencies, it's better to aim for fine-grained modules. Fine-grained modules are better able to take advantage of parallelism and incremental builds
  • External dependencies should be versioned explicitly under source control. Relying on "latest" versions is a recipe for disaster and unreproducible builds

Key ideas for Chapter 19: Critique: Google's Code Review Tool

  • Trust and communication are core to the code review process. A tool can enhance the experience, but it can't replace them
  • Tight integration with other tools is key to great code review experience
  • Small workflow optimizations, like the addition of an explicit "attention set", can increase clarity and reduce friction substantially

Key ideas for Chapter 20: Static Analysis

  • Focus on developer happiness. We have invested considerable effort in building feedback channels between analysis users and analysis writers in our tools, and aggressively tune analyses to reduce the number of false positives
  • Make static analysis part of the core developer workflow. The main integration point for static analysis at Google is through code review, where analysis tools provide fixes and involve reviewers. However, we also integrate analyses at additional points (via compiler checks, gating code commits, in IDEs, and when browsing code)
  • Empower users to contribute. We can scale the work we do building and maintaining analysis tools and platforms by leveraging the expertise of domain experts. Developers are continuously adding new analyses and checks that make their lives easier and our codebase better

Key ideas for Chapter 21: Dependency Management

  • Prefer source control problems to dependency management problems: if you can get more code from your organization to have better transparency and coordination, those are important simplifications
  • Adding a dependency isn't free for a software engineering project, and the complexity in establishing an "ongoing" trust relationship is challenging. Importing dependencies into your organization needs to be done carefully, with an understanding of the ongoing support costs
  • A dependency is a contract: there is a give and take, both providers and consumers have some rights and responsibilities in that contract. Providers should be clear about what they are trying to promise over time
  • SemVer is a lossing-compression shorthand estimate for "How risky does a human think this change is". SemVer with a SAT-solver in a package manager takes those estimates and escalates them to function as absolutes. This can result in either overconstraint (dependency hell) or undercontraint (versions that should work together that don't)
  • By comparison, testing and CI provide actual evidence of whether a new set of versions work together
  • Minimum-version update strategies in SemVer/package management are higher fidelity. This still relies on humans being able to assess incremental version risk accurately, but distinctly improves the chance that the link between API provider and consumer has been tested by an expert
  • Unit testing, CI, and (cheap) compute resources have the potential to change our understanding and approach to dependency management. That phase-change requires a fundamental change in how the industry considers the problem of dependency management, and the responsibilities of providers and consumers both
  • Providing a dependency isn't free: "throw it over the wall and forget" can cost you reputation and become a challenge for compatibility. Supporting it with stability can limit your choices and pessimism internal usage. Supporting without stability can cost goodwill or expose you to risk of important external groups depending on something via Hyrum's Law and messing up your "no stability" plan

Key ideas for Chapter 22: Large-Scale Changes

  • An LSC(Large-Scale Change) process makes it possible to rethink the immutability of certain technical decisions
  • Traditional models of refactoring breaks at large scales
  • Making LSCs means making a habit of making LSCs

Key ideas for Chapter 23: Continuous Integration

  • A CI system decides what tests to use, and when
  • CI systems become progressively more necessary as your codebase ages and grows in scale
  • CI should optimize quicker, more reliable tests on presubmit, slower, less deterministic tests on post-submit
  • Accessible, actionable feedback allows a CI system to become more efficient

Key ideas for Chapter 24: Continuous Delivery

  • Velocity is a team sport: The optimal workflow for a large team that developers code collaboratively requires modularity of architecture and near-continous integration
  • Evaluate changes in isolation: Flag-guard any features to be able to isolate problems early
  • Make reality your benchmark: Use a staged rollout to address device diversity and the breadth of the userbase. Release qualification in a synthetic environment that isn't similar to the production environment can lead to late surprises
  • Ship only what gets used: Monitor the cost and value of any feature in the wild to know whether it's still relevant and delivering sufficient user value
  • Shift left: Enable faster, more data-driven decision making earlier on all changes through CI and continuous deployment
  • Faster is safer: Ship early and often and in small batches to reduce the risk of each release and to minimize time to market

Key ideas for Chapter 25: Compute as a Service

  • Scale requires a common infrastructure for running workloads in production
  • A compute solution can provide a standardized, stable abstraction and environment for software
  • Software needs to be adapted to a distributed, managed computed environment
  • The compute solution for an organization should be chosen thoughtfully to provide appropriate level of abstraction

Conclusion

Thanks for reading this blog post!

If you have any questions or concerns please feel free to post a comment in this post and I will get back to you when I find the time.

If you found this article helpful please share it and make sure to follow me on Twitter and GitHub, connect with me on LinkedIn and subscribe to my YouTube channel.

Top comments (0)