Vitaly Sharovatov

Posted on Nov 2, 2022 • Originally published at qase.io

Designing a team that would produce software of good quality: get rid of individual performance reviews

Most companies practice individual performance reviews (also called performance appraisals).

This practice is quite well-established even though 40 years management science concluded that this practice caused great damage to the teams in multiple areas.

I can only guess the reasons why the practice still exists: lack of knowledge and/or sunk cost fallacy effect.

W. Edwards Deming condemned performance evaluations as a deadly disease afflicting American management. He argued that performance evaluations nourish fear, encourage short-term thinking, stifle teamwork, and are no better than lotteries.

This article is mostly focused on performance reviews’ detrimental effects on quality.

Teams have emergent properties, only teams can produce value for the client.

There’s almost no value in producing something, companies prosper by producing quality product.

Quality demands effective information flow, the following team characteristics are required too:

Low turnover
Constant learning from peers
Common goal and common interest in collaborative problem solving
Focus on teamwork

All these properties are severely damaged by the individual performance review practice.

Low turnover

Individual performance review practice increases the turnover.

The usual performance assessment is unavoidably subjective:

the assessee in self-presenting their results for a certain period of time
the assessor in assessing the presented results

The assessee compiles their self-assessment based on their subjective understanding of the achieved results.

Assessee’s judgement is clouded even more by a spectrum of biases, from Dunning-Krueger effect to Impostor Syndrome.

NB: Rationalwiki.org lists ~90 well-studied cognitive biases which affect human behaviour, cognition, thinking and decision-making.

NB: People are also almost unaware of the strength of the biases’ effect.

The assessor’s reads the compiled self-assessment and rates the results subjectively too.

Assessor’s judgement is clouded even more by a great number of biases, from Halo effect to Central tendency bias.

The longer the assessment period, the more clouded, subjective and biased both the assessee’s and the assessor’s judgement is.

Human memory is full of wonders, events are not stored and retrieved precisely and accurately, but lived through and changed every time a person recalls something.

There’s a separate list of memory biases, effects which not only cloud the recalled event, but sometimes change it quite significantly:

Rosy retrospection bias. We tend to remember the past as having been better than it really was, which leads to judging the past disproportionately more positively than we judge the present. As the Romans said: memoria praeteritorum bonorum, or “the past is always well remembered.”

Consistency bias. We incorrectly remember our past attitudes and behaviour as resembling our present attitudes and behaviour, so we feel like acting in accordance with our general self-image.

Mood-congruent memory bias. We better recall memories that are consistent with our current mood. For instance, feeling relaxed may bring back relaxing memories; feeling stressed may bring back stressful memories.

Hindsight bias. We have an inclination to consider past events as being predictable—also called the knew-it-all-along bias.

Egocentric bias. We recall the past in a self-serving manner, such as remembering our exam grades as being better than they really were, or remembering a caught fish as bigger than it was.

Availability bias. We often think that memories that come readily to mind are more representative than is actually the case. This is why people tend to overestimate the likelihood of attacks by sharks or the number of lottery winners.

Recency effect. We best remember the most recently presented information. At a trial, evidence presented last may be the clearest in the juror’s memory.

Choice-supportive bias. We remember chosen options as having been better than rejected options.

Fading affect bias. Our emotions associated with unpleasant memories fades more quickly than our emotions associated with pleasant memories.

Confirmation bias. Our tendency to seek and interpret memories in a way that confirms our prior hypotheses or personal beliefs.

To sum it up:

the assessee only believes they recalled something right (the longer period, the worse)
this belief is clouded even more by the assessee’s self-esteem-related and other personality biases
the assessor perceives the result and tries matching it with their clouded recollection of the events (beliefs, essentially)
the matching process is clouded even more by the assessor’s biases

It’s almost as if an unknown amount of noise is applied to a signal a few times, where the signal is the event in the assessment, and noise is all the recollection/reformulation/comparison process in the assessee’s and assessor’s brains.

Does it make any sense to conclude anything based on a result with completely unknown signal-noise ratio?

Studies prove this subjectivism often triggers the feeling of unfairness: the assessee thinks the assessor’s judgement is unfair:

They [assessments] are in many cases perceived as being procedurally unfair in terms of the Folger and Bies (1989) criteria. Inadequate consideration is given to employee views; there is evidence of bias, inconsistency, poorly justified feedback and a lack of honesty in the process.

Our research confirms findings that the appraisal reinforces power relationships and is often perceived as being open to abuse

Dissatisfaction with processes and the resulting distributive justice had a negative impact on employee perceptions of fairness and could act as a barrier to organizational effectiveness

Another study once again proves that people are all different, their memory and judgement is affected by biases differently, and that there’s almost no way the assessee’s judgement of their work and effort will match the assessor’s judgement.

Judgements about fairness can only ever be based on an individual’s expectations

Adding to this, people irrationally compare results from the current assessment with the previous ones, as this study shows:

Expectations about performance marks, and the comparison of those marks with the previous year, had a very strong influence over levels of satisfaction.

Multiple studies explore the detrimental effects unfairness causes on the productivity, such as this cross-cultural one:

Our results revealed that both unfairness approaches (OJ and ERI) were positively related to job-burnout and turnover intention

It makes perfect sense: as performance reviews make people feel that their efforts and results are judged unfairly, the motivation drops and they might consider leaving.

The situation gets much worse if pay rise is tied to performance reviews.

It’s a well-known fact that in most companies newjoiners negotiate higher salaries compared to existing employees.

Perceived judgement unfairness in this situation is even stronger — employees might even feel that the whole practice is established to persuade them in their knowledge and skills inferiority and postpone pay rise.

Loss of interest and motivation, similarly to increased turnover, have detrimental effects on quality.

Focus on teamwork

In case the employee is somehow still motivated to stay and work in the company, performance reviews’ focus on individual contribution will shift employee’s focus from collaborative team work to achieving what’s exactly measured.

Good bye teamwork, hello Goodhart’s law:

When a measure becomes a target, it ceases to be a good measure

Every measure which becomes a target becomes a bad measure – is inexorably, if ruefully, becoming recognized as one of the overriding laws of our times. Ruefully, for this law of the unintended consequence seems so inescapable. But it does so, I suggest, because it is the inevitable corollary of that invention of modernity: accountability

As soon as the employee is focused on improving the individual contribution measured in the assessment, quality decreases because:

the employee loses the motivation to help others, to teach and learn
the employee focuses only on what’s measured

While, as Dr. Deming says:

the most important figures that one needs for management are unknown or unknowable, but successful management must nevertheless take account of them.

All the effort managers and employees spend on executing the practice is waste, it simply could be spent on improving the system.

There’s a beautiful thought experiment illustrating Dr. Deming’s ‘95/5 rule’:

“95% of variation in the performance of a system is caused by the system itself; only 5% is caused by the people.”

The original goal of the individual performance reviews was to boost the teams productivity and yet the practice:

focuses on individuals, reducing the teams to mere groups of competing individuals
generates stress for everyone
negatively impacts motivation and interest
increases turnover
brings the feeling of unfairness
focuses people on what’s measured, distorting the system
simply wastes money

Samantha Evans and Dennis Tourish beautifully coined this in their study ‘Agency theory and performance appraisal: How bad theory damages learning and contributes to bad management practice’:

conventional appraisals prioritise hierarchy over intrinsic motivation, distrust over trust, and the importance of individual effort over that of building sustainable, co-operative systems.

References: