Okay, now that I have your attention… :) But seriously.
Let’s start by defining terms. When I use the word “software,” I mean real-world, productional business software. You know, the kind that makes the world go round. I admit there are many other kinds of software, but I’ll limit myself here to speaking to the kinds of software I have built in various capacities for over eighteen years now.
And when I say “developers,” I mean software (see above) developers, most typically what we now call “full stack,” but I think the fails below apply to more specialized roles within software development — to a point. My criticism of algorithm evaluation, for instance, may not be appropriate if the software being developed is heavily related to the algorithms being evaluated.
Nota Bene: I have been interviewing software developers (and other software professionals like designs and product owners/managers) for at least fourteen of my eighteen years. I by no means will claim that there is only one way to successfully screen developers, but it seems to me there are definitely some common ways of doing this that tend to feel a lot like fail to me — from both sides of the equation, as well as better ways, such as those suggested below.
WITH FURTHER ADO, in ascending order of failfulness…
3) Ask “clever” brain teaser questions. I don’t care if they are “clever” coding problems or non-coding (of the manhole cover variety).
As the SVP of People Ops at Google, Laszlo Bock, writes in Work Rules!: *Insights from Inside Google That Will Transform How You Live and Lead *(Ch. 5 — Don’t Trust Your Gut ):
Performance on these kinds of questions is at best a discrete skill that can be improved through practice, eliminating their utility for assessing candidates. At worst, they rely on some trivial bit of information or insight that is withheld from the candidate, and serve primarily to make the interviewer feel clever and self-satisfied. They have little if any ability to predict how candidates will perform in a job. This is in part because of the irrelevance of the task (how many times in your day job do you have to estimate how many gas stations there are?), in part because there’s no correlation between fluid intelligence (which is predictive of job performance) and insight problems like brainteasers, and in part because there is no way to distinguish between someone who is innately bright and someone who has just practiced this skill.
And he adds:
Some of these interview questions have been and I’m sure continue to be used at Google. Sorry about that. We do everything we can to discourage this, as it’s really just a waste of everyone’s time. And when our senior leaders — myself included — review applicants each week, we ignore the answers to these questions.
I remember when these were all the rage. I’m glad that time is mostly over. (I personally was asked once about how I’d design the ideal alarm clock…)
2) Ask obscure computer science questions.
You know what I mean. The “find the nearest node in a tree” or “capitalize the second letter in an array of words” or “what is the big O notation for X sorting algorithm.” In almost every case, a quick internet search can reveal what a developer needs to know about known algorithms for solving particular kinds of problems, or just reviewing for half a day the literature (e.g., on Wikipedia) on particular problem areas. And in many cases, such optimized algorithms are baked into popular frameworks, because they are general purpose and not domain-specific — so an average dev will rarely need to know them. But again, if they do, it’s not hard to refresh and not at all a big indicator of any important factor of success in a developer role.
As with #1 above, more often than not what these kinds of questions do is serve to make the interviewer feel smarter. (We get excited because we can stump people with a problem we already know the solution to.) And because these are very common interview questions, candidates will often prep for them. So again, not very helpful in determining anything useful.
I’ve interviewed fresh CS degree grads who could easily solve one of these algorithm problems but who barely know what a relational database is, have barely touched source control repositories, and wouldn’t know TDD from HIV. Which of these are more important for a professional software developer’s success?
1) Ask a candidate to live code in front of you.
I mean it. This is the most egregious thing you can do as an interviewer, and yet it is absolutely the most common thing devs get asked to do. It’s one thing if it naturally comes out of an interview discussion, like you’re discussing code, and it would just be easier for the candidate to write up some code or pseudo code to show what they mean. Usually, I wait for a candidate to volunteer to do this, but I never give them a problem and say “code the solution for me right now” on a whiteboard, in a shared screen program, or whatever.
“But everyone does this!” I can hear almost everyone exclaim. Indeed. I think maybe it’s a hazing ritual at this point. Like, “we all had to go through this, so you do, too.” LOL. No, I get it. We want to see how well they can code, or at least that they can code. But there are other, better ways.
This problem is exacerbated when the problem definition changes on the fly. Interviewers think they are adding layers of complexity to mimic real life changing requirements when in reality they are mostly adding layers of very artificial stress that will induce stupid mistakes that obscure a candidate’s true abilities.
This is an important point. Interviewees are already stressed and nervous. It’s easy as an interviewer to forget this. As an interviewee, you are putting yourself out there for someone else to judge you and your work, and deem you worthy or unworthy of a job that, in many cases, you see as key to your (and your family’s) livelihood. That’s kind of a big deal.
So as an interviewer, you’re usually dealing with someone who is in the midst of a significant life stress event. Add to this that developers tend to be introverts — they are not performing artists — and that few things stress introverts out more than being in unfamiliar social situations with unfamiliar people.
And then we ask them to essentially become a performing artist in code? To solve a surprise challenge on the fly while their every pen/keystroke is being scrutinized? What about this screams “this is how this dev will perform under normal circumstances” to you?
Nothing. That’s what. The closest thing in real dev life might be pair programming, but that is in a far, far less stressful context, typically with people you know relatively well (and hopefully get along with), dealing with a problem domain you have had time to get to know, and the goal is not to test your skills. The goal as is to help each other — you are partners, with a shared goal of making the best software together. So not even pair programming is like live coding in an interview; it is so only very superficially.
And all of this assumes that you have come up with a reasonably good problem that is scoped small enough and has a domain common enough to be readily familiar to any interviewee. For example, asking someone to code a tic-tac-toe game, when no software they are going to build every day is even remotely a game and rarely requires any sophisticated graphical layout skills, would be a bad idea. So, if you’re gonna do this, keep it stupid simple, like here’s some sample data (a list of movies); now display those in a table or a card list or some such.
“Ah, but giving a novel unfamiliar problem and watching them gives me insight into how they think!” goes the response. No. To reasonably interpret how a developer thinks (in general) from such an ad hoc, on-the-fly coding challenge far exceeds the psychological skills and knowledge of pretty much any other software developer (and we’re the ones who usually do the interviewing). We’re fooling ourselves to think we are so clever as to analyze in any generally meaningful way how someone else thinks. It’s too unusual and small of a sample in any case.
And really, judging how someone thinks/approaches and solves problems is rather presumptuous. As if there is a “right” way to think to solve problems. Talk about bias! What matters is that they can solve problems, more importantly, that they can turn problems into maintainable, reliable solutions. And no ad hoc, on-the-fly test is gonna tell you that.
Just for fun, I want to mention you can double- or even triple-fail here, by asking someone to live code solving a “clever” computer science problem. And yet this happens. And to make it worse, it happens on whiteboards — not even a friggin keyboard and screen. “Writing code.. on a whiteboard...” (I imagine Roy from IT Crowd saying that like he said, “A fire.. at a seaparks…”)
The common thread in all these fails is that what is being tested is only very tangentially connected to what really makes a software developer successful or not. This simple principle should guide what we strive for in shaping our technical evaluations: mirror real-world development as closely as possible.
Again, as our friend Laszlo Bock writes in *Work Rules!* (based on real data and science):
The best predictor of how someone will perform in a job is a work sample test (29 percent). This entails giving candidates a sample piece of work, similar to that which they would do in the job, and assessing their performance at it. Even this can’t predict performance perfectly, since actual performance also depends on other skills, such as how well you collaborate with others, adapt to uncertainty, and learn.
So concretely this suggests, for a developer technical evaluation:
- Let devs work in an environment in which they feel comfortable (e.g., at home) or in the worst case at a computer relatively isolated so that they can quietly focus.
- Let devs work without the artificial distraction and stress of being observed while they are working.
- Let devs have access to the resources they normally use. When I first started interviewing in 2003, I told interviewees to bring books they like and gave them full internet access for an on-site test. These days, I prefer to let devs work from home/remotely.
- Craft a problem that is realistic. That means a problem similar to what they will face on a day-to-day basis. If you are hiring front end, it’d be writing a small front end. If a mobile app, a small mobile app. If a full stack, a small slice of full stack. You get the idea.
Software development is a design exercise. This has a few implications for evaluating it. One is of course creating an environment that best facilitates and stimulates the mind. Contrary to this, high stress situations tend to work to the detriment of higher mental functions. It also implies that the best solution is rarely the first one that jumps to mind. We arrive at the best solution through freely exploring alternatives, going down dead ends, learning new things, and ultimately combining and iterating (and usually iterating further over time).
None of that happens in a high stress, ad-hoc, live coding situation. But by giving an applicant time on their own to come to terms with a problem and explore solutions, we get a lot closer to what real-world development is like.
Another well-established method for more reliable interviewing is what they call structured interviewing. As Bock writes:
Tied with tests of general cognitive ability are structured interviews (26 percent), where candidates are asked a consistent set of questions with clear criteria to assess the quality of responses. Structured interviews are used all the time in survey research. The idea is that any variation in candidate assessment is a result of the candidate’s performance, not because an interviewer has higher or lower standards, or asks easier or harder questions.
Compare that to unstructured interviewing (the typical ask-more-or-less-whatever-comes-to-mind-in-any-given-interview-and-follow-it-where-it-randomly-goes). The research says that this is 14% (of predicting actual employee future performance). And you can imagine how that could be terribly variable, immensely subject to bias, as well as influenced by how you, the interviewer, might feel on any given day. In short, it’s not really fair, in addition to being far less reliable in predicting performance. (BTW, years of experience is only a 3% indicator!)
What I have landed on is an at-home project for the primary technical evaluation, and then doing a code review of it with the interviewee (again, mirroring good, real-world development practices). The code review gives immense insight into 1) whether or not the applicant actually understands what they are doing and 2) how they interact with other team members (I think including at least two future peers is ideal).
I’ve actually been amazed by how some folks can’t really explain what they are doing with their solution. Like they just copied and pasted a framework/example from the Web — it’s very obvious, so I think it addresses the concern of it not being their own work. If you really want to know how a dev approaches development solutions, this is a much better approach. (I always ask for automated tests, as a bonus, and it’s amazing how few devs actually include them!)
Coupling this with structured interviewing leads to a whopping 55% indicator, and if you include what Bock refers to as “assessment of conscientiousness,” that adds another 10% of predictor power. You can get an amazing amount of detail if you use “behavioral structured interviewing” (a.k.a., performance-based interviewing — PBI), where you ask candidates for specific examples of when they did something or used a particular technology. Drilling into these gives a much deeper sense of what a candidate knows (rather than what they don’t know, which you get from asking questions with specific right answers).
As an example: “Tell me about a project where you used REST services.” Make sure they stay specific and don’t get into hypotheticals. Ask questions to clarify and dig deeper. You will get a good sense if they understand the ins and outs, what they do or don’t know, and how deep their experience actually is.
For the conscientiousness evaluation, I actually write up a handful of important attributes with descriptions, and then work on PBI questions that ask for examples in their experience, e.g., “tell me about a time you helped a co-worker with a particularly tough problem.”
Also, as noted/quoted above, “actual performance also depends on other skills, such as how well you collaborate with others, adapt to uncertainty, and learn.” This is easy for hardcore devs to overlook. We like the concreteness of the technical eval, but even a great eval as I propose here only tests the technical competence, which is arguably not as important as attitude, passion, curiosity, helpfulness, and the like, which dramatically impact their ability to adapt to new situations as well as work well with others.
Asking performance-based and/or “situational” (i.e., hypothetical) questions to drill into these areas is hugely important. “Can you give some examples of something cool you learned recently.” If they can’t quickly, that probably indicates they’re not the type always trying to learn. That might be a flag for you. (I think it’s a BIG RED FLAG.) As with the others, these should be the same questions you ask to everyone — it’s fair and it gives you the same criteria to evaluate people by.
To do this efficiently, either buy/use some specialized software (like Breezy.hr or Greenhouse.io) to let you create interview kits that have the same questions you will ask all candidates, or just write one up in, e.g., Google Docs. Have the interviewers take detailed notes, and then have interviewers individually fill out a scorecard for the various things you are looking to evaluate (which should map to the questions being asked).
Finally, you discuss with your hiring team. Google actually hires by committee (or so Bock says), and interviewers are not necessarily on the actual team being hired for. While I can see the value this introduces in reducing bias (especially that of the hiring manager), I haven’t tried that, and it may only work at scale. A drawback of that approach is you don’t get a feel from the actual team members and their interactions/chemistry, so YMMV on that one.
I’m not gonna claim this will guarantee success, but if you add it up, it gives you something like a 65% chance, much better than the 14% for stereotypical unstructured. Also, you can add years of experience (3%) and reference checks (7%) to add a bit more and get to 75% — pretty good odds. As far as I’m concerned, the more we can do to increase our odds of selecting the right/best candidates, the better.
This approach works both in tiny companies as well as huge ones like Google. I definitely don’t think I have it perfect — I learn more every time I go through a hiring process, but these are some great guidelines, based in scientific research and experience. It seems good and fair to candidates, as well. All around a win-win.
Bock cites the research that says general intelligence evaluations give you a 26% indicator, which is tied for 2nd place with structured interviewing. I have evaluated a couple of these tools. A colleague/friend of mine and I both took a few and reviewed each others’ results and talked about how well we thought they reflected our impressions of ourselves and each other. (We have a good relationship to be honest about things like that.) One test tool seemed better than the other — pretty accurate in fact. The problem is, they add another longish part of the evaluation (e.g., 30–45min or more). Bock also notes that such general intelligence tests discriminate against non-white, non-male test takers.
So far, I haven’t felt the need, but I have actually taken one as part of an application process as well. If you find you can’t get people to take a sample test project, it might be a good way to increase confidence in the hiring process.
My colleagues and I evaluated a few of these. My experience was that they tend to ask the kinds of questions above that I feel are not great indicators because they are too disconnected from day-to-day dev. That said, some people do swear by them. I’d say they are better than live-coding in front of someone.
I have taken (and passed) a number of these in my time. I don’t think they’re good in that they tend to add significant time pressure, which creates that highly artificial situation. You are testing someone’s immediate memory or ability to code a specific solution to a surprise problem in a short amount of time — not terribly realistic and also very limited in scope in that you find out more specific things they may not know by heart versus seeing what they do know and can throw together in a reasonable amount of time (as you get with a sample project).
As such, I don’t recommend or use them.
There are plenty — especially hipster types — who put a LOT of value on OSS contributions. If you are hiring specifically for someone who will need to do a lot of OSS contribution, that’s absolutely a real-world level thing to evaluate. Or if it is a core cultural value in your team/company. But I think a lot of folks conflate OSS contribution with general competence. Many very highly competent devs have only worked on closed source (and have families and other things they prefer to do with their free time).
I do think it can be a great help to evaluate ability, if someone does have substantiated contributions that you can look at. I’d even say that could be a reasonably good substitute for the sample project. The drawback can be that, without time-consuming digging, it may not be entirely obvious what exactly was their contribution. And with the sample project, you get the same evaluation for all candidates, which is more fair and gives a better comparative sense between applicants.
But hey, if someone is an OSS rockstar, then it’s hard to argue with that. :)
** Cover image courtesy of http://m1psychology.com/performance-anxiety/