Interview Reliability: Statistics vs. Personal Experience

Topic(s): interviewing, selection
Publication: International Journal of Selection and Assessment
Article: Employment interview reliability: New meta-analytic estimates by structure and format
Authors: A.I. Huffcutt, S.S. Culbertson, W.S. Weyhrauch
Reviewed by: Megan Leasher

This article focuses on the reliability of interviews. The more error introduced in interviews, the less reliable they are. The researchers targeted different sources of error, both from the interviewee and interviewers. Interviewees can introduce error into an interview when they answer similar questions from the same or multiple interviewers differently, while interviewers introduce error when they interpret, evaluate, and rate identical responses differently.

The researchers found that the more structured the interviews, the more reliable the interview results tended to be. In addition, panel interviews were more reliable than single-interviewer interviews.

I have to admit that I am a little torn regarding the desire to strive toward reliability in interviews. I know from a statistical perspective that interview reliability is important, since the more reliable interviews can be, the more capable they are of statistically predicting the future job performance of candidates (who become hires).

In industrial and organizational psychology, this conflict between ideal statistics and personal experience is the crossroads of “where research meets practice.” Sometimes what is helpful isn’t necessarily the ideal scenario or course of action from a statistical perspective. This crossroads can make for great discussion fodder, salty arguments, ideas for new research, or all of the above.

If interviewees answered the same or similar question in the exact, regurgitatingly same way, and all interviewers heard, interpreted, evaluated, and rated a candidate’s responses in the exact same way, you would have a perfectly reliable interview. From a statistical perspective, that is.

But would you be missing something?

Think of all of the times in an interview that you heard, saw, or interpreted something very differently from your colleague who asked the candidate a similar question. Then, when you debriefed after the interviews were complete, those differences gave you something really good to discuss, like nuances that the other didn’t pick up on, unique reactions to answers, and so on. Perhaps a third interviewer had another fresh perspective to share, turning it into a real debate. Debates lead to unexpected and more in-depth realizations about the candidate that one interviewer could have conjured up alone. Statistical reliability would not have wanted this scenario to take place. It would be unhappy that interviewers disagreed, because that would jeopardize and threaten the high reliability it would be striving for.

Which is more important? Reliability or the ability to learn or interpret something unique?