Internet-based Data Collection: Just Do It Already!

Topic: Measurement, Statistics
Publication: Computers in Human Behavior
ArticleFrom paper to pixels: A comparison of paper and computer formats in psychological assessment.
Author: M.J. Naus, L.M. Phillipp, M.Samsi
Featured by: Benjamin Granger

Internet Although many organizations
have jumped onto the internet-data collection bandwagon, several issues still
need to be addressed.
 For example,
are paper-pencil and internet-based tests of the same trait (e.g., personality
questionnaire) or ability (e.g., cognitive ability test) really equivalent?
 Similarly, are there any reasons to
believe that employees respond to internet-based tests differently than they
would a paper-pencil test of the same trait or ability?

Naus, Philipp, and Samsi
(2008) set out to investigate these questions using three commonly used
psychological scales (Beck Depression Inventory, Short Form Health Survey, and
the Neo-Five Factor Inventory).

Although Naus et. al found
that the paper-pencil and internet-based survey formats performed equivalently
for the Beck Depression Inventory and the Short Form Health Survey, there were
differences for Neo-Five Factor Inventory (a commonly used personality
assessment tool).
 What’s going on
here?

One possibility is that
responses were more socially desirable for the paper-pencil format, since a
researcher was present at the time.
 That is, in the presence of an authority figure (i.e.,
researcher) participants may have responded in order to appear more self-controlled
and self-focused.
 This is likely
much less of a concern when completing the same survey on a computer at home
(in PJs!).

Overall, respondents
perceived the internet-based format to be convenient, user-friendly,
comfortable and secure (All great things!).

So what can we conclude
about these findings?
 Although
internet-based data collection methods have some advantages over paper-pencil
methods, there are some caveats to their use.
 In some cases, the tests may operate differently due to the
particular format.
 Unfortunately,
not much is known about how they might differ.
 However, Naus et al.’s findings suggest internet-based
methods receive good reactions from employees and can save an organization time
and money!

Is interrater correlation really a proper measurement of reliability?

TopicMeasurement, Research Methodology, Statistics
Publication: Human Performance           
Article: Exploring the relationship between interrater correlations and validity of peer ratings
Blogger: Rob Stilson

Is inter Interrater reliability (still with me?, Ok good) is often
used as the main reliability estimate for the correction of validity
coefficients when the criterion is job performance. Issues arise with this
practice when one considers that the errors present between raters may not be
random, but due to bias, while agreement between raters may also stem from bias
instead of actual consistency. In this study, the authors’ main goal was to
explore the relationship between interrater correlations and validity and also
to explore the relationship between the number of raters and validity.

In order to do this, the authors gathered information from
3072 Israeli policemen from 281 work teams who took part in peer rating. The
average size of each of these work teams averaged about 12 people and ranged
from 5 all the way to 33. The measure used was overall performance (on a
7-point Likert scale). The predictor employed in this study was the ICC (C,k)
model, which is equivalent to Cronbach’s alpha. Measurement indices were
computed on the team level as rating only took place within work teams.

The predicted variable for the study was the validity
coefficient for each work team. This is the part of the study where you could
really feel the sweat involved. Here the authors gathered information on
supervisor evaluations, absenteeism data, and discipline data collected over
several years (for over 3000 policemen)! The authors then converted this
information into
z scores with higher
scores indicating better performance.

Results showed a weak positive linear relationship between
interrater correlations and the various validity indexes. This is not what you
want to hear if you are doing peer rated performance evaluations. The authors’
stipulate that the correlation between raters is a conglomeration of factors
having different theoretical relationships with validity (i.e. bias and other
idiosyncrasies).

Practical implications from the information gleaned here
include the adjustment of validity due to attenuation. If the measurements used
in the calculation included non random error estimates, the ensuing
calculations will be off. A positive finding for the work world was validity in
small units (less than 10 people) was about the same as those for larger units.
The authors’ believe this finding may be due to observation opportunity level,
which is seemingly greater in smaller work units.

Kasten,
R., and Nevo, B. (2008) Exploring the relationship between interrater
correlations and validity of peer ratings. Human Performance, 21(2), 180-197.