Dispelling Training Evaluation Myths
A few months ago, I attended a local professional group meeting where the topic of training evaluation was discussed. I learned a lot about what many people think about training evaluation. Most of the attendees were not training or human resources professionals, so I would not expect them to be experts on the topic. I’m glad they felt comfortable asking the questions that were on their mind. It was a good reminder for me that many misconceptions exist about training evaluation. Let’s talk about two of those myths or misconceptions and what the real story is!
Myth #1: A “Level 1” is the survey that students complete after a course; a “Level 3” is a survey that they complete a few months after the course.
It’s easy to understand that the Kirkpatrick Evaluation Levels™ (Disclaimer: I’m not related to Donald, Jim, or Wendy Kirkpatrick, but I did learn a lot when I took the Bronze Certification class!) became synonymous with the timing or method with which evaluation data were collected. However, in the New World Kirkpatrick Model™, which is described in the OPM Training Evaluation Field Guide, the levels are the type of information that is collected. The Four Levels are:
- Level 1: Reaction
- Level 2: Learning
- Level 3: Behavior
- Level 4: Results
Information on the four levels can be collected at different points in time and using different methods. For example, an end-of-course evaluation could ask students to answer questions about each of these levels. Yes, asking them to predict whether they think they will apply the skills they learned (Level 3) is hypothetical. But asking this question serves as an early warning indicator. Negative or low responses to that question should trigger further diagnosis into why they are responding that way.
Additionally, you should ask the same question on a follow-up evaluation to confirm whether, in fact, the student did apply what was learned. Things can change once students get back to the job. It is not uncommon for students to have trouble applying what they learned to their job; sometimes you can modify the training so that it is easier for students to know how to use their new skills. Students also may encounter other obstacles, which you should ask about on the follow-up evaluation. Some common obstacles include not having the right tools or encountering resistance from the supervisor or other team members.
Myth #2: Levels 3 and 4 are hard to measure.
The attendees at the meeting seemed to think that measuring Levels 3 and 4 were nearly impossible. One gentleman recounted a story about how he spent a whole year leading a study that cost over six figures to simply measure Level 4. As I listened to him, I began to wonder whether the study cost more than the training! I looked around the room at other attendees, and their eyes were wide as they presumably thought about having to run such a study one day.
When viewing the levels as types of information, measuring Levels 3 and 4 doesn’t seem so hard. As we already learned, you can ask questions about those levels in end-of-course evaluations and follow-up evaluations. No single measurement is perfect, so you must act like a detective and collect information from different sources and in different ways.
With Level 4, agencies often already measure key indicators related to their mission. For example, a transportation organization might monitor the number of injuries or accidents that passengers have while traveling in different modes of transportation. The Department of Health and Human Services and certain National Institutes of Health might monitor the number of people who are diagnosed with various diseases.
Again, no single measurement is perfect. Other factors impact these high-level metrics. With transportation, the number of passenger injuries might be impacted by weather, equipment, and funding. But, by gathering information from many sources, one can piece together a plausible argument that the training had the desired effect.
Once Level 4 is defined, it is often easier to identify the specific behaviors (Level 3) that will impact the Level 4 result. For a transportation agency, passenger injuries might be impacted by training flight attendants or train conductors in standard safety protocols. Injuries also might be impacted by research funding for testing out new materials, such as types of floor materials that prevent slipping. Training grants program managers about what new research is needed would be another metric to look at; the number of studies that they fund on these safety-related materials would be another Level 3 measure.
Again, these information sources often already exist and, along with other information, can help discern a training program’s true impact. By dispelling these myths, I hope that you can better evaluate your training program to ensure that it has a positive impact your agency’s mission!