Categories

Spinning Out of Randomized Controls

The drum beat for more “scientific evaluation techniques” has become a mantra in higher education research.  Despite clear and much more reasonable alternatives, external groups preach the need for randomized controls to determine if a given innovation “works.”   These calls come from outside the institutions with whom I work, by otherwise well-intended experts who’ve been thoroughly indoctrinated in narrow research designs.  Social Research Methods does a nice job of covering the evolution of the terms “experimental” and “quasi-experimental designs,” first coined by Stanley and Campbell.

The drumbeat has gotten louder and louder over the past decade for those seduced by the mandate to use control and treatment groups. The federal government during the Bush years was in the forefront for advocating for a “Scientifically Based Evaluation Methods” policy, sending the meek members of the education establishment into a scurry.

In it’s purest form, Scientifically Based Evaluation Methods calls for selecting randomized control groups (RCT’s). As the mantra reads, no true research in education can be conducted without RCT’s or some other variation of quasi-experimental design. Funders and foundations echo the need for comparison groups.

There are alternatives, however. Especially when RCT’s are preposterous, clumsy, and energy draining. Enter Michael Quinn Patton with some solid advice disputing the slavish attraction to RCT’s. While standards for research may be quite different from those used in evaluation (a dichotomy that I find only artificial, incidentally). But, we’re plagued in higher education by university researchers who insist that RCT’s are the only route to new knowledge.

Agreeably, Patton indicates that there are times when RCT’s are appropriate: drug studies, fertilizer and crop yield studies, and single health practices. The fertilizer reference, of course, brings me a broad chuckle. In my keynote as president of the Association for Institutional Research several years ago, I pointed out that treatment and control group methodologies were a call to “explain a multivariate world with a two variable model.” Six years later, I still think I’m right and am encouraged when Patton agrees.

Times when RCT’s are not appropriate include situations that are complex, multi-dimensional and highly context-specific. Patton uses community health interventions as an example; more broadly I use any intervention that seeks to change complex human behavior such as learning and skill interventions. My monograph late in the 1990’s sought to explain how complex and interrelated factors come together to predict student learning and cognitive development. Readers wanting a quick overview of how complex events might be related to produce a desired outcome can find a visual in my monograph.

So knowing that all of this is complex and there’s only a splinter that can be explained by RCT’s or quasi-experimental design, where does this leave all those gentle souls who merely want to prove that their programs work? Having to learn much more than the common mantra, I suspect. Patton talks about both the possible and appropriate, a good place to visit. Multiple sources of data about each case, triangulation of sources, modus operandi analysis, and epidemiological field are his keywords. To me, this all sounds a lot like context and generating meaning from each program’s reality rather than hammering on a RCT. Patton goes further, though, and offers that RCT’s aren’t needed when face validity is high, the observed changes are dramatic, and the link between treatment and outcome is direct. It seems that we agree that educators and evaluators frequently use RCT’s as a nail when the only tool they have is a hammer!

Now, there are times when RCT’s are appropriate. Most often, however, the assumptions they carry limit what can be learned about the intervention under scrutiny. In higher education, we assume that small, mirco-level programs can either assign students randomly to control and treatment groups or have the sophistication to scientifically match students in treatment and control groups. The former is virtually impossible from a moral as well as logistical perspective while the latter does violence to the complexity of an intervention. My experience working widely with programs throughout the United States has taught me that matching subjects on gender, age, and race/ethnicity neutralizes the effect of gender, age, and race/ethnicity have on an intervention. After that, as the song goes, “true differences” emerge. I don’t think so.

Does anyone besides me think that student outcomes depend more on other key factors such as the structure of the intervention, student motivation, and the quality of teaching? Does lack of a RCT mean that any other data gathered about that program is meaningless?

I’ve come lately to use the term developmental evaluation in my consulting practice to distinguish our approach from summative and formative evaluation. Most funders are interested in summative and formative evaluation while some are moving toward developmental approaches. The difference? Summative evaluations are for making final judgments and formative evaluations are directed at improving programs. Developmental evaluation, on the other hand, talks about ongoing development and knowledge building. The programs I work with aspire to be innovative and cutting-edge; they don’t have a long history from which to draw. They’re not ready for rigid summative judgments although they’re receptive to formative help. They are a work-in-progress.

  • Share/Bookmark

1 comment to Spinning Out of Randomized Controls

  • Jacki Stirn

    We had an interesting discussion at one of my Round One colleges about the ethical dilemmas of trying to keep comparison groups “pure”. Since it was early in Achieving the Dream, there were few resources concerning the ethics. This is a college that has been successful and I attribute some of the that to the truly student-centered atmosphere of the college. In hindsight, the ethics discussion was an early cue.

    They decided to keep the research as pure as possible but to serve any student requesting the intervention, whether or not they were in the selected pilot group.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>