Principal Investigator's Guide, Chapter 1: True Stories from your Peers: The Interplay Between Evaluation and Project Implementation

I am not a professional evaluator, nor are the other contributors to this chapter of the Guide, but we all have traveled to the land of evaluation and are here to report on some of the opportunities and adventures that lie therein. Welcome, fellow travelers. The true stories that follow are intended to engage you in the equivalent of a headlong rush through a foreign market full of colorful textiles, delicious-smelling spices, and tasty treats. I've interviewed many project leads-and added a few stories drawn from my own experience-to paint a vivid picture of the interplay between evaluation and project implementation...

Rachel Hellenga develops exhibitions and performs exhibit-specific strategic planning and fundraising. She is a Chicago-based consultant and self-professed‚ "Sam-I-Am," of evaluation owing to her many positive experiences working with professional evaluators over the course of a twenty-year career in the museum field. She has an insatiable appetite for visitor input, which has been reinforced by the results of integrating evaluation into projects such as the NSF-funded Inventing Lab and Skyline exhibitions at the Chicago Children's Museum, featuring a flying machine tower and construction materials replicated by other museums around the country; and the Science Storms exhibition at the Museum of Science and Industry, winner of the 2011 AAM Excellence in Exhibitions Award and the ASTC 2011 Roy L. Shafer Leading Edge Award. Rachel received her B.A. in psychology from Harvard University, and her particular areas of interest include education research in encouraging persistence; tinkering/making/engineering themes; Reggio-inspired design; bullying prevention; and novel uses of technology in exhibitions.


In the following chapters, friendly colleagues who have been in your shoes will address key topics and practical considerations to help you successfully integrate evaluation into your informal STEM education projects. You'll learn about the principles of evaluation and how different types of evaluation can be used (chapter 2). You'll learn how to find an evaluator whose expertise matches your project needs (chapter 3) and how to work with that individual throughout all phases of project development and implementation (chapter 4). You'll delve into the details of designing an evaluation plan (chapter 5). And, you'll learn how evaluation findings can be used to improve your project, inform the work of your peers, and even influence the overall success of your institution and the entire field of informal STEM education (chapter 6). Throughout these chapters you'll find stories told by PIs and other project leads who have integrated evaluation into the implementation of exhibitions, television shows, science cafe's, after-school programs, science festivals, and other informal science education projects.

Who should read this chapter?

Read this chapter if you like to see concrete examples before diving into underlying principles. You may also find this chapter helpful if you are new to the informal STEM education field and want to understand the operating context for evaluation activities. If you prefer to get straight to the details, feel free to jump right to chapter 2.

If you haven't jumped ahead, you're about to read several vignettes organized by the phases of typical project development.

  • First we'll look at setting project goals and objectives so you'll know what success looks like when it's time to measure your project outcomes.
  • Next we'll explore how front-end evaluation can help you better understand the needs and interests of your audience.
  • Following that, a few stories about formative evaluation will demonstrate how participant feedback can surprise you with valuable insights and can inspire course corrections.
  • Next we'll examine how summative evaluation can help you determine whether your project objectives have been achieved.
  • Finally, we'll talk about using your evaluation findings at the reporting and dissemination stages of project development.

Addressing Some Concerns about Evaluation: Time, Money, and Control

Some project leads fear that embracing evaluation means putting visitors or project participants in charge of the design process, a bit like Aesop's fable about a man and his donkey.

A centuries-old dilemma: getting to your destination while trying to please everyone

A man is travelling to market with his son and donkey. The first passerby suggests that the man is foolish because "donkeys are to ride upon." The man puts his son on the donkey, only to hear from the next passerby that "the lazy youngster rides while his father walks." Father and son switch places only to hear "what a bad father to let his son trudge along." When both then hop on the donkey, someone observes that they are overloading it. Eventually the man and his son carry the donkey until they trip on a bridge and the donkey goes overboard. The moral of Aesop's tale: Please all and you will please none.

In fact, the intent of evaluation is to inform project design rather than to generate or dictate it. This Guide will outline best practices in the integration of evaluation and project implementation and share examples of evaluation in action.

Let's acknowledge another concern: Project leads sometimes view evaluation as a process to be contained so that it doesn't derail the "real work" of project completion. It's true that evaluation can add to project cost and may extend your implementation schedule. However, when evaluation keeps your project on track and enables you to maximize your project's impact, it's time and money well spent. We'll look at a few situations in which evaluation saved money by preventing expensive mistakes; helped attract new funding; and eliminated lengthy debates in the conference room in favor of data collection and reality-testing.

Evaluation shouldn't substitute participants' judgment for yours

Here's a cautionary tale that acknowledges the fine line between inviting participant feedback. [David Schaller] Eighteen months after launch, WolfQuest had attracted a substantial audience: 450,000 game downloads and 150,000 registered users. So, for the next stage of this 3D wildlife simulation game developed by EduWeb and the Minnesota Zoo, we gave players a larger role in the design process. In fact, we put them in charge. In a slight departure from the methods of more traditional formal evaluation, we solicited ideas on new game features with a contest rather than through focus groups or surveys. After receiving nearly 1000 proposals from players we selected the best, then put them up for a vote on the WolfQuest forum.

To our surprise, proposals to improve gameplay and learning (e.g., adding bison as prey animals, creating better multiplayer missions) attracted fewer votes than proposals to add environmental enhancements (rain, snow, and changing times of day). Because we had made the vote public, we had to abide by the results. In retrospect, we realized our two big mistakes:

  • Instead of using player input to inform project development, we had allowed it to enforce a particular direction.
  • We had no sampling strategy, so the vote was skewed by a core group of players who used the game mainly as a chat room and who preferred improvements that would enhance their sense of immersion in our wolf-world.

Democracy has many merits, but benevolent dictatorship is generally a better model for formative evaluation.

Getting Started: Defining Project Outcomes

Determining goals and objectives requires describing specific outcomes that you hope your project will achieve for a visitor or participant. This process is described in detail in chapter 5, but here's a brief explanation. Typically outcomes are first stated as a series of goals that describe the broadest possible impacts that a project will have within its community, and then secondly as a series of measurable objectives that indicate specific and observable effects on the indended audience. Many evaluators coach project teams by asking some variant of the question: "What will your participants think, do, or feel as a result of their experience?" This exercise requires the team to define and articulate project outcomes that represent the desired effects on the end user. Outcomes can be framed in many ways-in addition to "think" (cognitive outcomes); "do" (behavioral outcomes), or "feel" (affective outcomes), participants may show particular types of engagement, or develop certain skills, or develop new interests in science (types of outcomes are elaborated in chapter 5).

If you'd rather skip reading about setting goals and objectives and jump right into project implementation, please first consider one more tale about the perils of ignoring best practices in project design:

"Ready, aim, fire!" vs. "Ready, fire, fire!"

Early in my career I found myself on a project team that was racing to develop a series of exhibit components under the time pressure that we all typically face. We had hired an evaluator to satisfy the requirements of our grant. As an exercise to understand the team's vision, she asked each team member to review a stack of images and to choose those most representative of the exhibition content. Our choices were all over the map! We had not come to an early consensus on project direction or clearly articulated our project outcomes. As a result, exhibit components were chosen based on the advocacy of individual team members. Instead of "Ready, aim, fire," it was "Ready, fire, fire." Unfortunately, the exhibition never really jelled-although it did open on time!

Rachel Hellenga

A goal and objective-setting exercise is NOT the time to talk about outputs, which may be project deliverables ("we will produce a video") or logistics ("we will reach 5,000 students in the first year") or content ("the topic is earthquake modeling"). Rather, outcomes are the effects that you hope to achieve, and continuing to develop your program model requires spelling them out in detail. Not only will the process provide you with measures of success, it also will suggest strategies for project development that will help you achieve your outcomes.

Learn about your Audience: Front-end Evaluation

Front-end evaluation provides information such as the intended audience's general knowledge, questions, expectations, experiences, motivations to learn, and concerns regarding the topics or themes that will be addressed by your project. Some might argue that this phase of project design is not "evaluation" per se. However, understanding your audience is critically important in project design, and audience research is often facilitated by an evaluator as part of the overall project development process.

Defining Outcomes (Not Outputs leads to new program strategies and measureable results)

When developing a body image program for teen girls, answering the question "How will the participants be transformed as a result of this project?" was a useful exercise for staff at the Robert Crown Center for Health Education. The word "transform" conveyed the idea of important change rather than big deliverables, and focused their conversation about desired outcomes for the new program. What would girls think, do, or feel differently as a result of participating in our program?

Kathleen Burke

One cognitive outcome defined by the team was for girls who participated in the project to "think critically about the media and realize they can refuse to imitate what they see." This desired outcome led to the development of an activity showing "before and after" pictures of magazine models, which revealed how photo manipulation made the models look thinner. Girls in the pilot program rated this activity very highly, and their responses to an open-ended question after participating in the pilot-"I will not compare myself to people and magazines;" "I don't need to look like a model;" "Models don't look the same on TV as they do in real life;" and "Magazines are liars!!!"-showed that the girls were definitely thinking critically about the media.

Facilitate good decisions: Formative evaluation

Formative evaluation provides information to improve a project during its design and development. Project leads may be tempted to rely on their own opinions about how a project is shaping up and skip formative evaluation with the argument that "it's too early in the process" followed shortly by the argument that "it's too late in the process." I believe that formative evaluation is always a good idea to do right now. Each round of data collection can provide you with a snapshot of one point in time, delivering rapid feedback that feeds into your project design.

Learn about your audience: front-end evaluation can identify "hooks" for a tricky subject.

When replacing the genetics section of the Tech Museum's LifeTech gallery, my team suggested removing a DNA crime-fighting component and encountered vigorous resistance from members of the marketing department. They were concerned that the exhibition would be all doom and gloom if we focused solely on high-tech strategies for addressing genetic disease. We pointed out that crime fighting didn't fit in a gallery about technology and health. They countered that the crime-fighting exhibit could be "extra"-it wouldn't do any harm to insert something fun to offset the gloomy content.

Rachel Hellenga

We addressed the "gloom and doom" risk by conducting front-end evaluation. The evaluators conducted interviews with visitors and presented various written scenarios such as 1) a woman deciding if she needed breast cancer screening based on her family history and 2) a boy who might benefit from human growth hormone. [Please note that the evaluators did not go out on the floor and ask visitors to vote on inclusion of a crime-fighting exhibit!] We learned that visitors were not turned off by the dark side of the topic. In fact, the personal stories were a "hook" that motivated visitors to absorb scientific information so they could perform diagnoses and advise patients.

Once Genetics: Technology with a Twist opened to the public, our summative evaluation found that two-thirds of interviewees described what they took away from the exhibition by naming exhibits related to personal stories. Some visitors expressed surprise at the extent to which genetic disorders impact families, while others found the information from the stories personally relevant. Front-end evaluation had given us confidence to pursue this direction and using the "hook" of personal stories had paid off.

Learn a little before investing a lot

Sometimes an evaluator can help a team resolve questions and reach consensus before incurring the full expense of producing a polished deliverable. [Rachel Hellenga] Sometimes an evaluator can help a team reach consensus before incurring the full expense of producing a polished deliverable. During the design phase of the Science Storms exhibition at the Museum of Science and Industry, the design team was planning a component that would allow visitors to race two solar-powered cars around a track. The design showed visitors using a joystick to tilt a solar panel into or out of the path of the light as a method to generate more or less power and control the car. The exhibit developer suggested an alternate strategy of covering and uncovering the solar panels to more clearly indicate how much light was reaching the solar panels. The "covering" approach seemed superior, but the "tilting" approach had already been detailed, so the team debated the relative importance of spending the time and money to change the component before spending tens of thousands of dollars on fabrication.

The developer produced a $20 mockup consisting of a regular light bulb, solar panel, and 3-inch diameter fan. The prototype was less than rudimentary; not only was it a fraction of the real size but it used a solar-powered fan instead of a car. Nevertheless, pairing it with exhibit text gave the evaluator enough to work with. Formative evaluation determined that visitors understood the concept much better when they were allowed to cover and uncover the solar panel instead of tilting it toward and away from the sun. The team changed the design.

What did they learn? Or do? Or feel? Summative Evaluation

Summative evaluation measures the degree to which objectives for the intended audience have been realized, which explains the importance of setting objectives that can actually be measured. While sometimes thought of as "final" evaluation, summative evaluation can begin long before a project is complete, and it can take many forms-tracking and timing for exhibits; pre- and post-surveys for community projects; analyses of journals for community science projects; interviews with project participants; and many, many more. The following stories about three diverse projects show the power of painting a clear picture of your destination at the outset of a project, not in the form of specific exhibit layouts or project materials or other deliverables, but as a shared vision of the desired outcomes for your visitors or project participants. If you take the time to define your vision of success, you certainly will increase your odds of achieving it.

Summative Evaluation: Measuring cognitive outcomes (What will they learn?)

As we embarked on the summative evaluation of Dragonfly TV, we wanted to measure whether watching the show successfully changed children's appreciation for and understanding of scientific inquiry In the planning stages for this show featuring "real kids doing real science," we had identified specific aspects of inquiry that we wanted to convey and incorporated those concepts into the project. Now it was time to see if the message got across. I worked together with our evaluator to craft evaluation questions to measure whether we had achieved this outcome. The dialogue between PI and Evaluator was essential to arriving at a solution. We landed on a strategy of asking kids questions such as "How important is it to you to write down information during an experiment?" and "How important is it to you to keep some things the same each time you repeat the experiment?" The kids ranked the importance of each aspect of inquiry before and after viewing episodes of Dragonfly TV, and through this simple set of questions we were able to demonstrate that watching our show resulted in a significant increase in children's understanding of the process of scientific inquiry.

Richard Hudson

Summative Evaluation: Measuring Behavioral outcomes (What will they do?)

I was on a team that completed a radical overhaul of a popular construction exhibit with the aim of engaging more girls and ensuring that more visitors succeeded at the building activity Over the years the museum staff at the Chicago Children's Museum had seen all kinds of wonderful free-standing structures built by visitors to Under Construction every day, but a formal front-end study revealed that only 11 percent of the visitors observed created these structures; others were connecting pieces together or engaging in other activities before moving on to another gallery. One of our desired behavioral outcomes for the Skyline exhibition, to be measured through observation, was to increase the number of children who built a free-standing structure. This desired outcome became a driver for various design strategies such as the substitution of larger nuts and bolts to cut the assembly time in half. The evaluation team determined a set of criteria for "free-standing structure" and observed 100 children using the new exhibition: We were pleased to find that 40 percent of them built stable, free-standing structures.

Rachel Hellenga

A second objective was to increase girls' involvement in, as the exhibit was only engaging half of the girls visiting the original exhibit versus about two-thirds of the boys. One of the modifications inspired by this objective was the addition of fabric with appealing textures and colors to the array of available materials. Keeping those outcomes in mind during early stages of the design process helped our team achieve the results we were aiming for by the time we reached the summative evaluation stage. Once Skyline opened, observations conducted by the evaluation team determined that 71 percent of children became engaged in building with no statistically significant differences between males and females.

Summative Evaluation: Measuring affective outcomes (What will they feel?)

Bouncemania integrates elements of a play and a science demonstration in a Wrestlemania-style match between two rubber balls to teach the relationship between polymer structure and function The show was created by Fusion Science Theater, a project led by the Madison Area Technical College, and features a live performer who presents the balls as characters: "In this corner, weighing in at 10.5 grams, the five time title winner-a ball with experience, a ball with bearing. Give it up for. . . B.B. the King Bouncer!" The performer shares information about the molecular structures of the balls and asks the audience to vote for which one they think will bounce higher. Then the performer has audience members act as atoms in a physical model of the structures, and once again, predict which ball will bounce higher. The demonstration itself is not that fancy-one ball bounces and one drops with a thud-but by that point in the show, the audience is cheering and exploding with excitement. Committing to a prediction motivates the children to apply what they've learned and to take a strong interest in the outcome of the demonstration.

Holly Walter Kerby

In addition to looking for cognitive gains, we studied affective outcomes by measuring whether children experienced a positive change in their perception of themselves as individuals who are interested in and capable of learning and doing science. We believed that the application of a playwright's bag of tricks-dramatic irony, plot twists, and character development-would all contribute to engaging children's emotions at a deeper level than a typical science demonstration. Our in-house evaluator worked with us to develop questionnaires for use before and after the performance. Children used Likert scales to indicate their interest in science and their confidence in their ability to learn science. Their ratings revealed that participating in the performance had a dramatic impact-no pun intended!

On a more informal level, we hold conversations with audience members afterward. One of my favorite conversations went something like this: I asked a boy what he liked about the show and he said "I like BB the King." I asked, "Why?" and he said "Because he won." I responded, "You like it when things win?" and his answer was "I liked it because I knew he was going to win."

Maximizing your Strategic Impact: Reporting and Dissemination

Not only can evaluation findings provide a picture of a project's effectiveness, but they can be valuable in identifying strategic next steps for an organization, in making the case for further funding, and informing the practice of others-but only if the information is embraced by the project team and shared with others who could benefit. Dissemination should be a fundamental part of your overall project evaluation plan and something you factor in from the start.

Integrating evaluation with your strategic vision

In an ideal world your evaluators go beyond assessing project outcomes; they help to gather strategic information for your organization In anticipation of its 2011 The Fabric of the Cosmos documentary series, NOVA hosted a series of "Cosmic Cafe's" around the country with the assistance of chapters of the Society of Physics Students (SPS). Based on the format of a science cafe's featuring an informal discussion with an expert, the Cosmic Cafe's addressed topics raised by the program. We needed our evaluation study to go beyond simply assessing what the attendees at these specific cafe's learned about Fabric of the Cosmos because we intended to use the evaluation findings to help shape the ongoing national network of science cafe's.

It was important for the PI to articulate the big picture to our evaluation team and to plan for this broader scope of work from the start, so that the evaluation team understood it was an ongoing national network that needed to grow. We worked with the evaluators to frame questions about how these cafe's could inform the field of informal science education. Armed with the big picture, the evaluators conducted the study and informed us that they saw great potential in further collaboration with professional organizations like the Society of Physics Students. The SPS was a very valuable partner in implementing the cafe's; in return, the project helped the undergraduates learn how to present publicly and impressed on them the importance of communicating with the public as they advance in their careers. The recommendation to invest further in that partnership informed our strategy going forward.

Rachel Connolly & Pamela Rosenstein

Disseminating evaluation findings can attract resources for your project

In the early years of the Science Festival Alliance, a common critique that we heard was "we don't need a once a year party, we need a sustained effort." When the Science Festival Alliance was in its second year, the evaluation covered a huge swath of events that had different formats, served diverse intended audiences, and took place at venues ranging from retirement homes and elementary schools to bars and tattoo parlors. The evaluation findings helped to demonstrate that these festivals are worth the energy and resources: the festivals were reaching new and diverse audiences; changing attitudes; resulting in cognitive gains; and increasing awareness of the science happening in the region. Over 40 percent of the festival collaborators reported follow-up contacts from the public and 89 percent of the STEM practitioners surveyed indicated an interest in ongoing participation in public outreach for the rest of the year.

Perhaps the most significant finding was that interaction with a STEM professional was identified as the number one predictor of positive learning outcomes, whether in terms of science learning, increased interest, or the perception that science is fun. That was a unique finding and revealed a distinct strength of festivals. The festival organizers disseminated the evaluation results in a wide range of formats such as PowerPoint, conference poster presentations, online PDF documents, and video, all of which helped to get buy-in for annual festivals and supported new festival organizers in making the case to their communities.

Ben Wiehe

Conclusion

Project planning, evaluation, and implementation are all parts of a whole, working best when they are synchronized and coordinated. At its best, evaluation works to answer questions that give a project team a deeper and richer understanding of its own practice-before, during, and after a specific project implementation. Therefore, an evaluation study should evolve and be guided by a project, while the project should be informed and guided by its evaluation. This Guide examines how you can implement and manage evaluation to inform your practice, facilitate decision-making on project-development teams, gather evidence of success, attract further funding, and most importantly, make a difference in the lives of all the visitors and project participants whom you touch through your work.