Improving STEM self-efﬁcacy with a scalable classroom intervention targeting growth mindset and success attribution

,


I. INTRODUCTION
The U.S. ranks 20 th out of 24 countries in the percentage of 24-year-olds who earn a first degree in a STEM field [1].Fewer than 6% of US college graduates earn a STEM degree [2].Gender and ethnic disparities in these fields are high: Hispanics, African Americans, and Native Americans make up 27% of the workforce, but only 11% of STEM workers, and men are employed in STEM occupations at twice the rate of women [3][4][5].Academe is even more unbalanced: Only 10% of US STEM professors are women, and only 4% of Physics and Engineering professors-one of the lowest rates among developed nations [6,7].African-Americans comprise just 2% of Physics faculty nationwide [8].
Bandura sees self-efficacy as a determinant of how people feel, think, and behave, and as the foundation for human motivation and personal accomplishment [19,[34][35][36][37].He hypothesized that people with low self-efficacy tend to avoid difficult tasks, while those who feel capable will embrace a challenge [38].Mastery experiences are one of the four major generators of positive self-efficacy [34] and the one that tends to have the strongest effect on self-efficacy beliefs [39].Our interpretation is that self-efficacy impacts student success because it improves motivation to persist, a prerequisite for mastery experiences that further raises self-efficacy-a virtuous cycle leading to learning and academic success.
An efficient, easily-disseminated, easily-adopted intervention for improving university students' STEM self-efficacy would be broadly useful and valuable.While several interventions have been designed to increase university students' academic self-efficacy in STEM domains [40,41], they are too time-and personnel-intensive for practical, widespread replication.They are also discipline-specific; we have been unable to find any self-efficacy interventions targeted at STEM disciplines collectively.We therefore sought to develop such an intervention.
Self-efficacy is a complex construct that includes two distinct components: understanding that success is based on one's actions and behaviors, and internal belief that one has the capacity to succeed [19,35].We looked for existing in-terventions that target one or both of these components, and found two types with a strong research base.The first is attributional retraining (AR), which targets a student's attribution of success or failure to internal vs. external factors.Over 30 years of empirical data support the effectiveness of AR for improving academic performance [15,22,[42][43][44][45][46][47][48].
The second intervention type targets mindset by persuading students that intelligence is growable rather than fixed [13,[49][50][51].Students tend to believe that the ability to do science is somehow innate or fixed rather than the result of effort [52], and challenging that belief with a mindset intervention can increase academic performance [13,49,51,[53][54][55].Growth mindset has also been linked to other variables associated with strong performance [56,57], although perhaps only for academically at-risk students [58].Its applicability to university-level physics has not been extensively studied [59].
Self-efficacy Intervention to Improve STEM Performance (SIISP) is a four-year NSF-funded project to develop and test an efficient, practical, disseminable, easily-adopted intervention to increase university students' STEM self-efficacy through a combination of attributional retraining and growth mindset instruction.Its primary hypothesis is that a brief intervention teaching STEM students about growth mindset and identifying ways they can enact it to take control of their academic success will increase their STEM self-efficacy.
Because the project's aim is to develop a self-efficacy intervention useful across STEM fields, the designs of our intervention and instrumentation are discipline-agnostic.However, we did not have the capacity to include multiple courses and campuses in each of multiple disciplines.To preserve comparability across courses and campuses, we chose to focus on one discipline.We selected physics as our working context, given that physics is a required gateway course for most STEM majors and has a reputation as being quite difficult.Since students' self-efficacy or mindset attributions towards physics may differ from other disciplines [60], subsequent research should include different disciplines.
The project is currently in its final analysis stages.In this paper, we will briefly summarize the intervention, measurement instrumentation, and study design, and then present evidence that although the intervention successfully increased students' growth mindset to a statistically significant degree, it did not cause a detectable increase in STEM self-efficacy.

II. METHODS
The SIISP project employed a large-scale quasiexperimental design in which students from multiple introductory Physics courses at three demographically distinct universities were divided into intervention and control groups by lab section.Students were quantified on each of three psychosocial variables-STEM self-efficacy (SE), growth mindset (GM), and perceived academic control (PAC)-via Likert-scale questionnaire three times: before the treatment, a few weeks after the treatment, and late the following semester.We repeated the design for three consecutive semesters (N = 265, 201, 387) with small adjustments to the treatment protocols in each iteration.Hierarchical linear modeling (HLM) was used to determine the impact of the intervention vs. the control and to identify moderating factors.

A. Context and Population
Three public universities with distinct student body demographics participated in the study: the University of North Carolina at Greensboro (UNCG), a former women's college with a female majority and a racially diverse population (66% female, 51% minority, 49% w/Pell grant); North Carolina Agricultural and Technical State University (NCA&T), a Historically Black College or University (HBCU; 57% female, 95% minority, 62% w/Pell grant); and North Carolina State University (NCSU), a highly rated engineering-focused school (45% female, 29% minority, 22% w/Pell grant) [61].Our study population was drawn from algebra-and calculusbased Introductory Physics I courses at UNCG and NCA&T, and from the calculus-based course at NCSU.The student demographics of each of these courses was not representative of its campus as a whole.In general, calculus-based courses at UNCG and NCA&T contained a smaller percentage of female or minority students than the campus as a whole.
From Fall 2016 through Summer 2017, we conducted pilot tests on early versions of the intervention, control treatment, and instrumentation as we designed and iteratively improved them, drawing students from UNCG and NCA&T.Data collection with the finalized survey instrument and the final (aside from small tweaks) intervention and control occurred during the Fall 2017, Spring 2018, and Fall 2018 semesters.
In Fall 2017, we recruited unpaid volunteers from students enrolled in the algebra-and calculus-based courses at NCA&T and from the algebra-based course at UNCG.All enrolled students were told that one laboratory section meeting would include, as a regular mandatory part of the course, an informational unit designed to help them succeed in a STEM major and career.A representative of the SIISP project explained that we were conducting a research study about the effectiveness of this unit, provided the informed consent document, and asked for volunteers.
We repeated this process in Spring 2018, in both algebraand calculus-based courses at UNCG and NCA&T; and in Fall 2018, in the algebra-based course at UNCG, both courses at NCA&T, and the calculus-based course at NCSU.Table I lists participant counts by course type, campus, and semester.

B. Intervention and Control Treatments
The primary goal motivating the SIISP project was to develop and validate a practical, efficient, scalable intervention to increase university students' STEM self-efficacy by incul- cating a growth mindset while increasing their sense of control over their academic success.We wanted it to be selfcontained enough that physics instructors or others without specialized training or background could adopt it.We also wanted it to be interactive, engaging, and responsive to students' thoughts.We therefore decided on a format in which an in-person facilitator would introduce the topic and then present two short, animated, narrated videos presenting the key ideas of the intervention.After each video, the facilitator would elicit student thoughts and promote class discussion with a few planned prompt questions.For the second and third semesters, we added an additional element: Each student received a small paper booklet with the discussion prompts; for each prompt, participants were asked to record their thoughts before the whole-class discussion began.The session elicited students' thoughts about why two equally hard-working students in introductory physics might experience different levels of success; presented the idea of and evidence for growth mindset; discussed; explained when and how growth mindset applies to taking a STEM course; presented strategies for practicing it by engaging in "productive struggle" and focusing on process; eliciting students' ideas about strategies they might personally try; and ended with a written prompt to give growth mindset advice to an incoming freshman.It required 30-40 minutes of class time.
To reinforce the message, we followed the primary intervention session with a short follow-up about four weeks later.In the follow-up, we distributed a single-page handout containing a brief, graphical summary of growth mindset and productive struggle as they apply to a STEM course-reusing a key graphic from the original session-and two openended prompts asking students to briefly describe how the ideas in the original session had affected their thinking and what else they could try in the future.The follow-up session required 10-15 minutes of class time.
In order to conduct an appropriate test of the intervention, we also designed a control treatment that had the same general structure as the growth mindset intervention, with as many similar or parallel design elements as possible.However, instead of growth mindset, the content focus of the control session and its follow-up was on developing an understanding of the complexities of "diversity" in the modern STEM workplace, on the importance of "cultural com-petence" for success in it, and on specific skills necessary for cultural competence.The control also had an associated follow-up session, with a single-page handout summarizing the key ideas and re-presenting a graphic and then providing analogous writing prompts.In Fall 2018, the final iterations of the control treatment required approximately 30 minutes for the primary session and 10-15 for the follow-up.

C. Protocol and Data Collection
During the first half of each semester, we solicited volunteers from each participating course and completed the informed consent process.We then administered a pre-test questionnaire consisting of demographics questions and our psychosocial variables instrument (see subsection D).
We conducted the intervention and control treatments midsemester, after the first course exam.These were conducted in course laboratory section meetings.We assigned each lab section to either the intervention or control treatment, arbitrarily except for an attempt to balance student counts and section distributions over time of day and day of week.We conducted the follow-up treatment sessions three to four weeks later, also in lab section meetings.As supplemental data, we collected booklets from the main sessions and handout pages from the follow-ups, both containing student responses to writing prompts.
Near the end of the semester, we administered a post-test identical to the initial questionnaire except that it omitted the demographics section.About half a year later, near the end of the subsequent academic semester, we contacted participants via email and asked them to repeat the post-test again.We incentivized completion of this by entering all respondents into a drawing to win a $50 Amazon gift card.
We visited each class or lecture section to conduct participant recruitment and the main treatment session.We also conducted the follow-up session in person in Fall 2017, Spring 2018, and at NCA&T in Fall 2018.Similarly, a team member administered the pre-test and post-tests via paper bubble-form questionnaire in Fall 2017, Spring 2018, and at NCA&T in Fall 2018.For logistical reasons and to explore scaling potential, in Fall 2018 at UNCG and NCSU we conducted the pre-and post-tests online via Qualtrics and administered the treatment follow-up sessions online and asynchronously via each course's learning management system.

D. Instrumentation and Analysis Process
When we initially conceived the SIISP project, we intended to use extant instruments for measuring the impact of our intervention relative to our control treatment.For our first pilot test rounds, we used a questionnaire assembled from relevant sections of published instruments.However, we quickly discovered that these instruments were not sufficient for our needs.The primary problem was that the items were too easy to agree with, such that we experienced very strong ceiling effects and therefore insufficient ability to detect changes.Over subsequent pilot test rounds we developed and refined a new instrument, the final version of which we employed in our three main study semesters.
This new SIISP instrument gauges three psychosocial variables: STEM-specific academic self-efficacy (SE), growth mindset (GM), and perceived academic control (PAC).It contains 38 five-choice Likert-scale items: 20 targeting SE, 7 targeting GM (two reversed), and 7 targeting PAC.The 20 SE items group into three subsets based on the approach they take to eliciting self-efficacy: ten statements about capacity (e.g., "I can correctly solve the most difficult homework problems in my STEM courses"), six statements about behavior (e.g., "I study enough to do well on STEM quizzes and exams"), and four statements about larger-scale success (e.g., "I can excel in an undergraduate STEM major").We will provide details about the grounding, development, validation, and use of the SIISP instrument in a forthcoming publication.
A trait value for each variable (SE, GM, and PAC) is estimated from each completed questionnaire via Rasch modeling.Specifically, we amassed all questionnaires from all survey rounds in all three study semesters that used the finalized instrument; eliminated those with the same option selected for all 38 items including the two reversed items ("column marking"); used WinSteps software [62] to fit a five-category Rasch Rating Scale Model to the resulting ensemble of responses, separately for each of the three variables; and extracted person trait estimates for SE, GM, and PAC for each respondent in each survey round.Rasch measures are reported in logits, defined such that a score increase of one logit corresponds to a doubling of the odds that an individual will select any particular response category rather than the nextlower one on a rating scale probing that trait.
These trait estimates by person and round, along with categorical data such as treatment (intervention vs. control), campus, course level, gender, and ethnicity, were used for a hierarchical linear modeling (HLM) analysis of the effect of our intervention on students' SE, GM, and PAC scores, as moderated by various possible categorical variables.At level 1, students were nested within lecture sections, and pre-test to post-test changes in the students' SE, GM, and PAC scores were predicted from whether they participated in the intervention or control treatment.At level 2, we controlled for any effects of semester, institution, and course level.In additional analyses, we also examined the interaction of students' gender or ethnicity with the treatment effect at level 1.

III. RESULTS
HLM analysis revealed that, averaged over the entire study sample, participants' growth mindset (GM) scores increased from pre-test to post-test by 0.169 ± 0.060 logits (0.12 standard deviations, p = 0.005).It found a statistically significant effect of treatment on change in growth mindset, larger for the treatment group by 0.311 ± 0.079 logits (0.213 SD, p < 0.001).Most of the level 2 variables-campus, course level, gender, and ethnicity-had no significant moderating effect, alone or in combinations.Semester had a significant effect on the impact of treatment when comparing Fall 2018 and Spring 2018 to Fall 2017 (−0.203 ± 0.097 logits, −0.139 SD, p = 0.037), but no effect for Fall 2018 vs. Spring 2018.
Participants' average perceived academic control (PAC) scores decreased from pre-test to post-test by 0.308 ± 0.097 logits (−0.131SD, p = 0.002).Treatment did not have a significant effect on PAC.Additionally, no level 2 variables or combinations thereof had any significant moderating effect.
Participants' average self-efficacy (SE) scores increased from pre-test to post-test by 0.141 ± 0.062 logits (0.107 SD, p = 0.022).Treatment did not have a significant effect on SE change.Additionally, no level 2 variables or combinations thereof had any significant moderating effect.
We also examined the effect of treatment on students' final grades for the physics course in question, with the same level 2 moderating variables.The mean course grade across all participants was 2.339 ± 0.142 on a 4-point scale (corresponding to a C+).No effect of treatment on grade was detected.
Our data may not meet the complex assumptions of HLM [63,64].Because HLM analysis showed little variability between lecture sections, we also ran a non-hierarchical multiple regression omitting the lecture section variable, predicting changes in students' SE, GM, and PAC scores based on treatment condition, controlling for semester, institution, and course level.Results agreed with HLM, finding the same significant effects as summarized above and with slightly lower p-values.It also attributed significance (p < 0.05) to two moderating effects that HLM had not: the effect of treatment on growth mindset was 0.242 logits higher at UNCG than either of the other institutions (0.166 SD, p = 0.041), and self-efficacy increased 0.149 logits more in Spring and Fall 2018 than in Fall 2017 (0.113 SD, p = 0.037).

IV. DISCUSSION
The data suggest that our intervention was successful in its direct goal of encouraging students to adopt a growth mindset perspective: It increased students' average growth mindset score by 0.213 standard deviations, a medium effect size with high statistical significance.However, we saw no corresponding increase in self-efficacy scores; our results did not support our primary hypothesis.We also saw no differential benefit for women or minorities, although small sub-populations limited our statistical power to detect such effects.
One possible explanation is that our instrumentation was inadequate to detect self-efficacy changes.Measuring selfefficacy via self-reporting by adults-individuals sophisticated enough to care about face, to maintain bravado, and to detect what authority figures and tests are looking foris notoriously difficult [59].Systematic biases are one concern.Inherent noisiness is a second: Students' self-efficacy trait estimates ranged from −6.77 to 6.98 logits, with uncertainties ranging from 0.28 to 1.9 logits.Many individuals also had suspiciously large "misfit" statistics, indicating an improbable response pattern.These persons may have been interpreting survey items idiosyncratically or just answering thoughtlessly.Another threat to sensitivity is ceiling effects: Despite our efforts to tune the survey items to the population, the majority of participants selected responses from the top few categories, reducing test sensitivity, and 2.6% of SE scores were "hard ceiling" cases.Rasch analysis can estimate a score to such cases, but cannot determine just how far above the top end of the scale they really are.
However, the growth mindset measurement suffered from these limitations even more severely: uncertainties for GM and PAC scores were larger, and 8.1% of GM scores were at the hard ceiling.And yet, we detected a statistically significant increase in GM.We also detected a significant increase in SE from pre-to post-test; we just didn't find any effect of treatment.This suggests that our conceptual model was incomplete, and increasing students' appreciation of growth mindset does not lead immediately to higher self-efficacy.
Interestingly, we found that students' perceived academic control-their sense of being in control of their academic success-decreased from pre-test to post-test, regardless of treatment condition.Perhaps students did not feel enough agency to believe that their appreciation for growth mindset could translate into better academic outcomes?This suggests a second possible explanation: That increasing both growth mindset and perceived academic control are necessary to increase academic self-efficacy, and our intervention was effective for the first but not the second.
A third explanation is that our model is accurate and our intervention effective, but the study was too brief to detect a self-efficacy impact.Growth mindset supposedly inclines students to take on harder challenges and persevere in the face of difficulties, until ultimately they have mastery experiences which generate positive self-efficacy [65].Half a semester may have been too short a time for this process to unfold.
A fourth is that our model of self-efficacy development is simply incomplete or incorrect, and increasing students' STEM self-efficacy requires more than addressing growth mindset and perceived academic control.
In summary, we have developed a brief, efficient intervention that successfully increases university STEM students' growth mindset but not their self-efficacy.During that process, we developed a survey instrument to measure selfefficacy, growth mindset, and perceived academic control in university STEM courses.Our results suggest a need for further research into the dynamics of academic self-efficacy.

TABLE I .
Student participant counts by semester, campus, and course level, counting those who completed both pre-and postsurveys and participated in the intervention or control treatment.