Effects of facilitating collaboration in large-enrollment introductory physics courses

,


I. INTRODUCTION
Introductory physics courses at the University of Washington (UW) made a sudden and indefinite transition to remote instruction at the beginning of the Spring 2020 term. In a questionnaire administered during that term, many students reported that compared with academic terms of in-person instruction, they felt frustration or disappointment in their course performance, lack of motivation to fully engage in courses, and that they missed the role of casual interactions with peers and instructors in supporting their learning.
In response, we implemented our Learning Pod intervention in introductory physics courses to help students collaborate more frequently and effectively, and to lower barriers for student engagement with instructors and teaching assistants (TAs).
We piloted the intervention in a relatively small summer course, and then scaled up in subsequent academic terms to more courses with larger enrollments. As we scaled up to reach more students each term, we scaled back on some resource-intensive elements of the intervention. The purpose of this paper is to report on differences in student self-efficacy, student-to-student engagement, and the level of student-TA interaction as metrics for comparing the effectiveness of differing implementations of the Learning Pod intervention.

II. BACKGROUND
Self-efficacy is a person's confidence in their ability to succeed in a task. Sawtelle, Brewe, and Kramer [1] developed the Physics Self-Efficacy Questionnaire as a measurement, and found that self-efficacy predicts success in introductory physics courses and can be improved by elements of effective collaboration.
Wilson's work [2] with sophomore-level engineering students suggests that engagement with TAs and instructors can mitigate feelings of worry, anxiety, and discouragement. These feelings are correlated with low success rates, and disproportionately impact women students who identify as traditionally underrepresented minorities (URM) in engineering.

III. CONTEXT: LEARNING POD INTERVENTION
The Learning Pod Intervention was implemented during remote instruction at UW. Over two-thirds of students in the introductory physics courses identify as non-Caucasian, the majority of whom identify as either Asian or Latinx. The interventions discussed in this paper took place in both the regular and honors versions of the calculus-based introductory physics courses. The majority of students taking these courses are interested in pursuing engineering majors, and over two-thirds identify as male.
We report on courses in which instructors prerecorded lectures that students view asynchronously, and TAs lead syn-chronous recitation sessions, where students worked collaboratively through Tutorials in Introductory Physics [3]. Labs were asynchronous, developed from Pivot Interactives [4] activities and administered on their online platform. The honors courses followed the same structure, except lectures were synchronous, and one course's labs involved multiweek group projects [5] aligned with the Investigative Science Learning Environment (ISLE) [6] framework.
In all intervention implementations, students were grouped into learning pods of 3-5 students by course staff, and students and instructors connected through Slack (a professional communication platform). Variations in intervention implementation arose as we scaled up across academic terms. The following tables provide comparisons to highlight the differences between the implementations: Table I compares categories of support for student collaboration, and Table II includes details about what category of support was available in each course.
TABLE I. This table shows categories of support for student collaboration in courses where the Learning Pod Intervention was implemented. All implementations included connecting students and instructors through Slack. Our "teamwork agreement" is a group activity where students contract group goals and individual contributions to the group, and students explicitly discuss how the group will collaborate and be socially sensitive in online settings. Video lectures and TA discussions on collaboration addressed literature on collective intelligence [7] and roles that individuals could take on to support their group.  [6] Multi-week group projects with substantial group deliverables

IV. METHODS
In order to assess the effectiveness of different structures to support collaboration (Table I), we present results from a Learning Pod Survey (discussed in more detail later) and discuss the level of messaging activity on Slack. We make four  Table I, and the specific course (I represents Mechanics, II Electromagnetism, III Waves and Optics, and (H) indicates an honors version of the course). "Sum", "Aut", and "Win" correspond to the Summer 2020, Autumn 2020, and Winter 2021 academic terms, respectively. The percentage of enrolled students who responded to our Learning Pod Survey is given for each course administration. In courses where the Learning Pod Intervention was implemented, students received extra credit for completing the survey in the last week of the quarter. In None III, students did not receive extra credit for completing the survey, and we suspect this is why we saw a lower response rate. The inclusion of multi-week group projects in honors labs TABLE IV. These are the Learning Pod Survey items discussed in this paper, administered as 5-point Likert scale questions from -2 (strongly disagree) to +2 (strongly agree).

Category
Survey Item

Student-TA interaction
At least one TA in this class cares about how much I learn.
I have messaged at least one of the TAs in this class for assistance.

Self-Efficacy
After I work through an activity in this course on my own, I am generally confident that I can explain the main ideas correctly. After I work through an activity in this course with other students, I am generally confident that I can explain the main ideas correctly.
Studentto-student engagement I have found students in this class with whom I am comfortable working. comparisons of these results, shown in Table III. The validity of these comparisons is discussed in the next two paragraphs. Comparisons A, B and C compare the same course that was offered in different terms. Based on previous work [9] done at UW, we do not expect systematic variation between autumn and winter iterations in these introductory physics courses where course policies are decided by a committee, so we believe that these comparisons clearly compare the effects of the intervention implementations. Sync+ I was during a summer term, which was not assessed in said previous work. However, 8% of students were non-matriculated, so we believe it is comparable to the autumn and winter iterations.
In Comparison D, we compare the difference between Sync I (H) and GP II (H) to the difference between Sync I and Async II. We acknowledge there are differences in student expectations and population between physics I and physics II, in both the honors and calculus-based sequences, and discuss the validity of this comparison in Section V. However, considering only the population of students who responded to our survey in both physics I and physics II, the average responses were within a standard error (σx) for all survey items except one, self-efficacy from collaboration. In this item, there was a difference of 1.5 σx (relatively small compared to the other effects we describe). We believe Comparison D is clear enough that we can make some inferences.
Most of the comparisons we discuss were made using the Learning Pod Survey items in Table IV. These items probe student interaction with TAs, collaboration with other students, and self-efficacy. The survey was administered at the end of each term. Survey response rates by course are listed in Table II. We suspect that students who didn't respond are less engaged, and might be more likely to give negative answers to our survey items if they were to have been included in the sample, so courses with low response rates may have lower actual scores than reported. We report the average responses (scale described in Table IV), the p-values of the shifts in the average values, effect size in terms of standard error, and the percentage of students who gave negative responses on the survey items. Springuel et al. [10] explore some limitations to this approach of Likert scale analysis. In refining the analysis we will explore alternative analysis methods suggested in this body of work.
The Learning Pod Survey items were drawn from prior research and one item was developed for this study. The two items probing student engagement with TAs are drawn from Wilson's work [2] validated with an analogous student population to ours (described in Section II). The two items probing student self-efficacy were inspired by Lindstrom's Physics Self Efficacy Questionnaire [11] as well as Baldwin's Biology Self-Efficacy Scale [12], modified to probe whether they felt they were learning though collaboration, and validated through interviews with introductory physics students taking calculus-based physics in a different context. The student-tostudent engagement item was written for this study. None of these items have yet been validated with our students because of timing -the intervention was designed to solve a critical, urgent problem -so we emphasize that we consider results on this item preliminary.
We also used messaging activity on Slack as a metric for student engagement. We determined the number of weekly posts that represent unprompted student discussions and comments (as opposed to posts required for course credit, or posts between instructors). We don't assume that messaging on Slack captures all of student interactions (Discord was also widely used among students), and we acknowledge that some students were likely much more active on Slack than others. We interpret relatively high average messaging activity as indicative that students were generally more engaged.

V. RESULTS
In this section, we present preliminary results comparing the impacts of different implementations of structures to support collaboration (Table I) using our TA interaction, selfefficacy, and student-to-student interaction survey items. We will show indication that the asynchronous intervention improved student-to-student interactions and the synchronous intervention had the additional benefit of lowering barriers for students reaching out to TAs. We also suggest that having a highly committed teaching team (as described below) that was cohesive in efforts to mentor and encourage student collaboration, resulted in additional increases in student-tostudent interactions and self-efficacy from individual work. GP II (H) had multi-week lab projects designed to scaffold a scientific discovery process [5] in addition to synchronous and asynchronous support for collaboration. This had all the gains of a synchronous intervention, and students reported stonger beliefs that a TA cares that they learn.

A. Asynchronous intervention increased and improved student collaboration
Comparison A is shown on the left of Fig. 1. We see improvement self-efficacy from collaborative work: a positive shift in the average (equal to 3.7 σx), and a reduction of two thirds in percentage of students with negative responses (shown in pink). We also see (smaller) improvement in whether students reported they found others that they were comfortable working with: a positive shift in the average (2.2 σx), and a reduction by almost a third in percentage of students with negative responses. We note that the variation in survey response rates likely reduced these effects.

B. Synchronous intervention increased student-TA interaction
In addition to the benefits seen in the asynchronous intervention, Comparison B (in the right column of Fig. 1) shows that the synchronous intervention increased student-TA interactions. The average response to how strongly students  Table  IV. Note: We did not ask TA engagement questions over the summer and do not have messaging activity data when Slack was not used. students believed "At least one TA in this class cares about how much I learn" increased by 4.3 σx, and the percentage of students with negative responses was reduced by almost half. The average response to whether students had messaged a TA for assistance increased by 3.0 σx, and the percentage of students with negative responses was reduced by a third. We believe this indicates that TAs seemed more approachable to students, which may have helped some students who may have felt marginalized otherwise.
C. Cohesive teaching team effort increased and improved student collaboration Sync+ I was very similar to Sync I in structures implemented to support collaboration, but we see weekly messaging activity more than doubled in Comparison C (on the top right of Fig. 1). We also see improvements from survey items describing self-efficacy from individual work, and whether students found others they were comfortable working with.
The differences may be partially explained by the smaller class size, but we believe a significant reason was the teach-ing team coherence in mentoring students on effective collaboration. Sync+ I was taught by 1 lecturer and 5 TAs, all of whom are experienced, and have been recognized through awards for their teaching. In comparison, Sync I had 3 lecturers and 18 TAs, most of whom were new to the graduate program and to teaching. This made training the entire team to effectively mentor their students more challenging.  Table IV.

D. Adding group project-based labs to the asynchronous intervention showed gains
The transition from Physics I (Sync I (H)) to Physics II (GP II (H)) in the honors courses resulted in a large increase in average weekly messaging activity (on the left of Fig. 2). Shifts in averages were within σx for all presented survey items except whether the "TA cares I learn", which increased by 2.4 σx. This is a comparison across different courses, so we look at a similar comparison between Sync I and Async II for reference. Here we see little change in messaging activity, and negative shifts in responses to all survey items, all but one of them at least 3 σx. Self-efficacy from collaboration shifted right by 1.6 σx. Reduction in negative responses was less drastic as it was in Comparison B, which was also between an asynchronous and a synchronous intervention implementation. We recognize that conclusions are not as obvious in this comparison, it reasonable to infer that including researchvalidated multi-week group projects recovered any losses in moving from a synchronous intervention to an asynchronous one, and improved on it in some areas.

VI. DISCUSSION
Our goal in introducing Learning Pods was to reduce the number of students who report low self-efficacy, high barriers to engaging with TAs, and more social isolation. In this paper, we present preliminary results from comparisons between differing implementations of this intervention.
We found indication (from Comparison A in Fig. 1) that implementing a completely asynchronous intervention, with focus on setting up structures to make it easier to interact (like using Slack) and giving asynchronous instruction on effective collaboration, saw significant gains in student interactivity and self-efficacy from collaboration. TA efforts to facilitate effective collaboration (from Comparison B in Fig.  1) seemed to lower barriers for student-TA interactions and to further increase student interactivity and self-efficacy from collaboration. We saw the strongest positive effects when students were required to collaborate on substantial team deliverables (Comparison D in Fig. 2), and when instructors and TAs formed a coherent team with a shared objective of synchronously mentoring effective student collaboration (Comparison C in Fig. 1).
The benefits of preparing the instructional staff of our gateway physics courses for STEM majors to mentor students in effective collaboration speaks to a need for specific learning objectives related to collaboration, and targeted, researchvalidated training materials and methods that can be taken up by non-PER faculty and TAs. Instructor guidance in developing more expert-like collaboration skills is valuable in mitigating low self-efficacy and engagement, which has been shown to plague students from underrepresented groups in physics and engineering. These groups are more likely to benefit from engagement with their peers, TAs, and instructors. We have shown that Learning Pods is one intervention that can help, and are particularly effective when there is a coherent effort to mentor all students in effective collaboration.