Understanding centrality : Investigating student outcomes within a classroom social network

Collaborative learning environments in undergraduate introductory physics courses, such as those promoted by Modeling Instruction (MI), influence both student performance and student social interactions. Because collaborative learning is inherently a social activity, we applied Network Analysis methods to examine student social interactions within the classroom using a survey administered periodically in class. We then calculated centrality, which is a family of measures that quantify how connected or "central" a particular student is within the classroom social network. In order to understand what centrality means in this context, we investigated the relationships among centrality, student demographics, and student outcomes in a large-scale MI classroom with 70 students and 6 instructors. We addressed two research questions: "Is centrality predicted by sex, ethnicity, incoming GPA, or Force-Motion Concept Evaluation (FMCE) prescore?" and "Does centrality predict FMCE gain or final grade in course?" A series of linear regressions showed that centrality can be predicted by sex and incoming GPA, and is a predictor of FMCE gain. PACS: 01.40.Di, 01.40.Fk, 01.40.gb, 01.40.Ha


I. INTRODUCTION
Active learning has been shown to be advantageous over traditional lecture [1].Much attention has been focused on the importance of collaborative learning environments, where students interact and work together.One such environment is the undergraduate introductory physics Modeling Instruction (MI) course at Florida International University (FIU) where students exhibit superior performance [2] and attitude [3] outcomes than do their lecture counterparts.However, the mechanisms of how and why student outcomes benefit from this collaborative environment are not clearly known [4].This paper investigates these crucial student interactions using social network analysis.
Centrality is a concept that arises from network science, graph theory, and sociology, where it is rigorously defined [5].In a classroom setting, centrality may be understood to be how connected a given student is to other students in the class.To better understand what centrality actually means in an educational context, we build on the work of Bruun and Brewe [6] by incorporating centrality measures into conventional statistical methods with general linear regression.In this study, we utilize centrality from two perspectives: as a predicted quantity that may be predicted by initial-state variables such as incoming GPA, and as a predictive quantity that may predict final-state variables such as final grade in course.

II. BACKGROUND
Data were collected in the Fall 2014 semester of undergraduate introductory physics at FIU, in a large-scale MI classroom.FIU is an Hispanic Serving Institution (HSI) located in Miami, FL where 61% of the student body identify as Hispanic [7].There were 70 students enrolled in the course, seated at tables of 6 students each.There were two instructors, two teaching assistants, and three Learning Assistants.
One way of understanding the MI students' superior outcomes in learning gains [2], odds of success [2], and attitudes [3] is through Nora's student engagement model [8].Nora explicitly connects involvement in learning communities and peer group interactions (among other variables) to students' academic performance and cognitive gains (both perceived and actual).The full impact of these factors on student engagement is part of a larger project beyond the scope of this paper, but the inherently social and collaborative nature of MI draws immediate attention to interactions among students.These interactions may be quantitatively understood with a network analysis methodology, where students are represented as nodes and interactions between them are represented as edges (connections between nodes).As their theoretical framework, the authors incorporate network analysis methods with the Nora student engagement model.

III. METHODS
Our data naturally divided into two types, depending on their scale: (i) student data, which describes individual students such as demographics and performance, and (ii) network data, which describes the classroom social network as a whole through reported interactions.These different types of data were managed independently at first, then combined for statistical analysis.All data analysis was done in the statistical programming language R [9].
Student demographic information, including sex, ethnicity, and incoming GPA, was downloaded from the FIU online course record; final grade in course was provided in raw numerical format by the course instructor.Using the plyr package in R [12], student ethnicity was collapsed from multiple categories to a two-state representational status: White and Asian students were coded as majority, while Black and Hispanic students were coded as statistically underrepresented (UR).

B. Network data
Network data were collected from a pencil-and-paper survey administered in-class, five times throughout the semester (every 3 weeks).The survey asked students to list with whom they had a "meaningful interaction" in class that day.
The question was open-ended with un-lined whitespace so as to neither constrain nor inflate student responses via a set number of lines.Due to the survey's open-ended format, students were free to name instructors, teaching assistants, and learning assistants; some students also listed people who could not be identified on the course roster.All of these individuals were assigned custom identification numbers and included in the network because the centrality of each individual student depends on every other person in the network (Ntotal=84).Demographic information for these unidentifiable nodes was unavailable, and was coded as NA.
The response rate for data collection instances 1-4 was greater than 75% for students enrolled in the course.The response rate for data collection #5 was only 43%, so it was excluded from the analysis.

C. Network analysis
Applying our network analysis framework, each network survey collection was converted to a directional edge list showing the "source" and "target" of each reported interaction.If student A reports an interaction with student B, then a directional edge is drawn from A to B; if B also reports an interaction with A, then a second directional edge is drawn from B to A. To aggregate the four data collections, the four edge lists were combined into a master edge list.Edges that appeared in more than one collection were combined to only express whether an interaction occurred: an edge that appeared 1 time was treated the same as an edge that appeared (maximally) 4 times.From this master edge list, a directed network graph was created in R using the igraph package [13].We may characterize the number of interactions using histograms of the degree distribution (see Fig. 1).Four centrality measures were then computed: degree, outdegree, indegree, and PageRank.A given node's degree is simply the number of edges that it is involved in (either as the "source" or the "target").Outdegree is the number of interactions that a given node is the source of (reported by a given student on their own survey).Indegree is the number of interactions that a given node is the target of (reported on all other surveys).PageRank is a more sophisticated "second order" measure that depends on a node's indegree as well as the indegree of the nearest neighbors it is connected to, [5,14].Once the centrality scores were computed for each node in the network, they were incorporated with the student data for statistical analysis.

IV. RESULTS AND DISCUSSION
General linear regression was used to test (i) whether centrality is predicted by "initial" factors sex, ethnicity, incoming GPA, and FMCE pre-score, and (ii) whether centrality predicts "final" factors, i.e.FMCE gain and final grade in course.Note that when we use the term "centrality" we refer to the measures degree, outdegree, indegree, and PageRank; in cases where these measures exhibit distinct behavior, we refer to them individually.

A. Centrality as predicted by initial factors
Both degree and outdegree were predicted by sex at statistically significant levels of α=0.05 or better, as shown in Table I.We interpret this to mean that male students report fewer interactions as meaningful than female students do; whether male students are actually involved in fewer meaningful interactions, or simply report fewer meaningful interactions is unclear.However, we found no significant dependence of indegree or PageRank on sex.
None of the centrality measures were predicted by ethnic representational status.This null result indicates that UR students are reporting interaction patterns similar to majority students, which provides evidence of an inclusive classroom environment.On the other hand, it may also be an artifact of FIU's HSI context: 53 out of 70 students (76%) were Hispanic.TABLE I. Summary of linear regressions for one-factor models of centrality as predicted by sex (using male as the base model).Nor was any centrality measure predicted by FMCE prescore.This suggests that student interactions during the course do not depend on incoming physics content knowledge, as measured by this metric, so students with and without prior physics knowledge interact with each other on even footing.

Linear Model Estimate
Finally, all four centrality measures were significantly predicted by GPA.See Table II.We interpret this two ways, which are not mutually exclusive.It may be that students with high incoming GPA have a greater number of meaningful interactions than other students.It may also be that high-GPA students are recognized as "high achievers" and thus are sought out by other students as resources to learn from.

B. Centrality as predictor of final factors
PageRank centrality was found to be significantly predictive of raw FMCE gain (Estimate=1643.95,p<0.05).This means that regardless of a student's FMCE pre-score, their learning gains are significantly predicted by their classroom interactions (as measured by PageRank).Such a result indicates that more interactions in class are associated with higher conceptual learning gains.
We also found that each of the four centrality measures predicted final grade at α=0.01 or better.This indicates that course performance is associated with interactions in the classroom social network.However, having found earlier that centrality depends on incoming GPA, we elected to additionally test if course grade still depends on centrality when controlling for incoming GPA.In this case, we found that centrality was no longer a significant predictor of final grade in course.This means that previously high-achieving students continue to achieve highly, while also interacting more in the course than lower-achieving students.Such a result may suggest that high-achieving students recognize the social nature of the MI course and associate interpersonal interactions with success/achievement in the course.It also may imply that high-achieving students are more likely to perceive classroom interactions as meaningful.In any case, we interpret this to mean that centrality plays a mediating role, though not a strictly predictive one.

V. CONCLUSIONS
In this paper, we investigated the meaning of centrality in a large-scale Modeling Instruction physics classroom.We considered centrality both as a predicted quantity (predicted by initial factors, i.e. sex, ethnicity, incoming GPA, and FMCE pre-score) and as a predictive quantity (a factor of final-state variables, i.e.FMCE gain and final grade in course).
From the first perspective, we found that degree and outdegree depended on student sex in favor of females, while indegree and PageRank did not.This is to say that female students reported more interactions than male students, as measured by degree and outdegree.We also found that none of the centrality measures depended on student ethnicity/representational status, which suggests an inclusive classroom environment in which minority students interact just as much as majority students.In-class interactions were also unaffected by prior physics content knowledge, as FMCE pre-score had no predictive effect on centrality.Finally, all four measures of centrality were predicted by incoming GPA with high significance, indicating that high incoming GPA is associated with more interactions in the classroom.
From the second perspective, we found that PageRank centrality is significantly predictive of raw FMCE gain; regardless of a student's incoming physics content knowledge, more interactions in class are associated with higher learning gains.
We also found that all four measures are significantly predictive of final grade in course when each is considered on its own, but not when incoming GPA is controlled for.In conjunction with centrality's dependence on incoming GPA, this implies that students who achieved highly before the course also (not surprisingly) achieved highly within the course, while interacting very much in the classroom.This is indicative of some mediating, though not predictive, role that in-class interactions play in terms of a student's final grade in course.
This paper finds that centrality was predicted by student sex and incoming GPA, but not ethnicity.It also shows that centrality predicted raw FMCE gain, and is associated with final grade in course.Although centrality is neither exclusively predicted by pre-course factors nor exclusively predictive of post-course outcomes, we believe centrality plays a mediating role in which the pre-to post-course shifts are influenced and modified by social interactions that take place within the classroom.Thus we demonstrate potential for network centrality measures to be considered as predictive and descriptive factors of student performance.In doing so, we build on previous work and take the next step toward incorporating network analysis with student engagement theory.

FIG 1 .
FIG 1. Histograms showing the degree distributions of (a) all students, (b) female students, and (c) male students.

TABLE II .
Summary of linear regressions for one-factor models of centrality as predicted by GPA.