Examining consistency of student errors in vector operations using module analysis

It has been well documented that introductory physics students struggle with vector addition and subtraction. We use the results of a multiple-choice assessment on one-dimensional vector addition and subtraction, administered to students in a large-enrollment algebra-based physics sequence, to explore the consistency in the types of errors students make. The assessment was analyzed using Module Analysis for Multiple Choice Responses, a type of network analysis that highlights groups of responses commonly given together. We find evidence for five distinct modules that are consistent across multiple semesters. Of these groups, two support the method of “closing-the-loop”, two are consistent with students performing the wrong operation, and one suggests that students provide an answer independently of whether they were prompted to add or subtract two vectors, though only for vectors that are anti-aligned.


I. INTRODUCTION
Successful manipulation of vectors and vector quantities is critical to success in introductory physics courses, and extensive literature exists examining student difficulties with vector operations, specifically addition and subtraction [1][2][3][4][5][6][7][8][9][10][11]. Prior work has shown that students' vector abilities are largely unchanged even after a year of physics instruction [2], students' solution methods are influenced by the relative position and orientation of vectors [6,7], and that students tend to stick with the same solution method across a variety of problems, even though a different method may be more appropriate [5].
Prior research has investigated the types of errors students make when adding vectors as arrows, and the frequency of those mistakes. Hawkins et al. hypothesized that the solution method students would use when solving vector addition questions would depend on the specifics of the problem; how the vectors were initially drawn (head-to-head, tip-totail, tail-to-tail, separated), the presence or absence of a grid on which the vectors were drawn, and the relative orientation of the vectors to a grid [7]. In interviews involving 8 students solving 10 separate questions, they found 7 out of 8 students used the same solution method through all problems presented. In a follow-up study, they found that the initial arrangement of vectors (tip-to-tail or tail-to-tail) had a significant effect on students' solution method, with more students using a "head-to-tail" method when the vectors were presented in a tip-to-tail format and more students choosing a "bisector" method when the vectors were presented tailto-tail. Barniol and Zavala later investigated the effect that problem context and vector positioning have on student solution methods for vector addition [6]. Problems 1-3 were presented in three different contexts (displacement, force, and no explicit context), and problems 4-6 were presented with different initial positioning (tail-to-tail, head-to-tail, and separated, all presented on the same grid). They found that students were more likely to use a "head-to-tail" solution method in the displacement context, but were more likely to draw vectors tail-to-tail in the force context. When vectors were presented to the students "head-to-tail", students were more likely to make a "closing the loop" error than in the other orientations. In both the tail-to-tail and "separated" orientations, students were more likely to make a tip-to-tip error, draw a general bisector, or make an error by subtracting only one of the components of the second vector [6]. We extend this work by investigating if certain errors are likely to be chosen together, across questions, using Module Analysis for Multiple Choice Responses (MAMCR).
Brewe et al. were the first to propose MAMCR to analyze the relationships among students' incorrect answers on the Force Concept Inventory (FCI) [15]. By examining the types of incorrect answers that are closely tied together in a network, it can provide insight into the consistency of student reasoning across different questions. An example is the "impetus" module found by Brewe et al., which is dominated by answers students could arrive at following a line of reasoning Each of the four possible responses comes from the various combinations of ± A and ± B. In this case, a) is where a contact force (either constant or diminishing) continues to act on an object after contact has stopped. Recent work has expanded MAMCR to work with larger FCI data sets [19], applied this modified MAMCR to the FMCE [17], and extended it to include students' correct answers on the FCI [18].
In this work, we apply Brewe et al.'s original MAMCR to the "arrows-on-a-grid" questionnaire originally designed to compare students' ability to add and subtract vectors in an "arrows-on-a-grid" representation and an ijk representation [10]. By using MAMCR, we hope to determine if students make a particular type of mistake across multiple iterations of the same type of vector addition and subtraction problems. While we can use this methodology to examine if student errors are consistent across problems, it does not tell us explicitly what mistakes students are making. The coupling of understanding what kinds of incorrect answers are commonly given together, along with explanations of how students arrive at these incorrect answers, should guide us to targeted instruction aimed at each potential mistake.

A. Data collection
The data for this study was collected in introductory algebra-based courses (both first and second semester). The courses met for 150 minutes per week with an optional 2hour lab component. The courses used a mixture of Peer Instruction and traditional lecture, and included weekly online assignments [14]. Graphical vector addition was covered explicitly in the first-semester course, and was used in the second-semester course. One of the authors of this study (JB) was the instructor for all sections where data were collected.
Students were administered an online, multiple choice quiz at the end of the semester. The quiz was offered on an opt-in (2015) and an opt-out (2017 and 2018) basis, and consisted of 12 one-dimensional (1-D) and 16 two-dimensional (2-D) vector addition and subtraction problems. In this paper we examine only the 1-D problems: additional analysis of the 2-D responses is in progress, including the use of handwritten work on some of the questions. For the 1-D questions there were six combinations of vectors A and B, and students were prompted to first add them (questions "+1" through "+6") and then subtract them (questions "-1" through "-6"). Each problem had four possible choices, corresponding to + A + B, In all problems, vectors were presented on separate grids. The image accompanying question "+1" is presented in Fig. 1.

B. Analysis
We employed Brewe et al.'s MAMCR process to analyze the student responses. All analysis was implemented using the R programming language and the igraph networking package. A network of the students' responses was created, with the possible choices represented as nodes weighted by the number of students who selected that response. Edges connecting two nodes were weighted by the number of students who selected both responses. The "backbone" (or underlying structure) of the network was extracted using locally adaptive network sparsification (LANS) [20]. LANS calculates how critical each edge is to both connected nodes. If an edge is deemed insignificant at both nodes, the edge is cut from the network, leaving behind the backbone. For our purposes, we use a significance level of α = 0.05.
As an analogy, consider using Google Maps to travel between city A and city B. The cities are the nodes of our network, and the roads connecting them are the edges. If there are multiple ways to travel between these two cities -an interstate, a state highway, and a county road -then the backbone from city A to city B would be the interstate, and LANS would remove the state highway and the county road. If city C is connected to city A by only a small county road, LANS leaves that as part of the backbone.
The backbone network was run through igraph's built-in implementation of the InfoMap community detection algorithm [21]. From examining the nodes in each group (community), we aim to classify and describe the kinds of mistakes that students make on one-dimensional vector addition and subtraction problems.

III. RESULTS
The data collected from four courses is summarized in Tab. I and Fig. 2. Since MAMCR uses only the incorrect answers in the analysis, we list this for each section in Tab. I. Figure 2 shows the percent of students answering correctly for each question. Figure 2 also shows that vector subtraction was consistently more difficult for students than vector addition in all courses, even in the 1-D case. We note that the results for question "-2" are excluded from Fig. 2 as there was an error in the version of the question that was administered during the Spring 2015 semester.  The remaining eleven questions are comprised of six addition and five subtraction problems, with each addition question having a corresponding subtraction version. Questions 1, 3, and 4 had vectors in an "aligned" orientation, with both vectors pointing to the right, while questions 2, 5, and 6 had vectors "anti-aligned", with A pointing to the right and B pointing left. Students in these courses perform better on vector addition than subtraction, and perform better on vector subtraction when the two vectors are parallel ("aligned") than when they are anti-parallel ("anti-aligned"), consistent with previous literature [9,10].
In order to categorize groups, we applied a naming scheme that would allow us to find consistent patterns in answer types. This naming scheme is detailed in Tab. II, with each response referred to by one method students use to combine ± A and ± B. As an example, back in Fig. 1, option a) is +1:+A+B, option b) is +1:+A-B, option c) is +1:-A-B, and option d) is +1:-A+B. We emphasize that, without written work, we cannot be sure how students are arriving at a given answer. However, if a student arrives at an answer of -A+B on one problem and then uses the same process on another FIG. 3. Heatmap generated from the backbone networks from each of the four courses. The color of each square indicates how many times the two nodes appeared in the same groups across the four courses. The naming scheme is further explained in Tab. II. Some nodes are not present in some of the networks, resulting in an inconsistent maximum occurrence count. The ordering of the nodes was determined by R's "heatmap" function, which computes a dendrogram from the co-occurrence matrix to determine how dissimilar two nodes are, using the euclidean distance between two rows as a measure of dissimilarity. Nodes that are less dissimilar are closer together on the axes. There appear to be two very strong groupings, groups 2 and 3, as well as three more moderate groupings, groups 1, 4, and 5. It appears plausible that some of these groups are connected in larger modules, but additional data is required to determine this. problem, they will again get an answer of -A+B. The communities for each course were extracted from 1000 trials of InfoMap. We then create a heatmap from the cooccurrence matrix across all courses, shown in Fig. 3. In this representation we can see how often InfoMap determined that two responses belonged to the same community across all four courses. The ordering of the nodes in Fig. 3 was determined by R's "heatmap" function, which computes a dendrogram from the co-occurrence matrix to determine how dis-  Fig. 3), the bold responses are those that follow the dominant trend of the group. similar two nodes are, using the euclidean distance between two rows as a measure of dissimilarity. Nodes that are less dissimilar are closer together on the axes. It should be noted that the center diagonal represents how often a response was present in a backbone network. Some nodes are not present in every backbone -students in a course either never chose that response, or in rare cases, that response was chosen only with correct answers and therefore was dropped from the backbone network.
The heatmap in Fig. 3 shows two strong groups, brighter white at middle (group 3), and just below and to the left of middle (group 2), and three slightly weaker groups, two at top right (groups 4 and 5) and one at bottom left (group 1). The membership of each group is listed in Tab. III, where the groups are numbered from bottom-left to top-right. Some of these groups are possible subgroups of larger modules, but we would need more data sets in order to determine this.

IV. DISCUSSION
Based on the literature, we attempted to classify these resulting modules into possible error types. Each group indicated in Fig. 3 is addressed individually, from bottom-left to top-right, to present possible arguments or misconceptions for the bold responses in Tab. III.
Group 1 -The dominant mistake in this module, present in four of the the seven nodes, is -Q:-A+B. This mistake is consistent with two plausible solution methods. The first method is for students to perform subtraction backwards, reversing the direction of A instead of B. Alternatively, students could be doing the set up correctly, aligning A and − B tip to tail, but then closing the loop and going from the tip of B to the tail of A, rather than from the tail of A to the tip of B.
Group 2 -There appears to be a subset of students who will consistently do addition instead of subtraction. Group 2 contains all of the responses that follow this pattern (recall that we had to drop question "-2"). It is interesting that in each class section the largest nodes are -1:+A+B, -5:+A+B, and -6:+A+B. Since questions were presented sequentially, "-1" was the first combination of vectors that students were asked to subtract. It is possible that some of the weight of the -1:+A+B node simply came from the students being on autopilot -to further examine this, we plan to shuffle the order of the questions in future iterations of the quiz. The prevalence of -5:+A+B and -6:+A+B is interesting because these are the two subtraction questions with anti-aligned vectors.
Group 3 -This module includes the same answer, -A-B, for both the addition and subtraction versions of problems 5 and 6. These questions have anti-aligned vectors, and that these nodes often end up in the same group together suggests that students are making the same mistake for both addition and subtraction. This could stem from students correctly adding − A and − B, or closing the loop when combining A with B.
Group 4 -Intriguingly, there seems to be a subset of students that gave the answer consistent with doing subtraction when prompted to do addition. It is interesting to note that the three nodes that seem to have the strongest connections within this module are +2:+A-B, +5:+A-B, and +6:+A-B, which are the three pairs of anti-aligned vectors. If students were to reverse the direction of B and then add it to A, they would come out to this result. This suggests that students are conflating vectors pointing to the right with a positive vector, and therefore addition.
Group 5 -In this last module, there are three responses that follow the -Q:-A-B pattern. Interestingly, these three responses are the three subtraction problems with aligned vectors, while the other two -Q:-A-B responses (from antialigned vectors) were in Group 3. This suggests that different students make this same mistake depending on the alignment of the vectors. There are again two plausible errors that we consider. First, these answers are consistent with adding − A to − B. Students may be inclined to do this if they associate vectors pointing to the left with negative -and therefore subtraction as well. The other possible mistake is students may have chosen the negative of the result from vector addition, again consistent with closing the loop.

V. FUTURE WORK
Moving forward, collecting handwritten work on the same set of questions would validate or refute our interpretations of each group. We are currently analyzing the results of the two-dimensional questions, not discussed here, for which we have both multiple-choice responses and handwritten work. A modified version of the 1-D questions could also be useful, with some questions having vector A pointing to the left. This would aid in our interpretation of Groups 4 and 5.

VI. CONCLUSIONS
The results of this work support earlier findings that students are consistent in how they approach vector addition and subtraction in an "arrows-on-a-grid" format. We find evidence for five broad groupings of mistakes that students consistently make across multiple problems: performing subtraction in the wrong order, performing addition instead of subtraction, closing the loop regardless if asked to perform addition or subtraction for anti-aligned vectors, performing subtraction instead of addition for anti-aligned vectors, and closing the loop for aligned vectors. We again emphasize that we cannot be certain without written work that students used these specific methods, but the consistency of their choices and the groupings found strongly suggest this to be the case.