Using mobile eye tracking to capture joint visual attention in collaborative experimentation

,


I. INTRODUCTION
Laboratory learning experiments at school or university play a key role in science education [e.g., [1][2][3]. However, despite the acceptance of experiments as an integral part of the learning process, previous studies have shown that the potential for learning is only insufficiently exploited [4][5][6][7][8][9]. Since student experiments are often conducted in groups of two or three students, one possible reason for the lower learning success could be that the students do not collaborate sufficiently with each other. Since information is not available a priori in experiments, but must first be identified or generated in the experimental learning environment (e.g., by reading off measured values), coordination of visual attention should be essential for successful collaboration, which has already been empirically demonstrated for other learning activities, especially problem solving (e.g., [10][11][12][13]).
The processes associated with joint attention, i.e. with the simultaneous focusing of both participants' attention on the same aspect or object by means of different senses, is an important contribution to the study of collaborative activities. To investigate such joint attention processes in the selection and extraction of visual information during collaborative experimentation, wearable eye-tracking devices are suitable as a process-based research methodology. Special glasses ( fig. 1) open up the possibility of capturing the experimenter's gaze data without restricting freedom of movement, so that experimental behavior remains unaffected by data collection.
In other contexts, mobile eye-tracking has already been successfully used to assess the quality of collaboration in joint problem-solving. For example, Schneider et al. used two mobile eye-trackers to capture levels of Joint Visual Attention of apprentices in logistics in co-located problem-solving settings and found that "joint attention can act as a proxy for students' quality of interaction" [10, p. 253].
Based on this approach, it is possible to investigate the interaction of the students to gain new insights into cooperative learning difficulties and their relation to learning success when conducting student experiments, which to our knowledge is still largely unexplored.

II. STATE OF RESEARCH
Tomasello [14] states that Joint Attention (JA) is the tendency for social partners to focus on a common reference and furthermore, to monitor each other's attention to an external entity such as an object, a person or an event in various ways. The situation when two people concentrate on one element of the environment at the same time cannot be dubbed as joint attention. With the ability of JA, groups are able to form a common ground, take the perspective of their group members, build on their ideas, show some empathy or solve a problem in cooperation [15].
A subconstruct of JA is Joint Visual Attention (JVA), which occurs when two (or more) individuals look at the same spot at the same time. There is ample work showing that JVA is a central mechanism by which group members coordinate their actions and establish a common ground [e.g., 16,17]. Therefore, JVA has been studied extensively by social and developmental psychologists and shown to be critical to many social interactions. As a finding of this research, there is now agreement in principle that JVA plays a central role in supporting students' conceptual convergence [18]. In this context, JVA can act as a proxy for the quality of students' collaboration, since without the ability to establish JVA, groups are unlikely to establish a common ground, take the perspective of their peers, build on their ideas, express some empathy or solve a problem together [10, p. 243]. In particular, it could be demonstrated that productive groups exhibit higher levels of JVA compared to less productive groups [11,19,20]. Against this background, Schneider et al. [10] even recognize JVA as one of the central constructs for any scientist interested in collaborative learning. According to Barron [21], JVA is seen as a prerequisite for virtually all social interactions. In his study, he was able to show that a group of students who did not achieve JVA more often ignored correct proposals and did not perform as well as similar groups.
Technological advancement enabled researchers to utilize (mobile) eye-tracker to capture JVA in collaborative situations. For example, Richardson and Dale [19] studied speaker-listener dyads and found that the higher the proportion of time that the speaker's and listener's eye movements were coupled, the better the listener performed on a subsequent comprehension test. Similarly, Jermann et al. [22] used two eye-trackers to capture gaze data to separate productive from less productive collaborative learning groups based on their common gaze patterns.

III. RESEARCH QUESTIONS
Based on the preliminary work presented, the goal of the study reported in this paper is to investigate the influence of JVA on learning outcomes in a collaborative experimental setting. Accordingly the main research questions of the study are as follows.

RQ1
: Is there a relationship between JVA and experimenters' learning success in collaborative experimentation? Experimentation usually involves going through different phases, from setting up the experiment to evaluating the measurement results. Another research question of the study relates to the influence of JVA in these different experimentation phases.

RQ2:
In what phases of experimentation does JVA have an impact on learning success?

A. Sample
A total of 40 university students (21 female, 19 male) with a mean age of 22.3 years (SD = 3.2) participated in the study. All participants study health sciences at a technical university in Germany: 28 first-year, 1 second-year, 7 thirdyear, 1 fourth-year and 2 fifth-year (one missing data).

B. Study
The study was conducted as part of the two-week obligatory laboratory course on physical principles of physiological processes. Participants experimented in dyads, to which they were randomly assigned previously. The purpose of the experiment was to investigate the physiological features of the visual process using an optical bench (rail for rapid assembly of optical elements such as lenses, sensors and light sources), with optical components designed to imitate parts of the human eye. Throughout the experimental activity, the students wore mobile eye-trackers (Tobii Pro Glasses 2) with a sampling rate of 50 Hz. Immediately before and after the experiment, a performance test was conducted to determine the learning gain from the experiment.

Experiment setup
Students interacted with an optical bench called the functional model of the human eye ( fig.2). This model is designed to simulate the optical functions of the eye, such as the visualization of an object on the retina. It can also be a tool to simulate the accommodation of the eye (change in lens curvature), short-sightedness and long-sightedness. The model includes a half eye cup with an adjustable iris aperture, lens holder as well as 2 convex lenses (f = 65 mm and 80 mm), a half eye cup with a retina (transparent screen) as well as one correction lens each concave and convex, lens holder and experimental instruction.
FIG. 2. The experimental setup: optical bench (slider with ruler markings and holders for optical parts) with the following optical elements: one half of the eye with a frosted glass pane (as a retina) (1), one half of the eye with a pupil attachment (2) and the adjustable iris aperture as the anterior segment of the eye (3), as well as a candle as the light source (4).

Conducting the experiment
Initially, the eye-tracking glasses were put on the test persons and then calibrated. All participants received the same instructions on how to handle the lenses beforehand. Before starting the experiment, the experimental materials were covered, which allowed both subjects to start experimenting simultaneously as soon as the study leader removed the cover. In order not to influence the collaboration between the experimenters in any way, the study leader then left the room for the duration of the experiment. The students received instructions on how to carry out and evaluate the experiment. The average duration of the experiment was 24.4 minutes (SD = 4.1).
The experimentation process comprised three phases: • Phase 1: Set up the experiment and adjust the distance between the light source and the pupil so that an image is clearly visible on the simulated retina. • Phase 2: Installation of a second lens and adjustment of the distance between the light source and the pupil. • Phase 3: Finally, the participants had to simulate a short-sighted eye with the model and look for solutions to optimise the beam path by using diverging lens as glasses.
C. Data collection and analysis

Learning gain
Learning gain was measured by a pre-post performance test with six items in single-choice format by subtracting the score of the pre-test from the score of the post-test. The items were constructed based on known student conceptions from the literature [23] and the content validity was confirmed by a survey of experienced teachers and experts in the field of geometrical optics. Three of the items have already been used in a larger-scale study with N = 256 students (Cronbach's alpha = .80), the other three items were supplemented in accordance with the experiment conducted. One of the questions of the test, for example, was the following one: "A luminous object is imaged sharply onto a screen using a converging lens. Then the screen is moved away from the lens. What happens to the image on the screen?" The relevant response options were as follows: "A: The image on the screen becomes smaller", "B: The image on the screen remains the same size and becomes unfocused", "C: The image on the screen becomes larger and unfocused", "D: The image on the screen remains exactly the same, except that the screen is now further away".

Gaze data a. Synchronization
To determine the JVA from the gaze data of the experimental partners, the gaze data were manually synchronized in post-processing. Visual and acoustic triggers, such as revealing the experiment setup at the beginning of the experiment, were used to synchronize the gaze data on the one hand and to divide the entire experiment into individual experiment phases on the other by placing time markers in the recordings. The phases of the experiment were the same for both participants, as they worked together, and were determined by the need to first assemble the model and then to work sequentially with the three worksheets.
b. Mapping In a first step, the raw gaze data from each participant was passed through an I-VT (Identification by Velocity Threshold) gaze filter included in the Tobii Pro Lab software to detect the eye movement types "fixations" and "saccades". Fixations are those times when our eyes essentially stop scanning about the scene, holding the central foveal vision in place so that the visual system can take in detailed information about what is being looked at. Saccades are rapid, ballistic movements of the eyes that abruptly change the point of fixation. The threshold was set to 100 • /s, as recommended by the manufacturer for gaze data acquisition with eye tracking glasses. In the next step, the gaze data were overlayed on the first-person video footage to match fixations with AOIs (Areas of Interest: objects observed by the participant across the timeline of the experiment). The areas of interest were chosen so that they would include all of the individual parts of the model the participants interacted with. This process is called mapping and can be done with Tobii Pro Lab using special built-in tools. The basis for this is a reference image of the object (snapshot), on which a "map" of the person's gaze was outlined. Thus, we mapped the recordings for all 40 participants with a total duration of more than 17 hours. Since the built-in automatic mapping can be inaccurate in some cases, all mapping results have been checked and corrected manually. Using the standard data export func-tion of the Tobii software, we obtained the initial data for the analysis of JVA.
c. Extraction of JVA From the synchronized and mapped gaze data of the experimental partners, in the next step, the time duration during which both subjects fixate the same AOI was determined for specific time intervals (separate phases and the duration of the entire experiment) using cross recurrence graphs (see fig. 3). The JVA value was calculated as the proportion of this time duration relative to the duration of the entire time interval, for which the JVA needs to be calculated. In a cross recurrence graph, the x-axis (y-axis) represents the time for the first (second) participant and pixels label moments of joint attention. Pixels along the main diagonal indicate that both experimental partners focused their attention on the same area at the same time, in our case either on the experimental setup (red pixels) or the task sheet (black pixels).

V. RESULTS AND DISCUSSION
In order to investigate the influence of JVA on the learning gains of the dyads, the first step was to divide the dyads into two groups: Those for which both experimental partners showed an increase in learning from the pre-test to the posttest and those for which no or a negative increase in learning was found (see tab. I). To make the results as unambiguously interpretable as possible, dyads in which only one partner showed a positive learning gain were not included in the analysis, which reduced the sample size to N = 22 (pos. gain:  When considering the JVA global, i.e., undifferentiated over the entire experimentation process, no correlation with the learning gain of the dyads can be found. However, when JVA is differentiated with respect to visual attention to the experiment setup and task sheet, there is a significant correlation between JVA and experimenter learning gains. This is an indication that attention on the same components of the experiment setup at the same time positively influences the experimental learning process. Differentiation of the JVA concerning the experimentation phases shows that especially the build-up phase at the beginning of the experimentation is of high importance for this, a significant correlation of the JVA with the learning gain can be proven for this phase.

VI. CONCLUSION AND OUTLOOK
This first study aimed to capture JVA in a collaborative experimental setting using mobile eye trackers and to inter-rogate their influence on learning gains through the experiment. Thus, on the one hand, we succeeded in transferring the research methodology for investigating a learning activity essential in the natural sciences, experimentation. On the other hand, we have also succeeded in gaining initial insights into the influence of JVA on learning gain. Although no significant correlation could be demonstrated between the JVA for the overall experimentation process and learning gain, it could be demonstrated for the JVA concerning experiment setup, especially in the setup phase. This gives an indication of possible causes for a lack of learning success in collaborative experimentation and thus opens up possibilities for targeted support of the experimenters, for example, by directing the visual attention of the experimenting partners to the experiment setup in order to improve collaboration during experimentation.
The collection of process-based data such as JVA offers a great advantage over product-based data such as performance tests that it would be possible to support the experimentation process even before the experiment is completed. In many cases, these kinds of experiential learning sessions are conducted without the help and guidance of an instructor, and the success of the experimental work depends entirely on the success of the collaboration between the participants involved in the learning process. Therefore, identifying the quality of the collaborative process during its development, a comparative analysis of these kinds of experiments conducted with or without guidance, as well as understanding what instruction can improve the cooperation, are promising options for the development of this research. In perspective, the research methodology thus enables quantification of the quality of collaboration already during experimentation. The data obtained in this way could in turn be used either by the teacher him/herself or by a cognitive system for adaptive support of the experimentation process.
It should be noted, nevertheless, that this study is the first to use a new research methodology to investigate the collaborative experimentation process, but that the data analysis is based on a small sample. The results of this study thus provide only first indications of a positive influence of JVA on learning success in collaborative experimentation. However, this possible influence should first be statistically verified in follow-up studies with a larger sample size before a corresponding conclusion can be drawn. JVA can be used to look in detail at role-taking within mini-groups, as well as to estimate cognitive expenses on transactional costs. As these parameters are fundamental to the consideration of collaborative cognitive load [24][25][26], such studies may provide an opportunity to establish the relationship between JVA and students' cognitive load specifically in the case of collaborative learning activities involving experimentation. Another possible line of research is the additional collection of audio data to provide insights into the communication process during collaborative experimentation, which would allow triangulation of the JVA data with respect to the quality of the collaboration.