Multimodality and inclusion: Educator perceptions of physics simulation auditory display

We surveyed educators and college students on their perceptions of a set of physics simulations with and without non-speech auditory display. In this work, we analyzed responses to a single open-ended text prompt from the surveys and found themes related to multimodality and inclusion. In their consideration of the auditory display, some educators and students noted the complementary interplay of the auditory and visual displays, while others noted that the auditory display can serve to augment the primary modality of the visual display. Educators also identiﬁed speciﬁc groups of learners that could potentially beneﬁt or be negatively impacted by the auditory display, including older learners, younger learners, those with certain "learning styles", and learners with sensory disabilities. This work is part of a larger effort to expand the auditory display of physics simulations to advance inclusive learning tools, and to investigate educator and student use and perceptions of multimodal physics simulations.


I. INTRODUCTION
The design of digital learning tools involves many decisions regarding choice of topic, representations, and physics model behavior. These decisions are informed by and made in concert with the capabilities of the available computing devices and associated technological infrastructure. For educational simulations, capabilities have evolved and advanced significantly over the past two decades, allowing for complex and highly performant visual displays with broad device compatibility, operating in a web browser for direct access nearly anywhere in the world.
Other display capabilities have also advanced, in particular auditory displays. The presence of modern Web Audio and Web Speech APIs [1,2] allows for robust inclusion of sound and speech into web-based applications, such as interactive physics simulations. To date, auditory displays are not often considered as a significant pedagogical feature in the creation of educational learning tools.
The growing possibilities with auditory display allows for the expansion beyond visual display more efficiently than ever before, and creates opportunities to compliment, augment, or replace visual displays, inviting designers and educators to expand their modal palette and reimagine traditional graphics-centric learning tools, incorporate and express musical and linguistic skills, and increase the capacity of digital learning tools to support inclusive learning opportunities. Inclusive learning tools include multimodal displays that can be adapted [3] to meet the needs of the educator and learner(s) in the moment-decreasing needs for alternative material creation for learners with diverse needs, and increasing opportunities for learning and collaboration amongst learners with diverse needs. Here we share findings from an investigation of teacher and student perceptions related to multimodality and inclusion after using interactive physics simulations with non-speech auditory display (sonifications and sound effects).

II. AUDITORY DISPLAY AND PHET INTERACTIVE SIMULATIONS
There are currently eleven physics simulations with sonifications and sound effects within the widely-used PhET Interactive Simulations collection [4]. These auditory displays were designed by an interdisciplinary team, with expertise in music and composition, physics, linguistics, education research, simulation and inclusive design, software development, and web accessibility. The iterative design process for each auditory display included feedback from physicists and physics teachers, and user interviews with youth, college students, and adults, including those with and without visual impairments [5][6][7][8][9]. Additionally, designs were also informed by authentic use of these simulations observed within formal and informal science classroom settings with middle and high school youth, including students with learning disabilities [10,11], those with visual impairments [12], and bilin-gual learners (primarily Spanish/English) [13].
We also conducted surveys of educators and students. The surveys allowed participants to experience a set of simulations with and without non-speech sound. Survey details are described in Section III. Findings from these surveys indicated a strong preference for the presence of sound by educators and students [14]. For example, the vast majority (77.5%) of the 2,471 educators who completed the survey believed that simulations should be designed with auditory display and educators consistently rated the with-sound variants of the simulations as more helpful, easy to understand, and enjoyable than the without-sound versions of the same simulations. Interestingly, musical sophistication was not a significant predictor of responses.
The surveys included one simulation ranking question; participants were asked to rank the four simulations in the survey (two simulations, each experienced with and without sound) in order of preference. This ranking question was followed immediately by an open-ended text response prompt to write the reasoning for their chosen rankings. In this paper, we present a portion of the results of a qualitative analysis of the text responses, focusing our analysis on the themes of multimodality and inclusion.

III. SURVEY DESIGN AND SIMULATIONS
Educator Participants. Educator participants were users of the PhET Interactive Simulation project website (http: //phet.colorado.edu). Visitors to the PhET website can create a user account and opt-in to receiving email announcements. During account creation, they can provide information such as role (teacher, pre-service teacher, student, etc), STEM subject specialty, and grade level. An email invitation to complete a research survey was sent to the subset of users who selected at least one category of Teacher, Pre-service Teacher, Teacher Educator, or Other. Additionally, an initial survey question asked participants to select their role; only those selecting educator roles were able to proceed with the survey. The survey was estimated to take about 15 minutes or less to complete. No compensation was provided. The total number of invited participants was 202,429; 4,658 responded to the survey beyond the role selection question; 2,471 users completed the survey.
Student Participants. Student participants were enrolled in Psychology courses at Georgia Tech. This survey was completed by 261 college students, who chose the survey from a pool of research studies available for course credit.
Simulations. The simulations (sims) included in the survey ( Figure 1) were selected from eight sims published with sonification at the time of survey creation (April 2020), representing the least complex sims in that set.
In John Travoltage [15,16], Figure 1A, the auditory display includes the sound of the foot rubbing on the rug, a "pop" sound as negative charges transfer onto John's body, a low continuous hum representing the charges on John's body, a ratchet-like sound when John's arm is moved, and an electrical "zap" sound as charges are discharged from John's body. In Friction [17], Figure 1B, the auditory display includes a rubbing sound when the books are rubbed together and a sound representing the "molecules" jiggling, which changes as temperature changes. In Ohm's Law [18], Figure 1C, when voltage or resistance is changed, a repeating, 2-second sound clip plays, with changes in pitch and tempo mapped to, and representing, changes to the value of current. In Resistance in a Wire. [19], Figure 1D, when resistivity, length, or area are changed, a short marimba tone is played, with changes in pitch mapped to, and representing, changes to the value of resistance.
Survey Design. In this paper we focus on responses collected from a single text-response item. We describe here the structure of the survey to provide context for this textresponse item. The full survey is available [14].
The survey is structured for participants to experience one sim with and without sound, and then to experience a second sim with and without sound, for a total of four sim experiences. Sims are randomly selected from a pool of four simulations, John Travoltage, Friction, Ohm's Law, and Resistance in a Wire; ordering of with sound and without sound variants was also random. After using each sim, educators were asked to rate seven Likert scale statements about their perceived performance, usability, and affect during interaction with the sim. If the sim had sound, they were asked to rate an additional seven Likert scale statements about the perceived performance, usability, and affect regarding the sound in the sim.
After interacting with all four sim experiences included in their survey, educators were asked to rank the four sims (two sims with sound, two sims with without sound). After ranking the four sims, they were provided with an open response text field with the prompt "Please explain why you ranked the simulations in the order above.". Responses to this question are the focus of this paper. This prompt was followed by a prompt to rate their agreement with a statement regard-ing preference for inclusion of sound features across all PhET sims.
A variant of the survey was developed for college students at Georgia Tech. This version moved questions regarding role to the demographics section, and constrained the random selection of sims to disallow the specific pairings of Resistance In a Wire/Ohm's Law and Friction/John Travoltage. Otherwise, the survey content was unchanged. This constraint on sim pairings was to allow for further investigation of emerging trends identified in the educator survey [14].

IV. METHODS
There were 2,186 educator responses to the open response prompt; 2,135 were written in English, 34 in Spanish, and 17 in neither English nor Spanish. The authors are fluent in English and interpreted the English responses directly. The Spanish responses were translated to English by a native Spanish speaking colleague with extensive experience translating from Spanish to English for the PhET community; the translated text was used in our analysis. The remaining 17 responses were not translated, and not included in our analysis.
Coding was conducted at the sentence level (sentence as the unit of analysis). None, one, or more codes could be assigned to each sentence. Multiple sentences could receive one code when one theme spanned multiple sentences in sequence. We limited analysis to sentences related to sound in the sims. After removing responses with no sentences related to sound, 1,829 educator responses remained.
The codebook was developed by authors BF and TS. Each coded the first 50 responses (ordered by time of submission) and developed an independent codebook, which were compared and from which a consensus codebook was created. This resulting codebook was then applied by BF and TS independently for the next 50 responses (#51-100). Resulting analysis for responses #51-100 were discussed, and final revisions were made to the codebook. This final codebook was applied in the analysis of all responses by TS. Responses #101-200 were also coded by BF, with inter-rater agreement of 92%.
There were 3,516 codes assigned to the 1,829 soundrelated educator responses in total. In this paper we focus on data coded with the Multimodal, Modal Comparison, Inclusion, and Age codes. The codes, descriptions, frequency, and an example quote for each of these codes can be found in Table I. The codebook used for the full dataset is available online [14]. Two extra codes, Conceptual Understanding (Positive) and Silly (Negative) are included in Table I for comparison as the highest and lowest frequency codes, respectively. For some codes, including these two codes, we additionally conducted sentiment analysis, indicating a positive, negative, or neutral sentiment. Following completion of the PhET educator survey data set, the codebook was applied to the college student data set.

Inclusion
Indicates idea about sound relative to some group or type of person or learner.
39 (1) "For an [sic] personal use its fine without sound, but thinking about blind students, than sound can be a great help." Age Compares appropriateness or predicts experience between age/generational groups.

(0)
"I worry that the sounds would be a distraction for younger students" Silly (Negative) Sounds were indicated to be silly or humorous in nature such that they were harmful to the simulation experience.

(2) "
The sounds for the resistance in a wire were just bizarre and I would imagine that if I didn't turn the sounds off, my students would just laugh." . The full codebook can be found at [14] V

. THEMES FROM EDUCATOR PERCEPTIONS OF SOUND IN PHYSICS SIMULATIONS
Here, we describe themes identified from responses that were coded with Multimodal, Modal Comparison, Inclusion, and Age.
Multimodal. The Multimodal code was applied to 128 educator and 7 student responses that indicated symbiosis between the auditory and visual displays in a sim; neither modality is clearly dominant. We noticed two themes in the data coded with the Multimodal code: 1) the visual and sound displays were complementary, and 2) sound served to augment the visual display. In addition to the example quote shown in Table I, noting the "visual and aural effects were matched well", a representative example from an educator who experienced Resistance in a Wire and John Travoltage notes an increase in usability and positive affect due to the multimodal displays: "The multisensory input made it much easier and more fun to process that something was increasing or decreasing." Here, the educator considers the multimodal displays as working together (complementary) to enhance the overall experience.
A different perspective within the Multimodal coded data set indicated that sound served to augment the visual display. For example, an educator reflecting on Ohm's Law wrote: "Sound helped with visualizing the concept-you could hear the increase as the pitch increased in Ohm's law." We interpret this comment and those like it as indicating the educator perceives the sound as serving as a secondary modality that enhances the visual display experience. Affirmingly, we see overlap (57 counts) between the Multimodal code and the Conceptual Understanding (Positive) code (shown in Table I), representing 45% of the Multimodal coded segments and the second highest code overlap for the entire dataset (highest = 68). We consider this to indicate that some educators, when presented with pedagogically-informed multimodal displays, can readily identify specific merits of multimodal displays that could enhance conceptual understanding.
Modal Comparison. The Modal Comparison code was applied to 29 educator and 6 student responses that included comparisons between auditory and visual displays; modalities were considered on different levels. The Modal Comparison code was included as a contrasting code to Multimodal, as multiple responses explicitly contrasted or ranked the visual and auditory displays.
The dominant theme found within responses coded with Modal Comparison encompasses a qualitative ranking between modalities, primarily underscoring their perception of the importance of the visual display relative to the auditory display. This representative quote is from a student who interacted with Resistance in a Wire and Friction: "The explanation of the concept was effectively done through visual effects rather than the sound effects. Hence, the sound effects were not necessary." An educator wrote in their comparison of modalities "I overall feel the noise distracts from kids noticing the visual details".
Inclusion. The Inclusion code was applied to 39 educator responses and 1 student response that indicated an idea about sound relative to some group or type of person/learner. Responses coded with Inclusion indicated that participants found the auditory display to be useful for learners with different "learning styles", or with sensory disabilities. Here is an example from an educator of a reference to "learning styles": "The friction sound help with the visual, and the ohm's law sound was a great addition, but it felt more like an added feature that would be good for my auditroy [sic] learners." Of the 39 responses coded with Inclusion, 15 (38%) made specific references to learners with sensory disabilities; for example: "I couldn't hear any difference in the sounds for the different variables in the Resistance in a Wire simulation. I think the sounds could be useful for visually impaared [sic] students but they would need to be distinct enough to discriminate between the variables." This educator acknowledges the usefulness of sound in Resistance in a Wire for visually impaired learners, though they also indicate a misinterpretation of the auditory display. This educator expected there to be distinct sonifications for each slider (presumably indicating increase and decrease in value for each slider individually); instead, each slider is associated with a single sonification of the resulting change to resistance. Though not included as part of the survey, each PhET sim published with sonification features has an accompanying brief Sound Features video on the sim's webpage. This video is typically not required for understanding the sonifications, but can be helpful if there are questions about the sound design.
Age. The Age code was applied to 31 educator responses that compared appropriateness or predicts experience between age or generational groups. The majority of responses with this code conveyed the theme that sound display could be detrimental to some groups of learners based on age or generation. Perspectives on which age group of learners would most benefit or be potentially harmed varied. In addition to the quote in Table 1, one educator wrote: "I teach older students and I do not think the sound is necessary....and might interfer [sic] with thinking." A few educators did note the possible benefits of nonspeech auditory display for all youth. One educator remarks on possible generational differences between students and educators in saying: "Our students are so used to video games with sound. They will prefer the sound enhanced simulation." No responses from the college student data set mentioned the use of sound for specific age groups.

VI. DISCUSSION
We found it interesting that participants were noting the role of multimodal features and their potential effects for different learner groups in response to a prompt to justify their sim rankings, and wanted to explore these responses further. Most of the responses were positive regarding their experience of the visual and auditory displays, though some clearly aimed to emphasize the primacy of the visual display or to express concern for the potential negative impact of the auditory display for some learners. Historically, the visual display in physics simulations is primary, to the extent that access to the visual display is mandatory to experience the conceptual and contextual information in the sim. In contrast, non-speech auditory display in many games and simulations is designed with less careful attention to conceptual implications given to the visual design. Consequently, auditory displays can often be turned off from the device or through controls within the tool without loss of conceptual information from the sim. In this study, many educators indicated that the addition of nonspeech auditory feedback to an interactive learning tool can be helpful, complementary and supportive of learners with and without disabilities.

VII. FUTURE WORK
Auditory display can include speech sounds as well as nonspeech sounds. In other work, we have developed Interactive Description accessible through screen reader software (assistive technology commonly used by people with significant visual impairments) for the sims included in this study, and others. When using Interactive Description, the verbal description replaces the visual display as the primary modality, with the non-speech sounds serving as a complement. The result is an entirely non-visual physics sim experience. Building upon those efforts, we have recently begun developing sims with a new speech sound display, Voicing. With Voicing enabled, learners can hear, directly through their web browser, verbalized text descriptions as they interact with the sim. The first PhET sims with the Voicing feature will be released mid-Summer 2021.

VIII. CONCLUSIONS
In this work, educators and students were surveyed regarding their perceptions of a set of physics simulations with and without non-speech auditory display. We analyzed responses to an open-ended text response prompt within the surveys, finding themes related to multimodality and inclusion. We found that educators and students considered the respective roles of the visual and auditory display in these physics sims, some noting them as complementary and others noting the auditory display as secondary. Some educators and students also identified specific groups of learners that could potentially benefit or be negatively impacted by the auditory display.
These efforts contribute to work to extend the use of auditory display for supporting physics learning, and to result in learning tools inclusive to those with and without disabilities. We hope that this and further investigations highlight the accepted importance of the consideration of auditory features for designers of physics education tools and for increasing the ways educators can connect physics concepts with their diverse classrooms.