Applying machine learning models in multi-institutional studies can generate bias
There is increasing interest in deploying machine learning models at scale for multi-institutional studies in physics education research. Here we investigate the efficacy of applying machine learning models to institutions outside of their training set, using natural language processing to code open-ended survey responses. We find that, in general, changing institutional contexts can affect machine learning estimates of code frequencies: either previously documented sources of uncertainty increase in magnitude, new unknown sources of uncertainty emerge, or both. We also find an example where uncertainties do not change between the institution used in the training data and an institution not in the training data. Results suggest that attention to uncertainty is critical, especially when making measurements of student writing across multi-institutional data sets.
Physics Education Research Conference 2024
Part of the PER Conference series Boston, MA: July 10-11, 2024 Pages 144-149
ComPADRE is beta testing Citation Styles!
![]() <a href="https://www.compadre.org/portal/items/detail.cfm?ID=16886">Fussell, R, M. Sundstrom, S. McDowell, and N. Holmes. "Applying machine learning models in multi-institutional studies can generate bias." Paper presented at the Physics Education Research Conference 2024, Boston, MA, July 10-11, 2024.</a>
![]() R. Fussell, M. Sundstrom, S. McDowell, and N. Holmes, , presented at the Physics Education Research Conference 2024, Boston, MA, 2024, WWW Document, (https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16886&DocID=5953).
![]() R. Fussell, M. Sundstrom, S. McDowell, and N. Holmes, Applying machine learning models in multi-institutional studies can generate bias, presented at the Physics Education Research Conference 2024, Boston, MA, 2024, <https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16886&DocID=5953>.
![]() Fussell, R., Sundstrom, M., McDowell, S., & Holmes, N. (2024, July 10-11). Applying machine learning models in multi-institutional studies can generate bias. Paper presented at Physics Education Research Conference 2024, Boston, MA. Retrieved July 20, 2025, from https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16886&DocID=5953
![]() Fussell, R, M. Sundstrom, S. McDowell, and N. Holmes. "Applying machine learning models in multi-institutional studies can generate bias." Paper presented at the Physics Education Research Conference 2024, Boston, MA, July 10-11, 2024. https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16886&DocID=5953 (accessed 20 July 2025).
![]() Fussell, Rebeckah K., Meagan Sundstrom, Sabrina McDowell, and Natasha G. Holmes. "Applying machine learning models in multi-institutional studies can generate bias." Physics Education Research Conference 2024. Boston, MA: 2024. 144-149 of PER Conference. 20 July 2025 <https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16886&DocID=5953>.
![]() @inproceedings{
Author = "Rebeckah K. Fussell and Meagan Sundstrom and Sabrina McDowell and Natasha G. Holmes",
Title = {Applying machine learning models in multi-institutional studies can generate bias},
BookTitle = {Physics Education Research Conference 2024},
Pages = {144-149},
Address = {Boston, MA},
Series = {PER Conference},
Month = {July 10-11},
Year = {2024}
}
![]() %A Rebeckah K. Fussell %A Meagan Sundstrom %A Sabrina McDowell %A Natasha G. Holmes %T Applying machine learning models in multi-institutional studies can generate bias %S PER Conference %D July 10-11 2024 %P 144-149 %C Boston, MA %U https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16886&DocID=5953 %O Physics Education Research Conference 2024 %O July 10-11 %O application/pdf ![]() %0 Conference Proceedings %A Fussell, Rebeckah K. %A Sundstrom, Meagan %A McDowell, Sabrina %A Holmes, Natasha G. %D July 10-11 2024 %T Applying machine learning models in multi-institutional studies can generate bias %B Physics Education Research Conference 2024 %C Boston, MA %P 144-149 %S PER Conference %8 July 10-11 %U https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16886&DocID=5953 Disclaimer: ComPADRE offers citation styles as a guide only. We cannot offer interpretations about citations as this is an automated procedure. Please refer to the style manuals in the Citation Source Information area for clarifications.
Citation Source Information
The AIP Style presented is based on information from the AIP Style Manual. The APA Style presented is based on information from APA Style.org: Electronic References. The Chicago Style presented is based on information from Examples of Chicago-Style Documentation. The MLA Style presented is based on information from the MLA FAQ. Applying machine learning models in multi-institutional studies can generate bias:Know of another related resource? Login to relate this resource to it. |
ContributeRelated MaterialsSimilar Materials |