Dr. Xiaoming Zhai is an Assistant Professor in Science Education at the University of Georgia. He obtained his Ph.D. in curriculum and instruction (physics) at Beijing Normal University. His research focuses on developing and applying innovative assessments to support science teaching and learning. He employs cutting-edge technology such as artificial intelligence, machine learning, and mobile learning to enhance classroom assessment practices. He is a PI and Co-PI on two NSF-funded projects, respectively. He has published 39 journal articles, one book, and several chapters. He recently edited a special issue in the Journal of Science Education and Technology, titled Applying Machine Learning in Science Assessments. His article, Zhai et al. (2020). From substitution to redefinition: A framework of machine learning-based science assessment. Journal of Research in Science Teaching, 57(9), 1430-1459 was selected to feature Wiley’s Research Headlines as the most newsworthy research across Wiley’s 1600 journals. This research was also chosen by major academic media such as ScienceDaily, the American Institute of Physics, and the American Association for the Advancement of Science to advertise to a broad audience. He received the 2021 AERA TACTL Early Career Scholar and the 2021 Jhumki Basu Scholar Award from the National Association of Research in Science Teaching.

Does AI Have a Bias? A Critical Examination of Scoring Bias of Machine Algorithms on Students from Underrepresented Groups in STEM

This project will answer two research questions: (a) Are Artificial Intelligence (AI) algorithms more biased than humans when scoring students from underrepresented groups in STEM’ (SURSs’) drawing models and writing explanations in scientific modeling practice? (b)  Are AI algorithms more sensitive to the linguistic and cultural features of the assessments than human experts? I will develop two sets of assessments that are aligned with the Next Generation Science Standards with varying critical cultural features. I will collect middle-school student responses from a school district where almost half are SURSs and recruit experts of both SURSs and others to score student responses. I will use 500 scored responses to develop multiple AI models for each item and use the models to score new testing data. I will compare machine severity on scoring SURSs’ responses with standard scorer’s (e.g., human consent scores), as well as examine how item cultural features interact with machine scoring capacity, as compared to human raters. The findings will inform the potential bias by using AI algorithms. Using knowledge learned in this project, educators can identify potential strategies to improve culturally responsive assessments and justify the use of AI to develop more inclusive and equitable science learning.

Similar Posts