| Abstract |
The past decade has seen an exponential increase in AI engagement across scientific disciplines, notably in the life sciences, where it is used for complex, data-intensive tasks including the identification of disease genes, the prediction of protein folding and protein-drug interactions and the design of molecular tools. Originally a niche field of computer science, AI has diffused out into the wider research landscape and is reshaping the way in which science is conducted. For AI to have a positive impact on science and society, development of machine intelligence must be informed by cross-disciplinary insights from different research fields, as well as by other facets of human knowledge and experience: AI that is both led by humans, and that can serve real-world science needs. This project aims to explore the roles of humans in AI for life sciences research, defining how AI model training and development is guided by researchers, how humans interpret and validate knowledge generated by computers, and how computer scientists interact with other scientific researchers to transfer knowledge between different disciplines. Gathering evidence and developing understanding of these issues is critical both to the design of new AI techniques that serve the needs of science, and to the establishment of trust in AI by researchers and the general public. AI methodology is commonly underpinned by machine learning algorithms that are trained on datasets. In life sciences research, these can include both quantitative data, DNA sequences, RNA and protein expression patterns, images and text. AI can be used to identify patterns in unlabelled data, termed ‘unsupervised’ learning, or be trained using human- or machine-labelled datasets, termed ‘supervised’ learning. AI can also be trained using reinforcement methods, in which a machine makes initially random decisions that are then corrected by humans to progressively improve task performance. Interactions between humans and machine learning algorithms enable the transfer of human knowledge to computer models, not only hard data, but the fine nuances of human experience. For AI to be both demonstrably free from bias and trustworthy, its knowledge outputs must also be 'explainable' to the human researcher: the route by which they are generated must be decipherable and transparent. Maximising the potential of AI to tackle complex research questions is critically dependent on cross-disciplinary collaboration between computer scientists, who typically build AI models, and other scientific researchers, who can provide domain-specific expertise to guide their training and harness their outputs to accelerate scientific discovery. This project will carry out quantitative reviews of relevant academic and grey literature, and in-person and questionnaire-based interviews with life science researchers who employ AI techniques in their work. I aim to answer the following questions: In which ways do human researchers interact with machine learning algorithms to train AI models? What steps are taken to ensure that the outputs of AI models are explainable? What motivates life sciences to make choices about AI methodology and how do they work together with computer scientists? By answering these questions, I will generate the first comprehensive analysis of the ways in which humans and computer AI models co-create knowledge for life sciences research. These findings can both help shape the evolution of useful and trustworthy human-inspired AI to address the research questions of the future, and inform the evaluation of transparency and explainability required for AI assurance regulatory frameworks. |