LLM-Driven Testing: Assessing Large Language Models in Cancer Registry Applications

LLM-Driven Testing: Assessing Large Language Models in Cancer Registry Applications

This research project seeks to transform cancer registry testing by harnessing the power of Large Language Models (LLMs) like ChatGPT, offering automated, generative testing methods to detect anomalies, create test cases, and enhance data quality.

This research project proposes to leverage Large Language Models (LLMs) as a transformative solution to enhance testing methodologies within cancer registries. With their natural language processing and generation abilities, LLMs offer the potential to automate and improve the testing process. They can assist in identifying data anomalies, generating test cases, and even generating synthetic data for testing purposes. Moreover, LLMs can provide valuable insights into data quality issues that may be challenging to detect using traditional methods. The application of this research extends to cancer registries worldwide, with a specific focus on the Cancer Registry of Norway (CRN). In the context of CRN, this project explores the potential of LLM-driven automation for advanced testing. The key challenge addressed is the need for more efficient and comprehensive testing methodologies. By harnessing LLMs, this research aims to enhance testing strategies, enabling automated and generative approaches to improve the quality of cancer registry data. By implementing LLM-driven testing strategies, this project aims to improve data quality and enhancement processes. The outcomes have the potential to significantly impact cancer research.

Goal

The primary goal of this project is to investigate and leverage various Large Language Models (e.g., GPT, Llama2, Falcon, PaLM2, Orca) to enhance testing strategies within cancer registries. By exploring the capabilities of these LLMs, we aim to develop advanced automated and generative testing methods that contribute to improved data quality and testing efficiency.

Learning outcome

  • Deep understanding of LLMs
  • Hands-on experience with real-world application
  • Innovative approach development

Qualifications

  • Programming (e.g., python)
  • Familiarity with ML and NLP concepts

Supervisors

  • Erblin Isaku
  • Shaukat Ali

Collaboration partners

  • Cancer Registry of Norway
  • Supervisor: Jan F. Nygård

Associated contacts

Erblin Isaku

Erblin Isaku

PhD student

Shaukat Ali

Shaukat Ali

Chief Research Scientist/Research ProfessorHead of Department