Supporting scientific research
LUMC
In short, Danielle Cohen's research entails the following:
Diagnosing salivary gland tumors (SGTs) poses one of the most challenging tasks for pathologists. Salivary glands are extremely rare and diverse. Within both malignant and benign tumors, there exist dozens of subtypes, and their number is increasing due to molecular insights constantly discovering new types. Pathologists may encounter some subtypes only a few times in their careers, making expertise difficult, if not impossible. For patients, the issue is significant: the circumstances surrounding salivary gland diagnostics are associated with delays and misdiagnoses, potentially impacting timely and appropriate treatment.
In this project, we are developing SalvIdentify, a platform inspired by the successful biodiversity classification tool 'ObsIdentify' from the Naturalis Center for Biodiversity. SalvIdentify aims to empower pathologists worldwide to upload anonymized digital images of SGTs and receive a differential diagnosis based on SALV-AI, a national digital histopathological database of SGTs.
The research is conducted by the Diagnostic SGT Consortium (DSGTC), a collaboration of head and neck pathologists and clinicians from head and neck oncology centers in the Netherlands and beyond. The goal is for SalvIdentify to enable pathologists everywhere (in the Netherlands and worldwide) to tap into decades of salivary gland pathology 'memory.' This would make it easier to identify rare subtypes, and biological behavioral patterns of subgroups will manifest more quickly.
Our research plan includes the following objectives:
• generating SALV-AI (the world's largest open-source histopathological SGT database);
• developing the SalvIdentify model;
• launching the SalvIdentify platform;
• validating the additional value of SalvIdentify in clinical practice;
• salvIdentify is a collaborative request on behalf of the 'Dutch Salivary Gland Tumor Consortium’.
During the first year of the SalvIdentify project, we established a strong foundation for achieving our research objectives. A key milestone was the expansion of the SALV-AI dataset, now comprising 3,012 digitized salivary gland tumor cases from six Dutch pathology departments (LUMC, EMC, UMCU, HMC, MUMC, AVL). An additional ~2,000 cases are currently being processed from UMCG, AUMC, and HagaHospital.
To ensure a high-quality gold standard, we implemented a structured and largely automated revision pipeline that enhances consistency and minimizes human error. This system standardizes data organization, ensures traceability, and facilitates scoring via slidescore and biweekly revision sessions with a dedicated team of 5 to 8 experienced Dutch head and neck pathologists.
A notable secondary benefit of our large dataset and team of collaborating pathologists has been the ability to form larger series of extremely rare subtypes, opening up new research opportunities. This has already resulted in submission of a first paper, which describes a distinctive pleomorphic adenoma subtype (HMGA2 alterations & MDM2 co-amplification) and is currently under review in the Journal of Head and Neck Pathology. A second manuscript, now in preparation, details the methodology behind the SALVDataset, which serves as the foundation for AI development. Collectively, these efforts contribute to the largest and most comprehensive salivary gland tumor dataset to date, directly improving diagnostics and enabling AI integration in clinical practice.
We have also made significant progress in integrating clinical data, with patient information recorded in CASTOR and additional collection via CTCue. To ensure real-world adoption of our AI model, we have initiated collaboration with patient advocacy groups. Since joining the project in December 2024, AI PhD candidate Jurre Weijer has focused on model architecture, data preprocessing, and algorithm development. His work includes building an image retrieval model that allows pathologists to search for diagnostically similar cases, supported by pathology foundation models capable of generating robust whole slide image representations.
With a solid foundation in place, the next phase of the project will focus on advancing AI models and validating them for clinical use.