Amsterdam university medical centers - location Vumc
Co-data random forest learning for rare tumors
Rare cancers pose a major problem for machine learning algorithms: most genomic studies on rare cancers contain data on a relatively small number of patients and a large number of genomic features (e.g. genes). Such a setting is challenging for machine learners, because these may overfit, or fail to find relevant signal. Our aim is to steer the machine learners in the right direction. For that, we make use of vast amounts of complementary data (co-data) on the features, as available in online repositories. We propose to build a well-interpretable tree-based learner that unites strong elements of machine learning, statistics and biology: it accounts for complex molecular interactions, while improving predictive performance by estimating feature weigths using biological co-data and incorporating these weigths in the learner. We focus on prognosis for three rare disease entities of lymphoma cancer using a variety of genomics data. The project is a collaboration between: prof. Mark van de Wiel (PI), dr. Thomas Klausch and prof. Daphne de Jong, all at Amsterdam UMC.
Hanarth Fonds Wolga 5 2491 BK Den HAAG Telephone: +31 70 20 60 177 Email: