AI methods for digital pathology have been expanding a new field of research in recent years, yet their implementation remains cumbersome, especially for inexperienced users.

Our group has developed HistoMIL, an end-to-end framework for preprocessing, training and testing multiple instance learning (MIL) models on tasks that combine H&E-stained whole-slide images with molecular measurements.

To showcase its utility, we trained >8,000 models to predict the activity of >2,000 cancer hallmarks in breast cancer H&E slides. We observed AUROCs up to 0.85, with cell-cycle related pathways (notably E2F targets) among the most accurately predicted.