The generation of large-scale biomedical data is creating unprecedented opportunities for basic and translational science. Typically, the data producers perform initial analyses, but it is very likely that the most informative methods may reside with other groups. Crowdsourcing the analysis of complex and massive data has emerged as a framework to find robust methodologies. When the crowdsourcing is done in the form of scientific competitions, known as Challenges, the validation of the methods is automatically addressed. Challenges also encourage open innovation, create collaborative communities to solve diverse and important biomedical problems and foster the creation and dissemination of well-curated data repositories.
In this talk I will discuss the scientific, methodological and social lessons learnt in the close to 50 DREAM Challenges (www.dreamchallenges.org) run to date, and in particular, I will highlight the recent Digital Mammography DREAM Challenge. In it, we challenged more than 1,200 registered participants to determine the cancer status of each breast of a subject, given a screening exam, a panel of clinical/demographic information, and if available, previous screening exam(s). The challenge leveraged close to 1,300,000 de-identified digital mammography images, corresponding to 300, ,000 mammography exams of 150,000 women, including demographic, clinical and longitudinal data. The community achieved excellent results reaching specificities and sensitivities that start to be competitive with the accuracy of radiologists in the clinic. Interestingly, the integration of the best algorithms and radiologist assessments reached a 1.5% improvement over the specificity of radiologists alone, which represents a decrease of more than half a million false positives in the US only. Our results strongly suggest that AI algorithms and radiologists can enhance each other to improve screening results for the general population.