Computational Toolbox for Discovery of Prognostic Biomarkers for Survival Analysis

Research activity

Code	Science	Field	Subfield
2.07.00	Engineering sciences and technologies	Computer science and informatics

Code	Science	Field
1.02	Natural Sciences	Computer and information sciences

Keywords

biomarkers, survival analysis, gene expression, interactive visualisations, explorative data analysis

Evaluation (metodology)

Evaluation of bibliographic research performance indicators according to ARIS methodology

Citations Citations for bibliographic records in COBIB.SI that are linked to records in citation databases

Organisations (1) , Researchers (8)

1539 University of Ljubljana, Faculty of Computer and Information Science

no.	Code	Name and surname	Research area	Role	Period	No. of publicationsNo. of publications
1.	16324	PhD Janez Demšar	Computer science and informatics	Researcher	2021 - 2024	347
2.	32930	Aleš Erjavec		Technical associate	2021 - 2024	12
3.	56629	Pavlin Gregor Poličar	Computer science and informatics	Researcher	2022 - 2024	15
4.	57109	Ela Praznik	Computer science and informatics	Researcher	2022 - 2023	0
5.	38461	PhD Ajda Pretnar Žagar	Computer science and informatics	Researcher	2021 - 2024	61
6.	30142	PhD Marko Toplak	Computer science and informatics	Researcher	2021 - 2024	37
7.	12536	PhD Blaž Zupan	Computer science and informatics	Head	2021 - 2024	571
8.	30921	PhD Lan Žagar	Computer science and informatics	Researcher	2021 - 2024	17

Abstract

The proposed project will build a powerful yet intuitive toolbox to support and automate the discovery of complex prognostic biomarkers from transcriptomic data. Multi-gene biomarkers of the disease can capture patients' physiological state at the point of diagnosis or treatment decisions and are a cornerstone of precision medicine. To advance this modern paradigm of medicine in our aging society, we propose to employ methods from data science, machine learning, and data visualizations to design and implement a system for computer-aided discovery (CAD) of biomarkers. Current solutions for the discovery of prognostic biomarkers from clinical survival data consist of disparate code fragments in R or Python that require a substantial effort to integrate with knowledge bases and more work still to then be able to interactively and visually explore the results. There is a great need for comprehensive tools that would help the domain experts uncover hidden patterns and communicate the results to other stakeholders in the process, e.g., clinicians and health regulators. We propose to develop a set of computational methods and interactive model exploration techniques to democratize the science of biomarker discovery. We will focus on complex multi-gene expression (transcriptomic) biomarkers that can predict cancer patients' clinical outcomes, including overall survival and progression free survival. Our methods will be included in a tool to find and rank potential biomarker genesets and visualize the relationships between genes in a geneset using ontologies, controlled vocabulary, and other forms of curated public knowledge. This knowledge-supported discovery of biomarkers will open up the black box of a data-driven approach to biomarker discovery and make the results interpretable in the broader context of biomedical science. The project's deliverables will empower newcomers and experts from academia and the industry for faster biomarker discovery. We will achieve that by focusing on three aspects: Robust data science. Developing and open-sourcing a coherent Python library of computational approaches, including survival-based biomarker interaction analysis, biomarker maps, and heuristic search for groups of biomarkers through the integration of data and knowledge-bases.Ease of use. Implementing the tools within Orange, our established framework for visual programming and data science, with interactive visualizations that seamlessly integrate with public repositories of data and knowledge.Communication. The guiding principle for software design will be improved communication between stakeholders in the R&D. But we also aim to empower newcomers to the field with a range of training materials. The project is ambitious and challenging but builds on our previous work. The project leader has published approaches in feature interaction discovery, feature construction, data projection and mapping techniques, knowledge-based search, and intelligent data visualizations. We have been developing Orange (http://orangedatamining.com), which has a vast user base in the industry and education, to which we will add a new add-on for biomarker discovery and survival analysis. We are partnering with Genialis, an SME specializing in data science research for precision medicine. They are leaders in the area of complex transcriptomic biomarkers and are currently registering the first-ever transcriptomic biomarker for clinical use. The tools developed in this project will primarily speed-up their discovery process and simplify the communication with their partners. We also envision a joint patent as a direct result of this work. Ultimately, the tools developed in the project will be made available to a broad scientific community and will have a real potential to advance the field of precision medicine globally.

Computational Toolbox for Discovery of Prognostic Biomarkers for Survival Analysis

Views history

Favourite

Computational Toolbox for Discovery of Prognostic Biomarkers for Survival Analysis

FRASCATI classification

FORD classification

Confirmation required

Views history

Favourite