We are increasingly accumulating molecular data about a cell. The challenge is how to integrate them within a unified conceptual and computational framework enabling new discoveries. Hence, we propose a novel, data-driven concept of an integrated cell, iCell. Also, we introduce a computational prototype of an iCell, which integrates three omics, tissue-specific molecular interaction network types. We construct iCells of four cancers and the corresponding tissue controls and identify the most rewired genes in cancer. Many of them are of unknown function and cannot be identified as different in cancer in any specific molecular network. We biologically validate that they have a role in cancer by knockdown experiments followed by cell viability assays. We find additional support through Kaplan-Meier survival curves of thousands of patients. Finally, we extend this analysis to uncover pan-cancer genes. Our methodology is universal and enables integrative comparisons of diverse omics data over cells and tissues.
COBISS.SI-ID: 16484379
Diseases involve complex modifications to the cellular machinery. The gene expression profile of the affected cells contains characteristic patterns linked to a disease. Hence, new biological knowledge about a disease can be extracted from these profiles, improving our ability to diagnose and assess disease risks. This knowledge can be used for drug re-purposing, or by physicians to evaluate a patient's condition and co-morbidity risk. Here, we consider differential gene expressions obtained by microarray technology for patients diagnosed with various diseases. Based on these data and cellular multi-scale organization, we aim at uncovering disease-disease, disease-gene and disease-pathway associations. We propose a neural network with structure based on the multi-scale organization of proteins in a cell into biological pathways. We show that this model is able to correctly predict the diagnosis for the majority of patients. Through the analysis of the trained model, we predict disease-disease, disease-pathway, and disease-gene associations and validate the predictions by comparisons to known interactions and literature search, proposing putative explanations for the predictions.
COBISS.SI-ID: 57930243
The k-assignment problem (or, the k-matching problem) on k-partite graphs is an NP-hard problem for k [greater than or equal to] 3. In this paper we introduce five new heuristics. Two algorithms, Bm and Cm, arise as natural improvements of Algorithm Am from (He et al., in: Graph Algorithms And Applications 2,World Scientific, 2004). The other three algorithms, Dm, Em, and Fm, incorporate randomization. Algorithm Dm can be considered as a greedy version of Bm, whereas Em and Fm are versions of local search algorithm, specialized for the k-matching problem. The algorithms are implemented in Python and are run on three datasets. On the datasets available, all the algorithms clearly outperform Algorithm Am in terms of solution quality. On the first dataset with known optimal values the average relative error ranges from 1.47% over optimum (algorithm Am) to 0.08% over optimum (algorithm Em). On the second dataset with known optimal values the average relative error ranges from 4.41% over optimum (algorithm Am) to 0.45% over optimum (algorithm Fm). Better quality of solutions demands higher computation times, thus the new algorithms provide a good compromise between quality of solutions and computation time.
COBISS.SI-ID: 38799875
The chapter in the book represents an upgrade of the conference paper. In the described memetic algorithm we use a population-based global search technique to locate good parts of the search space. By comparison the deterministic naive approach with two versions of memetic algorithm with varying degrees of inheritance we show that the evolutionary operators do not significantly improve capacity.
COBISS.SI-ID: 55493379
Motivation: Molecular interactions have been successfully modeled and analyzed as networks, where nodes represent molecules and edges represent the interactions between them. These networks revealed that molecules with similar local network structure also have similar biological functions. The most sensitive measures of network structure are based on graphlets. However, graphlet-based methods thus far are only applicable to unweighted networks, whereas real-world molecular networks may have weighted edges that can represent the probability of an interaction occurring in the cell. This information is commonly discarded when applying thresholds to generate unweighted networks, which may lead to information loss. Results: We introduce probabilistic graphlets as a tool for analyzing the local wiring patterns of probabilistic networks. To assess their performance compared to unweighted graphlets, we generate synthetic networks based on different well-known random network models and edge probability distributions and demonstrate that probabilistic graphlets outperform their unweighted counterparts in distinguishing network structures. Then we model different real-world molecular interaction networks as weighted graphs with probabilities as weights on edges and we analyze them with our new weighted graphlets-based methods. We show that due to their probabilistic nature, probabilistic graphlet-based methods more robustly capture biological information in these data, while...
COBISS.SI-ID: 57826051