Using high-throughput sequencing data, we have shown that that longest introns in some genes active in brain are removed in two recursive splicing steps, a mechanism previously only seen in Drosophila. By performing sequence analysis, we found that for recursive splicing to work, a recursive splice site (RS) signal is required. We found that such site also requires a cryptic exon, and is normally removed without a trace. However, when two cryptic exons are present, recursive splicing is inefficient, which leads to their inclusion and results in an aberrant transcript that becomes recognized by a machinery that degrades this transcript. We postulate that this may serve as a binary switch for the quality control of new, cryptic isoforms, and as a checkpoint for the evolution of new transcripts. Many of the identified genes are linked to autism and other neurological disorders.
COBISS.SI-ID: 1536358339
We have designed, theoretically justified and implemented an original method for visualzation of approximation sets as solutions to multiobjective optimization problems. The method reduces the dimensionality of the approximation set and visualizes it in the form of a prosection matrix. It also allows for visualization of differences between various approximation sets for a given problem. This provides a decision-maker with a better insight into solution quality and problem properties when selecting a final solution.
COBISS.SI-ID: 27961383
We introduced a novel paradigm of semi-artificial data generators. The approach is based on RBF networks. It takes a portion of the training data set, uses it to train a generative RBF model and turns it into a data generator. This novel approach can substantially improve learning performance for certain problems, including high dimensional ones, for example textual. The generated data can be used for tuning the parameters, in simulations, development of new algorithms, with imbalanced data etc.
COBISS.SI-ID: 1536875203
Qualitative modelling is traditionally concerned with the abstraction of numerical data. In numerical domains, partial derivatives describe the relation between the independent and dependent variable; qualitatively, they tell us the trend of the dependent variable. In this paper, we address the problem of extracting qualitative relations in categorical domains. We generalize the notion of partial derivative by defining the probabilistic discrete qualitative partial derivative (PDQ PD). PDQ PD is a qualitative relation between the target class and the discrete attribute; the derivative corresponds to ordering the attribute's values a_i, by P(c|a_i) in a local neighbourhood of the reference point, respecting the ceteris paribus principle.
COBISS.SI-ID: 1537020611
Identifying the patterns of RNA-protein interaction is key in understanding the role RNA binding proteins play in the post-transcriptional regulation of gene expression. We have developed an integrative orthogonality-regularised nonnegative matrix factorisation (iONMF) method to integrate multiple data sources and discover non-overlapping and class-specific patterns of varying abundances. Our integration of the largest compendium to date, which included 31 experimental data sets on 19 RNA binding proteins, revealed that the integration of multiple data sources improves the ability to predict interaction sites. We also identified the key predictive factors of protein-RNA interaction: RNA structure, sequence motifs and co-binding of other RNA binding proteins.
COBISS.SI-ID: 1537001923