Projects / Programmes source: ARIS

Research collaboration prediction using literature-based discovery approach

Research activity

Code Science Field Subfield
5.13.00  Social sciences  Information science and librarianship   

Code Science Field
5.08  Social Sciences  Media and communications 
information sciences, literature-based discovery, scientometrics, complex networks analysis
Evaluation (rules)
source: COBISS
Researchers (10)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  54468  Tomaž Bratanič    Technical associate  2020 - 2022 
2.  39138  Rok Hribar  Computer science and informatics  Researcher  2022  21 
3.  11373  PhD Dimitar Hristovski  Computer science and informatics  Researcher  2020 - 2022  146 
4.  29102  Irena Janjić    Technical associate  2023 
5.  26484  PhD Andrej Kastrin  Medical sciences  Head  2020 - 2023  146 
6.  12725  PhD Leon Kos  Mechanical design  Researcher  2020 - 2023  246 
7.  51959  PhD Damjan Manevski  Public health (occupational safety)  Researcher  2023  41 
8.  22649  PhD Janez Povh  Computer intensive methods and applications  Researcher  2020 - 2023  341 
9.  57182  Tim Prezelj  Biology  Technical associate  2022 - 2023  90 
10.  08992  PhD Janez Stare  Public health (occupational safety)  Researcher  2020 - 2023  277 
Organisations (2)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  0381  University of Ljubljana, Faculty of Medicine  Ljubljana  1627066  47,522 
2.  0782  University of Ljubljana, Faculty of Mechanical Engineering  Ljubljana  1627031  29,116 
Literature-based discovery (LBD) is a text mining technology for automatically generating research hypotheses. The aim of LBD is uncovering hidden, previously unknown relationships from the existing knowledge. LBD approach is based on the assumption that there exist two non-intersecting scientific domains. Knowledge in one domain may be related to knowledge in the other domain, without the relationship being known. The methodology for LBD relies on three literature concepts: X, Y, and Z. For example, suppose a researcher has found a relationship between disease X and gene Y. Further, suppose that a different researcher has studied the effect of substance Z on gene Y. The use of LBD may suggest an XZ relationship, indicating that substance Z may potentially treat disease X. To find new collaborators, researchers often rely on manual exploration of metadata of scientific papers (e.g., reference lists), although it is known that such procedures are highly biased and ineffective. There is a huge gap between studies which examine the determinants of effective research collaboration and method designed to help researchers to find new collaborators. In this project, we propose a novel and innovative approach for recommending cross-domain collaboration among researchers based on LBD paradigm and heterogeneous networks approach. Our approach not only recommends pairs of authors but also predicts novel topics for collaboration and provides an explanation why the collaboration makes sense. The problem which we address in this project proposal is constructed around the following elements: (1) generalize the LBD paradigm to cross-domain collaboration recommendation problem and develop a framework to recommend novel and potentially fruitful collaborations; (2) develop a heterogeneous network embedding methodology directed by discovery patterns (i.e., meta-paths) to uncover the structural and semantic information in collaboration networks; (3) develop programming tools for a semantic path-based recommendation in large-scale heterogeneous networks; (4) develop an open-source Web application for cross-domain research collaboration recommendation; (5) apply the developed methodology on two large-scale bibliographic databases (MEDLINE and national COBISS bibliographic database) and Stack Overflow, a huge question-answering system. The collaboration network is a basis on which the algorithm for recommending research collaboration works. For a given input author, we first compile the author's concept (i.e., topic) profile, which represents both the author’s interests and expertise. The concepts from the author’s profile are input to the LBD step. For each input concept, we perform an LBD discovery. LBD outputs target concepts as novel collaboration topics that are not yet published in the literature. For all target concepts output by LBD, we extract authors who have these concepts in their profiles and eliminate those authors who are already co-authors with the starting author. The output is a list of the remaining authors as potential research collaborators and even topics for collaboration. We will also develop a new method for representation learning that will be able to adequately summarize the structural properties of the collaboration network. We will describe each node with a low-dimensional vector (i.e., embedding). Such embeddings are compact and very suitable for downstream machine learning tasks. In our approach, we will first employ a random walk technique driven by meta-paths to create node sequences and in the next step employ link prediction to infer novel relations between authors. The project leader and members of the proposed team have advanced scientific references in the fields of LBD, scientometrics, network analysis, machine learning, and knowledge technologies. The successful completion of the proposed project will significantly contribute to a breakthrough of Slovenian scientists in the field of LBD.
Views history