Projects / Programmes
Research collaboration prediction using literature-based discovery approach
Code |
Science |
Field |
Subfield |
5.13.00 |
Social sciences |
Information science and librarianship |
|
Code |
Science |
Field |
5.08 |
Social Sciences |
Media and communications |
information sciences, literature-based discovery, scientometrics, complex networks analysis
Researchers (10)
no. |
Code |
Name and surname |
Research area |
Role |
Period |
No. of publicationsNo. of publications |
1. |
54468 |
Tomaž Bratanič |
|
Technical associate |
2020 - 2022 |
0 |
2. |
39138 |
Rok Hribar |
Computer science and informatics |
Researcher |
2022 |
24 |
3. |
11373 |
PhD Dimitar Hristovski |
Computer science and informatics |
Researcher |
2020 - 2022 |
153 |
4. |
29102 |
Irena Janjić |
|
Technical associate |
2023 |
0 |
5. |
26484 |
PhD Andrej Kastrin |
Medical sciences |
Head |
2020 - 2023 |
154 |
6. |
12725 |
PhD Leon Kos |
Mechanical design |
Researcher |
2020 - 2023 |
254 |
7. |
51959 |
PhD Damjan Manevski |
Public health (occupational safety) |
Researcher |
2023 |
45 |
8. |
22649 |
PhD Janez Povh |
Computer intensive methods and applications |
Researcher |
2020 - 2023 |
346 |
9. |
57182 |
Tim Prezelj |
Biology |
Technical associate |
2022 - 2023 |
119 |
10. |
08992 |
PhD Janez Stare |
Public health (occupational safety) |
Researcher |
2020 - 2023 |
280 |
Organisations (2)
Abstract
Literature-based discovery (LBD) is a text mining technology for automatically generating research hypotheses. The aim of LBD is uncovering hidden, previously unknown relationships from the existing knowledge. LBD approach is based on the assumption that there exist two non-intersecting scientific domains. Knowledge in one domain may be related to knowledge in the other domain, without the relationship being known. The methodology for LBD relies on three literature concepts: X, Y, and Z. For example, suppose a researcher has found a relationship between disease X and gene Y. Further, suppose that a different researcher has studied the effect of substance Z on gene Y. The use of LBD may suggest an XZ relationship, indicating that substance Z may potentially treat disease X. To find new collaborators, researchers often rely on manual exploration of metadata of scientific papers (e.g., reference lists), although it is known that such procedures are highly biased and ineffective. There is a huge gap between studies which examine the determinants of effective research collaboration and method designed to help researchers to find new collaborators. In this project, we propose a novel and innovative approach for recommending cross-domain collaboration among researchers based on LBD paradigm and heterogeneous networks approach. Our approach not only recommends pairs of authors but also predicts novel topics for collaboration and provides an explanation why the collaboration makes sense. The problem which we address in this project proposal is constructed around the following elements: (1) generalize the LBD paradigm to cross-domain collaboration recommendation problem and develop a framework to recommend novel and potentially fruitful collaborations; (2) develop a heterogeneous network embedding methodology directed by discovery patterns (i.e., meta-paths) to uncover the structural and semantic information in collaboration networks; (3) develop programming tools for a semantic path-based recommendation in large-scale heterogeneous networks; (4) develop an open-source Web application for cross-domain research collaboration recommendation; (5) apply the developed methodology on two large-scale bibliographic databases (MEDLINE and national COBISS bibliographic database) and Stack Overflow, a huge question-answering system. The collaboration network is a basis on which the algorithm for recommending research collaboration works. For a given input author, we first compile the author's concept (i.e., topic) profile, which represents both the author’s interests and expertise. The concepts from the author’s profile are input to the LBD step. For each input concept, we perform an LBD discovery. LBD outputs target concepts as novel collaboration topics that are not yet published in the literature. For all target concepts output by LBD, we extract authors who have these concepts in their profiles and eliminate those authors who are already co-authors with the starting author. The output is a list of the remaining authors as potential research collaborators and even topics for collaboration. We will also develop a new method for representation learning that will be able to adequately summarize the structural properties of the collaboration network. We will describe each node with a low-dimensional vector (i.e., embedding). Such embeddings are compact and very suitable for downstream machine learning tasks. In our approach, we will first employ a random walk technique driven by meta-paths to create node sequences and in the next step employ link prediction to infer novel relations between authors. The project leader and members of the proposed team have advanced scientific references in the fields of LBD, scientometrics, network analysis, machine learning, and knowledge technologies. The successful completion of the proposed project will significantly contribute to a breakthrough of Slovenian scientists in the field of LBD.