Research collaboration prediction using literature-based discovery approach

Research activity

Code	Science	Field	Subfield
5.13.00	Social sciences	Information science and librarianship

Code	Science	Field
5.08	Social Sciences	Media and communications

Keywords

information sciences, literature-based discovery, scientometrics, complex networks analysis

Evaluation (metodology)

Evaluation of bibliographic research performance indicators according to ARIS methodology

Citations Citations for bibliographic records in COBIB.SI that are linked to records in citation databases

Organisations (2) , Researchers (10)

0381 University of Ljubljana, Faculty of Medicine

no.	Code	Name and surname	Research area	Role	Period	No. of publicationsNo. of publications
1.	54468	Tomaž Bratanič		Technical associate	2020 - 2022	0
2.	39138	Rok Hribar	Computer science and informatics	Researcher	2022	28
3.	11373	PhD Dimitar Hristovski	Computer science and informatics	Researcher	2020 - 2022	158
4.	29102	Irena Janjić		Technical associate	2023	0
5.	26484	PhD Andrej Kastrin	Medical sciences	Head	2020 - 2023	175
6.	51959	PhD Damjan Manevski	Public health (occupational safety)	Researcher	2023	53
7.	57182	Tim Prezelj	Biology	Technical associate	2022 - 2023	138
8.	08992	PhD Janez Stare	Public health (occupational safety)	Researcher	2020 - 2023	283

0782 University of Ljubljana, Faculty of Mechanical Engineering

no.	Code	Name and surname	Research area	Role	Period	No. of publicationsNo. of publications
1.	12725	PhD Leon Kos	Mechanical design	Researcher	2020 - 2023	298
2.	22649	PhD Janez Povh	Computer intensive methods and applications	Researcher	2020 - 2023	370

Abstract

Literature-based discovery (LBD) is a text mining technology for automatically generating research hypotheses. The aim of LBD is uncovering hidden, previously unknown relationships from the existing knowledge. LBD approach is based on the assumption that there exist two non-intersecting scientific domains. Knowledge in one domain may be related to knowledge in the other domain, without the relationship being known. The methodology for LBD relies on three literature concepts: X, Y, and Z. For example, suppose a researcher has found a relationship between disease X and gene Y. Further, suppose that a different researcher has studied the effect of substance Z on gene Y. The use of LBD may suggest an XZ relationship, indicating that substance Z may potentially treat disease X. To find new collaborators, researchers often rely on manual exploration of metadata of scientific papers (e.g., reference lists), although it is known that such procedures are highly biased and ineffective. There is a huge gap between studies which examine the determinants of effective research collaboration and method designed to help researchers to find new collaborators. In this project, we propose a novel and innovative approach for recommending cross-domain collaboration among researchers based on LBD paradigm and heterogeneous networks approach. Our approach not only recommends pairs of authors but also predicts novel topics for collaboration and provides an explanation why the collaboration makes sense. The problem which we address in this project proposal is constructed around the following elements: (1) generalize the LBD paradigm to cross-domain collaboration recommendation problem and develop a framework to recommend novel and potentially fruitful collaborations; (2) develop a heterogeneous network embedding methodology directed by discovery patterns (i.e., meta-paths) to uncover the structural and semantic information in collaboration networks; (3) develop programming tools for a semantic path-based recommendation in large-scale heterogeneous networks; (4) develop an open-source Web application for cross-domain research collaboration recommendation; (5) apply the developed methodology on two large-scale bibliographic databases (MEDLINE and national COBISS bibliographic database) and Stack Overflow, a huge question-answering system. The collaboration network is a basis on which the algorithm for recommending research collaboration works. For a given input author, we first compile the author's concept (i.e., topic) profile, which represents both the author’s interests and expertise. The concepts from the author’s profile are input to the LBD step. For each input concept, we perform an LBD discovery. LBD outputs target concepts as novel collaboration topics that are not yet published in the literature. For all target concepts output by LBD, we extract authors who have these concepts in their profiles and eliminate those authors who are already co-authors with the starting author. The output is a list of the remaining authors as potential research collaborators and even topics for collaboration. We will also develop a new method for representation learning that will be able to adequately summarize the structural properties of the collaboration network. We will describe each node with a low-dimensional vector (i.e., embedding). Such embeddings are compact and very suitable for downstream machine learning tasks. In our approach, we will first employ a random walk technique driven by meta-paths to create node sequences and in the next step employ link prediction to infer novel relations between authors. The project leader and members of the proposed team have advanced scientific references in the fields of LBD, scientometrics, network analysis, machine learning, and knowledge technologies. The successful completion of the proposed project will significantly contribute to a breakthrough of Slovenian scientists in the field of LBD.

Research collaboration prediction using literature-based discovery approach

Views history

Favourite

Research collaboration prediction using literature-based discovery approach

FRASCATI classification

FORD classification

Confirmation required

Views history

Favourite