Database anonymization

Code Science Field Subfield
1.01.03  Natural sciences and mathematics  Mathematics  Numerical and computer mathematics 

Code Science Field
P110  Natural sciences and mathematics  Mathematical logic, set theory, combinatories 
P120  Natural sciences and mathematics  Number theory, field theory, algebraic geometry, algebra, group theory 
P170  Natural sciences and mathematics  Computer science, numerical analysis, systems, control 
data anonymization, privacy, security, information protection
Researchers (4)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  18021  MSc Janja Jakončič  Computer science and informatics  Researcher  2009  27 
2.  08724  PhD Aleksandar Jurišić  Mathematics  Head  2007 - 2009  209 
3.  28222  Maruša Stanek  Mathematics  Researcher  2007 - 2008 
4.  14273  PhD Arjana Žitnik  Mathematics  Researcher  2007 - 2009  103 
Organisations (2)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  0101  Institute of Mathematics, Physics and Mechanics  Ljubljana  5055598000  19,658 
2.  1539  University of Ljubljana, Faculty of Computer and Information Science  Ljubljana  1627023  16,041 
We study models for protecting privacy in databases, in particular k-anonimity and ℓ-diversity. We analyse possible attacks and search for appropriate security upgrade.
Significance for science
Database security became an active research area in the field of cryptography and computer security in the 1980s. Although the problems have not been solved the new medical privacy regulations are bringing about a resurgence. The novelty of our approach is the scientific formulation of the problem of database anonymization at IVZ that was considered within a broader framework - from data collection to its use for medical and statistical research. We proposed a practial model which uses efficient cryptographic and probabilistic methods to improve the current situation. The concepts of k-anonymity and l-diversity were employed. We studied new cryptographic shemes and specific optimization methods for increased efficiency of the algorithms. We investigated new methods for database anonymization and tested them on the actual data that we received from IVZ. Our aim was to provide a practical application of known anonymization techniques. The algorithms proved to be effective in the case of more static databases. Our investigation needs to be continued also for dynamic databases (since in many cases new data are being constantly added). For these kind of databases completely new algorithms need to be designed, however this is out of the scope for this project.
Significance for the country
The proposed project is solving a concrete and urgent problem and its relevance and the potential impact of the results are immediate (the project is not only of theoretical value, it includes completely problem-oriented research and a pilot solution as a concrete answer to the posed questions). The ethical problem of using personal medical data is enormous and so is the responsibility to guarantee the anonymity of individuals, as mandated by the European and Slovene legislature. In case of the National Health Institute of Slovenia (IVZ) the number of attributes for individual entries contained in the databases, and the sheer number of the databases and the possible links among them make the problem of ensuring the anonymity a daunting task. So far the only methods of protecting personal data used at IVZ were to highly restrict the access to the databases and their use. The higher restrictiveness led to a smaller scope of use of these databases, defying their main purpose.
