Logic List Mailing Archive

Postdoctoral position in AI-driven data linking, Toulouse (France)

** Post-doctoral position at IRIT: Data Linking **

* Context: ANR project DACE-DL (DAta-CEntric AI-driven Data Linking)  *

Data linking is the scientific challenge of automatically establishing 
typed links between the entities of two or more structured datasets. A 
variety of complex data linking systems exists, evaluated on public 
benchmarks. While they have allowed for the generation of vast amounts of 
linked data in the context of various dedicated projects, data generic 
systems often have limited applicability in many real-world scenarios, 
where data are highly heterogeneous and domain-specific. DACE-DL targets a 
paradigm shift in the data linking field with a data-centric bottom-up 
methodology relying on machine learning and representation learning 
models. We hypothesize there exists a finite number of identifiable and 
generalisable linking problem types (LPTs), that we need to categorize and 
analyse to provide better linking results.

  * Topic: Data collect, consolidation, and data linking systems
  modularization *

This research is articulated in two main tasks. The first task consists in 
(1) carrying out an in-depth analysis of the quality of the existing data 
linking datasets, identifying erroneous statements and providing a 
high-quality set of datasets by correcting those statements; and (ii) 
generating additional links using existing high-precision linking systems 
on the chosen datasets. Data quality metrics such as accuracy, consistency 
and conciseness will be considered. The aim of the second task is manifold 
: (1) to provide an inventory of publicly available and functional linking 
tools that are able to deal with a large spectrum of data linking problem; 
(2) to propose a theoretical approach for the modularization of these 
tools into atomic modules easy to combine in order to build more complex 
solutions in a linking ecosystem; (3) to make the produced modules 
available to the data linking community. To do the modularization at 
scale, we plan to call upon unsupervised ML algorithms, enhanced by a 
human-in-the-loop approach. The objective is to provide a set of 
correspondences between the modules and the LPTs.

Starting period: January 2022 ? duration of 24 months

  * Work environment and Salary *

Localization : Institut de Recherche en informatique de Toulouse (IRIT) ? 
Universite Toulouse - Jean Jaures / Maison de la Recherche, 5, allees 
Antonio Machado 31058 Toulouse. Salary between 2200? and 2700? gross 
monthly depending on qualifications and situation.

* How to apply *

Applicants are required to have a PhD in Computer Science, a strong 
background in semantic web technologies, ontology matching and data 
linking. Fluency in written / spoken English is required too. A good 
publication record and strong programming skills will be a plus. 
Applications will be accepted until the position is closed.  Applicants 
should send a full CV including a complete list of publications, a cover 
letter indicating their research interests, achievements to date and 
vision for the future, as well as either support letters or the name of 2 
persons that have worked with them.

Contact: Cassia Trojahn (cassia.trojahn@irit.fr) and Olivier Teste 
(olivier.teste@irit.fr)
--
[LOGIC] mailing list
http://www.dvmlg.de/mailingliste.html
Archive: http://www.illc.uva.nl/LogicList/

provided by a collaboration of the DVMLG, the Maths Departments in Bonn and Hamburg, and the ILLC at the Universiteit van Amsterdam