Logic List Mailing Archive
Postdoctoral position in data linking, Toulouse (France)
** Post-doctoral position at IRIT: Data Linking **
* Context: ANR project DACE-DL (DAta-CEntric AI-driven Data Linking) *
Data linking is the scientific challenge of automatically establishing
typed links between the entities of two or more structured datasets. A
variety of complex data linking systems exists, evaluated on public
benchmarks. While they have allowed for the generation of vast amounts of
linked data in the context of various dedicated projects, data generic
systems often have limited applicability in many real-world scenarios,
where data are highly heterogeneous and domain-specific. DACE-DL targets a
paradigm shift in the data linking field with a data-centric bottom-up
methodology relying on machine learning and representation learning
models. We hypothesize there exists a finite number of identifiable and
generalisable linking problem types (LPTs), that we need to categorize and
analyse to provide better linking results.
* Topic: Data collect, consolidation, and data linking systems modularization *
This research is articulated in two main tasks. The first task consists in
(1) carrying out an in-depth analysis of the quality of the existing data
linking datasets, identifying erroneous statements and providing a
high-quality set of datasets by correcting those statements; and (ii)
generating additional links using existing high-precision linking systems
on the chosen datasets. Data quality metrics such as accuracy, consistency
and conciseness will be considered.
The aim of the second task is manifold : (1) to provide an inventory of
publicly available and functional linking tools that are able to deal with
a large spectrum of data linking problem; (2) to propose a theoretical
approach for the modularization of these tools into atomic modules easy to
combine in order to build more complex solutions in a linking ecosystem;
(3) to make the produced modules available to the data linking community.
To do the modularization at scale, we plan to call upon unsupervised ML
algorithms, enhanced by a human-in-the-loop approach. The objective is to
provide a set of correspondences between the modules and the LPTs.
Starting period: January 2022 ? duration of 24 months
* Work environment and Salary *
Localization : Institut de Recherche en informatique de Toulouse (IRIT) ?
Universite Toulouse - Jean Jaures / Maison de la Recherche, 5, allees
Antonio Machado 31058 Toulouse.
Salary between 2200? and 2700? gross monthly depending on qualifications
and situation.
* How to apply *
Applicants are required to have a PhD in Computer Science, a strong
background in semantic web technologies, ontology matching and data
linking. Fluency in written / spoken English is required too. A good
publication record and strong programming skills will be a plus.
Applications will be accepted until the position is closed. Applicants
should send a full CV including a complete list of publications, a cover
letter indicating their research interests, achievements to date and
vision for the future, as well as either support letters or the name of 2
persons that have worked with them.
Contact: Cassia Trojahn (cassia.trojahn@irit.fr) and Olivier Teste (olivier.teste@irit.fr)
--
[LOGIC] mailing list
http://www.dvmlg.de/mailingliste.html
Archive: http://www.illc.uva.nl/LogicList/
provided by a collaboration of the DVMLG, the Maths Departments in Bonn and Hamburg, and the ILLC at the Universiteit van Amsterdam