Logic List Mailing Archive

Postdoc in CS in connection with natural language, Caen (France)

Postdoctoral Position in Computer Science, Linguistic and Natural Language
 
Processing: Using text resources for data mining

Research Unit: Groupe de REcherche en Informatique, Image, Automatique et
 
Instrumentation de Caen (GREYC) http://www.greyc.unicaen.fr/

Location: Caen, Normandy, France

This post-doc position is linked to the Bingo project which joins three 
computer scientists teams (EURISE, EA 3721, Universit? de St-Etienne, 
GREYC - CNRS UMR 6072, Universit? de Caen and LIRIS - CNRS UMR 5205, INSA
 
de Lyon) and a team of biologists (CGMC - CNRS UMR 5534, Universit? de 
Lyon 1).

The Bingo project (Bases de donn?es INductives et G?nOmique in French -
 
Genomics and Inductive Database in English, see 
http://www.info.unicaen.fr/~bruno/bingo/) focuses on several open 
problems, one of which is the use of text resources during the pattern 
post-processing stage, in order to make better use of domain knowledge 
during the knowledge discovery stage.  This problem requires a close 
cooperation between linguistic knowledge and methods from knowledge 
discovery in databases.

The aim of the work of this post-doc position is to use texts and 
ontologies in order to support the knowledge discovery phase (i.e., when 
post-processing patterns) in order to present relevant knowledge for the 
needs of the experts. Indeed, KDD processes tend to produce a lot of 
patterns which are - a priori - interesting. The validation of the 
extracted information is a hard task and requires the background knowledge
 
on the domain at hands. The background knowledge is partially embedded in
 
the literature. The key idea is to help the validation step by using 
ontologies (cf. http://www.geneontology.org/) and textual resources (e.g.,
 
Medline).  For instance, in the context of the genomic data, starting from
 
a pattern which may be a synexpression group, the biologist would like to
 
retrieve the texts which deal with this particular topic, which biological
 
situations are concerned, and so on.  Several work directions are proposed
 
(e.g., text-reader profiling, text analysis, define constraints coming 
from text resources), see 
http://www.info.unicaen.fr/~bruno/bingo/pages/menu_evenements.php

This post-doctoral position is supported by the CNRS, see also 
http://www.k-projects.com/cnrs_postdocs_2005/public/departement.php?Dep=I
NT&IdDpt=12

Sought profile of the candidate

Ph D in Computer Science with interest in liguistics or natural language 
processing. A significant experience in knowledge discovery in databases 
or linguistics would be highly appreciated. Speaking French is not 
required.


Duration of the fellowship (months): 12 (starting from September 1st, 
2005)

Gross salary : 25,800 Euro per annum

Deadline for application : May 16th, 2005

Contact:
Bruno Cr?milleux  +33 2 31 56 74 35   Bruno.Cremilleux@info.unicaen.fr
Nadine Lucas      +33 2 31 56 73 36   Nadine.Lucas@info.unicaen.fr

GREYC - CNRS UMR 6072, Universit? de Caen, Campus C?te de Nacre
F-14032 Caen Cedex - France