Logic List Mailing Archive
Postdoc in CS in connection with natural language, Caen (France)
Postdoctoral Position in Computer Science, Linguistic and Natural Language
Processing: Using text resources for data mining
Research Unit: Groupe de REcherche en Informatique, Image, Automatique et
Instrumentation de Caen (GREYC) http://www.greyc.unicaen.fr/
Location: Caen, Normandy, France
This post-doc position is linked to the Bingo project which joins three
computer scientists teams (EURISE, EA 3721, Universit? de St-Etienne,
GREYC - CNRS UMR 6072, Universit? de Caen and LIRIS - CNRS UMR 5205, INSA
de Lyon) and a team of biologists (CGMC - CNRS UMR 5534, Universit? de
Lyon 1).
The Bingo project (Bases de donn?es INductives et G?nOmique in French -
Genomics and Inductive Database in English, see
http://www.info.unicaen.fr/~bruno/bingo/) focuses on several open
problems, one of which is the use of text resources during the pattern
post-processing stage, in order to make better use of domain knowledge
during the knowledge discovery stage. This problem requires a close
cooperation between linguistic knowledge and methods from knowledge
discovery in databases.
The aim of the work of this post-doc position is to use texts and
ontologies in order to support the knowledge discovery phase (i.e., when
post-processing patterns) in order to present relevant knowledge for the
needs of the experts. Indeed, KDD processes tend to produce a lot of
patterns which are - a priori - interesting. The validation of the
extracted information is a hard task and requires the background knowledge
on the domain at hands. The background knowledge is partially embedded in
the literature. The key idea is to help the validation step by using
ontologies (cf. http://www.geneontology.org/) and textual resources (e.g.,
Medline). For instance, in the context of the genomic data, starting from
a pattern which may be a synexpression group, the biologist would like to
retrieve the texts which deal with this particular topic, which biological
situations are concerned, and so on. Several work directions are proposed
(e.g., text-reader profiling, text analysis, define constraints coming
from text resources), see
http://www.info.unicaen.fr/~bruno/bingo/pages/menu_evenements.php
This post-doctoral position is supported by the CNRS, see also
http://www.k-projects.com/cnrs_postdocs_2005/public/departement.php?Dep=I
NT&IdDpt=12
Sought profile of the candidate
Ph D in Computer Science with interest in liguistics or natural language
processing. A significant experience in knowledge discovery in databases
or linguistics would be highly appreciated. Speaking French is not
required.
Duration of the fellowship (months): 12 (starting from September 1st,
2005)
Gross salary : 25,800 Euro per annum
Deadline for application : May 16th, 2005
Contact:
Bruno Cr?milleux +33 2 31 56 74 35 Bruno.Cremilleux@info.unicaen.fr
Nadine Lucas +33 2 31 56 73 36 Nadine.Lucas@info.unicaen.fr
GREYC - CNRS UMR 6072, Universit? de Caen, Campus C?te de Nacre
F-14032 Caen Cedex - France