Tiếng Việt English
Lịch công tác
Hộp thư Cán bộ
cuusv.jpg Cựu Sinh Viên
Hộp thư Sinh viên
Liên hệ
Download Logo
 Hình ảnh

banner mlst.png
Thông báo về buổi seminar với chủ đề "Text Mining Biodiversity Literature" In E-mail
Chủ đề: Text Mining Biodiversity Literature

Người trình bày: TS. Nguyễn Thị Hồng Nhung (Bộ môn Công nghệ Tri thức - Khoa Công nghệ Thông tin và National Centre for Text Mining, University of Manchester, United Kingdom ).

Thời gian: từ 14g đến 16g ngày 29/03/2017 (Thứ tư)

Địa điểm: Phòng B11A

Ngôn ngữ trình bày: tiếng Việt.

Text Mining Biodiversity Literature

Biodiversity, a synergy between biology and diversity, is concerned with the study of the various levels of living entities on earth, from genes to ecosystems. It plays a central role in our daily lives, given its implications on ecological resilience, food security, species and subspecies endangerment and natural sustainability. To support the advancement of biodiversity research, several efforts aimed at storing and sharing biodiversity knowledge have been undertaken over the past few years, resulting in the creation of digital resources such as the Biodiversity Heritage Library (BHL), the Catalogue of Life, the Encyclopedia of Life, and the Global Biodiversity Information Facility. BHL is an open-access repository containing millions of digitised pages of legacy literature on biodiversity. Currently, BHL holds nearly 100,000 titles and over 170,000 volumes in many languages, accounting for a huge amount of textual content with over 150 million species mentions. The English subset alone, for instance, amounts to more than 24 million pages of text. 

In this talk, I will present recent text mining results in the domain of biodiversity, conducted at National Centre for Text Mining, University of Manchester, United Kingdom. Firstly, I will describe the automatic construction of a biodiversity terminological inventory. This inventory was created by applying distributional semantic models to the English subset of BHL. It contains a total of over 288,000 species names. For each species name in the inventory, the 20 topmost semantically related names are provided, together with their corresponding similarity scores. In order to evaluate the inventory in a more practical point of view, we implemented a visual search interface incorporating our term inventory to enable automatic query expansion. Secondly, I will show our approaches to construct a knowledge repository on Philippine biodiversity. The repository will be a synergy of different types of information, e.g., taxonomic, occurrence, ecological, biomolecular, biochemical, thus providing users with a comprehensive view on species of interest that will allow them to (1) carry out predictive analysis on species distributions, and (2) investigate potential medicinal applications of natural products derived from Philippine species.


Dr. Nhung Nguyen is currently a research associate at National Centre for Text Mining, University of Manchester, United Kingdom. She obtained her PhD in Information Science at Japan Advanced Institute of Science and Technology in 2014. Her main topic is to extract relations between entities in biomedical and biodiversity literature based on predicate-argument structure patterns. She has also worked on automatically constructing a terminological inventory by using distributional semantic models.  

Cập nhật ( 21/03/2017 )
HCMUS Portal 2.x
Thông tin tuyển dụng
Mọi thông tin liên quan đến trang web, xin vui lòng liên hệ theo địa chỉ Email: webmaster@hcmus.edu.vn
Phát triển bởi SELab