Speaker: Jiong Yang , University of Illinois, Urbana-Champaign
Mining Biological Data
Abstract: Bio-informatics has been an active research field in recent years. Sequences analysis and micro-array analysis are two important research areas in bio-informatics. I will present some recent research achievements on these two areas. In the first part of my talk, I will present a model for discovering important sequential patterns. Frequent pattern mining has been studies for several years and most of the research focuses on finding exact pattern matches. However, in many applications, a pattern may be corrupted and each occurrence in the database may be only an approximate match of the original pattern. This phenomenon becomes more evident for long patterns. An example of such corruption is the mutation between amino acids. To address this issue, a new model has been proposed to characterize possible corruptions and to quantify their impacts towards pattern matching. This model can include the traditional model of frequent pattern as a special case. In the second part of my talk, I will present a novel generic model called coherent cluster that captures the coherence (rather than the closeness) exhibited among objects in high dimensional numerical space. This coherent cluster model includes existing clustering models as special cases and has wide applications, especially in the areas of E-Commerce and Bio-informatics where each object or attribute may naturally bear some degree of bias and different biases may apply in different scopes.
Short Bio: Jiong Yang's research interests include data mining, database systems, and bioinformatics. He received his Ph.D. degree from UCLA in 1999. After graduation, he joined IBM T. J. Watson research centers as a research staff member. Since September 2002, he has been working as a visiting assistant professor at UIUC computer science department. Dr. Yang has published more than thirty referred articles in various conferences and journals.
Hosts: Jon Doyle and Munindar Singh