Einladung zum Vortrag im Kolloquium Technische Kybernetik
Stochastic Modelling Over a Finite Alphabet and Algorithms for Finding Genes from Genomes
Prof. Dr. Mathukumalli Vidyasagar
Executive Vice President, Tata Consultancy Services (TCS)
Hyderabad, India
Zeit: Montag · 24. 4. 2006 · 14:00 Uhr
Ort: Seminarraum 3.241 · Pfaffenwaldring 9 · Campus Stuttgart-Vaihingen
Abstract
In this paper, we study the problem of constructing models for a stationary
stochastic process Yt assuming values in a finite set
M:=1...m, based on observing only a finite length sample path of the process.
It is shown that a well-known approach of
modelling the given process as a multi-step Markov process is in fact
the only possible solution that satisfies certain nonnegativity conditions.
Then we study the problem of classification.
It is assumed that two distinct sets of
sample paths of two separate stochastic processes are available -- call them
u1, ..., ur and v1, ..., vs.
The objective here is to develop not one but two models, called
C and NC respectively, such that the strings ui have much larger
likelihoods with the model C than with the model NC, and the opposite
is true for the strings vj.
Then a new string w is classified into the set C or NC according as
its likelihood is larger from the model C or the model NC.
For this problem, we develop a new algorithm called the 4M
(Mixed Memory Markov Model) algorithm,
which is an improvement over variable length Markov models.
We then apply the 4M algorithm to the problem of finding genes
from the genome.
The performance of the 4M algorithm is compared against that of
the popular Glimmer algorithm.
In most of the test cases studied, the 4M algorithm correctly classifies
both coding as well as non-coding regions more than 90\% of the time.
Moreover, the accuracy of the 4M algorithm compares well with that of
Glimmer.
At the same time, the 4M algorithm is amenable to statistical analysis.
Further work is under way to extend the 4M algorithm to gene finding in
eukaryotic genomes.
Biographical Information
Dr. Mathukumalli Vidyasagar was born in Guntur, Andhra Pradesh on 29 September 1947. He received the B.S., M.S., and Ph.D. degrees, all in Electrical Engineering, from the University of Wisconsin, in 1965, 1967, and 1969, respectively. Between 1969 and 1989, he worked as a Professor of Electrical Engineering at various universities in the USA and Canada. His last overseas job was with the University of Waterloo, Canada between 1980-89.
In 1989 he returned to India as the Director of the newly-created Centre for Artificial Intelligence and Robotics (CAIR), under the auspices of the Defence Research and Development Organisation (DRDO), Ministry of Defence, Government of India. In that capacity he built up CAIR into a leading research laboratory consisting of about 40 scientists working on various cutting-edge areas such as aircraft control, robotics, neural networks, and image processing.
In 2000 he joined Tata Consultancy Services (TCS), India's largest IT firm, as an Executive Vice President in charge of Advanced Technology. In this capacity he created the Advanced Technology Centre (ATC), which currently consists of about 60 engineers and scientists working on e-security, advanced encryption methods, and bioinformatics.
In addition to his academic positions, he has held visiting positions at several universities including MIT, California (Berkeley), Califomia (Los Angeles), CNRS Toulouse, France; Indian Institute of Science; University of Minnesota and Tokyo Institute of Technology.
He is the author or coauthor of nine books and more than one hundred and thirty papers in archival journals. He has received several honours in recognition of his research activities including the Distinguished Service Citation from his alma mater, the University of Wisconsin at Madison. He is a Fellow of IEEE as well as the Indian Academy of Sciences; the Indian National Science Academy, the Indian National Academy of Engineering and the Third World Academy of Sciences.
|