Title : Learning from Big Data: Using Statistics to tame the Complexity
Speaker : Chiranjib Bhattacharyya
Date and Time : Friday, May 23, 2014, 4:00 PM
Venue : Faculty Hall, Indian Institute of Science
Abstract
The problem of learning Statistical models from Data can be posed as
Optimization programs. These programs often become unwieldy, in the Big
Data setting, as the number of variables and constraints grow with number
of data-points. Distributed Optimization, requiring expensive parallel
hardware, is the current state of the art remedy for such problems.
However, in Statistics growth of data points is often welcomed as it
yields more understanding.This then begs the question: Are there
alternatives to distributed processing where statistical understanding,
gleaned from large volumes of data, can be used for taming the
computational complexity of optimisation programs? Following this paradigm
we present two ideas for solving classification problems: the first
involving resampling constraints and the second involving chance
constraint programming. Time permitting we will show how these ideas can
be leveraged to build large scale focussed crawlers.
Speaker Bio
Chiranjib Bhattacharyya is an Associate Professor in the Department of
Computer Science and Automation, Indian Institute of Science. He is
interested in Robust Optimization and Machine Learning. Prior to joining
the Department he was a postdoctoral fellow at UC Berkeley. He holds BE
and ME degrees, both in Electrical Engineering, from Jadavpur University
and the Indian Institute of Science, respectively, and completed his PhD
from the Department of Computer Science and Automation, Indian Institute
of Science.