2019 IEEE North American School of Information Theory

(Link to YouTube Playlist)

Prof. Alexander Barg, University of Maryland, College Park
Erasure codes for distributed storage and related problem     (PDF Presentation Slides   Part 1    Part 2)

Abstract: The task of node repair in distributed storage gives rise to a range of new, previously unrecognized problems in coding theory and related areas of computer science and discrete mathematics. These problems have been actively studied for the past decade and led to the emergence of new methods and ideas in these areas. The goal of this tutorial is to introduce these methods and the associated results as well as to point out new research directions. We will discuss the problems of repair bandwidth, repair degree, their limits and optimal algebraic solutions, repair of RS codes, extensions of the basic questions to the repair of multiple nodes and various repair schedules, open problems, and extensions of the main ideas beyond the field of coding theory.

Prof. Tara Javidi, University of California, San Diego
Sequential Acquisition of Information: From Active Hypothesis Testing to Active Learning

Abstract: This lecture explores an often overlooked connection between the role of feedback in information theory and that of active hypothesis testing in statistics. This connection, we argue, has significant implications for the next generation machine learning algorithms where data is collected actively and/or by cooperative yet local agents. Consider a decision maker who is responsible to actively and dynamically collect data/samples so as to enhance the information about an hidden parameter of interest while accounting for the cost of communication, sensing, or data collection. The decision maker must rely on his current information state to constantly (re-)evaluate the trade-off between the precision and the cost of various actions. An important example of this problem is that of channel coding with feedback whose solution, in terms of Extrinsic Jensen-Shannon divergence and posterior matching, provides critical insights for the design of the generation machine learning algorithms.

In the first part of the talk, we discuss the history of the problem and seminal contributions by Blackwell, Chernoff, De Groot, and Stein. Considering the problem of measurement-based noisy search as a special case, we discuss the information theoretic notions of acquisition rate and reliability (and their fundamental trade-off), Extrinsic Jensen-Shannon divergence, and posterior matching. In the second part of the talk, we will use these notions and insights for an array of important problems in machine learning: 1) active Bayesian learning, 2) empirical function optimization, and 3) multi-agent learning.

Prof. Maxim Raginsky, University of Illinois, Urbana-Champaign
Information, Concentration, and Learning     (PDF Presentation Slides)

Abstract: During the last two decades, concentration of measure has been a subject of various exciting developments in convex geometry, functional analysis, statistical physics, high-dimensional statistics, probability theory, information theory, communications and coding theory, computer science, and learning theory. One common theme that emerges in these fields is probabilistic stability: complicated, nonlinear functions of a large number of independent or weakly dependent random variables often tend to concentrate sharply around their expected values. Information theory plays a key role in the derivation of concentration inequalities. Indeed, both the entropy method and the approach based on transportation-cost inequalities are two major information-theoretic paths toward proving concentration.
Machine learning algorithms can be viewed as stochastic transformations (or channels, in information-theoretic parlance) that map training data to hypotheses. Following the classic paper of Bousquet and Elisseeff, we say that such an algorithm is stable if its output does not depend too much on any individual training example. Since stability is closely connected to generalization capabilities of learning algorithms, it is of theoretical and practical interest to obtain sharp quantitative estimates on the generalization bias of machine learning algorithms in terms of their stability properties. In this tutorial, I will survey a recent line of work aimed at deriving stability and/or generalization guarantees for learning algorithms based on mutual information, erasure mutual information, and related information-theoretic quantities.

Prof. Kannan Ramchandran, Padovani Lecturer, University of California, Berkeley
On duality, encryption, sampling and learning: the power of codes     (PDF Presentation Slides)

Abstract: We will provide short research vignettes that draw inspiration from Shannon's ground-breaking impact on modern information and communication systems. These reflect personal research stories, and include: (i) the uncovering of a functional duality between source coding and channel coding, that is of particular interest in the presence of side-information; (ii) exchangeability between compression and encryption in a secure and efficient system; (iii) a surprising connection between sampling theory and sparse-graph coding theory uncovered in the setting of multi-band sub-Nyquist sampling; and (iv) the powerful role of codes in learning sparse structure and speeding up distributed computation in modern cloud platforms.

Prof. Adam Smith, Boston University
Privacy, Stability, and Generalization     (PDF Presentation Slides)

Abstract: This course will cover differential privacy — a restriction on data analysis algorithms that offers meaningful confidentiality guarantees to individuals’ data against attackers with arbitrary side information. The first part of the course will explain why privacy can be difficult to ensure, discussing a range of attacks based on statistical releases. We will then introduce differential privacy and some of the basic techniques used to design differentially private algorithms. Finally, we will see applications of differential privacy to maintaining statistical validity of adaptive data analysis (in which data are re-used across a sequence of studies). Throughout, we will highlight connections to information-theoretic notions.

Dr. Kalyan Veeramachaneni, Massachusetts Institute of Technology
Hands-on Machine Learning Workshop: The Machine Learning Bazaar     (GitHub Link to Demos)