ENG EC517 A1 Spring 2008

Introduction to Information Theory

Project presentations:


Presentation time: solo projects 15 mins (hard limit) + 5 mins for questions and to change to the next presentation; group project: 20 mins + 5 mins.

Part-I: 5:00 – 8:00 pm Thursday, 1 May, 2008, PHO 404-428

1

Davide Zennaro

A Comparison between Network Coding and ARQ in a Converge-Casting Scenario

2

Binbin Li

The Capacity of Wireless Networks from order sqrt{n/log n} to n^{1-epsilon}

3

Nagaraja Shivashankar

Distributive Video Coding

BREAK (10 mins)

4

Limor Eger and Cristina Esteve

Image-adaptive Data Hiding

5

Brian Mearns

Least-Significant Bit Steganography and Steganalysis

6

Kunjan Rana

CAPTCHA as a Communications Problem

 

Part-II: 4:30 – 7:30 pm Friday, 2 May, 2008, PHO 404-428

 

7

Emily Hostage

An Information-Theoretic Structure for Genetics

8

Richard Williams

Stabilizable Data Rates for Discrete Time Linear Systems

9

Dhananjay Raghunathan

An Investigation into the use of Relative Motion between Robots in 2-D as a means for Signaling

BREAK (10 mins)

10

Andrew Huss

Probabilistic Proof and Graph Properties

11

Kirill Trapeznikov

Compressed Sensing: An Overview

12

Satyavrata

Information-Theoretic Feature Selection for Clustering

 

Speakers, titles, and abstracts

 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Davide Zennaro
 
A Comparison between Network Coding and ARQ in a Converge-Casting Scenario

 
 
 
 
Abstract
 

This paper presents a comparison between Network Coding (NC) and K-ARQ

(ARQ with a maximum of (K-1)retransmissions for each packet). Our key

objective is to assess the performance of NC when used as an error

control scheme rather than an efficient multicasting technique. We

focus here on a converge-casting sensor network, where nodes

transmit packets to a final destination (or sink) via a few relay nodes that

communicate with the sink using either NC or K-ARQ. We aim at

understanding if NC could actually be chosen as a valid alternative, in

terms of number of transmitted packets, to K-ARQ in some relevant

scenarios.

 
[Talk viewgraphs]
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Binbin Li
 
The Capacity of Wireless Networks: from order sqrt{n/log n} to n^{1-epsilon}
 
Abstract
 
Since the pioneering work of Gupta and Kumar, the capacity of large-scale 
wireless networks has received considerable attention in the research 
community. Recent refinements to these results have shown that under a
multihop architecture, the total network throughput cannot scale better
than order sqrt{n} as the number of nodes n grows, for a network with
randomly deployed nodes. However, a more recent work shows that for certain
types of networks, a near-linear growth in network capacity can be achieved
through intelligent node cooperation and distributed MIMO communication under
a hierarchical digital architecture. In this project, we will investigate how
and why these new results emerge and improve the capacity of wireless networks.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Nagaraja Shivashankar
 
Distributive Video Coding
 
Abstract
 

Digital Video domain has experienced much advancement in the last few years

and has achieved great success in different applications. Examples are video

streaming and many others. All current video coding systems are based on the

very well-known motion compensation prediction prototype where most of the

computational burden of the encoding/decoding process is put at the encoder

with simpler decoders. The Slepian-Wolf and Wyner-Ziv theorems have received

a great deal of interest recently by showing that it is feasible to shift part

of this computational burden to the decoder, allowing the existence of simpler

encoders. Distributed Video Coding (DVC) aims at exploiting these theorems to

build encoders that are more suitable for applications where low complexity

encoders are a must. This project explores the theoretical background of the

Slepian-Wolf and Wyner-Ziv coding theory in DVC, reviews the most relevant

DVC solutions available in the literature and comparison of these DVC with

conventional video codecs.

 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Limor Eger and Cristina Esteve
 
Image-adaptive Data Hiding
 
Abstract
 

Data hiding embeds data into digital media such as image, video, audio

or text, known as the host signal,  for the purpose of copyright

protection, content authentication, steganography, etc. Some of the

technical issues involved are the quantity of information bits to be

hidden, the ability of the embedding algorithm to survive common

signal processing operations (e.g. compression, noise addition) and

the ability of the embedding algorithm not to cause perceptual

degradation of the host signal. In this project we will focus on

applying the information-theoretic concepts to practical applications

of data hiding in images. In particular, we will explore image-adaptive

methods based on quantization techniques. These methods use local

criteria to determine where to hide data and therefore allow embedding

large volumes of data with low perceptual degradation.

 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Brian Mearns
 
Least-Significant Bit Steganography and Steganalysis
 
Abstract
 
The least significant bits of an image's pixels are often used by
steganographers to embed hidden messages because it causes only a slight
variation in the image which is (ideally) unnoticeable. We will explore
a few variations on this technique, including "dirty-paper" and
"wet-paper" coding. Further, we will investigate some techniques for
steganalysis (the detection of hidden messages) and see how well they
work for the codes discussed.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Kunjan Rana
 
CAPTCHA as a Communications Problem
 
Abstract
 
CAPTCHAs (Completely Automated Public Turing tests to tell Computers and Humans
Apart) are widely used in various web applications to prevent automated form
submission reducing spam and server load caused by malicious bots.  Most
commonly, they consist of an image of a series of letters that has been
distorted to prevent OCR (optical character recognition) systems from detecting
their contents.  However, many CAPTCHAs have been cracked through newer
detection schemes.  So, the role of the developer of the CAPTCHA algorithm is
to ensure that the image can be decoded easily by a human but would be very
difficult for current OCR methods.  This problem can be thought of as a
communication problem with the encoder encoding the alphabet of letters to
fixed images, the channel introducing distortions into the image, and the
human and various OCR tools decoding the message. For purposes of simplicity, 
I will focus on encoding a single letter and applying random and
nonrandom geometric distortions to explore the limits of detection.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Richard Williams
 
Stabilizable Data Rates for Discrete Time Linear Systems
 
Abstract
 
We will examine the problem of stabilizing an unstable linear time-
invariant discrete time system through a noiseless feedback loop of
limited bandwidth.  We will prove the data rate theorem, which states
that the set of stabilizable data rates is the set of all data rates
greater than the intrinsic entropy of the plant.  Simulation results
will also be presented, demonstrating the performance of a coder-
controllers constructed along the lines of the proof argument.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Dhananjay Raghunathan
 
An Investigation into the use of Relative Motion between Robots in 2-D 
as a means for Signaling
 
Abstract
 
We present our investigation into the use of relative motion
between robots as a means for signaling between the robots. 
The idea is formulated in the framework of information
theory. Continuity in the motion profiles of the robots with 
respect to their path and velocity is a necessary constraint 
for this problem. We present one particular block coding scheme
which involves the transmission of straight-line slopes in the 
path as symbols. We determine a sequence of optimal codes for 
this coding scheme by minimizing the deviation from initial
relative position at the end of a transmission, maximizing the
separation of code-words based on a distance metric between 
code-words, and minimizing the average energy cost per symbol
transmitted per code. Assuming an additive Gaussian noise model, 
we develop a maximum likelihood estimator for the decoder. 
Simulations are presented illustrating the ideas developed.
Some open questions are shown.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Andrew Huss
 
Probabilistic Proof and Graph Properties
 
Abstract
 
The probabilistic method is a powerful approach for proving the existence of
objects without providing an explicit construction, which may be difficult.  
For example, the method can be applied in the proof of Shannon's
Channel Coding Theorem to prove the existence of channel codes which achieve
near-optimal rates.  This project explores the application of the
probabilistic method to generalized graphs.  It is possible to prove
interesting relationships between easy-to-measure properties of graphs such
as cardinality and harder-to-measure properties such as clique size and
chromatic number.  These results can be applied algorithmically to shorten
some computationally expensive calculations.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Kirill Trapeznikov
 
Compressed Sensing: An Overview
 
Abstract
 
Modern acquisition systems capture enormous amounts of data and then
significantly compress it to make storage and communication feasible. A
consumer digital camera with a three mega pixel sensor only stores a
compressed image with a size of just few hundred kilobytes. So why do we
record so much data only to throw away most of it during compression? Can we
record only a small fraction of the data and still have perfect or near
perfect reconstruction? The answer is yes, and the method is called
Compressed Sensing or Compressive Sampling (CS).
 
The Nyquist Sampling Theorem dictates us to sample at least at twice the
maximum frequency of a band limited signal to preserve perfect signal
reconstruction. However, many signals, such as images are not band limited
and may have very few non-zero components in the frequency or some other
domain. (Our signal is sparse in the domain of choice.) If we can somehow
capture only these very few non-zero coefficients, we will have perfect
signal reconstruction. This would be a rough sketch of a typical image
compression algorithm only if we had the original signal to determine the
exact locations of the non-zero coefficients. But we do not have this
information. CS solves this problem by having a measurement process
independent of the signal being measured. CS measurements are random linear
combinations of the signal. Reconstruction is done by solving an
optimization problem. An algorithm searches for the sparsest signal out of
all possible signals that could yield our recorded measurement. If the
measurement process and the signal satisfy certain conditions, the
reconstruction process is exact with high probability while sampling was at
a rate much less than Nyquist suggests.
 
My presentation/project will explore the interesting field of CS. I will
explain the theory, provide examples and draw connections to Information
Theory.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Satyavrata
 
Information Theoretic Feature Selection for Clustering
 
Abstract
 
Appropriate feature selection and weighting is crucial for clustering 
algorithms to successfully handle multi-dimensional data. A feature 
is relevant when it is correlated with the classification, mutually 
independent of other features, but possibly correlated with other 
features. For any feature, these characteristics, and hence the 
weighting, can be determined using information theoretic quantities,
e.g., mutual information with other features and the veridical cluster
assignment available from training data. An application of the technique
to feature weighting in a speech separation task is presented.
 
[Talk viewgraphs]
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 
Emily Hostage
 
An Information-Theoretic Structure for Genetics
 
Abstract
 
Since Claude Shannon introduced it in 1948, information theory has been the
inspiration for a variety of new approaches to existing research, ranging
from image processing and pattern recognition to finance and statistical
analysis. One such application receiving growing attention is genetics. 
Before drawing viable conclusions from such an examination, the findings of 
information theory must be translated into a language useful in a biological 
context.  In this project we will examine research leading to a properly
defined structure for measuring entropy, compressing data, searching for
patterns, and comparing data sets using sequences of DNA.  We will observe
the analogy of genetic distance to informational distance, and see how this
structure facilitates a quantitative study of biological processes such as
phylogeny, genetic regulation and function, and binding site manipulation.
 
[Talk viewgraphs]