Deepcode: Feedback Codes via Deep Learning

Submitted by admin on Tue, 06/11/2024 - 01:30

The design of codes for communicating reliably over a statistically well defined channel is an important endeavor involving deep mathematical research and wide-ranging practical applications. In this work, we present the first family of codes obtained via deep learning, which significantly outperforms state-of-the-art codes designed over several decades of research.

DeepJSCC-f: Deep Joint Source-Channel Coding of Images With Feedback

Submitted by admin on Tue, 06/11/2024 - 01:30

We consider wireless transmission of images in the presence of channel output feedback. From a Shannon theoretic perspective feedback does not improve the asymptotic end-to-end performance, and separate source coding followed by capacity-achieving channel coding, which ignores the feedback signal, achieves the optimal performance.

LEARN Codes: Inventing Low-Latency Codes via Recurrent Neural Networks

Submitted by admin on Tue, 06/11/2024 - 01:30

Designing channel codes under low-latency constraints is one of the most demanding requirements in 5G standards. However, a sharp characterization of the performance of traditional codes is available only in the large block-length limit. Guided by such asymptotic analysis, code designs require large block lengths as well as latency to achieve the desired error rate.

Tightening Mutual Information-Based Bounds on Generalization Error

Submitted by admin on Tue, 06/11/2024 - 01:30

An information-theoretic upper bound on the generalization error of supervised learning algorithms is derived. The bound is constructed in terms of the mutual information between each individual training sample and the output of the learning algorithm. The bound is derived under more general conditions on the loss function than in existing studies; nevertheless, it provides a tighter characterization of the generalization error.

Toward Moderate Overparameterization: Global Convergence Guarantees for Training Shallow Neural Networks

Submitted by admin on Tue, 06/11/2024 - 01:30

Many modern neural network architectures are trained in an overparameterized regime where the parameters of the model exceed the size of the training dataset. Sufficiently overparameterized neural network architectures in principle have the capacity to fit any set of labels including random noise. However, given the highly nonconvex nature of the training landscape it is not clear what level and kind of overparameterization is required for first order methods to converge to a global optima that perfectly interpolate any labels.

Stochastic Gradient Coding for Straggler Mitigation in Distributed Learning

Submitted by admin on Tue, 06/11/2024 - 01:30
We consider distributed gradient descent in the presence of stragglers. Recent work on gradient coding and approximate gradient coding have shown how to add redundancy in distributed gradient descent to guarantee convergence even if some workers are stragglers-that is, slow or non-responsive. In this work we propose an approximate gradient coding scheme called Stochastic Gradient Coding (SGC), which works when the stragglers are random.

Understanding GANs in the LQG Setting: Formulation, Generalization and Stability

Submitted by admin on Tue, 06/11/2024 - 01:30
Generative Adversarial Networks (GANs) have become a popular method to learn a probability model from data. In this paper, we provide an understanding of basic issues surrounding GANs including their formulation, generalization and stability on a simple LQG benchmark where the generator is Linear, the discriminator is Quadratic and the data has a high-dimensional Gaussian distribution.

The Information Bottleneck Problem and its Applications in Machine Learning

Submitted by admin on Tue, 06/11/2024 - 01:30

Inference capabilities of machine learning (ML) systems skyrocketed in recent years, now playing a pivotal role in various aspect of society. The goal in statistical learning is to use data to obtain simple algorithms for predicting a random variable Y from a correlated observation X. Since the dimension of X is typically huge, computationally feasible solutions should summarize it into a lower-dimensional feature vector T, from which Y is predicted.

MaxiMin Active Learning in Overparameterized Model Classes

Submitted by admin on Tue, 06/11/2024 - 01:30

Generating labeled training datasets has become a major bottleneck in Machine Learning (ML) pipelines. Active ML aims to address this issue by designing learning algorithms that automatically and adaptively select the most informative examples for labeling so that human time is not wasted labeling irrelevant, redundant, or trivial examples. This paper proposes a new approach to active ML with nonparametric or overparameterized models such as kernel methods and neural networks.

Expression of Fractals Through Neural Network Functions

Submitted by admin on Tue, 06/11/2024 - 01:30

To help understand the underlying mechanisms of neural networks (NNs), several groups have studied the number of linear regions ℓ of piecewise linear (PwL) functions, generated by deep neural networks (DNN). In particular, they showed that ℓ can grow exponentially with the number of network parameters p, a property often used to explain the advantages of deep over shallow NNs.