The Life and Times of AdMob Engineering

Posts tagged ‘CTR prediction’

Recently, one of my papers (primarily drawn from the work in my PhD thesis) was accepted for publication in the IEEE Transactions of Information Theory. The paper is titled ‘Universal Denoising of Discrete-time Continuous-Amplitude Signals’, an abstract of which (reproduced from the manuscript) is as follows.

We consider the problem of reconstructing a discrete-time signal (sequence) with continuous-valued components corrupted by a known memoryless channel. When performance is measured using a per-symbol loss function satisfying mild regularity conditions, we develop a sequence of denoisers that, although independent of the distribution of the underlying `clean’ sequence, is universally optimal in the limit of large sequence length. This sequence of denoisers is universal in the sense of performing as well as any sliding window denoising scheme which may be optimized for the underlying clean signal. Our results are initially developed in a “semi-stochastic” setting, where the noiseless signal is an unknown individual sequence, and the only source of randomness is due to the channel noise. It is subsequently shown that in the fully stochastic setting, where the noiseless sequence is a stationary stochastic process, our schemes universally attain optimum performance. The proposed schemes draw from nonparametric density estimation techniques and are practically implementable. We demonstrate efficacy of the proposed schemes in denoising gray-scale images in the conventional additive white Gaussian noise setting, with additional promising results for less conventional noise distributions.

In other words, suppose we are presented with some vital data (most often result of measurements made on a system) that is unfortunately corrupted by noise and we are interested in the underlying clean data. The work in this paper involves a new denoising (or noise removal) algorithm that retrieves the underlying clean data from its corresponding noise corrupted version. The corruption source is modeled as a memoryless channel that can be described by its probability distribution function that is (assumed to be) known . The algorithm performs denoising by learning statistics of the noise corrupted data using nonparametric density estimation (smoothing). It is then “psuedo-inverted” to estimate the statistics of the underlying clean data followed by Bayesian estimation of the “actual” (clean) data values itself. The main contribution of the work is the algorithm analysis that establishes the asymptotic optimality of the scheme. A central ideas used in the analysis are, concentration inequalities (in probability) that measures the deviation of the loss incurred from applying the algorithm on the noisy data to a “genie-aided” scheme that has knowledge of the statistics of the underlying clean data. The genie-aided scheme has the luxury of making Bayesian optimal decision and we show that we asymptotically approach its performance by applying the proposed algorithm under mild regularity assumptions on the channel.

Information Theory, in its essence, is a very academic discipline with the publications therein being heavy in mathematical analysis of algorithms for problems such as data compression, prediction, filtering etc. However the problems being analyzed and studied have significant practical implications in building engineering systems. On the face of it, it appears to be a whole different universe from mobile advertising. But many of the analytical tools used in design and analysis of information processing algorithms also find application in optimization of mobile advertising. For example, universal prediction schemes have potential applications in CTR prediction which is vital in revenue estimation and hence traffic allocation problems in an advertisement network

- Kamakshi Sivaramakrishnan, Optimization Team

Copyright © The Life and Times of AdMob Engineering. All rights reserved.