Linear sketching is a data compression technique based on computing a small number of linear functions over the data. With careful choice of such functions one can perform various computations of interest later only looking at their values and not the data itself. In the recent years linear sketching has emerged as a powerful tool for approximate computing in settings with limited resources including distributed computation and streaming. It has been used in breakthrough results on graph and matrix processing, dimensionality reduction, etc. Strikingly, linear sketches have been shown to be optimal for dynamic stream processing under fairly mild assumptions.
In this talk I will describe a new study of linear sketching that focuses on understanding the power of linear sketches based on parities (i.e. over GF_2, the field of two elements, as compared to the previous work that uses real arithmetic). I will illustrate various properties of such sketches using Fourier-analytic methods and tools from communication complexity. In particular, linear sketching over GF_2 turns out to be closely related to Fourier sparsity with respect to Lp-norms. Moreover, it can be shown to be optimal in streaming and distributed settings for data generated according to the uniform distribution.
Joint work with Sampath Kannan (UPenn) and Elchanan Mossel (MIT) and Swagato Sanyal (TIFR).