Technical Program

Paper Detail

Paper Title Gradient Coding Based on Block Designs for Mitigating Adversarial Stragglers
Paper IdentifierFR3.R2.5
Authors Swanand Kadhe, O. Ozan Koyluoglu, Kannan Ramchandran, University of California, Berkeley, United States
Session Coding for Stragglers
Location Saint Germain, Level 3
Session Time Friday, 12 July, 14:30 - 16:10
Presentation Time Friday, 12 July, 15:50 - 16:10
Manuscript  Click here to download the manuscript
Abstract Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, suffer from slow running machines, called 'stragglers'. Gradient coding is a coding-theoretic framework to mitigate stragglers by enabling the server to recover the gradient sum in the presence of stragglers. Approximate gradient codes are variants of gradient codes that reduce computation and storage overhead per worker by allowing the server to approximately reconstruct the gradient sum. In this work, our goal is to construct approximate gradient codes that are resilient to stragglers selected by a computationally unbounded adversary. Our motivation for constructing codes to mitigate adversarial stragglers stems from the challenge of tackling stragglers in massive-scale elastic and serverless systems, wherein it is difficult to statistically model stragglers. Towards this end, we propose a class of approximate gradient codes based on balanced incomplete block designs (BIBDs). We show that the 'approximation error' for these codes depends only on the number of stragglers, and thus, adversarial straggler selection has no advantage over random selection. In addition, the proposed codes admit computationally efficient decoding at the server. Next, to characterize fundamental limits of adversarial straggling, we consider the notion of 'adversarial threshold' -- the smallest number of workers that an adversary must straggle to inflict certain approximation error. We compute a lower bound on the adversarial threshold, and show that codes based on symmetric BIBDs maximize this lower bound among a wide class of codes, making them excellent candidates for mitigating adversarial stragglers.