Paper Title Coded Distributed Computing over Packet Erasure Channels
Authors Dong-Jun Han, Jy-yong Sohn, Jaekyun Moon, Korea Advanced Institute of Science and Technology (KAIST), Korea (South)
Abstract Coded computation is a framework which provides redundancy in distributed computing systems to speed up large-scale tasks. Although most existing works assume error-free scenarios, the link failures are common in current wired/wireless networks. In this paper, we consider the straggler problem in distributed computing systems with link failures, by modeling the links between the master node and worker nodes as packet erasure channels. We first analyze the latency in this setting using an (n; k) maximum distance separable (MDS) code. Then, we consider a setup where the number of retransmissions is limited due to the bandwidth constraint. By formulating practical optimization problems related to latency, bandwidth and probability of successful computation, we obtain achievable performance curves as a function of packet erasure probability.