We consider constructing capacity-achieving linear codes with minimum message size for private information retrieval (PIR) from N non-colluding databases, where each message is coded using maximum distance separable (MDS) codes, such that it can be recovered from accessing the contents of any T databases. It is shown that the minimum message size (sometimes also referred to as the sub-packetization factor) is significantly, in fact exponentially, lower than previously believed. More precisely, when K > T/ gcd(N, T) where K is the total number of messages in the system and gcd(·, ·) means the greatest common divisor, we establish, by providing both novel code constructions and a matching converse, the minimum message size as lcm(N - T, T), where lcm(·, ·) means the least common multiple. On the other hand, when K is small, we show that it is in fact possible to design codes with a message size even smaller than lcm(N - T, T).
Read full abstract