In a 1985 paper, the pc scientist Andrew Yao, who would go on to win the A.M. Turing Award, asserted that amongst hash tables with a particular set of properties, the easiest way to seek out a person aspect or an empty spot is to only undergo potential spots randomly—an method generally known as uniform probing. He additionally acknowledged that, within the worst-case situation, the place you’re looking for the final remaining open spot, you’ll be able to by no means do higher than x. For 40 years, most laptop scientists assumed that Yao’s conjecture was true.
Krapivin was not held again by the standard knowledge for the straightforward cause that he was unaware of it. “I did this without knowing about Yao’s conjecture,” he mentioned. His explorations with tiny pointers led to a brand new type of hash desk—one which didn’t depend on uniform probing. And for this new hash desk, the time required for worst-case queries and insertions is proportional to (log x)2—far quicker than x. This consequence immediately contradicted Yao’s conjecture. Farach-Colton and Kuszmaul helped Krapivin present that (log x)2 is the optimum, unbeatable sure for the favored class of hash tables Yao had written about.
“This result is beautiful in that it addresses and solves such a classic problem,” mentioned Man Blelloch of Carnegie Mellon.
“It’s not just that they disproved [Yao’s conjecture], they also found the best possible answer to his question,” mentioned Sepehr Assadi of the College of Waterloo. “We could have gone another 40 years before we knew the right answer.”
Along with refuting Yao’s conjecture, the brand new paper additionally accommodates what many contemplate an much more astonishing consequence. It pertains to a associated, although barely completely different, scenario: In 1985, Yao appeared not solely on the worst-case occasions for queries, but additionally on the common time taken throughout all doable queries. He proved that hash tables with sure properties—together with these which can be labeled “greedy,” which signifies that new components have to be positioned within the first accessible spot—might by no means obtain a mean time higher than log x.
Farach-Colton, Krapivin, and Kuszmaul wished to see if that very same restrict additionally utilized to non-greedy hash tables. They confirmed that it didn’t by offering a counterexample, a non-greedy hash desk with a mean question time that’s a lot, a lot better than log x. In reality, it doesn’t depend upon x in any respect. “You get a number,” Farach-Colton mentioned, “something that is just a constant and doesn’t depend on how full the hash table is.” The truth that you’ll be able to obtain a relentless common question time, whatever the hash desk’s fullness, was wholly surprising—even to the authors themselves.
The staff’s outcomes could not result in any fast functions, however that’s not all that issues, Conway mentioned. “It’s important to understand these kinds of data structures better. You don’t know when a result like this will unlock something that lets you do better in practice.”
Authentic story reprinted with permission from Quanta Journal, an editorially impartial publication of the Simons Basis whose mission is to reinforce public understanding of science by overlaying analysis developments and traits in arithmetic and the bodily and life sciences.