Advisory: Mistakenly collected proxy churn measurements on the Snowflake broker have been deleted

This is a note for transparency to state that the Snowflake team accidentally recorded some information derived from Snowflake proxy IP addresses in a way that was not as protective of privacy as we intended. The recorded information was never published and has now been deleted.

Background: the Snowflake team wanted to know how often Snowflake proxy IP addresses change. (The rate of “proxy churn.”) In order to avoid having to store proxy IP addresses, even temporarily, we designed an experiment using Bloom filters to compare the overlap of two sets of IP addresses without storing the IP addresses directly. We intended to blind each IP address by hashing it with a secret random string, so that original IP addresses would not be recoverable, even probabilistically, from our published data.

The mistake was that our first deployment did not use a properly random string for this blinding operation. If the measurements had been published, it would have been possible to probabilistically check whether an IP address had been a Snowflake proxy during the time interval covered by the measurements. The file containing the measurements was only ever stored locally on the broker and was never published before being deleted.

If you want to read about the results of the proxy churn experiment (using properly blinded proxy IP addresses), you can find the text here and a graph here.

2 Likes