Roadmap

Photo of people socializing at a reception at the Harvard Science and Engineering Complex

OpenDP’s core initiatives are detailed in the Roadmap. These initiatives are not comprehensive: there are lines of work that don’t fit clearly into an initiative and may not be represented here. We’ve broken the initiatives down into three categories: In-Progress, Planning/Design and Future.


In Progress

What the development team and community are working on right now:

Algorithmic Primitives

As needed to support use-case partners and to improve the usability and utility of the OpenDP Library, we are broadening the suite of supported mechanisms and combinators. Some of the near-term primitives include:

  • Variations of above threshold and sparse vector
  • Variations of private selection from private candidates

More Dataframe Functionality

The OpenDP Library integrates with the Polars dataframe library. OpenDP 0.11 Polars supports counts, sums, means, medians and quantiles, as well as grouping with protected group keys. The library operates under either add/remove or change-one neighboring relations, and uses Laplace, Gaussian and variations of the Exponential mechanism as appropriate. We plan to extend this functionality:

  • Automatic rewriting of queries to satisfy differential privacy. This would improve library usability, allow us to privatize standard SQL queries, and ultimately allow a tighter integration with SmartNoise SQL.
  • Add support for joins

More R Package Functionality

While the low-level Framework API is supported for Python and R, higher level interfaces which improve usability (Context API, Polars and Plugins) are not yet supported in R.

DP Wizard

We plan to add support for linear regression and synthetic data in DP Wizard.

Differential Privacy Deployments Registry

  • Develop governance structures to guide the development of http://registry.opendp.org/
  • Document our plans in a whitepaper
  • Continue to improve the deployment descriptions and the UI

Planning / Design

Strategic items that we’ve prioritized and are designing and testing. 

Alternate Dataset Types

Most of the algorithms in the OpenDP library operate on row-oriented multisets (i.e., tabular data). However, the library was designed to accommodate any type of dataset. We plan to extend the library to support other types of datasets, like graphs and streams.

External Compute

OpenDP functions are currently limited to running on a single CPU. In order to support larger datasets and more computationally intensive operations, we need to extend the library to support multiple machines and external compute resources. The plan is to provide combinators that execute Polars logical plans representing private computations on other compute backends. This approach allows OpenDP to reuse the privacy calculus implemented over the Polars Logical Plan DSL.


Future

What we’d like to work on but haven’t yet prioritized.

  • Federated Machine Learning
  • Uncertainty Estimates and Utility Framework
  • Additional Models of Privacy