Release Announcement: Bayeslite 0.1

The MIT Probabilistic Computing Project is pleased to announce a generally available pre-release of bayeslite, a new implementation of BayesDB. BayesDB is a probabilistic programming platform that aims to let users directly query the probable implications of data. This is the first release on a new codeline, a fresh implementation of BayesDB built on top of SQLite and integrated with Python data analysis infrastructure.

For installation instructions, documentation, and to provide us with feedback, please visit http://probcomp.csail.mit.edu/bayesdb.

Bayeslite 0.1 provides a bare-bones implementation of the Bayesian Query Language (BQL), including:

  • BQL Query Primitives
    • SELECT
    • INFER
    • ESTIMATE
    • SIMULATE
  • BQL Model-based Estimators and Predictors
    • PROBABILITY DENSITY OF <col>
    • CORRELATION WITH <col>
    • SIMILARITY TO <row>
    • PROBABILITY OF DEPENDENCE WITH <col>
    • MUTUAL INFORMATION WITH <col>

Bayeslite 0.1 does NOT include support for critical BayesDB features, such as:

  • Diagnostics for model and inference quality
  • Machine-assisted modeling via the Meta-Modeling Language (MML)
  • Integration of user-specified models

As such, it is NOT appropriate for analyzing novel datasets. These features are under active development and will be released as soon as we responsibly can.

Bayeslite 0.1 is designed to integrate with Python data analysis infrastructure, including iPython/Jupyter notebooks, Pandas, and Matplotlib.

The 0.1 release is accompanied by the release of an analyzed dataset to explore, satellites.bdb, based on the Union of Concerned Scientists Satellite Data. You are invited to explore this data, help improve data quality for this dataset, and develop your own applications, queries, and visualization techniques. We welcome submission of iPython notebooks or database traces to showcase bugs, feature requests, or unique accomplishments.

BayesDB is research software. It is not (yet) recommended for unsupervised use on novel data or in production settings (even if restricted to the data we’ve provided). We are requesting community help in assessing model and inference quality, as well as characterizing the quality of the Meta-Modeling Language (MML) and its associated machine-assisted modeling engine.



Would you like to apply for private alpha?

BayesDB is open source under the Apache License version 2.0. BayesDB research and development is supported by grants from DARPA (under the PPAML program), IARPA, the Office of Naval Research, and the Army Research Laboratory, with additional support from Google and from the Bill and Melinda Gates Foundation. BayesDB is a part of the Venture platform, also supported by PPAML, and includes components from CrossCat, supported by DARPA (under the XDATA program).

First Post

Welcome to the research blog for the MIT Probabilistic Computing Project. We’ll be using this blog to:

  1. Announce research projects, software releases, data analyses, and academic publications.
  2. Present mini tutorials on technical topics, especially probabilistic programming.
  3. Describe and discuss preliminary research that is currently underway, including new ideas and proposals that we believe are of interest to the broader community and would benefit from early feedback.
  4. Illustrate connections between probabilistic computing and work in other fields, especially the social sciences and artificial intelligence.
  5. Share open problems, opportunities for collaboration, and requests for help.