Cellarium AI Platform

Cellarium AI Platform is a cloud-native DataOps and MLOps infrastructure for single-cell biology inspired by modern AI practices.

What is Cellarium AI Platform?

Cellarium AI Platform is a comprehensive open-source cloud-native infrastructure engineered from the ground up to scale single-cell workflows for billion-scale datasets. Cellarium AI Platform streamlines both data operations (“DataOps”) and machine learning operations (“MLOps”) for single-cell omics data, ranging from rapid prototyping to production-scale execution, model design, and deployment of ML-based single-cell data analysis workflows and applications.

Cellarium DataStore

The Cellarium DataStore, backed by Google BigQuery serverless cloud-native data warehouse, enables the ingestion, storage, partitioning, and querying of exabytes of cellular matrix features data.

Cellarium ML Library

The Cellarium ML library empowers the development and training of complex ML models in distributed computing environments. The library currently implements a wide range of transformation and models, including scalable parallel-scan and probabilistic implementations of Principal Component Analysis (PCA), logistic regression, flexible single-cell variational autoencoders, and Transformer-based family of models. ML pipelines, specified using our flexible domain-specific language (DSL), configure data preprocessing, model training, and model inference tasks. These containerized pipelines are deployable on horizontally and vertically scaling Kubernetes-based platforms like Google Vertex AI Pipelines to meet the analysis demands of virtually all data scales.

Cellarium Microservices

Cellarium AI Platform additionally supports various auto-scaling microservices, including approximate nearest neighbor (ANN) search for swift similarity-based querying of embedded cellular features and model serving for hosting large models.

Cellarium Applications

Cellarium AI Platform components culminate in Cellarium Applications— robust and secure scientific web services that accelerates the pace of cell biology research using the power of modern AI and vast amounts of high-throughput cell biology data. The first Cellarium AI Platform application is Cell Annotation Service (CAS)— a single-cell transcriptomics inverse search engine powered by Cellarium AI Platform.

Where to go next?

Cellarium AI Platform is open-source and is actively developed in GitHub under 3-Clause BSD license. The platform is currently used internally by Broad researchers and application developers. We are actively working on improving the platform’s documentation and user-experience toward the goal of making the platform available as an deployable Infrastructure-as-a-Service (IaaS). Meanwhile, we plan to make individual applications built using the platform, such as Cell Annotation Service (CAS), publicly available to the community. Please stay tuned for more!

Affiliated labs

BROAD institute logo

Funding

BROAD institute logo

Team

Mehrtash Babadi

Stephen Fleming

Yerdos Ordabayev

Fedor Grab

Kevin Lydon

Bingxing Huo

Kristian Cibulskis

Timothy Tickle

Brice Wang

Yang Xu

Nicolas “Muscles” Malfroy-Camine

Back to top