Standardizing Benchmarks in Spatial Omics with OpenProblems

Author

Sai Nirmayi Yasa

Published

November 4, 2024

Spatial omics is an emerging field that is allowing researchers to unravel the complexities of cellular organization and function within their native tissue context. Given the diversity of technologies and techniques involved in spatial omics, there isn’t a one-size-fits-all solution for data analysis. This calls for formalized and standardized benchmarks to guide researchers in selecting the right methods for their analyses. This is where OpenProblems comes in. The success of OpenProblems in benchmarking computational tasks in single-cell analysis is attracting the attention of the spatial omics community, leading to a growing number of researchers contributing their own benchmarks to the OpenProblems platform.

New Spatial Omics Tasks in OpenProblems

Several new benchmarking tasks related to spatial omics are being introduced:

  • Spatial Decomposition: Carried over from the first version of OpenProblems, this task focuses on methods that estimate the composition of cell types/states that are present at each capture location.

  • Spatially Variable Genes: This recently released task focuses on methods that detect genes whose expression levels vary across spatial regions.

  • Spatial Simulators: An ongoing task is to assess methods that simulate spatial transcriptomics data. These simulators are crucial for validating analytical methods when gold-standard datasets with ground-truth labels are scarce or unavailable.

  • Image-Based Spatial Transcriptomics Preprocessing: A benchmark to evaluate various approaches for preprocessing image-based spatial transcriptomics data is being developed. Unlike the other tasks where AnnData was the primary data format used to standardise datasets, efforts are also being made to support the SpatialData format, enhancing the interoperability across methods in the spatial omics field.

Looking Ahead

OpenProblems aims to provide a platform for developing comprehensive and continuously updated benchmarks for various spatial omics analysis tasks. By developing a unified and extensible framework, benchmarking results are ensured to not only be reproducible but also reusable, serving as foundational resources for developing other tasks.

A key example of this is the development of the spatial simulators task. The current lack of comprehensive gold standard datasets with well-defined ground truth information has led to each spatial omics task relying on its own distinct datasets for evaluation, causing inconsistencies in testing across tasks. But once the spatial simulator benchmark is fully developed, the best simulator could generate a set of standardised high-quality synthetic datasets, providing a common resource that could be used across multiple tasks in the absence of real-world gold standards.

Similarly, the best approach from the image-based spatial transcriptomics preprocessing could set the stage for developing more standardised workflows in spatial omics, allowing researchers to follow more rigorous, reproducible workflows, ultimately accelerating discovery in this field.

Elevate your data workflows

Transform your data workflows with Data Intuitive’s complete support from start to finish.

Our team can assist with defining requirements, troubleshooting, and maintaining the final product, all while providing end-to-end support.

Contact Us