Background
Neuron instance segmentation is an essential step in connectomics, the reconstruction of brain wiring diagrams at synapse-level resolution. Manual correction of automated segmentations is expensive: with currently available methods, it is estimated to cost billions of dollars to proofread the connectome of a single mouse brain, so automated methods need to improve. However, progress is difficult to measure due to a lack of established large-scale benchmark datasets, potentially slowing the development of better methods.
Existing benchmarks are limited:
- Size Constraints: Benchmarks like CREMI or SNEMI3D are saturated since they only provide small image volumes, making it difficult to reliably estimate false merge rates for modern methods that achieve low error rates.
- Resource Intensity: Larger datasets, such as the songbird volume ("j0126") used to evaluate advanced methods like FFN and LSD, require significant computational and development resources to segment, limiting accessibility for computationally constrained research groups.
- Limited Ground Truth: These large datasets still have limited ground truth for training (dense cubes) and evaluation (skeletons), covering only a small subset of the data because high-quality manual annotation is expensive. Additionally, human-generated ground truth suffers from label noise.
Synthetic Benchmark
We remedy this by providing the first large-scale synthetic benchmark datasets for neuron instance segmentation. We generate the data by first using procedural generation to create segmentations (random walks with branches for the "neurons" + some post-processing), and then using novel 3D diffusion models conditioned on the segmentation to generate realistic corresponding images. Since the segmentations are generated before the images, our datasets come with noise-free labels, in contrast to error-prone human annotations. The synthetic data generation is cost-effective (<200 USD per 27µm-cube with 9×9×20 nm voxels using rented GPUs).
Advantages
- Expanded Training Data: We essentially eliminate training data limitations for our benchmark, providing more than 100x the voxel-wise segmentation ground truth used in recent trainings in a single dataset. This allows for fair model comparison in a largely data-unconstrained training setting.
- Extensive Evaluation: Our test cubes are large enough to provide ~100mm path length each, enabling meaningful comparisons.
- Computational Accessibility: Yet, the cubes are small enough to be processed with reasonable resources (~2 hours on 1 GPU for our baseline) without requiring complex distributed inference setups.
Team
Franz Rieger, Ana-Maria Lăcătușu, Zuzana Urbanová, Andrei Mancu, Hashir Ahmad, Martin Bucella, and Joergen Kornfeld @ Max Planck Institute for Biological Intelligence. Data generously hosted by the Max Planck Computing and Data Facility. We are grateful to Alexandra Rother and Jonas Hemesath for their help with rendering.
Details on data generation and diffusion models will be published soon. Until then, please use the following BibTeX to cite this benchmark:
@misc{https://doi.org/10.17617/1.r2mm-1h33,
doi = {10.17617/1.R2MM-1H33},
url = {https://structuralneurobiologylab.github.io/nisb/},
author = {Rieger, Franz and Lăcătușu, Ana-Maria and Urbanová, Zuzana and Mancu, Andrei and Ahmad, Hashir and Bucella, Martin and Kornfeld, Joergen},
title = {NISB: Neuron Instance Segmentation Benchmark},
publisher = {Max Planck Institute for Biological Intelligence},
year = {2024}
}
This work is licensed under a CC BY-SA 4.0 license.