SuperFlex

Abstract

While recent advances like SuperDec have revitalized superquadrics for compact 3D scene understanding, they remain limited by rigid geometric constraints and lack robustness to partial observations. We present SuperFlex, an enhanced framework expanding the expressive power and applicability of superquadric decompositions. First, we include bending and tapering parameters, enabling high-fidelity representation of complex geometries. Second, a novel loss formulation and optimization scheme refines initial decompositions, significantly improving accuracy. These high-fidelity primitives enable training a robust model for direct prediction from noisy, partial real-world point clouds, a task previously infeasible for methods requiring full point clouds.

Method Overview

Overview of SuperFlex. Given an input point cloud, SuperFlex decomposes the object into a set of superquadric primitives, each defined by pose, shape, and deformation parameters. The model is trained via self-supervised joint volumetric and surface losses. A subsequent (optional) object-specific optimization can further improve the superquadric decomposition quality leveraging the same losses.

Results

Object-Level Results

We evaluate the performance of SuperFlex to decompose individual objects on ShapeNet, a traditional object dataset.

Our SuperFlex model significantly outperforms all the learning- and optimization-based methods, particularly in terms of IoU. The only method which is able to achieve similar IoU compared to our base model is Marching Primitives [4] which however uses more than 4x the primitives and is 10000x slower.

We compare with five baselines: SQ [1], CSA [2], EMS [3], Marching Primitives [4], SuperDec [5].

shapenet_quantitative — **Quantitative Results on ShapeNet**. We evaluate the accuracy of the reconstruction in terms of Intersection over Union (IoU) and the compactness of the representation in terms of number of primitives.

shapenet_quantitative_scatter — **Quantitative Results on ShapeNet**. We evaluate the accuracy of the reconstruction in terms of Intersection over Union (IoU) and the compactness of the representation in terms of number of primitives.

shapenet_qualitative — **Qualitative Results on ShapeNet**. We show results on test samples for in-category *(four first columns)* classes and out-of-category classes *(two last columns)*. The latter were not seen during training and illustrate how well models generalize to novel classes.

❮ ❯

We also show that with the introduction of our refinement stage, which is much faster than prior optimization methods, predictions from our feed-forward model can be enhanced further by obtaining a 22% improvement over the predictions.

optimization_qualitative — **Qualitative Results on ShapeNet**. We illustrate how the refinement step improves the quality of the fitted shapes.

optimization_quantitative — **Quantitative Results on ShapeNet**. We evaluate the accuracy of the reconstruction in terms of Intersection over Union (IoU) and the speed of the optimization in terms of runtime.

optimization_quantitative_scatter — **Quantitative Results on ShapeNet**. We evaluate the accuracy of the reconstruction in terms of Intersection over Union (IoU) and the speed of the optimization in terms of runtime.

❮ ❯

Partial Point Clouds

We evaluate reconstruction under realistic sensing conditions both quantitatively and qualitatively. We use the Aria Synthetic Environments (ASE) dataset for quantitative evaluation as having access to the GT 3D models allows to compute all the relevant metrics. ASE dataset contains scans of complex indoor scenes made up of objects from the Amazon Berkeley Objects (ABO) dataset. We use the provided depth maps together with the instance segmentation masks to extract object point clouds that exhibit viewpoint-dependent incompleteness, including self-occlusion, partial visibility, and occlusion by surrounding objects. We give those point clouds as input to our model and we use the ground truth 3D models from ABO to evaluate our predictions.

occlusions_qualitative — **Qualitative Results on ASE.** Top row: input point clouds. Below: the outputs of SuperFlex and our robust method.

occlusions_quantitative — **Quantitative Results on ASE.** We evaluate reconstruction performance under partial observations across all metrics.

❮ ❯

We also visualize the results of our robust method on some scenes from ScanNet++.

**Qualitative results on ScanNet++ scenes.** Top row shows renderings of the original point clouds, bottom row shows the superquadric representation obtained with SuperFlex.

Applications

We envision SuperFlex as an interpretable and scalable framework for efficient scene understanding in complex environments, combining non-rigid deformable primitives with a robust model for partial point clouds.

3D Scene Decomposition from RGB Frames

Inspired by how SAM3D proposed to reconstruct multi-object scenes from a single RGB frame, we experiment how SuperFlex can be used in the same monocular setting. We propose a modular pipeline that integrates SAM3 for object segmentation and MoGE for lifting 2D masks into partial 3D point clouds. While single-view observations suffer from inherent self-occlusion, our robust model, fine-tuned with occlusion augmentations, can still infer complete object geometries as sets of coherent deformable primitives.

**SuperFlex from a single RGB Image.** From an input RGB frame, we first compute instance point maps using SAM3 and MoGE. Then, we apply SuperFlex on top of each object’s point map.

These results suggest that SuperFlex has the potential to serve as a robust, compact, and interpretable alternative to traditional full-scene reconstructions even from monocular observations.

References

Despoina Paschalidou, Ali Osman Ulusoy, and Andreas Geiger. Superquadrics revisited: Learning 3d shape parsing beyond cuboids. In International Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Kaizhi Yang and Xuejin Chen. Unsupervised learning for cuboid shape abstraction via joint segmentation from point clouds. In ACM Transactions On Graphics (TOG), 2021.
Weixiao Liu, Yuwei Wu, Sipu Ruan, and Gregory S. Chirikjian. Robust and accurate superquadric recovery: a probabilistic approach. In International Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Liu, W., Wu, Y., Ruan, S., Chirikjian, G.S.: Marching-primitives: Shape abstraction from signed distance function. In International Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Fedele, E., Sun, B., Guibas, L., Pollefeys, M., Engelmann, F.: SuperDec: 3D Scene Decomposition with Superquadric Primitives. In: International Conference on Computer Vision (ICCV), 2025.

BibTeX

@inproceedings{superflex,
  title={{SuperFlex: Deformable Superquadrics for Point Cloud Decomposition}},
  author={Tavernini, Gabriel and Fedele, Elisabetta and Novello, Tiago 
        and Guibas, Leonidas and Pollefeys, Marc and Engelmann, Francis},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2026}
}