Why Plenoxels?
There are several projects that reduce training time of optimizing neural fields, such as DVGO-v1, DVGO-v2, Plenoxels, Instant Neural Graphics Primitives, TensoRF, and PointNeRF. Here, we compare recent fast-training NeRF models and describe why Plenoxels is a suitable representation for perception datasets.
Reason 1: Plenoxels is a fully explicit representation
According to the second version of DVGO paper, Improved Direct Voxel Grid Optimization, Plenoxels is the only representation that uses explicit features only. In other words, Plenoxels directly stores density volume, and view-dependent colors by spherical harmonics coefficients.
Method | Data structure | Density | Color | Training Time |
---|---|---|---|---|
DVGO | Dense Grid | Explicit | Hybrid | < 30min |
DVGO-v2 | Dense Grid | Explicit | Hybrid | < 20min |
Plenoxels | Sparse Grid | Explicit | Explicit | < 30min |
INGP | Multi-level Hash | Hybrid | Hybrid | < 5min |
TensoRF | Dense Grid | Explicit | Hybrid | < 30min |
PointNeRF | Point Cloud | Explicit | Explicit | > 1 day |
Reason 2: Great reconstruction quality
Plenoxels shows great ability for reconstructing scenes compared to the others in both indoor and outdoor scenarios. We randomly pick 5 sequences each from CO3D and ScanNet. We report the rendering quality and training time for each method. We compare Plenoxels with DVGO-v2 since it has shown comparable performance on outdoor scenes. For the other methods, we could not use them as our data format since 1) INGP implicitly encodes geometries and does not cover unbound scenarios, 2) TensoRF and DVGO-v1 do not have representation for backgrounds, and 3) PointNeRF takes a long time for optimization. For DVGO-v2, we follow the Tanks and Temples setup.
We will soon add experiments about reconstruction abilities of recent methods.