Most NeRF methods assume that training and test-time cameras capture scene content from a roughly constant distance:
They degrade and render blurry views in less constrained settings:
This is due to NeRF being scale-unaware, as it reasons about point samples instead of volumes. We address this by training a pyramid of NeRFs that divide the scene at different resolutions. We use "coarse" NeRFs for far-away samples, and finer NeRF for close-up samples: