We present SharpDepth, a diffusion-based depth model for refining metric depth estimators, e.g., UniDepth, without relying on ground-truth depth data. Our method can recover sharp details in thin structures and improve overall point cloud quality.
The gallery below presents several images from the internet and a comparison of SharpDepth with the previous state-of-the-art metric depth like UniDepth. Use the slider and gestures to reveal details on both sides.
Quantitative comparison of SharpDepth with SOTA metric depth estimators on several zero-shot benchmarks. Our method achieves accuracy comparable to metric depth models. Further evaluation on synthetic datasets (Sintel, UnrealStereo, and Spring) and a real dataset (iBims) shows that our method significantly outperforms UniDepth in both edge accuracy and completeness. By leveraging priors from the pre-trained diffusion model, our approach produces sharper depth discontinuities and achieves high accuracy across datasets. In contrast, discriminative-based methods often produce overly smooth edges, leading to higher completeness errors.