Latest works on 3D reconstruction from posed pictures have demonstrated that direct inference of scene-level 3D geometry with out iterative optimization is possible utilizing a deep neural community, displaying exceptional promise and excessive effectivity. Nonetheless, the reconstructed geometries, sometimes represented as a 3D truncated signed distance operate (TSDF), are sometimes coarse with out wonderful geometric particulars. To handle this drawback, we suggest three efficient options for enhancing the constancy of inference-based 3D reconstructions. We first current a resolution-agnostic TSDF supervision technique to offer the community with a extra correct studying sign throughout coaching, avoiding the pitfalls of TSDF interpolation seen in earlier work. We then introduce a depth steerage technique utilizing multi-view depth estimates to boost the scene illustration and get better extra correct surfaces. Lastly, we develop a novel structure for the ultimate layers of the community, conditioning the output TSDF prediction on high-resolution picture options along with coarse voxel options, enabling sharper reconstruction of wonderful particulars. Our technique produces easy and extremely correct reconstructions, displaying important enhancements throughout a number of depth and 3D reconstruction metrics.