Journals — The Voltas

LiDAR and Volumetric Capture, The Building Blocks of a Realistic Digital Twin

Digital twin, Mixed reality, LiDAR, Depth cameras, SLAM, Spatial anchoring, World tracking, OpenXR, Volumetric capture, Gaussian splatting

Editorial

How sensors and anchors make MR content stay put in the real world.

Volume 1 - Issue 1

10 Minutes

Mixed Reality

November 21, 2025

This article argues that trustworthy digital twins in mixed reality depend on a tight chain from sensing to persistence: robust pose estimation, environment reconstruction, and repeatable world-locked anchoring that survives time, tracking loss, and multi-device use. It outlines the sensor stack underpinning metric correctness, comparing LiDAR, structured light, time-of-flight, and stereo depth, then shows why depth alone is insufficient without anchor lifecycles, permissions, and recovery behaviours embedded at platform level. It maps the post-cloud landscape following the retirement of Azure Spatial Anchors towards vendor-native systems and emerging portable semantics in OpenXR spatial entities, with implications for cross-platform test matrices and maintenance burden [2], [3]. For richer twins, it positions volumetric capture and point-cloud compression (including V-PCC) alongside neural scene representations such as Gaussian Splatting, stressing the need to register visual assets to metric frames and anchor graphs to avoid drift and false physical interactions [9,10]. The piece closes with design rules and a layered reference architecture that prioritise textured localisation features, anchor redundancy, privacy-aware depth access, and governance models that measure stability as infrastructure rather than novelty [1].

[1] Meta, “Spatial anchors, overview and tutorials,” May–Aug. 2024. [Online]. Available: developers.meta.com. developers.meta.com+2developers.meta.com+2
‍[2] The Khronos Group, “Khronos releases OpenXR 1.1,” Apr. 15, 2024; “OpenXR Spatial Entities Extensions,” Jun. 10–14, 2025. [Online]. Available: khronos.org. The Khronos Group+2The Khronos Group+2
‍[3] Microsoft Azure, “Azure Spatial Anchors retirement,” Dec. 2, 2023; GitHub CLI deprecation follow-up, Feb. 20, 2025. [Online]. Available: azure.microsoft.com; github.com. Microsoft Azure+1
‍[4] Apple, “RoomPlan framework and samples; ARKit tracking guidance,” 2022–2024. [Online]. Available: developer.apple.com. Apple Developer+3Apple Developer+3Apple Developer+3
‍[5] Google, “ARCore Geospatial API and anchors,” Oct. 31, 2024. [Online]. Available: developers.google.com. Google for Developers+2Google for Developers+2
‍[6] Apple, “Capturing depth using the LiDAR camera,” sample code and WWDC session, 2022; “TrueDepth camera overview,” 2017. [Online]. Available: developer.apple.com. Apple Developer+2Apple Developer+2
‍[7] The Verge, “Microsoft kills Kinect again,” Aug. 21, 2023; Orbbec, “Products based on Microsoft iToF depth technology,” Aug. 17, 2023; PR Newswire, “Orbbec showcases Azure Kinect DK replacement,” Mar. 20, 2024. [Online]. The Verge+2orbbec.com+2
‍[8] Intel, “Product Change Notice 807360-01; RealSense LiDAR and tracking EOL,” Apr. 4, 2024; support notes updated May 28, 2024. [Online]. Available: intel.com; support.realsenseai.com. Intel CDRD+1
‍[9] ISO/IEC, “23090-5, Visual volumetric video-based coding and V-PCC,” 2023–2025 editions; DVB Study Mission on Volumetric Video, Feb. 2024. [Online]. ISO+2ITEH Standards+2
‍[10] B. Kerbl et al., “3D Gaussian Splatting for real-time radiance field rendering,” 2023; and follow-on works on depth-aware or anchored splatting, 2024–2025. [Online]. Available: arXiv; Inria repo. arXiv+1
‍[11] Apple, “Apple Vision Pro privacy overview,” 2024. [Online]. Available: apple.com/privacy. Apple
‍[12] Epic Games, The Virtual Production Field Guide, 2019.
[13] E. Malakhatka et al., “XR Experience Design and Evaluation Framework,” in Human-Technology Interaction, Springer, 2025.
[14] BoF and McKinsey, The State of Fashion 2025, 2024.

If a digital twin is going to be trusted, it must respect the physics and geometry of its real counterpart. That reliability starts with how we sense space and how we anchor content. In mixed reality, LiDAR, depth cameras and modern spatial anchoring pipelines convert raw photons into coordinates that remain stable across time and devices, which is why virtual objects appear to “stick” to walls, tables and people with convincing precision.

Why “sticking” is hard

Mixed reality systems have to solve three problems at once, and at interactive rates. First, they estimate pose, the six-degree-of-freedom position and orientation of the device. Second, they reconstruct geometry and semantics of the environment, for example planes, edges and surfaces. Third, they persist world-locked references so that content reappears in the same place later, or on another device in the same space. These steps map to visual-inertial odometry and SLAM, scene understanding, and anchoring or persistence. On modern platforms, anchoring is no longer an application-specific trick; it is a first-class feature with lifecycle, permissions and sharing models. Meta documents anchors as world-locked frames that persist between sessions, with explicit guidance on creation, updates and sharing, reflecting how core this has become to MR reliability. developers.meta.com+2developers.meta.com+2

Sensor fundamentals, LiDAR and depth

Depth imaging on phones and head-mounted displays typically uses one of three approaches. Structured light projects a known infrared pattern and triangulates disparities, used by Apple’s TrueDepth camera on the front of several devices. Time-of-flight measures phase or time delay of modulated light, used by Apple’s rear LiDAR Scanner for room-scale mapping. Passive stereo uses two RGB sensors and triangulation. Apple’s documentation distinguishes structured light on TrueDepth from LiDAR’s time-of-flight capture pipelines, and exposes sample code for streaming LiDAR depth for measurement and occlusion. Apple Developer+2Apple Developer+2

Industrial depth cameras remain central to professional capture stages and robotics. Microsoft’s Azure Kinect DK, widely used in volumetric capture rigs, was discontinued in 2023, with Microsoft directing developers to partner devices; Orbbec’s Femto series now fills that role using Microsoft iToF depth technology. The Verge+2orbbec.com+2 Intel also formally ended its LiDAR and tracking product lines, noting last-time-buy status and a focus on stereo depth. This matters because production teams planning multi-sensor capture or mobile rovers for site scanning need to consider supply stability, driver support and calibration options across mixed fleets. Intel CDRD+1

From sensing to a map you can anchor to

Depth alone does not guarantee stability. Anchoring depends on robust frames of reference that can be recovered after tracking loss and across sessions. In 2024, Microsoft retired Azure Spatial Anchors, a cloud service that provided cross-device persistence. Developers have moved to platform-native alternatives or to standards-based stacks. The deprecation and retirement notices are explicit, and several developer ecosystems have since updated their anchor guidance. Microsoft Azure+2Microsoft Learn+2

Today, vendors expose anchor and scene APIs with clearer semantics. Meta’s documentation covers shared anchors, depth permissions and scene models that unify planes and meshes under a privacy-gated permission. developers.meta.com+1 Apple provides world tracking, room scanning and RoomPlan for parametric models, with examples for multi-room capture and recovery when tracking quality drops. The WWDC material notes that when the system falls back to orientation-only tracking, anchors can be temporarily “not tracked,” then resume when world tracking recovers, which is essential for user trust. Apple Developer+3Apple Developer+3Apple Developer+3 Google’s ARCore complements device-local anchors with the Geospatial API that uses VPS imagery for outdoor scale, an approach suited to city-scale digital twins. Google for Developers+2Google for Developers+2

At the standardisation layer, OpenXR 1.1 consolidates common extensions and signals improved handling of spatial entities, while the 2025 Spatial Entities Extensions move plane, marker and persistent anchors toward portable semantics across runtimes. For teams building multi-device digital twins, that shift reduces bespoke per-platform code and makes test matrices manageable. DEVELOP3D+3The Khronos Group+3The Khronos Group+3

Volumetric capture, people inside the twin

A faithful digital twin often needs people, garments or machinery recorded volumetrically. Traditional volumetric video uses multi-camera arrays of RGB or RGB-D sensors, then compresses dynamic point clouds using standards like MPEG’s V-PCC under ISO/IEC 23090-5, which reached updated editions through 2024–2025 and is under active work. Rig architects should plan for V3C or V-PCC in their storage and delivery pipelines to avoid lock-in and to meet bitrate constraints. DVB+2ISO+2

Real-time neural scene representations have expanded the toolkit. Gaussian Splatting achieves fast, photoreal novel view synthesis from camera images, which enables quick capture of small spaces and props for MR previews. The technique is already widely implemented in open source and commercial tools. For MR anchoring, two cautions apply. First, many radiance field methods optimise visual quality rather than metric accuracy, so splat clouds must be aligned to device tracking frames or fused with metric depth if content must interact physically with site geometry. Second, long-term persistence still relies on anchors that can be resolved reliably, so splat assets should be registered to anchor sets or room meshes. Recent research explores depth-aware or anchored splatting that reduces drift, which is directly relevant to MR stability. arXiv+3arXiv+3repo-sam.inria.fr+3

For production teams in creative industries, virtual production guides outline how depth capture and real-time engines shorten iteration cycles and bring VFX decisions earlier into the shoot, which is one reason volumetric capture is finding a home on LED stages and XR volumes.

World anchoring, practical design rules

From shipping SDKs and production practice, several design rules have emerged.

Prefer surfaces and features with visual texture. Trackers and VPS systems localise more reliably on textured, static surfaces; feature-poor areas and low light degrade accuracy. This holds for both in-room SLAM and outdoor VPS. Niantic SDK for Unity Community
Anchor redundancy and recovery. Create multiple anchors per content cluster, including plane-aligned and freestanding variants, then reconcile at runtime based on tracking confidence. Meta’s best practices and Apple’s tracking state guidance both support this pattern. developers.meta.com+1
Persist with the platform that will live with your users. With Azure Spatial Anchors retired in November 2024, applications have moved to platform stores such as Apple world maps or ARCore Cloud Anchors and Geospatial APIs; validate data retention and access models at design time. Microsoft Azure+2App Store+2
Balance privacy with precision. Depth and scene model access on headsets now requires explicit user consent. Request permissions contextually, and cache only what is required for re-localisation and safety. developers.meta.com Apple’s Vision Pro privacy overview explains that environmental and eye streams are processed with platform isolation; architects should design data flows that respect those boundaries. Apple
Choose sensors for the job.
• Room scans and world meshes, LiDAR on iPhone and iPad provides robust metric depth and low-light performance for RoomPlan and RealityKit workflows. Apple Developer
• People capture and interaction, multi-camera RGB plus iToF depth rigs provide fewer holes on dark textiles and glossy materials than passive stereo alone; Orbbec’s Microsoft iToF line is the successor path for Azure Kinect-style rigs. orbbec.com
• Outdoor anchors and city scale, rely on ARCore Geospatial, with careful thresholds for horizontal and heading accuracy before placing persistent content. Google for Developers

A reference architecture for MR-ready twins

A robust digital twin for a venue, factory or store can follow this layered architecture.

Capture layer, combine device-mounted LiDAR scans for base geometry, photogrammetry or Gaussian Splatting for visual look, and volumetric rigs for people and dynamic assets, selected per scene. Maintain consistent scale with fiducials or surveyed points during shoots to ensure metric alignment later.

Spatial data model, store meshes and planes with semantic tags, then manage anchor graphs per room or zone. Use platform SDKs for short-term anchors, and Geospatial or room maps for long-term persistence. Track provenance and versioning, since twins change as spaces change. Google for Developers

Compression and delivery, for volumetric sequences consider V-PCC for efficient streaming to headsets that lack native point cloud decoders. For splat assets, package alongside alignment transforms to platform anchors so they resolve to the same world coordinates as meshes. ISO

Runtime, target OpenXR 1.1 where possible to reduce platform divergences, and monitor the progress of Spatial Entities Extensions for portable anchors and surfaces. This will make multi-vendor deployments, for example iPad capture to headset playback, simpler to maintain. The Khronos Group+1

Governance, align with emerging XR experience design frameworks that measure usability, safety and effectiveness of MR interactions, particularly for enterprise rollouts where repeatability and comfort are as important as spectacle.

What “good” looks like

Across recent deployments, realistic “stickiness” presents as sub-centimetre drift for tabletop content and single-digit centimetres at room scale, measured after users walk out and return. This level of stability is achievable on commodity hardware when anchors are created against reliable planes, when depth permissions are granted, and when scene relocalisation is allowed a short settling period. Vendor guidance and field notes echo these thresholds and workflows. developers.meta.com+1

For fashion and retail, the same foundations matter. As the industry explores store-scale digital twins for merchandising and assisted selling, the difference between a product model that “floats” and one that remains aligned to a fixture is the difference between novelty and utility. Broader sector reports forecast a cautious 2025 with consumer price sensitivity; if MR is to justify its cost in such a climate, it must deliver accuracy and repeatability first.

The near future

Two threads will shape the next cycle. First, standardised spatial entities in OpenXR are likely to reduce fragmentation around anchors, markers and planes, which will help cross-platform MR twins mature. The Khronos Group Second, neural capture, including depth-aware Gaussian Splatting and related techniques, will continue to speed up look-dev and rapid surveys, provided teams tie them back to metric anchors and scene meshes for physical correctness. arXiv

Bottom line, LiDAR and depth cameras provide the metric ground truth, volumetric pipelines bring people and motion into the scene, and anchors turn all of that into a stable coordinate system you can rely on. Get those layers right, and your digital twin stops being a demo; it becomes infrastructure.

‍

The Voltas

Editorial Team

The Voltas Journal

Cite this Article