The CSI-based approach to positioning and device identification hits an impressive 95% accuracy when tested with real-world 5G NR data. The core challenge has long been the scarcity of authentic data from functioning 5G networks, but a team from ETH Zurich—Reinhard Wiesmayr, Frederik Zumegen, and Sueda Taner—together with NVIDIA colleagues Chris Dick and Christoph Studer, has changed that. They released three extensive CSI datasets captured from a live 5G New Radio system. Their setup includes a software-defined 5G testbed at ETH Zurich and data gathered in both indoor and outdoor environments, plus a dedicated dataset for distinguishing between different devices. This release opens up new avenues for research into neural user positioning, real-world channel charting, and robust device classification. The results are striking: positioning accuracy reaching as precise as 0.6 cm and device classification accuracy surpassing 95%. The publicly available datasets and the accompanying data-processing tools mark a pivotal step toward realizing the full potential of CSI-based sensing in next-generation wireless networks.
Overview of the CAEZ Datasets and Research
This document explains the development and characteristics of the CAEZ (Channel Awareness for Efficient Zero-effort) datasets and the related research, which centers on three main tasks: neural user equipment (UE) positioning, channel charting, and device classification. The aim is to provide openly accessible datasets and tools to advance research in these areas, particularly by applying machine learning techniques to wireless systems. Key contributions include publicly available indoor and outdoor datasets collected with a distributed massive MIMO system and mobility provided by robots, including Channel State Information (CSI), robot trajectories, and ground-truth positioning data. The research achieves centimeter-level accuracy (0.7 cm outdoors) in neural UE positioning using neural networks to estimate UE location, constructs a channel chart that maps the wireless environment, achieving a mean absolute error of 73 cm, and identifies devices by their unique radio-frequency fingerprints with 99% accuracy on the same day and 95% accuracy the next day. Datasets and simulation code are publicly accessible at https://caez.ch. The work encompasses three main tasks:
- Neural UE Positioning: Neural networks learn the relationship between CSI and UE location, delivering centimeter-level accuracy in indoor and outdoor settings with distributed massive MIMO and machine learning.
- Channel Charting: Builds a map of the wireless environment from channel characteristics, enabling prediction of channel conditions at various locations, with a 73 cm MAE using triplet-based learning and distributed massive MIMO.
- Device Classification: Identifies devices based on unique RF fingerprints, achieving high accuracy (99% same-day, 95% next-day) through RF fingerprinting and machine learning.
Technical details and tools include a distributed massive MIMO system, iRobot Create robots for mobility, and standard-compliant 5G NR equipment, plus simulation code and data-processing/machine-learning tools. Future work will broaden the datasets to cover more diverse scenarios (mixed LOS/NLOS, larger areas, and 3D trajectories) and validate both model-based and NN-based receivers in real-world deployments. In short, the CAEZ project provides a comprehensive platform for researchers to explore machine learning-based solutions for wireless localization, channel modeling, and device identification.
Real 5G Channel Data for Sensing
Three real-world CSI datasets collected from a 5G NR system have been released to address a critical gap in resources for developing and validating advanced sensing algorithms. A software-defined 5G NR testbed using commercial-off-the-shelf hardware captured uplink CSI from real network traffic, offering a representative real-world dataset for future wireless systems. The experiments demonstrate the datasets’ usefulness across three CSI-based sensing tasks.
Neural UE positioning using the collected CSI achieves a mean absolute error (MAE) of 0.7 cm in outdoor environments, marking a substantial advance in positioning precision. Channel charting—the process of creating real-world maps from channel information—yields a MAE of 73 cm outdoors, illustrating the potential for detailed environmental mapping. Device classification reaches 99% accuracy on the same day and 95% accuracy on the following day, underscoring the robustness of the approach. The datasets include ground-truth UE position labels, CSI features, and simulation code, and are publicly available to enable researchers to develop and validate new algorithms without relying on synthetic data or bespoke testbeds. This work serves as a valuable resource for the wireless research community, supporting progress in off-device neural positioning, device classification, and real-world channel mapping—essential components for next-generation wireless systems.
Real 5G Channel Data for Positioning and Charting
This study presents the first publicly available datasets of real-world 5G NR CSI, complemented by positioning data, to advance key areas in wireless research. Three datasets—one indoor, one outdoor, and one dedicated to device classification—were collected with a software-defined 5G NR testbed. These datasets enable the development and validation of algorithms for neural UE positioning, channel charting in real-world coordinates, and precise device classification. Experimental results show strong performance across all tasks: neural network-based positioning achieves MAEs of 0.7 cm outdoors, channel charting reaches 73 cm in real-world coordinates, and device classification achieves 99% accuracy on the same day and 95% accuracy on the next day, respectively. The datasets and accompanying simulation code empower other researchers to build on these findings and explore new possibilities in 5G and future wireless systems. The authors note that the current datasets focus on specific scenarios and plan to expand to include more diverse environments, such as mixed LOS/NLOS conditions, larger measurement areas, and three-dimensional user trajectories. Future work will also examine validating model-based and neural-network-based receivers in real-world deployments to further amplify the impact of this research.
👉 More information
🗞 CSI-Based User Positioning, Channel Charting, and Device Classification with an NVIDIA 5G Testbed
🧠 ArXiv: https://arxiv.org/abs/2512.10809