Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
Abstract
Augmented reality, as one of the emerging technologies in human-environment interaction, plays an important role in integrating virtual information with the real world, and to achieve this goal, accurate and reliable positioning is required. Vision-inertial (VIO) techniques that combine camera and motion sensor data have attracted widespread attention due to their high performance, especially in indoor environments without access to GPS. This paper reviews and systematically examines the positioning methods in the ARCore platform; ARCore is one of the most advanced and widely used development frameworks in the field of augmented reality. In the initial sections, basic concepts such as visual tracking, pose estimation, feature recognition, and environment mapping are introduced and reviewed. Then, the internal mechanisms of ARCore, especially the SLAM, depth sensing, and Kalman filter subsystems, are analyzed. Also, common VIO algorithms including MSCKF, OKVIS, VINS-Mono and ROVIO are introduced and compared with ARCore performance in terms of accuracy, computational complexity and adaptability to mobile data. The main innovation of the paper is to provide a comparative analysis between academic models and practical frameworks that helps to understand the advantages and limitations of each. The findings indicate that the future trend is towards lightweight, learning-based and robust models in unstable environmental conditions, which can significantly improve the quality and efficiency of augmented reality.