Achieving precise real-time localization and ensuring robustness are critical challenges in multi-user mobile AR applications. Leveraging collaborative information to augment tracking accuracy on lightweight devices and fortify overall system robustness emerges as a crucial necessity. In this paper, we propose a robust centralized collaborative rnulti-agent VI-SLAM system for mobile AR interaction and server-side efficient consistent mapping. The system deploys a lightweight VIO frontend on mobile devices for real-time tracking, and a backend running on a remote server to update multiple submaps. When overlapping areas between submaps across agents are detected, the system performs submap fusion to establish a globally consistent map. Additionally, we propose a map registration and fusion strategy based on covisibility areas for online registration and fusion in multi-agent scenarios. To improve the tracking accuracy of the frontend on agent, we introduce a strategy for updating the global map to the local map at a moderate frequency between the camera-rate pose estimation of the frontend VIO and the low-frequency global map optimization, using a tightly coupled strategy to achieve consistency of the multi-agent frontend poses estimation in the global map. The effectiveness of the proposed method is further confirmed by executing backend mapping on the server and deploying VIO frontends on multiple mobile devices for AR demostration. Additionally, we discuss the scalability of the proposed system by analyzing network traffic, synchronization frequency, and other factors at both the agent and server ends.