As a representative mode of shared mobility, bike-sharing serves not only as a convenient way to conduct short-distance trips in urban areas, but also as a feeder mode to public transit, forming the Bike and Ride (BnR) system. Conducting management for such a hybrid multi-modal system faces various challenges, including the complex interactions between bike-sharing and other modes, highly dynamic passenger demand, and the difficulty of accessing direct transfer data. To overcome such difficulties, our study proposes a framework for assessing the dependency between the two usage modes. Firstly, a Dynamic-Time-Warping-based (DTW) method is utilized to determine the catchment area (CA) between the two modes, allowing the BnR-related tendency similarity under a given time scale to be considered. Then, the patterns of probabilistic dependence between travel demand of the two modes are obtained by a copula-based approach, which separates correlations under specific usage levels from single modal demands. A case study on the multi-modal system formed by docked bike-sharing and subway in New York is conducted to validate the proposed framework. The tendency similarity is found to be most pronounced within 500 m on average under a 4-hour interval. For each formed station group (SG), the best-fitted copula type is selected, capturing the strong tail correlations present only at specific usage levels. The results show a variety of different correlation patterns within SGs, despite the close geographic locations they may share. Areas of potential transfer resistance between the two modes are identified, which is more evident in first-mile-related (FMR) activities. In contrast, the two modes display more weak connections in last-mile-related (LMR) activities. The obtained results can be utilized by bike-sharing service providers to analyze demand distributions and conduct efficient station-level rebalancing. Compared to previous methods, our proposed framework is computationally inexpensive since no direct transfer of data or complex inference network is required. It incorporates statistically significant spatial–temporal information, allowing for a more accurate determination of the bi-modal assessment range. Moreover, considering that single-mode influences are mathematically removed, the resulting correlation in principle links to the strength of the connections between the two modes. Therefore, it can be assessed as an indicator of the reliability of the multi-modal system.