Bridge detection methods based on deep learning have many parameters, complex calculations, and serious errors and missed detections for multiscale bridges. To solve the above problems, a depth-wise separable multiscale feature fusion network (DSMFFNet) is proposed for efficient and accurate bridge detection in very high resolution satellite images (VHR). First, depth-wise separable convolution was used to build a backbone feature extraction network to extract the bridge features. Second, to better match bridges of different scales, multiscale receptive fields were obtained by multibranch parallel dilated convolution at the last layer of the feature map. Then, to make full use of the details and semantic information of the bridges at different depths, the three effective feature layers of the bridges at different levels are fused by a multiscale feature pyramid. The experimental results showed that the mean average precision (mAP) and frame per second (FPS) of the proposed method reach 94.26% and 60.04%, which can lead most of the mainstream object detection networks in accuracy and speed and can be integrated into the mobile end to complete the task of high-precision and fast bridge detection.