Abstract To navigate and dock safely in urban waters, an autonomous ferry must assess its safe navigation paths by identifying navigable free space and potential obstacles. A deep understanding of its surroundings is crucial for situational awareness. Knowing which objects are static and dynamic is an indispensable ingredient for path planning algorithms and collision avoidance systems, and is especially useful for docking operations, as moving obstacles can interfere with the optimal path to the dock. To handle all types of obstacles, we present a novel dynamic scene representation for docking in urban waters using a short baseline stereo camera. Obstacles protruding from the water surface are represented with vertically oriented rectangles, known as stixels. The stixels are tracked over consecutive frames with dense optical flow. Together with stereo correspondences, their 3D motion is estimated. Stixels representing the same object are grouped together into line segments which are classified as either dynamic or static based on the motion of its associated stixels. Our dynamic representation presents objects in the 3D camera coordinate system as a bird’s eye view map. Additionally, stereo depth uncertainty can significantly impact depth estimates and is considered using a Kalman filter, providing smooth position and velocity estimates. We test our dynamic scene representation on a real docking scenario recorded with an autonomous ferry using a stereo camera. We evaluate the accuracy of the reconstructed 3D scene points and the velocity estimates of the dynamic objects using LiDAR and GNSS.