With the continuous development of microfluidic technology, continuous-flow microfluidic biochips (CFMBs) are being increasingly used in the Internet of Things. The automation design of CFMBs has also received widespread attention. The architecture design of CFMBs is divided into a high-level synthesis stage and a physical design stage. Among them, the problem of the physical design stage is very complex. At this stage, the chip architecture is generated based on the device library and a set of flow paths, taking into account the actual fluid manipulations, while minimizing the cost of the chip, such as the number of ports, total length of flow channels, number of flow channel intersections. As fabrication technology advances, the number of devices integrated into CFMBs is increasing. The existing physical design algorithms can no longer meet the design requirements of CFMBs in terms of time. Therefore, we propose a three-stage rapid physical design algorithm for CFMBs considering the actual fluid manipulations. The proposed algorithm includes a port-driven preprocessing stage, a force-directed quadratic placement stage, and a negotiation-based routing stage. In the port-driven preprocessing stage, a port-driven preprocessing algorithm is proposed to generate connection matrices between ports and devices to reduce the number of ports introduced. In the force-directed quadratic placement stage, we model the placement problem as an extremum problem of a quadratic cost function, which mathematically reduces the search space significantly and shortens the running time of the algorithm significantly. In the negotiation-based routing stage, a heuristic negotiation-based routing algorithm and a flow channel strategy that prioritizes the construction of parallel execution are proposed to reduce the running time of the algorithm while ensuring that the number of crossings in the routing solution is close to the optimal solution. Experimental results confirm that our proposed method is able to generate the high-quality solutions quickly. Under general scale problems, compared to the existing method based on ILP, our proposed method achieves a speedup ratio of 23,171 in terms of CPU time and optimizations in terms of number of ports and port reuse of 3.18% and 6.52%, respectively. These optimizations come at the cost of only a slight increase in the number of intersections, the flow length, and the number of flow valves. In addition, our proposed method can effectively solve large-scale problems that cannot be solved by existing method based on ILP.