Recently, the multi-cloud environment (MCE) has increasingly become the preferred choice of users. As with the cloud environment, efficient workflow scheduling in a MCE remains crucial for identifying the cost efficiency and overall performance of the MCE. In MCE, the resources exhibit heterogeneity, complexity, and dynamism. Simultaneously, the intricate inter-task dependencies among workflow tasks, diverse Quality of Service (QoS) metrics for users, and multiple cloud service providers’ (CSPs) billing mechanisms significantly amplify the workflow scheduling challenge. Motivated by the application of reinforcement learning (RL) in workflow scheduling in a cloud environment, this paper proposes a scheduling algorithm that takes advantage of the asynchronous advantage actor–critic algorithm (A3C) to balance cost, makespan and resource utilization in workflow scheduling in a MCE. By analyzing the elements in the MCE, we design and define multiple agents in the MCE, and each cloud service provider will have an agent to record the state and update the local parameters. For the workflow task submitted by the user, the action is selected according to the initialization policy and submitted to the scheduling action to allocate the task to a designated virtual machine in the MCE so that each agent can more clearly perceive the environment change and adapt to the MCE. In contrast to the traditional A3C algorithm, we design a new critic network according to the data characteristics of real-world scientific workflows so that each agent is more suitable for real-world scientific workflow data. Through multiple sets of simulation experiments, the workflow scheduling algorithm based on the A3C algorithm in the MCE (MCWS-A3C) was compared with three benchmark methods. The experimental results show that the proposed method has better advantages than other methods in terms of cost, makespan, and resource utilization. Specifically, on the Montage_100 dataset, the average cost was reduced by 55.12% compared to other methods. The pioneering introduction of the A3C algorithm that adapts to the dynamic environment into the MCE brings more possibilities to address the issue of workflow scheduling in the MCE.