At present, the exchanges in various fields such as culture, economy, and politics are becoming more and more close in the world. From the perspective of the relationship between national music and cultural development, the development of national music has also received more and more attention, and it has become an inevitable trend in the development of today's era. The purpose of this paper was to perform virtual reconstruction of the sound image of folk music through multiple objective optimizations, and to model the virtual sound image of folk music as a multi-objective optimization problem. According to the research on sound image positioning, the relevant noise factors were removed, so as to achieve the playback of ethnic music that enhanced the surround effect and visual enjoyment. The evolutionary algorithm in this paper was mainly based on the multi-objective optimized sound image localization technology. For the music virtual sound image, an improved FCM algorithm and an evolutionary multi-objective optimization algorithm combining local and non-local information of the sound image were proposed, respectively. Through the analysis of the traditional sound image algorithm method, the accuracy of the sound image localization on the horizontal plane could be effectively improved. After the conversion, the audience could feel the better stereo image and surround feeling of the folk music. Through experimental analysis, it can be seen that the system can not only perform virtual conversion of audio-visual signals of different frequencies, but also provide data for different audio playback systems. The feature point registration error is low, and the reconstruction effect is good, especially the random delay processing in the range of 0∼20m, and the performance is better than the traditional method. Finally, virtual left and right surround sound image signals were obtained, which effectively improved the three-dimensional surround feeling of folk music.