Exascale computing requires accelerators with ultrahigh power efficiency. Digital signal processors (DSPs), the most important embedded processors widely known for high power efficiency, are rarely explored in the HPC community. We propose a 64-bit general purpose DSP architecture, FT-Matrix2000, which not only integrates the main features of DSPs but also presents several novel enhancements for scientific computing. The FT-Matrix2000 architecture comprises multiple FT-Matrix2 cores and optional RISC CPU cores. The FT-Matrix2 core utilizes a VLIW+SIMD architecture, provides support for double precision operations, and optimizes both the data and control path for scientific computing. Our evaluations show that the performance and efficiency of FT-Matrix2000 are 1107GFLOPS and 92.25%. Compared with the MIC and a 40nm process GPU, FT-Matrix2000 improves the GEMM power efficiency with a factor of 1.49 and 2.68, respectively. We build up a prototype supercomputer with FT-Matrix2000/12. Its HPL efficiency achieves 62.2%, and the performance power ratio is 5.33 GFLOPS/W, which can rank the fourth in the latest Green500 list. These results validate that the FT-Matrix2000 architecture is suitable for scientific computing while maintaining the efficiency of signal processing well. Moreover, the enhancement of FT-Matrix2000 in vector and matrix related computations also enable it to efficiently support deep learning related applications. We have implemented some typical DCNN models on FT-Matrx2000, NVIDIA GPUs, and Vision P6 DSP. The experiments demonstrate that the average computation efficiency of the proposed architecture based on Matrix2000 is about 20 ~ 35% and 8% higher respectively than GPUs and Cadence Vision P6 DSP.