Missions, both near Earth and deep space, are under consideration that will require data recorder capacities doubled at a rate of approximately every three years. This challenge for ever-increasing mass storage also exists in other applications, such as unmanned aerial vehicle (UAV) and echo recording for phased array radar (PAR). All these scenarios call for storage devices with larger capacity, higher I/O bandwidth, lower latency and smaller size. In this paper, we combine Field Programmable Gate Array (FPGA)-based efficient cores of the emerging Non-Volatile Memory express (NVMe) protocol with Flash storage to improve the I/O bandwidth and latency from the operating system (OS) storage I/O software stack. We provide an alternating operation scheme to guarantee consistency of I/O bandwidth. The device has two independent optical fiber channels to ensure the reliability of interconnections and four NVMe flash storage recording data respectively at the same time, which increase its integration and scalability. The prototype has a capacity of 8TB and a volume of only 990 cubic centimeter, weighing only 2.2 pounds. Experimental results demonstrate that the continuous I/O bandwidth of each channel is above 1GBps with variance no more than 7% for its total capacity, and NVMe host logic core achieves up to 88% lower latency against the OS-based system.