In the field of robot grasping detection, due to uncertain factors such as different shapes, distinct colors, diverse materials, and various poses, robot grasping has become very challenging. This article introduces a integrated robotic system designed to address the challenge of grasping numerous unknown objects within a scene from a set of α -channel images. We propose a lightweight and object-independent pixel-level generative adaptive residual depthwise separable convolutional neural network (GARDSCN) with an inference speed of around 28 ms, which can be applied to real-time grasping detection. It can effectively deal with the grasping detection of unknown objects with different shapes and poses in various scenes and overcome the limitations of current robot grasping technology. The proposed network achieves 98.88 % grasp detection accuracy on the Cornell dataset and 95.23 % on the Jacquard dataset. To further verify the validity, the grasping experiment is conducted on a physical robot Kinova Gen2, and the grasp success rate is 96.67 % in the single-object scene and 94.10 % in the multiobject cluttered scene.