AbstractSmart healthcare systems do exist with a variety of architectures. However, the hunt for better smart healthcare systems is more predominant. The cutting‐edge field of IoT (internet of things) and technological developments provide better solutions for smart healthcare systems using Sensor–Body Area Networks. Thus, the patient's sensor data can be collected, stored, analyzed, and suitable treatments can be offered, over the inter‐network, anytime, anywhere. The most complex part in such systems is the physician analysis of the huge volume of patient's data, to handle and prepare suitable diagnose and treatment for humanity. This article reveals a methodology of Deep Reinforcement Learning for smart healthcare decisions in an IoT interfaced Smart Healthcare–intelligent monitoring system. The system incorporates four layers, patient data collection, Edge computing, patient data transmission and Cloud computing. IoT is employed for automatic collection of Patient's data and for transmission of data, to data centers. Artificial intelligence techniques are used to analyze these data to provide suitable decisions, diagnosis, and treatment for those patients and humanity. Deep Reinforcement Learning provides the platform for smart decisions, diagnosis, and treatment. The investigation was experimented with synthetic simulated data of various BAN sensors. We developed a data set of size 286, which contains 21 different health parameters. After pre‐processing, these data were stored in the Amazon web services (AWS) cloud server using (message queue telemetry and transport) MQTT–IoT protocol. Initially, the Deep Q‐Network (DQN) was imposed to the training algorithm. The methodology was examined in PyTorch using a single GTX 1080 Ti X GPU with the training data sizes from 27 to 1536. The training time was about 10,000 to 90,000 s for training 500 epochs. In the high dimensional action space environment, the algorithm responded slowly to analyze, explore, and determine effective healthcare strategies. The systems convergence response of estimated hidden health state (g') and the actual health state (g) for the 21 different health parameters were estimated, whose values range from 0 to 1. The system responded with smart decisive interventions, which were good and close to that of a Physician's decision. The proposed methodology is definitely a promising solution for a smart and economic telemedicine.