Quality variables cannot be automatically measured to all nor be measured at a high cost, infrequently, nor with high delays, such as laboratory analysis and online analyser. Therefore, data-driven soft sensors are inferential models which use online available sensors, such as temperature, pressure, and flow rate among others, to predict the quality variables. Soft sensors which are built using historical data of the processes are normally developed from the supervisory control and data acquisition (SCADA) systems connected with PLC or DCS (distribution control systems) as the daily reports on the oil refinery processes. These systems are then obtained from laboratory observation/measurements. Notably, the main issue in the development of the soft sensor is the treatment of missing data, outlier detection, selection of input variables, model training, validation, and soft sensor maintenance to adopt the heavy-duty oil refineries to improve the products of the crude oil and increase yield. In this article, the improvement in the virtual sensor based on hybrid soft computing methods (FLS and NN), which are combined into ANFIS, will be employed to construct the soft sensor model. Moreover, RST will be used to reduce the fuzzy rules and discretisation method to optimise and deal with the large continuous data. It was found from the implementation of rough set theory and discretisation methods that these two methods solved the complexity and nonlinearity of the soft sensor model. This model was employed for the refining process measurements data of the oil refinery from two different crude oil sources, in which the database of the measurements and processes was combined to improve the quality of data and discover the knowledge stored in the data pattern. It was indicated from this study result that the ANFIS model is able to manage the complex data to predict two important parameters of light naphtha (API and RVP) compared to the simple regression model. Additionally, controlling and monitoring the process are crucial actions performed to achieve the 4th industrial revolution and IoT. This study has contributed to the assistance in breaking the barriers of privacy between oil industries and the applicability of soft sensors modelling in the changes of data sources to achieve remarkable data analysis. The analyses result of RVP show the efficiency of ANFIS compare with linear regression regarding the generalization and overfitting.