Machine Learning-Driven Customer Segmentation: A Behavior-Based Approach for F&B Providers
This study explores behavior-based customer segmentation by integrating Recency, Frequency, and Monetary value (RFM) analysis with the K-Means++ clustering algorithm. Using one year of invoice-level transactional data from a Romanian Food and Beverage (F&B) provider serving restaurants and coffee shops, the research aims to deliver actionable insights to enhance marketing and sales strategies. After standardizing the dataset to address scale differences, the Elbow Method was applied to determine the optimal number of clusters, resulting in five distinct customer groups: Champions, Loyal Customers, Promising, Hibernating Customers, and Lost Customers. Notably, the Champion segment, consisting of a single customer, accounts for 15% of total sales, highlighting both profitability and dependence risks. Loyal and Promising customers were identified as the most strategically valuable segments for targeted retention and growth initiatives. The clustering results were validated through visualization techniques and internal metrics, confirming the effectiveness of the segmentation. By relying exclusively on transactional data, this approach ensures GDPR compliance and offers a scalable framework for continuous monitoring and dynamic strategy adaptation. The findings provide immediate financial implications for the company, illustrating the potential of machine learning-driven behavior-based segmentation in B2B markets with frequent, recurring transactions.
- Research Article
2
- 10.36713/epra17685
- Jul 10, 2024
- EPRA International Journal of Economic and Business Review
This research presents a comprehensive approach to customer segmentation using Recency, Frequency, and Monetary (RFM) analysis, combining statistical insights, data visualization, and machine learning techniques. The study utilizes a real-world dataset obtained from a retail environment, aiming to categorize customers based on their recent purchasing behavior, visit frequency, and monetary contributions to the store. The code begins with data preparation and exploration, ensuring data integrity by addressing issues such as negative quantities and missing customer identifiers. Following this, the Recency, Frequency, and Monetary metrics are computed, providing a holistic view of customer engagement and spending patterns. Visualizations, including violin plots, histograms, and box plots, are employed to intuitively convey the distribution of these metrics. The research then delves into the quantile-based segmentation of customers, allowing for a more granular classification. Quantiles are calculated to divide customers into four segments for each RFM metric. The resulting quantile labels are applied to the dataset, enabling the creation of a compound RFM quantile that combines recency, frequency, and monetary information. This combined quantile facilitates the definition of distinct customer segments. To further enhance the interpretability of customer segments, the study introduces a set of rules for labeling customers based on their RFM quantiles. These rules yield segments such as "Best Customer," "Loyal Customer," "Big Spender," "Dead Beats," and "Lost Customer." The resulting customer segmentation is presented visually through histograms and a pie chart, providing a clear and concise representation of the distribution of customers across different segments. Moreover, the research integrates machine learning models, including XGBClassifier, and CatBoostClassifier, to explore the potential of automating the segmentation process and predicting customer segments based on historical data. However, the machine learning aspect is introduced with commented-out sections, leaving room for further exploration and experimentation. In conclusion, this research contributes a comprehensive and detailed code implementation for RFM-based customer segmentation. The integration of visualization techniques aids in the interpretation of customer behavior, while the inclusion of machine learning models opens avenues for predictive analytics in customer segmentation. The presented approach provides valuable insights for businesses seeking to tailor marketing and customer relationship strategies based on individualized customer segments. KEYWORDS— Customer Segmentation, Frequency, Monetary Value, Recency, RFM Analysis.
- Research Article
113
- 10.1016/j.tmp.2016.03.001
- Mar 12, 2016
- Tourism Management Perspectives
Using data mining techniques for profiling profitable hotel customers: An application of RFM analysis
- Research Article
- 10.31539/budgeting.v5i2.8994
- Apr 4, 2024
- BUDGETING : Journal of Business, Management and Accounting
The Pharmaceutical Company is a company that has quite large raw material import activities and has many benefits for society and institutions such as hospitals. Pharmaceutical companies play an important role in improving the quality of life of the human population in modern times because, in the field of marketing, pharmaceutical companies face increasing sales performance and profits, as well as maintaining customer loyalty. Pharmacy retail customers usually make drug purchases influenced by the selling price and suitability factors (suggestions) for certain drug brands. Based on these conditions, drug purchasing patterns for the Indonesian people become unpredictable, and it is difficult to increase sales and profits. One effort that pharmaceutical business players can make is to carry out sales promotions based on customer segmentation. Customer segmentation in pharmaceutical companies can be done using clustered data mining analysis methods, such as modified Recency Frequency Monetary (RFM). This method allows companies to group customers based on purchasing patterns of pharmaceutical products, thereby allowing companies to prioritize energy and resources to different segments. After the scoring and data processing process, the number of customers for each RFM Score is obtained, then the Monetary group is segmented which is divided into 4 (four) parts, namely Best Customers by quantity (36), Loyal Customers by quantity (188), Potential Customers by quantity (34) and Lost Customers by quantity (61). Then we continue to map it into only 3 (three) parts, namely Best Customers, Loyal Customers, and Potential Customers using blue as a sign to see the score range. From the results of dividing the 3 (three) group segmentations, the Loyal Customer Score segmentation is greater in quantity (188) so the blue color is darker than the others, which shows that the more customers spend their money. Of the 3 (three) customer segmentation sections, we put all of them into the Best Customer category, because they have introduced new products or products they have not purchased. By using RFM analysis, you can quickly find out customer targets that will be prioritized in carrying out marketing, campaigns, promotions, and rewards using digital channels and direct customer relations. Keywords: Farmasi Company, Group Segmentation, Recency Frequency Monetary (RFM).
- Research Article
19
- 10.30880/ijie.2019.11.03.018
- Sep 1, 2019
- International Journal of Integrated Engineering
The CLV model is a measure of customer profit for a company that can be used to evaluate the future value of a customer. The CLV model is a measure of customer profit for a company that can be used to evaluate the future value of a customer. This study aims to obtain Customer Lifetime Value (CLV) in each customer segment. Grouping uses the K-Means Clustering method based on the LRFM model (Length, Recency, Frequency, Monetary). The cluster formation process uses the Elbow Method and SSE with the best number of clusters = 2 clusters. CLV values are generated from the multiplication of the results of normalization of LRFM and the LFRM weight values are then summed, and carried out on each cluster that has been formed. The highest ranking among the 2 clusters is at the second cluster with the CLV value being far the highest from the other cluster average of 0.362. Based on LRFM matrix, this cluster has a high loyalty value with the symbol LRFM L ↑ R ↑ F ↑ M ↑ which is a loyal customer (the best segment that has high customer loyalty value). Based on the LRFM symbol, the company can make a strategy to retain customers and acquire customers to become loyal customers with high profitability.
- Conference Article
10
- 10.1109/is3c.2016.126
- Jul 1, 2016
Due to fierce competition, veterinary hospitals have to maintain good relationship with their existing customers and attract new customers. In order to identify critical customers, data mining techniques particularly cluster analysis are viewed as a vital tool to facilitate customer relationship management. This study uses a veterinary hospital located in Taichung City, Taiwan as an example by analyzing its transactions data focusing solely on dogs in 2014 with 4,472 customers. Recency, frequency, and monetary are the three input variables for cluster analysis. A combination of self-organizing maps and K-means method is used for cluster analysis. The results show that seven out of twelve clusters are found to be the best or loyal customers, while three clusters are uncertain or lost customers. Two clusters with relatively higher recency values than average can be viewed as new customers. When customers are classified, this veterinary hospital can provide different marketing strategies to meet different customer needs.
- Research Article
3
- 10.22619/ijcsa.2017.100116
- Nov 9, 2017
- International Journal on Cyber Situational Awareness
Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a number of weeks. Regular visitors and spenders are more likely to be loyal to the supermarket. Neither one of these two criteria can provide a complete picture of customers’ l The decision regarding the loyalty of a customer will have to take into account the visiting pattern as well as the categories of products purchased. This paper describes results of experiments that attempted to identify customer loyalty using thes e two sets of criteria separately. The experiments were based on transactional data obtained from a supermarket data collection program. Comparisons of results from these parallel sets of experiments were useful in fine tuning both the schemes of estimating the degree of loyalty of a customer. The project also provides useful insights for the development of more sophisticated measures for studying customer loyalty. It is hoped that the understanding of loyal customers will be helpful in identifying better marketing strategies. 1. Introductionloyalty is an important component of marketing analysis in a supermarket. The loyalty of a customer may be apparent through the products bought by the customer. Certain product categories such as bread and eggs may have a higher ability to distinguish between loyal and disloyal customers. Other product categories such as coffee/tea and ketchup may not be deterministic of a customer’s loyalty but may simply enhance their degree of loyalty. Establishing a scoring system based on such key product categories is one possible way of determining customer loyalty. However, the dietary habits of some loyal customers may lead to lower loyalty scores if they are based solely on product categories. Studying patterns in tra nsactional records can also provide important clues about the loyal patrons of the supermarket. It is important to conduct parallel analyses of products purchased and transaction patterns for identifying loyal customers. The two separate analyses can also be used for fine-tuning each other. This paper reports the results of experiments that studied various characteristics of loyal customers based on the products purchased and visiting patterns. The experiments were based on the data obtained from a large national supermarket chain, which was gathered over a thirteen-week period in 2000. The project was divided into two parallel streams: product based and transaction pattern based analyses. The product based analysis started with a preliminary definition of loyal customers, based on spending levels. 1 The authors would like to thank NSERC Canada, the Nova Scotia Cooperative Employment Program, and the Senate Research Grant Committee of Saint Mary’s University for the financial support. The authors are also grateful to the supermarket chain and it’s management for allowing us the use of the data.
- Research Article
- 10.22034/gahr.2021.294817.1581
- Sep 23, 2021
Today, public spaces such as cafes and restaurants play an important role in people's daily life style, and use the coffee shops have increased significantly compared to the past, and consequently the number of coffee shops has increased significantly. For this reason, attracting customers and turning them into loyal customers have the considerable importance for coffee shop owners. Since the sense of belonging to a place creates an emotional interaction and connection between man and place and leads to a strong desire to be present and stay and return to that place, paying attention to the physical parameters effective in promoting a sense of belonging should be the main concern. The purpose of this study is to explain the concept of environment to recognize the physical factors affecting it in the coffee shop ،moreover, to examine how these factors work in a case study (Ivy Coffee Shop). The present research is a type of applied and qualitative one and in terms of descriptive-analytical nature, the first part has been done through documentary and library studies and the second part has been done through survey and field studies. The results show that the factors affecting the creation and promotion of a sense of belonging to the place in the coffee shop can be divided into physical, functional, social, cultural, personal, semantic and temporal factors. The physical component includes elements of light, color, materials, nature, furniture, texture and decoration, size and scale. The characteristics' quality and desirability play an important role in increasing the sense of belonging to place in coffee shop users. The most important physical characteristics to create the sense of belonging in the case model (Ivy Coffee Shop)are indigenous materials, connection with nature through the use of natural elements and special attention to artificial lighting.
- Research Article
61
- 10.1007/s11747-016-0491-8
- Aug 8, 2016
- Journal of the Academy of Marketing Science
Service firms are encouraged by historic evidence that loyal customers are less price sensitive. Yet, some research has challenged the assertion while others have demonstrated considerable heterogeneity within loyal segments. Aiming to reconcile this debate, we investigate the relationship between customers’ behavioral loyalty and the importance they place on price relative to two managerially relevant service attributes: rewards and convenience. We also assess the moderating role of attitudinal loyalty resulting from superior service experience. Results from a longitudinal survey and transaction data from an airline carrier show that as customers’ behavioral loyalty increases, they place more importance on price and less importance on rewards and convenience, revealing that behavioral loyalty causes a shift in emphasis toward price. As a result, behaviorally loyal customers spend less and revenue decreases. However, by improving attitudinal loyalty, firms achieve the desired outcome of reducing price sensitivity and increasing revenue. Specifically, after experiencing better service, behaviorally loyal customers focus less on price and instead shift their focus toward rewards and convenience, and this results in revenue gains for the firm. Overall, attitudinal loyalty from better service experience acts as a key mitigator of the positive link between behavioral loyalty and price sensitivity.
- Research Article
- 10.26798/jiko.v9i2.1737
- Jun 20, 2025
- JIKO (Jurnal Informatika dan Komputer)
Industri ritel yang kompetitif memerlukan pemahaman mendalam tentang kebutuhan pelanggan untuk menyusun strategi pemasaran yang relevan dan efektif. Penelitian ini bertujuan untuk melakukan segmentasi pelanggan di Toko Mitra 10 Cirebon menggunakan analisis Recency, Frequency, dan Monetary (RFM) yang dikombinasikan dengan algoritma K-Means. Segmentasi ini bertujuan mendukung strategi pemasaran yang lebih terarah dan meningkatkan loyalitas pelanggan. Data yang digunakan berasal dari catatan transaksi pelanggan dalam periode tertentu. Nilai RFM dihitung untuk setiap pelanggan berdasarkan Recency (waktu sejak transaksi terakhir), Frequency (jumlah transaksi), dan Monetary (total nilai transaksi). Metode K-Means digunakan untuk mengelompokkan pelanggan menjadi beberapa segmen, dengan jumlah kluster optimal ditentukan melalui metode elbow. Analisis menghasilkan tiga segmen utama: Lost Customers, dengan Recency tinggi, Frequency rendah, dan Monetary rendah; Potential Loyalists, dengan Frequency sedang dan Monetary bervariasi; serta Loyal Customers, dengan Frequency tinggi dan kontribusi Monetary signifikan. Hasil segmentasi ini mendukung penyusunan strategi pemasaran yang berbeda untuk setiap kluster: kampanye reaktivasi untuk Lost Customers, program loyalitas untuk Potential Loyalists, dan layanan eksklusif untuk Loyal Customers. Pendekatan berbasis data ini meningkatkan efektivitas pemasaran, loyalitas pelanggan, serta kontribusi pendapatan, sekaligus menegaskan pentingnya analisis data dalam pengambilan keputusan pemasaran yang relevan dan personal.
- Research Article
5
- 10.1080/09720510.2019.1565445
- Mar 26, 2019
- Journal of Statistics and Management Systems
This study adopts a two-stage clustering technique, the combination of self-organizing maps and K-means method, and RFM (recency, frequency, and monetary) model to categorize customers in a veterinary hospital in Taichung City, Taiwan in order to make effective marketing strategies in this competitive market. Based on 1,784 customers with the focus solely on cats in 2014, twelve clusters are formed. Specifically, six out of twelve clusters are classified into the best and loyal customers. Three clusters are uncertain and lost customers. One cluster is viewed as the best but lost customers. Finally, two clusters that have relatively higher recency values than the average value belong to uncertain but new customers based on RFM model and based on RFM model.
- Research Article
3
- 10.1051/e3sconf/202448402008
- Jan 1, 2024
- E3S Web of Conferences
The company’s approach to customers is important to maintain the company’s profits. Understanding the differences of each customer is very important so that we can understand customer needs based on customer data. Customer Relationship Management (CRM) is considered as a solution to bridge the company and customers. Customer segmentation needs to be done to make it easier for companies to meet customer needs. Data mining and RFM modelling are used for customer segmentation in online retail companies using K-Means and K-Medoids methods. This research compares the performance of both algorithms using Davies Bouldin Index (DBI) and execution time. The results show K-Means is better in cluster validation and execution time. The average DBI value of K-Means is 0.2962 with an execution time of 0.0960 with k=3, while K-Medoids produces a DBI of 0.8942 and an execution time of 2.4295 with k=5. K-Means RFM customer tiers 1-3: Potential Customer/Golden Customer, Lost Customer/ Dormant Customer, and Superstar/Core Customer, 1-5: Champion and Lost. K-Medoids RFM 1-5: Lost, Loyal Customer, Champion, At Risk, and Hibernating, 1-3: Lost Customer/Dormant Customer, Potential Customer/Golden Customer, Superstar/Core Customer, Potential Customer/Golden Customer, and At Risk Customers/Occasional customer.
- Conference Article
- 10.1109/icsai57119.2022.10005470
- Dec 10, 2022
With the increasing intensity of competition in the current metrology testing market, building customer portrait is an effective way for metrology institutions to improve service levels to customers. This paper is based on the basic business data of a certain metrology institution. First, recency, frequency, monetary value model (RFM model), which is widely applied in customer relationship management, is improved. Further, it is combined with the business features of the metrology institution and used to build data feature engineering, which is closely related to the business data of the metrology institution and can reflect the data situation. Then, the data are analyzed through correlation test, standardized by Z-score, and clustered with three clustering algorithms, namely K-Means, DBSCAN, and AGNES, which are in SKLEARN database based on Python. After that, the clustering results are compared. In the clustering process, the elbow method and method for traversing the silhouette coefficient are used to determine the optimal value of the clustering algorithm. Finally, with the analysis of clustering results, the customers’ features of the metrology institution are signed and the customer portrait is built, which provides data analysis methods, tools and decision basis for the metrology institution to offer better services.
- Research Article
- 10.26877/asset.v6i2.18506
- Apr 19, 2024
- Advance Sustainable Science, Engineering and Technology
Sales transaction data contains rich information potentially used to support company competitiveness. However, interpreting and utilizing transaction data in developing marketing strategies remains a challenge, even for big companies. Therefore, this research aims to develop marketing strategies using data mining techniques. A medium-sized company focusing on producing and selling traditional motif clothes (batik) will be used as a case study. The negative sales trend is the biggest issue currently faced by the company. Hypothetically, this problem is caused by imported products sold at lower prices or changing consumer behavior after pandemic covid. Currently, the company only implements simple analysis of its transaction data. The analysis of transaction data, conducted through five data mining stages, yielded a shift from purchasing small quantities to larger quantities, increased purchases during the final week of each month, and increased purchases on religious occasions. Furthermore, the analysis revealed that 31.29% of all transactions were attributed to loyal consumers, and 192 customers exhibited in Cluster 1 (high transaction quantities and high transaction values). Further investigation also revealed that customers categorized as loyal customers and Cluster 1 have different behaviors that can be used to develop further customer relationship programs. Future research can be conducted by employing data mining techniques to study the organization's assortment of products. Management discussions reveal that changes in consumer buying behavior extend to the selection of items and batik themes.
- Research Article
- 10.52436/1.jutif.2024.5.2.1497
- Apr 15, 2024
- Jurnal Teknik Informatika (Jutif)
The rapid development of online business in recent years has driven Store X to embark on a digital transformation. By the end of 2020, Store X relocate their conventional business to online business. The greatest obstacle and key to success for online business operators, such as Store X, is gaining and retaining consumer loyalty in the face of an increasing number of competitors. Therefore, the company must be able to identify the character (behavior) of its clients to provide appropriate treatment. Each customer's behavior is unique, which means they must all be treated differently. However, all this time, Online Store X has provided the same treatment (as much of a discount) to all its customers due to the lack of information regarding their customers’ characteristics. Therefore, in this study, customers of Online Store X were segmented based on their transactional behavior using online transaction history data from March 2021 to March 2023. Two customer analysis models, LRFM and MLRFM, will be combined with RM K-Means to find the best combination through Silhouette Coefficient values. The optimal number of clusters (k) is then determined using the Elbow Method. The results indicate that the optimal number of clusters for both combinations is K=3, with the combination of MLRFM and RM K-Means is the best combination. The finest combination has a silhouette coefficient value of 0.8609. Based on this combination, it is also known that 2,053 customers in cluster 3 are loyal customers, while 2,339 customers in cluster 1 and 2 are lost customers. The results of this study were also implemented on websites built for X Store using Python programming languages and MySQL databases, making it easier for companies to see data visualization.
- Research Article
3
- 10.35200/explore.v12i1.548
- Jan 10, 2022
- EXPLORE
XYZ online bookstore is one of the companies engaged in the online book sales industry that located in Jakarta, Indonesia, but the marketing strategy given to customers has not been maximized, so it has not been able to increase book purchase transactions. Therefore a customer-centered marketing strategy is needed by implementing Customer Relationship Management, One of the methods that can be applied is customer segmentation. Customer segmentation can be done by implementing a data mining process which carried out by using the K-means clustering algorithm and based on the RFM (Recency, Frequency, Monetary) model. . Determining the number of clusters in the clustering process using the elbow method. Performance tests on cluster results using the silhouette method, and the Calinski-Harabasz index. The results of cluster analysis based on customer value using the RFM Combination and Customer Value Matrix methods show that based on the RFM Combination method produces 3 types of customer characteristics namely loyal customers, new customers, and lost customers. Meanwhile, based on the customer value matrix method, it produces 2 types of customer characteristics namely best customer and uncertain customer.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.