Significant research work has been conducted in the area of business process (BP) mining leading to mature solutions for discovering process knowledge. These solutions were generally limited to the analysis of structured event logs generated by BP management systems (BPMS). Given the recent spread of digital workplaces, there have been several initiatives to extend the scope of these analysis to consider other information systems (IS) supporting BP execution informally. More precisely, emailing systems have attracted much attention as they are widely used as alternative tools to collaboratively perform BP activities. However, due to the unstructured nature of email logs data, traditional process mining techniques could not be applied or at least not directly applied. Existing approaches that discovered BP from emails are usually supervised or at least require significant human intervention. They focused on discovering BP with respect to their behavioral perspective (i.e. that defines the conditions for activity execution) while neglecting the discovery of their data perspective (i.e. that defines the informational entities manipulated by BP activities). In addition, they did not studied how emailing systems are used in the context of BP executions. They assume that emailing systems are used in the same way employees use ordinary BPMS. However, employees actually use emails to perform poorly structured BP fragments (i.e. parts) rather than complete and well-structured ones. These BP fragments are not necessary defined in advance as in the case of BPMS. This induces the need to discover BP functional perspective (i.e. that defines what a BP performs and what are its activities). Furthermore, employees use emails with different purposes when talking about BP activities (e.g. information about activity execution, request or planning activity execution, etc.). This results in the occurrence of new event types referring to the purpose of considering activities in emails rather than events referring only to their execution.In this paper, we propose to discover BP from email logs with respect to their functional, data and behavioral perspectives. The paper first formalizes these perspectives. Then, it introduces a completely non-supervised approach for discovering them based on: (i) speech act detection for recognizing the purposes of considering activities in emails, (ii) overlapping clustering of activities to discover their manipulated artifacts (i.e. informational entities), (iii) overlapping clustering of BP elements (i.e. activities, artifacts and activity actors) to discover BP fragments and, (iv) mining sequencing constraints between event types deduced from activities and speech acts to discover behavioral perspective. Our approach is finally validated using the public email dataset Enron.
Read full abstract