Abstract
Event logs are the main source for business process mining techniques. However, not all information systems produce a standard event log. Furthermore, logs may reflect only parts of the process which may span multiple systems. We suggest using network traffic data to fill these gaps. However, traffic data is interleaved and noisy, and there is a conceptual gap between this data and event logs at the business level. This paper proposes a method for producing event logs from network traffic data. The specific challenges addressed are (a) abstracting the low-level data to business-meaningful activities, (b) overcoming the interleaving of low-level events due to concurrency of activities and processes, and (c) associating the abstracted events to cases. The method uses two trained sequence models based on Conditional random fields (CRF), applied to data reflecting interleaved activities. We use simulated traffic data generated by a predefined business process. The data is annotated for sequence learning to produce models which are used for identifying concurrently performed activities and cases to produce an event log. The event log is conformed against the process models with high fitness and precision scores.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.