Background/Overview In this assignment, you are required to demonstrate your understanding of process mining by using different tools and techniques to analyse execution data.

Background/Overview

In this assignment, you are required to demonstrate your understanding of process mining by using different tools and techniques to analyse execution data. In answering specific questions about the data, you will apply various process mining techniques in order to interpret event log(s) and draw meaningful evidence-based insights from the data.

Deliverable

Written Report

You are required to submit a written report that answers the assignment questions together with screenshots and explanations to illustrate the process mining results. Key findings/insights should be discussed in depth.

Your report should be a maximum of 15 pages of content for Parts A and B together with a maximum of 5 pages of content for Part C for each member.

Part A: Disco Analysis (15%)

You are to use an event log ‘BPI_Challege_2017.xes’. It is a real-life event log publicly available for process mining analysis. Unless otherwise specified, the complete, unfiltered, original log should be used to answer each of the three questions using the Disco software.

Analyse and interpret the following process models generated by Disco.
Compare and contrast two process models (maps) – one generated using the setting: 100?tivity and 100% paths and the other generated using the setting 50?tivity and 50% paths. Explain why we should not use a model with 0% paths to understand process behaviour.
Investigate the case variants detected from the log. Overall, how many case variants are present in the log? Report on the top five (5) most frequent case variants and their respective frequencies. How much of the log do these five (5) case variants cover? Investigate the variants with low frequencies. How many variants only have less than 10 cases? Explain the implications of a high number of case variants when you try to generate a representative process model.
Compare their process behaviours (i.e., Are the two process models quite similar or very different? Observe the activities/paths/rework loops etc) and performance (i.e., throughput times, # cases, bottlenecks) of two groups of cases. Describe your observations.
Group A: Cases for Business loans (i.e., attribute LoanGoal – value Business goal).
Group B: Cases for Home Improvements.
Sequentially apply the following filters to the original log.
Only keep cases whose behaviour is shared by at least 50 cases;
Filter out (discard) all cases that were not started and completed between 1 April 2016 and 31 Dec 2016;
Only keep the cases with “W_call after offers” being the last event.

Show the overview screen with the statistics of the filtered log.

Using the filtered log, answer the questions:

How many cases are there in the log?
What is their mean duration?
How many variants are there in the log?
How significant is the problem of rework for this process?
Are there any bottlenecks detected? If so, which activities/paths are involved?

Part B: ProM (20%)

Use the event log ‘RequestForPayment.xes’ for Part B. This log is provided as part of BPI Challenge 2020[ https://doi.org/10.4121/uuid:52fb97d4-4588-43c9-9d04-3604d4613b51 ]. The data comes from the travel reimbursement process at TU/e. This log contains Requests for Payment (should not be travel related): 6,886 cases, 36,796 events.

Please undertake an exploratory analysis of the log first. You can filter the original log if you wish when answering each of the four questions using the ProM framework (all required plug-ins are available in ProM Lite).

Discover the process models using the Alpha Miner and the Inductive Miner (Petri Net) algorithms (90% paths for inductive miner). Compare the similarities and differences between the two Petri-nets discovered.
Discuss the different visualisations (process tree, BPMN) available from the Inductive Miner algorithm (process tree). Write down a brief process description based on the discovered BPMN model.
Using the Petri nets models discovered by the Inductive Miner algorithm (at 90% paths), replay the log (‘Replay a Log on Petri Net for Conformance Analysis’ plug-in). Explain how well the models describe the process behaviour seen in the log, including a discussion of the following trace fitness metrics.
Does the model completely fit the log?
If not, how many cases fit the models and how many do not?
Where are the problems for the non-fitting process cases?
Analyse the process using the Inductive visual Miner and identify potential bottlenecks and deviations. Note: Inductive visual Miner is also made available as part of QuickVisualiser: http://leemans.ch/leemansCH/quickvisualiser/

Part C: BPI Challenge 2011 Log Exploratory Analysis (15%) [Individual Task]

Use ‘Hospital_log.xes’ for Part C. This is a real-life log, taken from a Dutch Academic Hospital. This log contains 46560 events in 824 cases. Apart from some anonymisation, the log contains all data as it came from the Hospital’s systems. Each case is a patient of a Gynaecology department. The log contains information about when certain activities took place, which group performed the activity and so on. Many attributes have been recorded that are relevant to the process. Some attributes are repeated more than once for a patient, indicating that this patient went through different (maybe overlapping) phases, where a phase consists of the combination of Diagnosis & Treatment.

van Dongen, Boudewijn (2011): Real-life event logs – Hospital log. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54

[Individual Task] Imagine that you are asked to present one data-informed process improvement recommendation to stakeholders. Your task is to extensively analyse this process (guided by the stakeholder questions) and provide one improvement recommendation based on these analysis insights. Please note that this is quite a complex process. You can filter the log as you see fit. You must explain your assumptions, and filtering rules and justify your recommendation using the process mining results (including screenshots).

You are encouraged to investigate multiple tools (e.g., Disco, Celonis, ProM Lite, Quick Visualiser, Apromore, etc) to derive your insights.

Related Posts