Institute of Computer Science
  1. Courses
  2. 2023/24 spring
  3. Business Process Mining (LTAT.05.025)
ET
Log in

Business Process Mining 2023/24 spring

  • Home
  • Lectures
  • Practicals
  • Assessment
    • Submit
    • Grades
  • Message Board

Homework 4: Predictive Process Monitoring

This task involves utilizing the preprocessed BPIC12W event log https://owncloud.ut.ee/owncloud/s/pMRrERiTnpE9srJ to train and employ predictive process monitoring techniques for predicting the outcome of a process. To accomplish this objective, you must modify the frameworks reviewed in class as needed.

Tasks:

  1. (2 points) As a part of log preprocessing, it is important to categorize process traces as either deviant or regular. A case is considered deviant if its total duration exceeds the mean of all cases' duration. To achieve this, a new column must be created in the log that contains a case attribute called 'label.' This attribute must take a value of 1 for deviant cases or 0 for regular cases.
  2. (2 points) To improve the precision of the model, new time contextual features can be extracted from the timestamps. These features can provide valuable information to the model about any seasonal influences on the process behavior. For instance, from a start timestamp like "2024-04-18 13:00:00", we can extract the month of the year (e.g., 04), the day of the week (e.g., Tuesday as 2, where Monday is 0 and Sunday is 6), and the relative time in seconds since midnight (e.g., 46800 for 1:00 PM as midnight is 0). To implement this, six new columns must be added to the log, containing these three contextual features for the start and complete timestamps. It is recommended that the weekday() method from Python be used.
  3. (3 points) Train an XGBoost Classifier using single bucketing and last-state encoding with the “label” column as target and contextual feature columns as part of the training log.
  4. (3 points) Please perform Task 3 once more, but this time, exclude the contextual features. Compare the accuracy of the resulting models with the previous step. Explain any differences in the results obtained from both steps. You should consider whether the use of contextual features impacted the accuracy of the models or not. Please also explain why such an effect may or may not have occurred.

What do you need to submit? You must submit a report in PDF format that includes a comprehensive explanation of the modifications made to your approach, an evaluation of the changes, and a link to the repository that contains the modifications you have made. Kindly submit this document through the 'Submit' link provided on the course website.

  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment