+31(0)6 21886161info@processminery.com

How to select your case ID? We can’t take them all… or can we?

When you want to use Process Mining to analyze your process you will need to determine what the appropriate case ID is for your process. But there are many potential case IDs available, so how to select the one that suites your analysis best?

If we look at the Procure to Pay process in Oracle EBS there are many potential case IDs. We see a number of different documents like:

  • Requisition
  • Purchase Order
  • Receipt
  • Invoice

Each of these documents have their own “number”. You have for example the requisition number. This is the number you see in the application and you can use this number to query your requisition.

Use the document number or the internal ID?

These documents also have an internal ID. You have for example the requisition header ID. The advantage of the ID compared to the Number is that the ID is unique across all organizations in your Oracle EBS instance, the number is not unique. This makes the ID a better candidate as case ID.

The disadvantage is that you cannot use the ID in the normal Oracle EBS forms to query your document, so you will also need to have the number in your event log as reference so that you can query the document in Oracle EBS.

An alternative is to concatenate the org ID and the document number to ensure uniqueness of the case ID. The advantage of this approach is that you already have a reference in the case ID to the number that you can query in Oracle EBS.

What level to use?

The documents in Oracle EBS consist of multiple levels. The requisition has a header, one or multiple lines and each line has one or multiple distributions. Each of the levels have their own ID, so you have a requisition header ID, one or more requisition line IDs and one or more requisition distribution IDs.

Which one to pick? Selecting the lowest level of a document will in general give the most clean result of your process. But sometimes you want to check the process on a higher level. For example the approval of a requisition is done on header level, not on line or distribution level. So if you are most interested in the approval of the requisitions the requisition header ID would be more appropriate. If you would select the distribution ID in this case and the requisition contains 10 lines you would see the approval of the requisition also 10 times, once for every single distribution.

If you are more interested to see how each requisition line is processed from requisition, to purchase order, to receipt and finally to the invoice, the line or distribution would be more appropriate to avoid unnecessarily loops and complexity in your process.

And if you want to do both you can probably better use two event logs, one with the requisition header ID as case ID and one with the requisition distribution ID as case ID.

What document to select?

If we select a specific document as case ID you will retrieve all activities related to this case ID. So when you selected the requisition header ID you will see everything related to this requisition header ID.

That also means that everything that is not related to a requisition will not appear in your event log and will be excluded from your analysis. For example a purchase order can be entered in Oracle EBS without requisition. So if you have the requisition as case ID, all the purchase orders without requisition will be excluded from your analysis.

The same applies to invoices. If your invoice is not matched to a purchase order it will never be part of your analysis if you select the purchase order or requisition as case ID.

Which document is the correct document for your analysis? It depends…. And in most cases your will end up with using multiple event logs. One with requisition as case, one with purchase order as case and one with invoice as case.


When selecting your case ID you will in many situations end up with the conclusion that there is no single best case ID for your analysis. But is this really a problem? Why not take them all… one for every specific analysis? The data is there, the tools are their, so why not use it in the best possible way?

Marcel Koolwijk

Comments are closed.