Thursday, May 8, 2025

Generative AI and Process Mining - A world of possibilities - Part 2

In my previous article, we explored how a Copilot agent, integrated with Power Automate and Azure OpenAI's large language models, can efficiently process both structured and unstructured event logs from diverse storage systems. This integration enables the generation of tabular data in CSV format, facilitating the creation of Process Mining data models. However, the automation did not extend to generating the process model itself. In this article, we delve into a new yet closely related challenge, as detailed below.

Problem Statement

Now that I have my event logs captured in CSV format, stored in Azure Data Lake Gen2 Container, I am ready to connect it to Microsoft's Power Automate Process Mining to create a process model out of it. (If you have not yet tried your hands creating process models from CSV stored in Azure Data Lake Gen2 accounts, please visit Microsoft documentation here.)

As part 2 of this series of articles, we shall explore ways to automate creation of the process model in a Power Platform environment under Power Automate Studio. 

Solution


Fig. 1: High Level Architecture Diagram

  1. The Orchestrator: The Power Automate (which is also named as "The Orchestrator" in my previous article) is just shown in Fig. 1 as a reference to the caller of the Azure Function (see below "Azure Function" step).
  2. Azure Function (Role - Data Modeler): In this part of the solution, the Azure Function is supposed to play one more critical role od Data Modeler by executing Power Platform CLI commands and calling Dataverse Web API to create four resources in Microsoft Dataverse, as follows:
    1. A Power Platform Solution.
    2. A connector that will connect to the Azure Data Lake Gen2 Storage Account.
    3. A connection reference for the connector created in step #1.
    4. Finally, a PM Inferred Task (or the Process Model).
  3. Power Automate Studio Process Mining: Finally the model is ready for: 
    1. Creating visuals in Microsoft Power BI dashboards, or 
    2. Linking to Microsoft Fabric Lakehouse using DirectLake (see "Useful Resources" section below), or 
    3. Power Automate's desktop client for Process Mining (a.k.a. Power Automate Process Mining). See "Useful Resources" section below to know more.

Business Outcomes

This makes the Copilot Agent an apt assistant to a process mining engineer for streamlining an end-end solution or a one-stop-shop for generating process maps, variants, loopholes, root cause analysis and key metrics, etc. from structured and unstructured event logs alike.

Conclusion

This short article is still an as a concept, something yet to be verified. Stay tuned to get more updates once this architecture is tested. The high level architecture will potentially be further broken down into more details in the upcoming articles. Consider following me on LinkedIn so you do not miss on the  latest updates on this blog series.

Useful Resources

How to bring Azure Data Lake Gen2 Storage Container as data source for process mining data model? 

Link Process Model to Microsoft Fabric Lakehouse using DirectLake - 

Process Mining Tutorial from Microsoft -

Disclaimer: The ideas and concepts presented in this blog post are based on personal opinions and are yet to be proven true. They are intended for informational purposes only and should not be considered as professional advice.




Sunday, May 4, 2025

Generative AI and Process Mining - A world of possibilities - Part 1

Generative Artificial Intelligence (GenAI) has rapidly become a transformative force in the technology sector, prompting major cloud service providers to make substantial investments in their infrastructure and GenAI capabilities. Companies like Amazon, Microsoft, Google, and Alibaba are allocating billions of dollars to enhance their GenAI offerings, aiming to meet the growing demand for AI-driven solutions. These investments include expanding data center capacities, developing specialized AI hardware, and acquiring pioneering AI firms. This strategic focus underscores the critical role of GenAI in shaping the future of cloud computing and the broader technological landscape.

Process mining has emerged as a pivotal tool for organizations across various industries seeking to enhance operational efficiency by identifying and addressing process delays, anomalies, and duplications. By analyzing event logs from information systems, businesses can gain accurate insights into their actual workflows, uncovering inefficiencies and opportunities for improvement. This data-driven approach enables the strategic implementation of automation, streamlining processes and boosting productivity. For example, in manufacturing, process mining can optimize production lines by pinpointing bottlenecks and facilitating better scheduling. In the financial sector, it accelerates loan processing and enhances fraud detection. By leveraging process mining, organizations can transform their operations, achieving greater transparency and effectiveness.

Infusing GenAI in process mining offers a transformative approach for various stakeholders—including end clients, consultants, architects, developers, and technology enthusiasts—to optimize business processes and harness the full potential of both technologies. Let us see how. 

Problem Statement

The other day, during a client workshop, my team was demonstrating how Microsoft's Process Mining capabilities can help streamlining process redundancies, fallacies, anomalies, loops, variants, delays, etc., thereby significantly enhancing the efficiency of the process lifecycle. Process mining software heavily depends on the event logs documented in a structured manner. CSV is the universally accepted format for almost every Process Mining tool, technology agnostic. However, in most occasions, these event log data are generally quite large in volume. 

One of the clients asked me a genuinely relevant question which was of immense significance, that I did not have a ready answer to. He said "Almost every process mining tool has similar capabilities that can help streamlining processes. The biggest challenge lies elsewhere. How do we accumulate the data and give them a structure in CSV format? This usually becomes a monumental task." I fumbled for a bit and responded with a submissive grin "This is something I will have to circle back to you. Upfront I can think of GenAI being a potential way to address. But I will do my research and come back."

Since then I had this thought swirling in my head, and finally I got an answer that this article makes an attempt to discuss.

Solution

The solution to the above problem statement is twofold, as illustrated below:
  • Prepare CSV from structured or unstructured event log data.
  • Prepare Process Mining Data Model (is discussed in the next article in this series).



Fig. 1 Architecture Diagram

The idea is to generate a CSV output from captured event logs, that could be in any format, stored in any storage. The below section explains the diagram in Fig. 1 (above):
  1. Event Loggers: Event logs can be generated by a wide range of applications like RPA, apps, manually by data entry operators, etc.
  2. Event Log Storage: Event logs can be captured in any format (both structured or unstructured, i.e. tabular or non-tabular) in any on-prem or cloud data storage.
  3. Application Service Layer: This are main components of the solution. 
    1. Copilot Studio Agent: An agent initiates an interactive conversation with the Process Mining Engineer. In course of the chat thread the agent captures key information like "Process Name", storage information (like SharePoint list, Dataverse table, etc.) where the event logs are stored. The agent also finally provides hyperlink to the output CSV for the user to download and verify the data generated by the agent.
    2. Power Automate (The orchestrator): Flows in Power Automate constitutes the backbone of this solution, that 
      1. Receives user input (see "Copilot Studio Agent" step above). 
      2. Reads event log data stored in the data storage. 
      3. Calls an Azure Function when it is the first time event log data processing in bulk (See "Azure Function" step below). Event data are passed as JSON.
      4. Appends newly created events in the output CSV file (See "Azure Function" step below).
    3. Azure Function: The Azure Function has two roles to play, as follows:
      1. CSV Author: An HTTP Triggered Flex Consumption Azure Function utilizes Azure OpenAI LLMs (as NLP) to give structures to the unstructured or semi-structured event log data read from their storage to CSV format. Finally the output CSV is stored in some storage container (can be any container like Azure BLOB Storage, SharePoint, Dataverse, etc.)
      2. Data Modeler (discussed in my next article)
  4. Security Layer and Responsible AI: The solution is conceived to be using Microsoft's cloud service stack. This makes all services and data secured by Microsoft Entra ID. Also Azure OpenAI and Copilot Studio complies to responsible AI principle.

Conclusion

The integration of Generative AI with process mining represents a significant leap forward in operational efficiency and data-driven decision-making. By automating the transformation of diverse event logs into structured formats, organizations can unlock deeper insights and streamline their workflows. This synergy not only addresses longstanding challenges in data preparation but also paves the way for more agile and responsive business processes.

Stay tuned for future updates where we'll delve deeper into real-world applications, share success stories, and explore advanced techniques in this evolving landscape. If you're passionate about the convergence of AI and process optimization, consider following me on LinkedIn. Your feedback and insights are invaluable—feel free to share your thoughts in the comments below.


What next?

Integration of GenAI and Process Mining is a wide area of discussion, and should take more than just a single article. This is the very first article of a series that will keep getting updated. You can visit my next article in this series here.


Saturday, August 21, 2021

Power Platform Licensing - A Jigsaw puzzle?



Power Platform Licensing - A Jigsaw puzzle?

One of the most common impressions that we gathered so far of Power Platform to be used as Rapid Development (RAD) framework is that, it comes free with Microsoft 365 (a.k.a. M365 and formerly  Office 365) plans. At least the developers, architects and business users (power users in the world of Power Platform) of M365 got that flavor because Microsoft had let us enjoy its easy availability with the M365 plans. We were able to connect with both Microsoft and non-Microsoft data sources in cloud and on-premise environments alike with a range of in-built connectors and/or using HTTP connectors, developing custom connectors. 

But with the inception of the premium plans which were initially P1 and P2, we found a green box, likestarted appearing beside many connectors. Then we came to know from Microsoft's release notes that unlike many connectors which used to come free with the M365 plans earlier, then became paid. That means organizations had to buy additional licenses to use them.

Then as the technology kept evolving, there were more restrictions being introduced and users/developers started getting "API limit error". This is because Microsoft also introduced API limits. That means operations which use connectors, call APIs, initializing variables with values, flow calls etc. all account for the API limits that Microsoft introduced. This means even if you buy licenses, there is more to it. So how do you configure scheduled flows which run unattended, keep monitoring your sites and environments? Jobs are one of the prevalent requirements, these days, when we keep modernizing and automating. 

Isn't that all a jigsaw puzzle? Well, it is, if your knowledge purview of Power Platform licensing does not cover every nook of it. I have seen clients walking out of signed contracts calling Power Platform a costly affair, either because of their ignorance of how it is priced or because of inadequate consulting done by the vendors who propose to modernize their workspace using Power Platform. In most cases it is the latter.

This is why, you need a rock solid understanding of how licensing works to provide the most optimal solution around Power Platform, to make the fullest use of this RAD framework from Microsoft. Only then will you be able to convince your client that Power Platform indeed helps in cutting down budget and one of the best of its kind in the industry from every aspect.

This article aims at covering most of it with the help of some very apt case studies that summarizes and simplifies Power Platform licensing for you.

Out Of Scope 

This article does not cover UI flows, Power BI, Power Virtual Agent. Stay tuned for future articles covering these topics.

Prerequisites

First you need to take a look at the following pages from Microsoft documentation to understand the pricing slabs, restrictions, add-ons, etc.

Decision Making

First of all, there are few questions that you would need to answer to be able choose the right plan for your client, as follows - 
  1. How many flows will be running in your organization, approximately?
  2. What type of flows are they? Are they automated or scheduled or instant?
  3. How many scheduled flows do you need to run, approximately? How often will they run?
  4. How many automated and instant cloud flows do you need? How many users will be using them?
  5. What all data sources you want to connect with? Are they all native M365 services or would you be needing Premium connectors?
  6. Do they all have built-in connectors available in Power Platform? 
  7. What type of users will be accessing your environment? Are they internal or external to your organization or is it a combination of both?
  8. Do you need to build a portal?
  9. Do we need to worry about large number of API calls?

 Plan Selection Matrix

Sl No Requirement Plan Suggested
1. Native Data Sources Only? M365 E3/E5 Plan
2. PowerApp Only - At least one premium connector used? PowerApps Per User/Per App Plan
3. Large no. of users and limited no. of apps? PowerApps Per App Plan
4. Large no. of apps, but limited no. of API calls? PowerApps Per User Plan
5. Estimated high no. of API requests/flow calls from apps? App Passes under Capacity Add-ons
7. Run Scheduled Flows Only? No limit to API requests. Power Automate Per User Plan assigned to a service account
9. Run Instant/Automated Flows with potentially high no. of API requests? Power Automate Per User Plan
10. Run limited no. of shared flows with no limit to users consuming them and no limit to API requests? Power Automate Per Flow Plan

Case Studies

1. I have 5000 internal users, who will potentially use apps to be developed such that data operations will happen on M365 workloads only like SharePoint Online, OneDrive For Business, Microsoft (MS) Teams and Planner.

Solution
This case typically falls in bracket of question # 5 (see "Decision Making" above). Answer to this question will be Sl. # 1 in the Plan Selection Matrix (see above)

"M365 workloads only" means M365 native data sources being used. We do not need any premium connector. 

Answer to the puzzle is -
  • 1 M365 enterprise plan like E3 or E5 for every user 

2. I have a hybrid deployment, that uses M365 for workspace modernization. I have 5000 internal users, who will potentially use apps to be developed such that data operations will happen on M365 workloads like SharePoint Online, OneDrive For Business, Microsoft (MS) Teams and Planner. In addition, there will be 2 jobs that will import data from an on-prem SQL Server instance to SharePoint Online, that will eventually be visualized as Power BI dashboards and charts. My SQL Server on-prem database size is ever-growing.

Solution
Key excerpts from the above use case are -
  • Data import happens in M365 from on-prem SQL Server. So we need to configure data gateway, which is only available with premium plans of PowerApps/Power Automate.
  • Data import to SharePoint Online is via jobs. This means it is a scheduled sync and NOT real-time. Therefore, 2 scheduled flows should suffice.
  • 5000 internal users will be using app, that reads data from native M365 workloads. So they can very well be licensed by the standard M365 enterprise plans. (see Sl. # 1 in the Plan Selection Matrix)
  • Scheduled flows' calls of SQL Server connector will never exceed 5000 mark stipulated for a PowerApps Per User Plan (See "Prerequisites" section, above)
  • Ever-increasing database size is also not a matter of concern as we are bringing in data to SharePoint Online, an integral component with M365 plans.
So answer to the puzzle is -
  1. 5000 M365 E3/E5 plans
  2. 1 PowerApps Per User Plan for a service account for running the 2 flows

3. I need to develop an app that will use Microsoft Azure cognitive services like Conversational AI built on top of a curated knowledge base. The app needs to be deployed to MS Teams as a channel app for members of the channel to use.

Solution
This one is pretty straight forward. We shall use MS Azure QnA Maker connector in a flow in Power Automate and call the flow from a PowerApp canvas app. The PowerApp will then be added as a tab in a MS Team channel.

This corresponds to question # 5 under "Decision Making" and Sl. # 2 under section "Plan Selection Matrix", where your QnA Maker connector is a premium connector. The app will run in user context. 

So simple answer to the licensing puzzle is -
  • PowerApps Per User Plan for all the members of the channel.

4. I am an admin who wants to loop through all SharePoint Online sites in my tenant and generate a permission report.

Solution

This is a typical case where we will either need to develop a custom connector to enumerate through all SharePoint Online sites (check my post https://microsoftcloudautomation.blogspot.com/2021/08/powerapps-custom-connector-to-enumerate.html) or need to post HTTP requests from a scheduled flow to generate permission extract, that too as a repetitive task. This has potentials to even exceed 5000 API requests limit a day, if it is a daily extract and no. of sites in the tenant is large.

So this corresponds to Sl. # 7 under "Plan Selection Matrix" and questions # 3, 6 and 9 under "Decision Making" (see above).

Answer to the puzzle is - 
  • Power Automate Per User Plan to be assigned to a service account, that will run the scheduled flow.

5. I am a site admin who wants to be notified whenever there are documents uploaded or modified to a document library with some predefined criteria or pattern, due to security reasons. I have 3,000+ employees in my organization who do frequent uploads and changes in this library.

Solution
Here, the admin wants to monitor activities with files in a SharePoint library. As you can see that the library is open to all users in the organization and user count is 3000+, chances are high that there will be more than 5000 operations a day. Hence, if we create a real-time notification and extract activity log, that will lead to 5000+ API calls. On the flip, if we design a scheduled flow (which we can afford to do here, as it is a background process and extracts downloaded thrice or four times a day generally suffice monitoring purpose), we can still restrict API calls to only 3-4 times a day. That makes a whole bunch of difference when pricing is to be taken into consideration.

Also there is no built-in connectors for activity logging and monitoring. We need to consume #Microsoft365 Activity Management APIs. That needs us to make HTTP calls or build a custom connector (#customconnector). Whichever path you tread, you need to a premium connector.

So this corresponds to our questions # 2, 5 and 9 under section "Decision Making". Then, under "Plan Selection Matrix", Sl. # 7 also helps us coming to a decision point.

So answer to the puzzle is - 
  • 1 PowerApps Per User Plan assigned to a service account or to the admin's account will suffice

6. I want to create a portal that will be accessed by 5,000+ external users who are authenticated by Azure AD. The portal will have canvas apps embedded that will also be externally faced. 

This is a typical case of PowerApps portal being accessed by authenticated external users. 

Answer to the puzzle is -
  • Portals with login capacity is the required plan here. This gives 100 login session per month. So an estimated 51 plans should suffice for this requirement. (Visit PowerApps pricing page, referred under "Prerequisites" section.)

Conclusion

I really hope the aforesaid was helpful, as this indeed puzzles most of us who struggle to choose the right plan for our customers. I tried collating few of the trending use cases above. Comments with more use cases are welcome. I shall try my best to answer them.

Stay tuned for more case studies! 😊

Generative AI and Process Mining - A world of possibilities - Part 2

In my previous article , we explored how a Copilot agent, integrated with Power Automate and Azure OpenAI's large language models, can e...