Governance perspective: Managing an AI-driven organization
Managing, optimizing, and scaling the organizational AI initiative is at the core of the governance perspective. Incorporating AI governance into an organization’s AI strategy is instrumental in building trust, enabling the deployment of AI technologies at scale, and overcoming challenges to drive business transformation and growth. By driving consistency, AI governance enables alignment with organizational goals, and ensures that AI technologies are ethically used and effectively managed. To that end, AI governance frameworks create consistent practices in the organization to address organizational risks, ethical deployment, data quality and usage, and even regulatory compliance, as well as managing the different cost patterns of AI workloads. This creation of scalable processes and standards for AI deployments allow organizations to expand initiatives across business units to create long term business value.
Building an AI governance practice requires close alignment with the organization's AI strategy. The first step is to identify all the key stakeholders and bring together a team with representation from multiple business units. This team will be responsible for:
-
Defining governance goals, including compliance and ethical goals as well as identifying areas of potential risks.
-
Developing policies and guidelines to include data, transparency, responsible AI and compliance.
-
Defining mechanisms to monitor AI systems, performance, compliance, bias and determine actions based on predefined thresholds.
-
Continuously revise results and existing policies to ensure alignment with business goals, and AI safety.
In this perspective, we describe some solutions to governance challenges and introduce a new capability: The Responsible use of AI, a decisive element for future competitive advantage in the AI space.
Foundational Capability | Explanation |
---|---|
Cloud Financial Management (CFM) | Plan, measure, and optimize the cost of AI in the cloud. |
Data Curation | Create value from data catalogs and products. |
Risk Management | Leverage the cloud to mitigate and manage the risks inherent to AI. |
Responsible use of AI | Foster continual AI innovation through responsible use. |
Program and Project Management |
This capability is not enriched for AI, refer to the AWS CAF |
Data Governance |
This capability is not enriched for AI, refer to the AWS CAF |
Benefits Management |
This capability is not enriched for AI, refer to the AWS CAF |
Application Portfolio Management |
This capability is not enriched for AI, refer to theAWS CAF |
Cloud Financial Management (CFM)
Plan, measure, and optimize the cost of AI in the cloud.
Managing AI projects in the cloud involves planning for the cost structure of training and inference. This is important to consider in advance when budgeting for individual projects as well as for the overall funding of AI initiatives. An example of such a cost structure over the AI lifecycle, are zig-zag costs or phases of low/high/low/high costs:
-
You might start off with a high initial cost to establish or increase the quality of the data that is needed to build your solution. However, if the data is ready, this initial cost may be very low. This is followed by a potentially volatile proof-of-concept phase.
-
While most AI proof-of-concept (POC) initiatives may be relatively low-cost compute-wise, there are a few technical approaches that can quickly become costly, such as the training of larger models (in the context of generative AI) or constant retraining for domain-specific ML models. In such cases, you can leverage purpose-built AI hardware likeAmazon Elastic Compute Cloud (Amazon EC2) Trn1
instances powered by AWS Trainium or Amazon EC2 Inf2 instances powered by AWS Inferentia2 to help keep costs low. If you have access to the right talent, AI services, and AWS Partners, leverage their expertise to estimate the resources needed for different phases of your use cases and overall AI strategy. If feasible, work on calculating what an incremental improvement of an ML metric is worth to decide how to optimize your investment. -
After the first iteration of the system is built, the next phase of building a minimum viable product (MVP) may have a relatively high cost; for example, to generalize the system’s capability or acquire edge-case and long-tail data that is crucial for user adoption. If you are working on a use case that requires generative AI capabilities, you can evaluate using or fine-tuning foundation models, since that can have significant positive cost impact, as the initial training costs have been absorbed by your supplier or vendor (for example, Amazon BedrockTitan Foundation Model
). -
After AI models are deployed, inference itself is largely dependent on the volume of requests, and in many cases the inference cost itself is relatively low. If not, you can leverage the purpose-built AWS Inferentia
architecture. At this stage, monitoring model metrics and flagging drift alerts you to changes and the potential need to retrain your algorithms. You can leverage the low costs of scaling in the cloud. Throughout the AI lifecycle, it is important to track costs and tag all resources and ML workloads.
Once you have cost-visibility measures in place, it is critical to analyze the data ,
training , and inference costs over time . There is a large quantity of problem types (text,
forecasting, document processing) , which in their infancy do not cost much , but their
costs grow linearly with data size. There are other AI problems that rely on audio and voice
data that have a much higher start-up cost and need well-defined goals even in the POC phase
to not cause unexpected charges. Aligning your AI vision with the business goals should
inform how you scope the work , and establishing mechanisms to calculate the tradeoffs
between model costs and model performance is critical for maintaining positive ROI.
Additionally, the cost of data acquisition is strongly influenced by the mechanisms that
organizations establish around their data process. A standard process around acquiring new
data, and master data, is key to keeping costs down, just as much as keeping data in formats
where it can be used for AI (with reduced copy/read/copy or ETL needs). The cloud helps with
all of these challenges through governed data-services and zero-ETL patterns
Beyond this, always connect your AI initiative to an underlying business goal. If it
relates to a new revenue stream, assume how much revenue will likely be associated to what
success criteria and translate business value into your AI metrics. Factor in the
often-underestimated cost of not recognizing the need for the responsible use of AI. Due to
its importance, we have added the Responsible use of AI a
Data curation
Create value from data catalogs and products.
Your ability to acquire, label, clean, process, and interact with data will increase your speed, decrease time-to-value, and boost your model’s performance (such as accuracy). When models stall for accuracy, consider going back and enriching, growing, or improving the data you are feeding the algorithm. Doing so is often much easier than rearchitecting or squeezing out that next percent of performance with modeling alone.
Collecting data
Data quality
assessments
Easy to use human readable data repositories, catalogs and dictionaries, can provide a
centralized and organized repository of data and metadata about the organization’s data
assets, which empowers teams of all skill levels to discover, understand, collaborate on
data, and start using your data to create business value. This increases the speed to decide
upon the additional investment cost needed for other use cases considerably. There are many
ways to go about increasing your data’s potential, such as buying external data sources
Risk management
Use the cloud to mitigate and manage the risks inherent to AI.
While every new technology comes with a new set of risks, managing the risks involved both in the design and development process of AI systems as well as in the deployment and long-term operations and application of AI is challenging due to the non-deterministic nature of AI models. Some risks are financial. Start by factoring in the risk of sunken cost into the development process as the outcome of an AI development initiative is hard to guarantee upfront (the nature of optimizing a system for output compared with specifically building it to do so). Establish solid practices, such as model cards and adversarial inputs, and mechanisms such as POCs, minimum loveable products (MLPs), and MVPs, to mitigate and control risks.
Other risks are of legal and ethical nature. This includes risks as classified by your
local legislature, for example, the European Union
Developing and adopting safeguards and architectures that constrain the system when
necessary, not just in safety-critical environments is a priority. Make sure that subsystem failures don’t propagate and compound
Responsible use of AI
Foster continuous AI innovation through responsible AI practices.
Until recently, the responsible use of
this powerful new technology
Establish an AI governance board with representation of multiple business units (like
research, human resources, diversity and inclusion, legal, government and regulatory
affairs, procurement, and communications) to work closely or as part of AI leadership teams
to ensure AI solutions are safe and cause no harm to employees, customer, and society at
large. This board should be responsible for overseeing and guiding the ethical and
responsible development , deployment, and use of AI technologies, and for driving alignment
with industry regulations and compliance with AI-focused legislation. Scale how
Responsible AI impacts your design, development and operations over time
Embed explainability by design into your AI lifecycle where possible and establish
practices to recognize and discover both intended and unintended biases. Consider using
the right tools to help you monitor the status quo and inform risk. Use best practices
Note
The AWS Responsible Use of AI team has written a whitepaper