EqualAI Algorithmic Impact Assessment (AIA)

Assess your AI

Artificial Intelligence is an increasingly important and pervasive part of our daily lives and critical functions. As our reliance on this technology grows, the need increases exponentially to ensure it is trustworthy, meaning that the AI system is safe, inclusive, and tested thoroughly for unconscious biases and other potential harms.

There is increasing alignment on best practices for AI development and deployment, which are important elements of responsible AI governance. A key resource for understanding responsible AI governance is the congressionally mandated AI Risk Management Framework (AI RMF), released on January 26, 2023 by the U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) and the companion guide, the draft AI RMF Playbook (Playbook) to help navigate and use the AI RMF.

Algorithmic Impact Assessments (AIA) have gained increasing acceptance as a useful approach for organizations to identify potential risks and proactively avoid harms and liability stemming from their AI systems. The EqualAI Algorithmic Impact Assessment (“EqualAI AIA”) Tool was created to offer organizations a user-friendly tool to operationalize best practices illuminated in the recent NIST publications.


Example EqualAI AIA

Our example EqualAI AIA is for demonstration purposes only and does not collect information. To adapt the example assessment for your use, either download our templates (PDFJekyll) or contact us to schedule an AIA consultation.


Description

Provide an initial description of the system, project overview, and points of contact who will be responsible for the audit or AI system. Note: You may want to include unique ID, tags or other project identifiers to facilitate identification and record keeping downstream.

Context

Section 1.1. Intended purpose(s) and limitations

It is important to provide context about the specific, envisioned use cases for which the AI system is designed to be deployed. In the absence of advanced knowledge about all potential settings in which a system will be deployed, examination of the bounds of acceptable deployment is instructive to identify unanticipated or untested downstream uses.

Section 1.2. Interdisciplinary Collaborations

A team of AI actors with a diversity of experience, expertise, abilities and backgrounds, and with the resources and independence to engage in critical inquiry is critical to build a robust and comprehensive mapping system.

  1. What interdisciplinary expert teams are you engaging to identify and manage risks in all stages of the AI life cycle? Have you included individuals with expertise in:

Section 1.3. The business value or context of business use

There are instances in which the AI solution is not the appropriate choice. The most significant example are situations where the AI solution would cause more harm than good. Another situation where the decision to terminate development could be appropriate is when AI systems do not present a business benefit beyond the status quo. Inherent risks and implicit or explicit costs should be weighed in the evaluation of whether an AI solution should be developed or deployed. Defining and documenting the specific business purpose of an AI system in a broader context of societal values helps teams evaluate risks and increases the clarity of “go/no-go” deployment decisions.

  1. Have you reviewed the documented system's purpose from a socio-technical perspective, i.e., considering characteristics such as explainability, interpretability, privacy, safety, and bias management, and in consideration of societal values?

Section 1.4. Organization's mission and goals

By establishing comprehensive and explicit enumeration of AI system purpose and expectations, organizations can identify and manage socio-technical risks and benefits that could be supported or jeopardized by the AI system.

Section 1.5. Organizational risk tolerances

Risk tolerance reflects the level and type of risk the organization will accept while conducting its mission and functions. Deployment decisions should be the outcome of a clearly defined process that is reflective of an organization’s values, including its risk tolerances. Go/no-go decisions should be incorporated throughout the AI system’s lifecycle. For systems deemed “higher risk,” such decisions should include approval from sufficiently senior technical or otherwise specialized or empowered executives (for more information on risk tolerance, see NIST AI RMF document Section 3.2.2. Risk Tolerance).

Classification

Section 2.1. Learning Task

AI actors should define the technical learning or decision-making task that an AI system is designed to accomplish.

  1. Which category of learning tasks does the AI system support?

2.2. Operational Context

Once deployed and in use, there are times when AI systems can perform poorly, manifest unanticipated negative impacts, or violate legal or ethical norms. Human oversight and stakeholder engagement can provide important contextual awareness.

  1. Does the AI system have connections to external networks (including the internet), financial markets, and critical infrastructure that have potential for negative externalities?

2.3. Data Collection and Selection

Many AI system risks and vulnerabilities can be traced to insufficient testing and evaluation processes as well as oversight in data collection and curation.

  1. [After extended use(s)] Is the training and testing data still representative of the current operational environment(s)?
  2. If the dataset relates to, or derives from individuals (e.g., their attributes) were they informed about the data collection?
  3. Is the training and testing data representative of the demographic population with which the AI system will be used (e.g., age, gender, race, etc.)?
  4. Do the data collection processes adhere to organization policies related to bias, privacy and security for AI systems?
  5. Do the data collection processes comply with relevant legal or regulatory requirements applicable to data or AI systems? (See Section 6 on Legal)

Bias

Trustworthy and Responsible AI has been evaluated to determine whether a given AI system is biased, fair and does what it claims to accomplish. Processes to ensure trustworthy and responsible AI often focus on computational factors such as representativeness of datasets and fairness of machine learning algorithms, which indeed are vital for mitigating bias. However, a robust process must include considerations of human, systemic, institutional and societal factors that can present significant sources of AI bias as well. An effective governance system requires expanding the perspective beyond the machine learning pipeline to recognize and investigate how this technology is both created within and impacts our society.

Understanding AI as a socio-technical system acknowledges that the processes used to develop technology are more than their mathematical and computational constructs. A socio-technical approach to AI takes into account the values and behavior embedded in the datasets, the humans who interact with the AI systems, and numerous other factors that go into the design, development, and deployment of that system.

The importance of engaging in an evaluation process considering the transparency, datasets, and test, evaluation, validation, and verification (TEVV) cannot be overstated. Participatory design techniques and multi-stakeholder approaches, as well as human-in-the-loop processes are also important steps to help mitigate risks related to AI bias. However none of these practices provide a panacea against the introduction or scaling of bias in AI systems, and each can introduce potential pitfalls. It is critical to recognize the reality that it is not possible to achieve zero risk of bias in an AI system. Rather, we create and adopt AI governance to enable better identification, understanding, management, and reduction of bias, as well as other potential harms.

This section incorporates elements of SP1270 NIST taxonomy of bias in AI to help organizations initiate a probe for common types of biases that are likely to occur in the AI systems outcomes. Fairness evaluation and remediation is a fast-evolving, interdisciplinary field of research, which requires a variety of perspectives from different fields. Engaging a social scientist, an AI fairness specialist or similar an individual with expertise in this area could help facilitate responses to the questions in this section.

The NIST Special Publication on Bias categorizes AI biases into three categories:

3.1. Pre deployment check

  1. Are the users of the AI system properly trained to interpret AI model output and decisions? Are the staff developing the AI system trained to detect and manage bias in data?
  2. Statistical and Computational biases stem from errors that result when a sample is not representative of the population. Consider whether your AI system should be tested for one or more of the following statistical/computational biases:
  3. Systemic biases affect how organizations and teams are structured and who controls the decision making processes, which can result in certain social groups being advantaged or favored. Which types of systemic biases should your system be tested for?
  4. Human Cognitive biases reflect systematic errors in human thought based on a limited number of heuristic principles and predicting values to simpler judgmental operations. These biases are often implicit and tend to relate to how an individual or group perceives information (such as automated AI output) to make a decision or fill in missing or unknown information. Consider which types of human bias should your system be tested for, including:

3.2. Post deployment check: Are additional or newly established procedures necessary to continue mitigating bias or inequity?

  1. Have rechecks continued within stated time frames?

Cost/Benefits Evaluations

4.1. System Benefits

AI systems should be checked periodically to ensure benefits outweigh inherent risks as well as implicit and explicit related costs. To identify system benefits, organizations should define and document system purpose and utility, along with foreseeable costs, risks, and negative impacts.

  1. Have the benefits of the AI system been communicated to users? How?
  2. Have any employees or users raised concerns about the systems’ safety or inclusivity, or other potentially negative impacts arising from the AI system?

4.2. Potential Costs

Negative impacts can be due to many factors, such as poor system performance, and may range from minor annoyance to serious injury, financial losses, or regulatory enforcement actions.

  1. Can users or parties affected by the outputs of the AI system test the AI system and provide feedback?

4.3. Application Scope

Systems that function in a narrow scope tend to enable better mapping, measurement, and management of risks in the learning or decision-making tasks and the system context. Areas that help narrow contexts for system deployment include:

  1. Are there areas where the narrowing of application scope could help reduce risks exhibited by the AI system?
  2. Have you consulted with experts (e.g., legal and procurement experts) to identify whether the application scope needs to be further narrowed or refined?

Third-party Technologies

5.1. Third-party risks

Technologies, such as pre-trained models, and personnel from third-parties are another source of risk to consider during AI risk management activities.

  1. Did you acquire datasets from a third party for this AI system?
  2. Did you assess and manage the risks of using third-party datasets consistent with the process above?
  3. Did you acquire third-party material (open-source software, pre-trained models, open-source datasets, etc.) for developing the AI system?

5.2. Controls for third-party risks

AI actors often utilize open-source software, freely available datasets, or third-party technologies—some of which have been reported to have privacy, bias, and security risks.

  1. Have you applied controls—such as procurement, security, and data privacy controls—to all acquired third-party technologies, for example, when procuring a third-party AI model?
  2. Have you reviewed any audit reports, testing results, product roadmaps, warranties, terms of service, end-user license agreements, contracts, model/system cards, or other documentation available for third-party resources used in your AI system? For example, if you are using models such as DALL-E 2 have you reviewed its system card?

Final Decision

7.1. Identifying Impacts

Risk assessment enables organizations to create a baseline for system monitoring and to increase opportunities for detecting emergent risks.

  1. Are there systems or mechanisms in place to ensure continuous monitoring for impacts and emergent risks?
  2. If the AI system relates to subjects protected by international standards or bodies, have appropriate obligations been met (e.g., medical data might include information collected from animals)?

7.2. Likelihood and Magnitude of Impact

If an organization decides to proceed with deploying the system, the ‘likelihood estimate’ can be used to assign oversight resources appropriate for the risk level and triage the likelihood of the system’s impacts. A ‘likelihood estimate’ includes:

7.3. Final Decision

The final decision on go/no go should take into account the risks mapped from previous steps and the organizational capacity for their management.