Abstract
Algorithmic Decision-Making with Stakeholder Participation
Vijay Keswani
2023
The development of trustworthy systems for applications of machine learning and
artificial intelligence faces a variety of challenges. These challenges range from the
investigation of methods to effectively detect algorithmic biases to methodological
and practical hurdles encountered when incorporating notions of representation,
equality, and domain expertise in automated decisions. Such questions make the
task of building reliable automated decision-making frameworks quite complex;
nevertheless, addressing them in a comprehensive manner is an important step
toward building automated tools whose impact is equitable. This dissertation fo-
cuses on tackling such practical issues faced during the implementation of auto-
mated decision-making frameworks. It contributes to the growing literature on
algorithmic fairness and human-computer interaction by suggesting methods to
develop frameworks that account for algorithmic biases and that encourage stake-
holder participation in a principled manner.
I start with the problem of representation bias audit, i.e., determining how well
a given data collection represents the underlying population demographics. For
data collection from real-world sources, individual-level demographics are often
unavailable, noisy, or restricted for automated usage. Employing user-specified
representative examples, this dissertation proposes a cost-effective algorithm to
approximate the representation disparity of any unlabeled data collection using
the given examples. By eliciting examples from the users, this method incorporates
the users’ notions of diversity and informs them of the extent to which the given
data collection under or over-represents socially-salient groups. User-defined rep-
resentative examples are further used to improve the diversity of automatically-
generated summaries for text and image data collections, ensuring that the gener-
ated summaries appropriately represent all relevant groups.
The latter part of the dissertation studies the paradigm of human-in-the-loop
deferral learning. In this setting, the decision-making framework is trained to ei-
ther make an accurate prediction or defer to a domain expert in cases where the
algorithm has low confidence in its inference. Our work proposes methods for
training a deferral framework when multiple domain experts are available to as-
sist with decision-making. Using appropriate statistical fairness mechanisms, the
framework ensures that the final decisions maintain performance parity across de-
mographic groups.
By focusing on stakeholder participation, in the forms of user feedback incorpo-
ration or domain expert participation, this dissertation advances methods to build
trustworthy decision-making systems which can be readily deployed in practice.
Algorithmic Decision-Making with Stakeholder Participation
A Dissertation
Presented to the Faculty of the Graduate School
of
Yale University
in Candidacy for the Degree of
Doctor of Philosophy
by
Vijay Keswani
Dissertation Director: L. Elisa Celis
May 2023
Copyright © 2023 by Vijay Keswani
All rights reserved.
ii
Contents
Acknowledgements v
1 Introduction 1
2 Background 10
2.1 Study of Stereotypes, Biases, and Their Impact . . . . . . . . . . . . . 11
2.2 Automated Decision-Making . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Social Biases in Automated Decision-Making . . . . . . . . . . . . . . 15
2.4 AlgorithmicFairness ............................ 18
3 Auditing for Diversity Using Representative Examples 23
3.1 RelatedWork ................................ 26
3.2 Notations................................... 28
3.3 ModelandAlgorithm ........................... 29
3.4 Empirical Evaluation Using Random Control Sets . . . . . . . . . . . 36
3.5 AdaptiveControlSets ........................... 42
3.6 Empirical Evaluation using Adaptive Control Sets . . . . . . . . . . . 45
3.7 Discussion, Limitations, and Future Work . . . . . . . . . . . . . . . . 50
4 Implicit Diversity in Image Summarization 53
4.1 RelatedWork ................................ 60
4.2 ModelandAlgorithms........................... 64
iii
4.3 Datasets ................................... 71
4.4 Empirical Setup and Observations . . . . . . . . . . . . . . . . . . . . 77
4.5 Discussion, Limitations and Future Work . . . . . . . . . . . . . . . . 90
5 Dialect Diversity in Text Summarization on Twitter 99
5.1 RelatedWork ................................102
5.2 Dialect Diversity of Standard Summarization Approaches . . . . . . 105
5.3 Model to Mitigate Dialect Bias . . . . . . . . . . . . . . . . . . . . . . . 111
5.4 Empirical Analysis of Our Model . . . . . . . . . . . . . . . . . . . . . 114
5.5 Discussion, Limitations, and Future work . . . . . . . . . . . . . . . . 121
6 Towards Unbiased and Accurate Deferral to Multiple Experts 124
6.1 RelatedWork ................................127
6.2 ModelandAlgorithms...........................129
6.3 Synthetic Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.4 Simulations Using a Real-world Offensive Language Dataset . . . . . 155
6.5 Discussion, Limitations, and Future Work . . . . . . . . . . . . . . . . 158
7 Conclusion 163
A Appendices 210
A.1 Appendix for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 210
A.2 Appendix for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . 215
A.3 Appendix for Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . 244
A.4 Appendix for Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . 265
iv
Acknowledgements
To Elisa Celis, who has been the best advisor I could have asked for. Your pas-
sion for the subject of data ethics and your engagement with this field beyond
computer and data sciences have been deeply inspiring. You taught me the im-
portance of clarity and curiosity in creating meaningful research and encouraged
me to explore topics beyond my comfort zone. Even when this led to me pursuing
unconventional projects, you still always supported me and worked with me to
refine my ideas. I am truly grateful for the faith you have shown in me.
To Nisheeth K. Vishnoi, who has been a mentor and a guide to me throughout
my Ph.D. Working with you helped me understand the importance of rigor and
perseverance in research. You have always emphasized the significance of asking
the right questions, developing a better theoretical understanding of my research,
and communicating my work in a clear and concise manner. Your lessons have
been paramount in my research.
To Matthew Lease and Krishnaram Kenthapadi, thank you for being my men-
tors and collaborators. I appreciate your patience, encouragement to continuously
pursue difficult research problems, and advice whenever I needed it. It has been a
privilege working with you both over the last few years.
Thank you to Jas Sekhon and Matthew Lease for being the readers of this dis-
sertation and to Karen Kavanaugh, Jay Emerson, and Andrew Barron for all the
administrative and academic help.
v
To Chinmayi Arun at the Information Society Project and Demar Lewis and
Chloe Sariego at the Institute of Social and Policy Studies, thank you for having me
as a fellow at your institutes. Being a part of these communities helped me develop
a better appreciation for the complexity that comes along with questions of ethics
and I am grateful to these institutes and the affiliated scholars for providing me
with opportunities to grow beyond the field of data science.
To my fellow Ph.D. colleagues, Colleen, Anay, Curtis, Alex, Shinpei, Megan,
and Sky, who kept me academically afloat, helped me troubleshoot my problems,
proofread this thesis, gave me amazing feedback on my research and writing, and
always patiently listened to my rants about research and beyond. I consider myself
incredibly lucky to have colleagues and friends as supportive as you.
To Stacey, Sarah, Doug, Halley, Topaz, Garth, Prabaha, Vasudha, Kabish, Ak-
shay, and Fiona, friends and housemates whose company kept me sane through
difficult days. My time in New Haven has been lovely, enriching, and full of pleas-
ant surprises thanks to all of you and I will cherish the moments we spent together.
Last but not the least, I thank my family for their neverending love and humor. I
am sure my decisions don’t always make sense to you but I will forever be grateful
for your endless support. I am who I am thanks to all of you.
vi
Chapter 1
Introduction
A rational decision-making process incorporates a variety of values and prefer-
ences of the decision-maker. Using information from prior decisions, evaluation
of counterfactuals, and ranking available actions by priority, decision-making in-
volves a complex mechanism that we, as humans, execute habitually. Our deci-
sions express our personal and social preferences and embody the values we deem
important. Yet, our decision-making processes are not perfect and we all face mo-
ments where our decisions are incorrect. These failures can stem from inadequate
prior information, lack of experience, or from other internal and external factors.
Considering the impossibility of any one human possessing all the knowledge and
experience in the world, we rely on each other to make correct decisions. We defer,
we ask for help, and we learn from others to improve our decision-making. We de-
velop automated tools like computers to assist our decision-making by using them
for routine tasks like arithmetic computations or by employing them for complex
tasks that involve advanced algorithmic systems like map navigation.
A crucial aspect of any decision-making process that involves two or more par-
ties is trust. The exercise of trust building involves beliefs of shared values and
interests among the decision-makers and facilitates the acceptance of one party’s
1
decision by the other. With the involvement of automated tools in our decision-
making process, the question of trust comes up time and again. Do we trust au-
tomated tools to make decisions that embody our values? Do we trust automated tools
to account for our preferences in an objective manner? Do we trust automated tools to
make decisions in a way that would be the most beneficial to us? We trust other humans
to assist our decisions when they have demonstrated, through intent and action,
that they share similar interests as us. Can we place the same trust in automated tools
that we didn’t develop and in algorithms that we didn’t design which are, nevertheless,
parts of our daily lives? This dissertation explores this question by investigating the
decisions made by Artificial Intelligence (AI) and Machine Learning (ML) tools
through the perspectives of users and stakeholders. I demonstrate how flawed al-
gorithmic mechanisms can lead to harmful automated decisions and design meth-
ods to counter algorithmic harms, whenever possible, through a judicious process
by which stakeholders are a part of the algorithmic decision-making process.
The availability of large datasets, massive computing power, and progress in
machine learning methods has led to a surge in the use of automated decision-
making frameworks in a variety of domains. Technological and monetary invest-
ments have facilitated significant improvements in the performance of algorithmic
tools and a number of applications of these tools lie in fields that make decisions
affecting humans and society in general. They are employed in numerous criti-
cal applications, including healthcare [152,286,302], advertising [245,257], online
search and recommendation feeds [40,180], lending [231,324], content modera-
tion [80,220,313], recruitment [106], criminal risk assessment [1,89,128,233], and
policing [134,149]. All these applications involve actively processing information
related to people and making decisions that affect society at an individual and in-
stitutional level. The impact of such automated frameworks in shaping our current
and future socio-technical landscape cannot be understated.
2
A technical taxonomy of applications of artificial intelligence involves consid-
ering different kinds of learning methodologies involved in the applications. In
this context, two popular learning approaches that cover a large number of appli-
cations are unsupervised and supervised learning1. Unsupervised learning corre-
sponds to processing large amounts of unlabelled data to extract useful structural
and semantic information about the data [48,82]. For instance, the task of clus-
tering or ranking a large set of images to generate a small subset of representative
images (e.g., ranking in search engines or recommendation feeds) is a prominent
use case of unsupervised learning. Supervised learning, on the other hand, is used
to develop labeling or prediction algorithms that can predict task-relevant out-
comes for given data points [47,140]. Supervised learning techniques are used
to train decision-making frameworks on outcome-labeled datasets, with the goal
of accurately predicting the outcomes for future data. For example, past human
hiring decisions can be used to train an automated recruitment pipeline that then
makes hiring decisions for future applicants. Both learning paradigms are widely
employed in a variety of domains. Unsupervised learning tools, such as recom-
mendation and search, fulfill important informational gaps between data and the
underlying population structure and are now an integral part of our interaction
with the digital world. Similarly, supervised learning algorithms, such as classifi-
cation and regression, are trained to simulate past decisions and deployed to assist
future decision-making.
Trust in these automated tools is usually established by testing them on real-
world scenarios and quantifying their performance using statistical measures. While
some errors are expected (as in the case of even human decision-making), we nev-
1There are other learning paradigms as well, including reinforcement learning and a spectrum
of semi-supervised learning methods that combine techniques from supervised and unsupervised
learning. For this thesis, the focus on supervised and unsupervised learning arises from the interest
in specific applications where algorithmic harms are commonly encountered. See Section 2.2 for
further discussion.
3
ertheless demand these automated tools to demonstrate that their decisions align
with the users’ preferences. For instance, search engines sometimes return results
that do not provide us with the information we are looking for. Random errors
might be excusable if they occur infrequently and the overall decision accuracy
is sufficiently high; however, systemic errors that reflect problematic decision-
making patterns reveal deeper issues with the use of automation. Investigating
the pattern of decisions made by certain automated tools indeed paints a grim pic-
ture: real-world algorithmic decisions often encode problematic social biases and
disparately favor some demographic groups over others. Furthermore, this dispar-
ity mirrors the divide in our society as algorithms employed in real-world practice
exhibit and even propagate societal inequalities and negative stereotypes against
groups that have been historically disadvantaged. In the case of unsupervised
learning, real-world applications of summarization or retrieval algorithms have
been shown to exhibit gender and racial biases, leading to a stereotypically-biased
representation of underlying populations [166,232]. Similarly, supervised algo-
rithms deployed in practice often have disparate performance for different demo-
graphic groups, such as in the settings of criminal recidivism [1,64,94,182], pre-
dictive policing [103,149,259,272], recruitment [78,264], and healthcare [92,235].
Clearly, the presence of these biases undermines the trust we can place in the deci-
sions of automated tools. Correspondingly, it is important to study methods that
can (a) evaluate the biases in automated tools, and (b) if possible, modify these
tools so that they do not inherit and propagate social biases of the data or the de-
velopers. Chapter 2 presents an overview of the research on social biases, popular
techniques for automated decision-making, and prior studies demonstrating social
biases in automated decisions.
Addressing social biases in algorithmic tools requires overcoming many differ-
ent kinds of challenges. From a practical viewpoint, the definition of what one
4
considers to be unbiased or fair is highly context-dependent and relies crucially
on the stakeholders involved in the design of the framework. From a technical
viewpoint, ensuring that the output of an algorithm is fair with respect to socially
salient attributes, such as gender, race, skintone, or dialect, often requires incor-
porating additional constraints or posthoc adjustments into the learning process,
making the task of learning the final framework quite complex [173]. These prac-
tical and technical challenges manifest themselves in different ways in different
applications, making the process of bias mitigation a highly involved exercise that
requires the participation of both users and designers of the framework to con-
verge to an accurate and equitable decision-making framework. This dissertation
discusses both challenges using the methodological frameworks of popular ap-
plications of AI, such as Google Image Search, Twitter recommendation feeds, and
human-AI teams for content moderation. In all of these applications, the presented
research studies the impact of social biases, suggests methods to audit them effi-
ciently, and, in most cases, proposes solutions that can function as unbiased alter-
natives in these applications. The proposed solutions take into account the hurdles
one can encounter while implementing these frameworks in real-world settings
and aims to provide feasible solutions to address biases despite such hurdles.
The first step towards addressing biases in any algorithmic application is to de-
velop methods to efficiently detect or audit them. The statistical question of bias
audit essentially boils down to employing hypothesis testing frameworks to deter-
mine if there are disparities in the representation of different groups in any given
data collection. However, this simple process of bias audit becomes difficult to im-
plement when the group memberships or socially salient attributes (e.g., gender or
skintone) of individual samples are unknown2. For example, suppose we wish to
2I will use terms socially salient attributes and protected attributes interchangeably throughout
the dissertation. While protected attributes usually correspond to group identities that are pro-
tected by anti-discrimination laws, I will use this term to also denote attributes that we wish to
protect against algorithmic harm. See Section 2.4 for further discussion on this point.
5
check the disparity in gender representation of Google Image Search results for any
given occupation. Executing this task automatically is difficult since the presented
gender of the people in the images is quite often unavailable. Auditing these re-
sults, in this case, would then involve manual labeling or crowdsourced labeling
of the perceived gender, which can be expensive and time-consuming. Chapter 3
presents an alternative - i.e., an efficient algorithm for auditing representational bi-
ases in the absence of socially salient feature information. The proposed algorithm
uses a small set of labeled representative examples (which can be user-specified)
to measure representation disparity in any given unlabeled dataset, under certain
domain assumptions. To measure representation disparity with respect to any so-
cially salient attribute (i.e., the difference in the fraction of elements with one at-
tribute value vs. another), this algorithm calculates the average similarity between
the elements in the unlabeled dataset and the elements in the labeled set of repre-
sentative examples. Using these similarity scores, we can approximate the repre-
sentation disparity by taking the difference between group-wise similarity scores.
Theoretical analysis using standard concentration inequalities demonstrates that
the proposed algorithm produces a good approximation of the actual representa-
tion disparity of the dataset even when the number of labeled examples is loga-
rithmic in the size of the unlabeled dataset. To further reduce the approximation
error, we also propose an algorithm that can construct an appropriate set of labeled
examples for auditing purposes. Empirical evaluations on multiple image and text
datasets demonstrate that the proposed audit algorithm effectively approximates
the representation disparity in any random or topic-specific data collection.
The primary contribution of the above bias audit algorithm is the use of rep-
resentative examples. These user-defined representative samples incorporate the
user’s notion of diversity and side-step the issue of unavailable group attributes.
We extend the use of such representative samples to debias automatically-generated
6
summaries. Chapters 4 and 5 cover the field of fair summarization and present
post-processing algorithms for generating diverse summaries using a small set of
representative examples. Both chapters first highlight the presence of social bi-
ases in the outputs of popular image and text summarization algorithms and then
use suggest methods to improve group representations in automatically-generated
summaries using user-defined representative examples. Chapter 4 focuses on im-
age summarization, where we first evaluate the diversity in Google Image Search
results. To do so, we collect top image search results using 96 occupations as
search queries (extending the methodology of Kay et al. [166]). We observe that
the search results consistently favor and over-represent gender-stereotypical and
skintone-stereotypical images. Given this issue of misrepresentation, we next pro-
pose efficient methods to incorporate visible diversity in summary results using
user-defined representative examples. Once again, note that these data collections
can be at scales where collecting socially salient attributes or group labels is infeasi-
ble (e.g., search engine results for any possible query) and the use of representative
examples can side-step the issue of unavailable attributes. We propose two post-
processing algorithms, inspired by the well-known Maximal Marginal Relevance
(MMR) algorithm [46], to debias image summaries in a post-processing manner.
Our algorithms take a black-box image summarization algorithm and the unla-
beled dataset to be summarized as input and overlay it with a post-processing
step that diversifies the results of the black-box algorithm using the given repre-
sentative examples. We demonstrate the efficacy of these algorithms over multiple
image datasets, including the Google Image Search dataset we collected. For these
datasets, we observe an improvement in demographic representation in generated
summaries while ensuring that the summaries are visibly diverse in a similar man-
ner as the user-defined representative examples.
Chapter 5 extends the use of our post-processing algorithm for the domain of
7
extractive text summarization, i.e., the task of generating a short summary for a
large number of sentences. Again, we first demonstrate the lack of diversity in
the summaries generated by popular extractive text summarization algorithms. In
particular, our analysis considers diversity with respect to various dialects (e.g.,
Standard English and African-American English dialects) in datasets containing
Twitter posts. We evaluate the dialect diversity in the summaries generated by
frequency-based summarization algorithms (e.g., TF-IDF [203] and Hybrid TF-IDF
[150]), graph-based algorithms (LexRank [104] and TextRank [209]), non-redundancy
based algorithms (MMR [122] and Centroid-Word2Vec [262]), and pre-trained su-
pervised approaches (SummaRuNNer [224]). We observe that, for random and
topic-specific collections from these datasets, most algorithms return summaries
that under-represent certain dialects. To address this dialect bias, we employ the
post-processing algorithm from Chapter 4. As mentioned earlier, this approach re-
quires a small set of representative labeled examples, which in this case is a small
dialect-diverse set of Twitter posts given as part of the input. Using a small set
of sentences written in different dialects as the set of representative examples, the
post-processing algorithm efficiently increases the dialect diversity of any set of
given Twitter posts, demonstrating the applicability of this approach for debiasing
social media recommendation feeds.
Chapter 6 considers the supervised learning problem of training a decision-
making framework given human assistance. In applications like risk assessment
[127] and maltreatment hotline screening [65], multiple human experts are avail-
able to assist an automated decision-making framework, so as to share the load
and to cover different kinds of input samples [129]. This chapter studies the setting
where an automated decision-making framework can either make a prediction for
a given input or defer the decision to a human expert when it has low confidence
in its prediction. Since different human experts can have different domains of ex-
8
pertise and various social prejudices, choosing the appropriate unbiased expert
when deferring the decision is crucial to ensure the high accuracy of final predic-
tions. Hence, in this setting, there is an additional challenge of determining which
decision-maker (among the available humans and the machine) should make the
final decision. Chapter 6 presents a training framework that simultaneously learns
an automated classifier and a deferral model, such that the classifier is the primary
decision-maker but it defers the decision to an appropriate human for input sub-
domains where it lacks sufficient information. Theoretically, we show that this de-
ferral framework can be trained efficiently using gradient descent-based methods
and provide mechanisms to incorporate popular statistical fairness metrics with
the deferral training. The efficacy of the framework is also demonstrated via syn-
thetic experiments and real-world experiments, the latter conducted over a dataset
we curate by asking a large number of crowd-annotators to label the toxicity of a
collection of social media posts.
The methodologies presented in this dissertation focus on stakeholder partici-
pation. Chapters 3, 4, and 5 present algorithms that address biases using a user-
specified representative set of examples. By utilizing these examples, we ensure
that the final output of the framework aligns with the user’s idea of diversity and
create a participatory process to address representation biases. Similarly, Chap-
ter 6 proposes methods to create decision-making frameworks that employ the
available human experts in a manner that improves the overall predictive accu-
racy. Such a framework is most effective when the human experts are as diverse as
the targeted user population. The inclusion of human feedback helps incorporate
shared values, preferences, and expertise of the stakeholders. In this manner, the
research in this dissertation aims to address crucial faults in the final decisions of
automated decision-making frameworks using stakeholder participation, allowing
us to steadily build trust in the decisions of these frameworks.
9
Chapter 2
Background
There has been significant interest in the field of fair machine learning and AI ethics
in the last decade. Early investigations by journalists and academic scholars em-
pirically demonstrated the presence of gender and racial biases in the outcomes of
algorithmic frameworks [13,105,232,211]. Seminal works by computer and data
sciences researchers correspondingly studied methods to mathematically model
these automated biases [22,98,136,312]. Following the footsteps of these works
and inspired by decades of research on decision-making biases in fields like so-
ciology, law, philosophy, and economics, data science and computer science re-
searchers have started critically assessing the biases present in different algorith-
mic applications. In this chapter, I present an overview of the research on social
biases in automated decision-making and situate the work presented in this thesis
within the larger fields of algorithmic fairness and human-computer interaction.
Literature that is directly related to the research presented in this dissertation is
relegated to the individual chapters. The discussion below starts with a brief intro-
duction to the research on stereotypes and biases in human decision-making and
then covers the relevant paradigms of automated decision-making and algorithmic
fairness methods for machine learning and artificial intelligence applications.
10
2.1 Study of Stereotypes, Biases, and Their Impact
The study of cultivation and the impact of stereotypes has drawn serious interest
in the age of digital media [237,247], primarily due to the increased ease of infor-
mation access and the possibility of stereotype propagation via sources like images
on social media or search results. To define briefly, stereotyping is the process of
inferring common characteristics of individuals in a group. When used accurately,
stereotypes associated with a group are helpful in deducing information about in-
dividuals from the group in the absence of additional information [33,207] and
also function as tools to characterize group action [41,139,287]. However, inaccu-
rate or exaggerated stereotypes can be quite harmful and can inadvertently cause
biases against the individuals from the stereotyped group [116]. Prior studies have
shown that the association of a negative stereotype with a group for a given task
can affect the performance of the stereotyped individuals on the task [281,306];
using the performance on such a task for any kind of future decision-making will
lead to the propagation of such stereotypes and bias the results against one group.
Furthermore, inaccurate stereotypes also lead to an incorrect perception of real-
ity, especially with respect to sub-population demographics [117,166,275]. For
example, stereotypical images of Black women as matriarchs or mammies, that
are further disseminated via digital media, can lead to the normalization of such
stereotypes [68,138]. Given the existence of such negative social stereotypes and
the possibility of their propagation via digital sources, it is important to explore
methods to prevent their exacerbation through the use of automation.
The role of biases has seen similar investigation across social science disci-
plines. Decision-making biases often arise due to the decision-maker’s prejudices
against certain groups or due to a lack of information about individuals from cer-
tain groups (leading to a reliance on stereotypes) [24,107]. These biases manifest
themselves in the form of reduced access to resources or diminished performance
11
of decision-making systems for individuals from disadvantaged groups. Contin-
uous audit of various human and institutional decision-making settings has re-
vealed the presence of biases with respect to race, gender, and other demographic
and socially-salient attributes in many common settings. This includes biases in
socially-critical applications like mortgage approval [8], criminal justice system
and policing [217], healthcare [92], recruitment [142], and social welfare access
[277].
Frequent and extensive audits of these decision-making settings are crucial to
ensure the accountability of the associated institutions. In particular, third-party
audits of biases have been shown to be impactful in the past, often resulting in sig-
nificant oversight and modification of harmful decision-making processes [4,39].
It is important to subject automated decision-making to a similar level of continu-
ous scrutiny and methods to efficiently audit or mitigate social biases can be useful
in developing accountable and transparent technologies.
2.2 Automated Decision-Making
Automated decision-making can take a variety of forms and can be studied in the
context of any application that involves machine support. For the purposes of this
dissertation, I focus on automated frameworks that are designed to make decisions
by processing large amounts of prior and current data and decisions.
As mentioned before, unsupervised learning algorithms learn mathematical
(and potentially interpretable) patterns within a large data collection [48,82]. Given
a large number of samples from a particular domain, unsupervised learning algo-
rithms aim to deduce the underlying representation of the samples which can then
be used for future decision-making. Clustering, summarization, and outlier detec-
tion are all various instances of the unsupervised learning approach that allow for
12
a structured analysis of a large amount of data.
Supervised learning aims to learn the mathematical relationship between task-
related features and the associated outcomes (usually characterized by class labels)
through data [47,140]. Given task-related features for samples observed in the
past and the decisions made or true outcomes for these samples, supervised learn-
ing algorithms are used to infer a mathematical function that maps the features
to the decisions/outcomes; this function can then be used to make decisions for
future samples. The feature-decision pairs used to learn the function are called the
training data for the learning algorithm. For example, in healthcare, this training
data could correspond to health and demographic data of patients and whether
they were afflicted with a particular disease. The supervised learning algorithm
trained on this data can then be potentially used to predict the likelihood of any
future patient suffering from the same disease using their health and demographic
information.
The primary difference between supervised and unsupervised learning is that
in unsupervised learning there are no “decisions” or labels associated with the
available data. For example, clustering simply involves finding subsets within a
given dataset such that elements within a subset are more similar to each other
than to the elements outside the subset [178]. The learned cluster identities can
then be used for downstream labeling or decision-making, but these identities
wouldn’t be known beforehand.
Finally, semi-supervised learning combines the paradigms of supervised and
unsupervised learning and is applied in situations where a small amount of la-
beled data is available along with a large amount of unlabelled data. In this case,
combining the function learned using the labeled data with representations learned
using unlabelled data is important to build an overall robust decision-making sys-
tem. Chapters 4 and 5 demonstrate the use of unsupervised and semi-supervised
13
learning paradigms for the task of summarization.
Note that automated decision-making is traditionally associated with just su-
pervised or semi-supervised learning. This is because the notion of decision-making
is clear in the applications of these paradigms given data about past decisions,
learn to simulate these decisions in the future. For unsupervised learning, prior
decisions are not available. Nevertheless, the representations learned using un-
supervised learning algorithms are still used for decision-making. Clustering al-
gorithms are often used to identify the appropriate cluster for future samples so
that cluster-specific processing techniques can be employed appropriately. Sum-
marization algorithms are used to decide which samples best represent a given
large collection. Recommendation systems similarly decide the content that is
most likely to be relevant to a given user. Considering that the applications of
unsupervised learning involve making automated decisions, I will use the term
automated decision-making for unsupervised learning applications as well.
While the goal of supervised learning is to simulate (and potentially replace)
human decision-making, in practice, automated decision-making tools are often
deployed side-by-side with expert humans [84,133]. For example, machine learn-
ing models in healthcare assist doctors and medical practitioners with accurate
diagnosis [38,171]. Criminal risk assessment tools operate with judges to provide
an empirical estimate of recidivism risk [96,127]. Human experts are also involved
in auditing the outputs from automated models to detect errors for input samples
where the automated system has insufficient experience, as observed in the case of
child maltreatment hotline screening [65]. Many other examples of similar hybrid
human-machine decision-making frameworks exist in literature [236,282,315].
For such human-in-the-loop frameworks, the approaches used for learning a clas-
sifier can often be different than those used in traditional supervised learning algo-
rithms. Assuming one or more human experts are available to assist a classifier in
14
decision-making, an ideal training process should ensure that the capabilities and
expertise of the humans are appropriately utilized to improve prediction accuracy
or performance. However, since humans can have additional costs associated with
their decisions (corresponding to time or resources invested to make predictions),
the classifier will be expected to bear the primary decision-making load and hu-
mans should only be consulted when the classifier has low confidence in its deci-
sion. One can see that training human-in-the-loop frameworks can be more com-
plex than traditional supervised learning; along with training an accurate classifier,
the framework should also decipher the domains of expertise of different human
experts so that they can be consulted appropriately. This field of research has seen
a lot of recent interest due to the applicability of such frameworks in a variety of
real-world settings. Algorithms to learn accurate human-in-the-loop frameworks
have been forwarded by a number of recent studies [204,218,219,253]. Chapter 6
proposes a novel learning algorithm for human-in-the-loop deferral frameworks,
where the goal is to train a classifier that can either make an accurate decision or
that defers the decision to an appropriate human expert when the classifier has
low confidence in its decision. Considering that a number of applications are cur-
rently adopting automated decision-making systems, human-in-the-loop frame-
works can allow such applications to smoothly and steadily transition from human
decision-making to automated decision-making.
2.3 Social Biases in Automated Decision-Making
Either due to inappropriate data or due to imperfect model designs, automated
decision-making frameworks currently display problematic social biases in their
output. Applications where decision-making institutions have historically denied
opportunities to the underprivileged groups of the population, e.g. credit lend-
15
ing [258], will still suffer from the impact of such historical biases when automa-
tion is incorporated into the decision-making framework. Years of discrimina-
tory decision-making can corrupt the training datasets used to learn automated
decision-making models. Corrupted datasets are indeed currently employed for
creating models in many real-world applications, such as recruitment [78,264],
healthcare [235,302], facial analysis [39,269], risk assessment [13,94], and predic-
tive policing [272]. Furthermore, inappropriate processes for past and current data
collection, aggregation, and processing of these datasets has compounded biases
against minority groups. For example, survey instruments for data collection often
use oversimplified race categorizations, which ignore the historical and political
background that led to popular racial classifications [135]. Similarly, measurement
errors in data collection can be disparately larger for the groups which have histor-
ically denied equal opportunities, leading to diminished information about indi-
viduals from the group [285]. Misrepresentation or under-representation of certain
demographic groups in the data used to develop the decision-making model will
affect the performance of the model for these marginalized groups. Inappropri-
ate representation limits the amount of information that a trained model learns
about the affected group and correspondingly results in larger errors when used
for decision-making over this group [293].
Biases in data used for learning automated models can affect the outcome in
many different problematic ways. When the model is used for resource allocation,
as in the case of loan applications, admissions, risk assessment, or any other super-
vised learning application, biases in outcomes can result in disparate resource al-
location across demographic groups, resulting in a denial of equal opportunity [20].
Representational biases can also affect public perceptions associated with misrep-
resented or under-represented groups. The negative portrayal of minority groups
in the input data or the resulting decisions of automated frameworks propagate,
16
and sometimes even exacerbate, the negative stereotypes associated with these
groups [166]. Beyond data biases, inappropriate model designs that do not ac-
count for the heterogeneity in the underlying population demographics can also
result in disparate performance across groups [103].
In particular, inappropriate representation in a data collection can imply two
different kinds of mismatches between the data collection and the underlying pop-
ulation. From a technical viewpoint,