Our Ph.D. student, Ruoyuan Gao, has successfully defended her dissertation titled “Toward a Fairer Information Retrieval System”. The committee included Chirag Shah (University of Washington, Chair), Yongfeng Zhang (Rutgers University), Gerard de Melo (Rutgers University), and Fernando Diaz (Microsoft).
Ruoyuan investigated the existing bias presented in search engine results to understand the relationship between relevance and fairness in the results. She developed frameworks that could effectively identify the fairness and relevance in a data set. She also proposed an evaluation metric for the ranking results that encoded fairness, diversity, novelty, and relevance. With this matric, she developed algorithms that optimized both diversity fairness and relevance for search results.
With the increasing popularity and social influence of information retrieval (IR) systems, various studies have raised concerns on the presence of bias in IR and the social responsibilities of IR systems. Techniques for addressing these issues can be classified into pre-processing, in-processing and post-processing. Pre-processing reduces bias in the data that is fed into the machine learning models. In-processing encodes the fairness constraints as a part of the objective function or learning process. Post-processing operates as a top layer over the trained model to reduce the presentation bias exposed to users. This dissertation explored ways to bring the pre-processing and post-processing approaches, together with the fairness-aware evaluation metrics, into a unified frame- work as an attempt to break the vicious cycle of bias.
We first investigated the existing bias presented in search engine results. Specifically, we focused on the top-k fairness ranking in terms of statistical parity fairness and disparate impact fairness definitions. With Google search and a general purpose text cluster as a lens, we explored several topical diversity fairness ranking strategies to understand the relationship between relevance and fairness in search results. Our experimental results show that different fairness ranking strategies result in distinct utility scores and may perform differently with distinct datasets. Second, to further investigate the relationship of data and fairness algorithms, we developed a statistical framework that was able to facilitate various analysis and decision making. Our framework could effectively and efficiently estimate the domain of data and solution space. We derived theoretical expressions to identify the fairness and relevance bounds for data of different distributions, and applied them to both synthetic datasets and real world datasets. We presented a series of use cases to demonstrate how our framework was applied to associate data and provide insights to fairness optimization problems. Third, we proposed an evaluation metric for the ranking results that encoded fairness, diversity, novelty and relevance. This metric offered a new perspective of evaluating fairness-aware ranking results. Based on this metric, we developed effective ranking algorithms that optimized for diversity fairness and relevance at the same time. Our experiments showed that our algorithms were able to capture multiple aspects of the ranking and optimize the proposed fairness-aware metric.