fairness – I Seek, Therefore I Am

Challenging the status quo in search engine ranking algorithms

December 21, 2020 Angel Huang Comments 0 Comment

How can we bring more fairness to search result ranking? This was the question tackled by our FATE (Fairness Accountability Transparency Ethics) group in the 2020 Text REtrieval Conference’s (TREC) Fairness Ranking Track. In the context of searching for academic papers, the assigned goal of the track was the goal was to develop an algorithm that provides fair exposure to different groups of authors while ensuring that the papers are relevant to the search queries.

The Approach

To achieve that goal, the group decided to use “gender” and “country” as key attributes because they were general enough to be applied to all author groups. From there, the group created an fairness-aware algorithm that was used to run two specific tasks:

An information retrieval task where the goal was to return a ranked list of papers to serve as the candidate papers for re-ranking
Re-ranking task where the goal was to rank the candidate papers based on the relevance to a given query, while accounting for fair author group exposure

To evaluate the relevance of the academic papers, the group relied on BM25, which is an algorithm frequently used by search engines.

The Findings

By randomly shuffling the academic papers, the result was high levels of fairness if only the gender of the authors was considered. In contrast, if only the country of the authors was considered, fairness was relatively lower. With the proposed algorithm, data can be re-ranked based on an arbitrary number of group definitions. However, to fully provide fair and relevant results, more attributes need to be explored.

Why is fairness in search rankings important?

We use search engines everyday to find out information and answers for almost everything in our lives. And the ranking of the search results determine what kind of content we are likely to consume. This poses a risk because ranking algorithms often leave out the underrepresented groups, whether it’s a small business, or a research lab that is not established yet. At the same time, the results tend to only show information we like to see or agree with, which could lack diversity and contribute to bias.

Interested in learning more? Check out the full research paper here: https://arxiv.org/pdf/2011.02066.pdf

InfoSeeking at The 14th RecSys Conference

September 25, 2020 Pan Peewsook Comments 0 Comment

RecSys is the premier international forum for the presentation of new research results, systems, and techniques in the broad field of recommender systems.

We are thrilled to be involved in one of the most important annual conferences for the presentation and discussion of recommender systems research. This year, our Lab Director, Chirag Shah, collaborated with Spotify paper – Investigating Listeners’ Responses to Divergent Recommendations – is being presented at the conference.

Moreover, two of our InfoSeekers, Ruoyuan Gao and Chirag Shah, are also giving a tutorial on “Counteracting Bias and Increasing Fairness in Search and Recommender Systems”

About the Tutorial

Search and recommender systems have unprecedented influence on how and what information people access. These gateways to information on the one hand create an easy and universal access to online information, and on the other hand create biases that have shown to cause knowledge disparity and ill-decisions for information seekers. Most of the algorithms for indexing, retrieval, ranking, and recommendation are heavily driven by the underlying data that itself is biased. In addition, ordering of the search and recommendation results create position bias and exposure bias due to their considerable focus on relevance and user satisfaction. These and other forms of biases that are implicitly and some times explicitly woven in search and recommender systems are becoming increasing threats to information seeking and sense-making processes. In this tutorial, we will introduce the issues of biases in search and recommendation and show how we could think about and create systems that are fairer, with increasing diversity and transparency. Specifically, the tutorial will present several fundamental concepts such as relevance, novelty, diversity, bias, and fairness using socio-technical terminologies taken from various communities, and dive deeper into metrics and frameworks that allow us to understand, extract, and materialize them. The tutorial will cover some of the most recent works in this area and show how this interdisciplinary research has opened up new challenges and opportunities for communities such as RecSys.

DATE

Session A on Sep 25 10:00 – Sep 25 11:00, Attend in Whova
Session B on Sep 25 21:00 – Sep 25 22:00, Attend in Whova

FATE Research Group: From Why to What and How

July 26, 2020 Raef Salem Comments 0 Comment

When the public started getting access to the internet, search engines became common in daily usage. Services such as Yahoo, AltaVista, and Google were used to satisfy people’s curiosity. Although it was not comfortable using search engines because users had to go back and forth between all the search engines, it seemed like magic that users could get so much information in a very short time. At that time, users started using search engines without any previous training. Before search engines became popular, the public generally found information in libraries by reading the library catalog or asking a librarian for help. In contrast, typing a few keywords is enough to find answers on the internet. Not only that, but search engines have been continually developing their own algorithms and giving us great features, such as knowledge bases that enhance their search engine results with information gathered from various sources.

Soon enough, Google became the first choice for many people due to its accuracy and high-quality results. As a result, other search engines got dominated by Google. However, while Google results are high-quality, those results are biased. According to a recent study, the top web search results from search engines are typically shown to be biased. Some of the results on the first page are made to be there just to capture users’ attention. At the same time, users tend to click mostly on results that appear on the first page. The study gives an example about a normal topic: coffee and health. In the first 20 results, there are 17 results about the health benefits, while only 3 results mentioned the harms.

This problem led our team at the InfoSeeking Lab to start a new project known as Fairness, Accountability, Transparency, Ethics (FATE). In this project, we have been exploring ways to balance the inherent bias found in search engines and fulfill a sense of fair representation while effectively maintaining a high degree of utility.

We started this experiment with one big goal, which is to improve fairness. For that, we designed a new system that shows two sets of results, both of which are very similar to Google’s dashboard. (as illustrated by picture below). We have collected 100 queries and top 100 results per query from Google in general topics such as sports, food, travel, etc. One of these sets is obtained from Google. The other one is generated through an algorithm that reduces bias. The system has 20 rounds. The system gives a user 30 seconds on each round to choose the set they prefer.

For this experiment, we asked around 300 participants to participate. The goal is to see if participants can notice a difference between our algorithms and Google. The early results show that participants preferred our algorithms more than Google. However, we will discuss more in detail as soon as we finish the analysis process. Furthermore, we are in the process of writing a technical paper and an academic article.

Also, we have designed a game that looks very similar to our system. This game tests the ability to notice bad results. It gives you a score and some advice. In this game, users can also challenge their friends or members of their families. To try this game, click here http://fate.infoseeking.org/googleornot.php

For many years, the InfoSeeking Lab has worked on issues related to information retrieval, information behavior, data science, social media, and human-computer interaction. Visit the InfoSeeking Lab website to know more about our projects https://www.infoseeking.org

For more information about the experiment visit FATE project website http://fate.infoseeking.org

Creating a fairer search engine

March 14, 2020 Chirag Shah Comments 0 Comment

It’s getting increasingly more important to understand, evaluate, and perhaps rethink our search results as they continue to show bias of various kinds. Given that so much of our decision-making relies on search engine results, this is a problem that touches almost all aspects of our lives. Read about some of our new works in a new article by InfoSeekers Ruoyuan Gao and Chirag Shah:

Gao, R. & Shah, C. (2020). Toward Creating a Fairer Ranking in Search Engine Results. Journal of Information Processing and Management (IP&M), 57 (1).

With the increasing popularity and social influence of search engines in IR, various studies have raised concerns on the presence of bias in search engines and the social responsibilities of IR systems. As an essential component of search engine, ranking is a crucial mechanism in presenting the search results or recommending items in a fair fashion. In this article, we focus on the top-k diversity fairness ranking in terms of statistical parity fairness and disparate impact fairness. The former fairness definition provides a balanced overview of search results where the number of documents from different groups are equal; The latter enables a realistic overview where the proportion of documents from different groups reflect the overall proportion. Using 100 queries and top 100 results per query from Google as the data, we first demonstrate how topical diversity bias is present in the top web search results. Then, with our proposed entropy-based metrics for measuring the degree of bias, we reveal that the top search results are unbalanced and disproportionate to their overall diversity distribution. We explore several fairness ranking strategies to investigate the relationship between fairness, diversity, novelty and relevance. Our experimental results show that using a variant of fair ε-greedy strategy, we could bring more fairness and enhance diversity in search results without a cost of relevance. In fact, we can improve the relevance and diversity by introducing the diversity fairness. Additional experiments with TREC datasets containing 50 queries demonstrate the robustness of our proposed strategies and our findings on the impact of fairness. We present a series of correlation analysis on the amount of fairness and diversity, showing that statistical parity fairness highly correlates with diversity while disparate impact fairness does not. This provides clear and tangible implications for future works where one would want to balance fairness, diversity and relevance in search results.

I Seek, Therefore I Am

InfoSeeking Blog

Browsed by
Tag: fairness