Browsed by
Tag: human information behavior

FATE Research Group: From Why to What and How

FATE Research Group: From Why to What and How

When the public started getting access to the internet, search engines became common in daily usage. Services such as Yahoo, AltaVista, and Google were used to satisfy people’s curiosity. Although it was not comfortable using search engines because users had to go back and forth between all the search engines, it seemed like magic that users could get so much information in a very short time. At that time, users started using search engines without any previous training. Before search engines became popular, the public generally found information in libraries by reading the library catalog or asking a librarian for help. In contrast, typing a few keywords is enough to find answers on the internet. Not only that, but search engines have been continually developing their own algorithms and giving us great features, such as knowledge bases that enhance their search engine results with information gathered from various sources.

Soon enough, Google became the first choice for many people due to its accuracy and high-quality results. As a result, other search engines got dominated by Google. However, while Google results are high-quality, those results are biased. According to a recent study, the top web search results from search engines are typically shown to be biased. Some of the results on the first page are made to be there just to capture users’ attention. At the same time, users tend to click mostly on results that appear on the first page. The study gives an example about a normal topic: coffee and health. In the first 20 results, there are 17 results about the health benefits, while only 3 results mentioned the harms.

This problem led our team at the InfoSeeking Lab to start a new project known as Fairness, Accountability, Transparency, Ethics (FATE). In this project, we have been exploring ways to balance the inherent bias found in search engines and fulfill a sense of fair representation while effectively maintaining a high degree of utility.

We started this experiment with one big goal, which is to improve fairness. For that, we designed a new system that shows two sets of results, both of which are very similar to Google’s dashboard. (as illustrated by picture below).  We have collected 100 queries and top 100 results per query from Google in general topics such as sports, food, travel, etc. One of these sets is obtained from Google. The other one is generated through an algorithm that reduces bias. The system has 20 rounds. The system gives a user 30 seconds on each round to choose the set they prefer.

For this experiment, we asked around 300 participants to participate. The goal is to see if participants can notice a difference between our algorithms and Google. The early results show that participants preferred our algorithms more than Google. However, we will discuss more in detail as soon as we finish the analysis process. Furthermore, we are in the process of writing a technical paper and an academic article.

Also, we have designed a game that looks very similar to our system. This game tests the ability to notice bad results. It gives you a score and some advice. In this game, users can also challenge their friends or members of their families. To try this game, click here http://fate.infoseeking.org/googleornot.php

For many years, the InfoSeeking Lab has worked on issues related to information retrieval, information behavior, data science, social media, and human-computer interaction. Visit the InfoSeeking Lab website to know more about our projects https://www.infoseeking.org

For more information about the experiment visit FATE project website http://fate.infoseeking.org

Manasa Rath successfully defends her dissertation

Manasa Rath successfully defends her dissertation

Manasa Rath, Ph.D. student

Our Ph.D. student, Manasa Rath, has successfully defended her dissertation titled “Assessing the quality of user-generated content in the presence of automated quality scores”. The committee included  Chirag Shah (University of Washington, Chair), Vivek Singh (Rutgers University), Kaitlin Costello (Rutgers University), and Sanda Erdelez (Simmons University).

Students seek online crowdsource to fulfill their academic course requirements

Manasa investigated the quality of those user-generated content whether it is correct, credible, clear, and complete using her developed framework. The framework has been validated with multiple experts in the field. She then generated an automation to score the content accordingly and conduct a user study on 45 undergraduate students to see how and to what extent do users considered the role of quality while completing the task provided.

Abstract

With the proliferation in participatory web culture, individuals not only create but also consume content present in crowdsourced environments such as blogs, question-answering systems. Studies have revealed users often employ the content present in them to make a range of choices, from issues in everyday life to important life decisions, without paying much attention to the quality. In the recent years, studies have demonstrated K-12 students’ over reliance on these crowdsourced sites to fulfill their academic course requirements. Therefore, it is important to evaluate users’ cognizance while evaluating the quality of content in these venues. But before identifying to what extent do users make use of quality while evaluating user-generated content, it is important to learn what constitutes of quality. To address these issues, this dissertation expounds to the problems in a three-step process. First, the dissertation begins by developing a conceptual framework for evaluating quality of user-generated content consisting of constructs such as correct, credible, clear, and complete. The second step involves validating the framework with the help of twelve experts i.e. librarians to attest the developed framework and using this validation to come to generate automated methodologies to evaluate the quality of content to provide quality scores. To further investigate, the third step delves deeper into users’ reliance on the automated quality scores by conducting a user study. 45 undergraduate students were recruited to investigate their use of automated quality scores while completing their tasks under three conditions – users provided with genuine quality scores, users provided with manipulated quality scores, and users provided with no quality scores (control). As prior research has indicated users task completion is dependent on the task-type provided, this user study involves providing users with different task types such as ranking and synthesis in the presence of quality scores. To further comprehend users’ use of quality scores while completing the tasks, the author makes use of eye-tracking metrics such as total number of gazes, gaze duration, and the number of gazes on the quality scores to evaluate their use by the users. Analyses was performed with the help of fixation data, users’ responses to pre and post task questionnaires, along with task scenarios along with interview transcripts. ANOVA and other statistical analyses were conducted and no statistical differences were found between users’ use of quality scores and the type of task. It was also found that users primarily considered the constructs – correct and credible from the charts from the qualitative data. Additionally, it was also found that users made use of the quality scores primarily when they had little familiarity with the topic. The study provided insights into how and to what extent do users considered the role of quality while completing the task provided. The contribution of this human information behavior study is of twofold: users’ reliance on an automated score provided by an intelligent tool along with studying users’ confidence in considering quality in evaluating user-generated content.