Our Ph.D. student, Yiwei Wang, has successfully defended her dissertation titled “Authentic vs. Synthetic: A Comparison of Different Methods for Studying Task-based Information Seeking”. The committee included Chirag Shah (University of Washington, Chair), Nick Belkin (Rutgers University), Kaitlin Costello (Rutgers University), and Diane Kelly (University of Tennessee Knoxville).
In task-based information seeking research, researchers often collect data about users’ online behaviors to predict task characteristics and personalize information for users. User behavior may be directly influenced by the environment in which a study is conducted, and the tasks used. This dissertation investigates the impact of study setting and task authenticity on users’ searching behaviors, perceived task characteristics, and search experiences. Thirty-six undergraduate participants finished one lab session and one remote session in which they completed one authentic and one simulated task. The findings demonstrate that the synthetic lab setting and simulated tasks had significant influences mostly on behaviors related to content pages, such as page dwell time and number of pages visited per task. Meanwhile, first-query behaviors were less affected than whole-session behaviors, indicating the reliability of using first-query behaviors in task prediction. Subjective task characteristics—such as task motivation and importance—also varied in different settings and tasks. Qualitative interviews reveal why users were influenced. This dissertation addresses methodological limitations in existing research and provides new insights and implications for researchers who collect online user search behavioral data.
Souvick Ghosh successfully defends his dissertation
Our Ph.D. student, Souvick Ghosh, has successfully defended his dissertation titled “Exploring Intelligent Functionalities of Spoken Conversational Search Systems”. The committee included Chirag Shah (University of Washington, Chair), Nick Belkin (Rutgers University), Katya Ognyanova (Rutgers University), and Vanessa Murdock (Amazon).
Conversational search systems often fail to recognize the information need of the user, especially for exploratory and complex tasks where the question is non-factoid in nature. In any conversational search environment, spoken dialogues by the user communicate the search intent and the information need of the user to the system. In response, the system performs specific, expected search actions. This is a domain-specific natural language understanding problem where the agent must understand the user’s utterances and act accordingly. Prior literature in intelligent systems suggests that in a conversational search environment, spoken dialogues communicate the search intent and the information need of the user. The meaning of these spoken utterances can be deciphered by accurately identifying the speech or dialogue acts associated with them. However, only a few studies in the information retrieval community have explored automatic classification of speech acts in conversational search systems, and this creates a research gap. Also, during spoken search, the user rarely has control over the search process as the actions of the system are hidden from the user. This eliminates the possibility of correcting the course of search (from the user’s perspectives) and raises concern about the quality of the search and the reliability of the results presented. Previous research in human-computer interaction suggests that the system should facilitate user-system communication by explaining its understanding of the user’s information problem and the search context (referred to as the system’s model of the user). Such explanations could include the system’s understanding of the search on an abstract level and the description of the search process undertaken (queries and information sources used) on a functional level. While these interactions could potentially help the user and the agent to understand each other better, it is essential to evaluate if explicit clarifications are necessary and desired by the user.
We have conducted a within-subjects Wizard-of-Oz user study to evaluate user satisfaction and preferences in systems with and without explicit clarifications. However, the results of the Wilcoxon Signed Rank Test showed that the use of explicit system-level clarifications produced no positive effect on the user’s search experience. We have also built a simple but effective Multi-channel Deep Speech Classifier (MDSC) to predict speech acts and search actions in an information-seeking dialogue. The results highlight that the best performing model predicts speech acts with 90.2% and 73.2% for CONVEX and SCS datasets respectively. For search actions, the highest reported accuracy was 63.7% and 63.3% for CONVEX and SCS datasets respectively. Overall, for speech act prediction, MSDC outperforms all the traditional classification models by a large margin and shows improvements of 54.4% for CONVEX and 18.3% over the nearest baseline for SCS. For search actions, the improvements were 32.3% and 2.2% over the closest machine learning baselines. The results of ablation analysis indicate that the best performance is achieved using all the three channels for speech act prediction and metadata features only when predicting search actions. Individually, metadata features were most important, followed by lexical and syntactic features.
In this dissertation, we provide insights on two intelligent functionalities which are expected of conversational search systems: (i) how to better understand the natural language utterances of the user, in an information-seeking conversation; and (ii) if explicit clarifications or explanations from the system will improve the user-agent interaction during the search session. The observations and recommendations from this study will inform the future design and development of spoken conversational systems.
Prof. Chirag Shah is receiving the KSJ Award 2019 and giving a keynote at ECIR 2020.
Our lab director, Prof. Chirag Shah, is receiving the Microsoft BCS/BCS IRSG Karen Spärck Jones Award (KSJ Award) 2019 and he is giving a keynote this Wednesday at the 42nd European Conference on Information Retrieval (ECIR 2020).
About the KSJ Award
KSJ Award is created by The British Computer Society Information Retrieval Specialist Group (BCS IRSG) in conjunction with the BCS sine 2008. The award also sponsored by Microsoft Research. See more detail at https://irsg.bcs.org/ksjaward.php
About the keynote
“Task-Based Intelligent Retrieval and Recommendation”
While the act of looking for information happens within a context of a task from the user side, most search and recommendation systems focus on user actions (‘what’), ignoring the nature of the task that covers the process (‘how’) and user intent (‘why’). For long, scholars have argued that IR systems should help users accomplish their tasks and not just fulfill a search request. But just as keywords have been good enough approximators for information need, satisfying a set of search requests has been deemed to be good enough to address the task. However, with changing user behaviors and search modalities, specifically found in conversational interfaces, the challenge and opportunity to focus on task have become critically important and central to IR. In this talk, I will discuss some of the key ideas and recent works — both theoretical and empirical — to study and support aspects of task. I will show how we could derive user’s search path or strategy and intentions, and how they could be instrumental in not only creating more personalized search and recommendation solutions, but also solving problems not possible otherwise. Finally, I will extend this to the realm of intelligent assistants with our recent work in a new area called Information Fostering, where our knowledge of the user and the task can help us address another classical problem in IR — people don’t know what they don’t know.
Our Ph.D. student, Jiqun Liu, has successfully defended his dissertation titled “A State-Based Approach to Supporting Users in Complex Search Tasks”. The committee included Chirag Shah (University of Washington, Chair), Nick Belkin (Rutgers University), Kaitlin Costello (Rutgers University), and Dan Russell (Google).
Liu’s study focuses on understanding the multi-round search processes of complex search tasks by using computational models of interactive IR and develop personalized recommendations to support task completion and search satisfaction. From the study, the team built a search recommendation model based on Q-learning algorithm. The results demonstrated that the simulated search episodes can improve search efficiency to many extents.
Previous work on task-based interactive information retrieval (IR) has mainly focused on what users found along the search process and the predefined, static aspects of complex search tasks (e.g., task goal, task product, cognitive task complexity), rather than how complex search tasks of different types can be better understood, examined, and disambiguated within the associated multi-round search processes. Also, it is believed that the knowledge about users’ cognitive variations in task-based search process can help tailor search paths and experiences to support task completion and search satisfaction. To adaptively support users engaging in complex search tasks, it is critical to connect theoretical, descriptive frameworks of search process with computational models of interactive IR and develop personalized recommendations for users according to their task states. Based on the data collected from two laboratory user studies, in this dissertation we sought to understand the states and state transition patterns in complex search tasks of different types and predict the identified task states using Machine Learning (ML) classifiers built upon observable search behavioral features. Moreover, through running Q-learning-based simulation of adaptive search recommendations, we also explored how the state-based framework could be applied in building computational models and supporting users with timely recommendations.
Based on the results from the dissertation study, we identified four intention-based task states and six problem-help-based task states, which depict the active, planned dimension and situational, unanticipated dimension of search tasks respectively. We also found that 1) task state transition patterns as features extracted from interaction process could be useful for disambiguating different types of search tasks; 2) the implicit task states can be inferred and predicted using behavioral-feature-based ML classifiers. With respect to application, we built a search recommendation model based on Q-learning algorithm and the knowledge we learned about task states. Then we apply the model in simulating search sessions consisting of potentially useful query segments with high rewards from different users. Our results demonstrated that the simulated search episodes can improve search efficiency to varying extents in different task scenarios. However, in many task contexts, this improvement often comes with the price of hurting the diversity and fairness in information coverage.
This dissertation presents a comprehensive study on state-based approach to understanding and supporting complex search tasks: from task state and state transition pattern identification, task state prediction, all the way to the application of computational state-based model in simulating dynamic search recommendations. Our process-oriented, state-based framework can be further extended with studies in a variety of contexts (e.g., multi-session search, collaborative search, conversational search) and deeper knowledge about users’ cognitive limits and search decision-making.
Hands-On Introduction to Data Science, Dr. Shah, our lab director’s new book
If you are looking to get started in Data Science, or in the entry-level to intermediate level, this book is just the right fit for you. The “Hands-On Introduction to Data Science” newly published book by our lab director, Dr. Shah, is filled with hands-on examples, a wide range of practices and real-life applications that will help you develop a solid understanding of the subject. No prior technical background or computing knowledge needed for this book
If you are instructors and looking for a good textbook for your class, the book also provides end-to-end support for teaching a data science course. The book provides curriculum suggestions, slides for each chapter, datasets, program scripts, and solutions to each exercise, as well as sample exams and projects.
Reviews & Endorsements ‘Dr. Shah has written a fabulous introduction to data science for a broad audience. His book offers many learning opportunities, including explanations of core principles, thought-provoking conceptual questions, and hands-on examples and exercises. It will help readers gain proficiency in this important area and quickly start deriving insights from data.’ Ryen W. White, Microsoft Research AI.
Book Summary: This book introduces the field of data science in a practical and accessible manner, using a hands-on approach that assumes no prior knowledge of the subject. The foundational ideas and techniques of data science are provided independently from technology, allowing students to easily develop a firm understanding of the subject without a strong technical background, as well as being presented with material that will have continual relevance even after tools and technologies change. Using popular data science tools such as Python and R, the book offers many examples of real-life applications, with practice ranging from small to big data. A suite of online material for both instructors and students provides a strong supplement to the book, including datasets, chapter slides, solutions, sample exams and curriculum suggestions. This entry-level textbook is ideally suited to readers from a range of disciplines wishing to build a practical, working knowledge of data science.
Almost everything in the book is accompanied with examples and practice – both in-chapter and end-of-chapter so students are more engaged because they can use hands-on experiences to see how theories relate to solving practical problems
Assumes no prior technical background or computing knowledge and lowers the barrier for entering the field of data science so that students from a range of disciplines can benefit from a more accessible introduction to data science
Supplemented by a generous set of material for instructors, including curriculum suggestions and syllabi, slides for each chapter, datasets, program scripts, answers and solutions to each exercise, as well as sample exams and projects which gives instructors end-to-end support for teaching a data science course