top of page
INFO 464 Group Presentation.png
Project Title

Speech Emotion Recognition

Value Sensitive Design Research: To increase empathy by assisting users to understand the "hidden"messages of the conversation that go beyond semantics

Keywords: Recognition of Emotion; Natural Language Understanding; Online Communication; Human-Computer Interaction, Value Sensitive Design

In the digital world, it is not easy for people to read the emotions of others. Unlike face-to-face communications, where people can recognize others' emotions through facial expressions, body language, and tones, tracking and analyzing these visual cues through solely online interactions is challenging. The main goal of this project is to provide a comprehensive way of recognizing the emotional aspect of speech, as well as to increase empathy by assisting users to understand the "hidden"messages of the conversation that go beyond semantics.

This study presents a series of research investigations, including conceptual, empirical, and technical investigations, using the Value Sensitive Design approach. The research findings suggest that the biggest limitation of speech emotion recognition technology is the lack of resources in database and classification models that support the technology's functionality and accuracy. This novel technology still requires extensive research, investigation, and testing when moving forward.

My Responsibilities: Define and communicate the research vision; Develop and maintain the product roadmap and timeline; Conduct academic literature review and primary research; Monitor performance and make strategic decisions

(*These case studies will guide you through the Value Sensitive Design (VSD) approach. I will walk you through each step of what I did for this case study.)

Type / 

Course Project

Role / 

Project Manager,

Researcher

 

Year / 

2022

Duration/ 

10 Weeks

Tools/ 

Figma, Miro, Value Sensitive Design

Value Sensitive Design (VSD)

What is Value Sensitive Design?

Value Sensitive Design (VSD) is an approach to design that takes into account ethical values, social implications, and human factors to ensure that technology is designed and used in a way that aligns with human values and social well-being. 

(Learn moreom about VSD)

My VSD Design Process ⬇️

Conceptual Investigation

Empirical Investigation

Technical Investigation

Conceptual Investigation

Conceptual investigation refers to the process of exploring and analyzing ideas, concepts, and theories in order to gain a deeper understanding and develop new insights.

One of my main focuses was on conceptual investigation, which included stakeholder analysis, value analysis, and literature review of past scholarly articles and research papers on human psychology related to speech and voice, as well as the findings on speech emotion recognition technology.

2_edited.jpg
Refined How Might We Statement:​

How might we improve the depth and accuracy of voice messaging to strengthen interpersonal communication?

Screenshot of Some Research Findings

Refined Research Goals:​
  • Enhance people’s experience of sending and receiving voice messages

  • Lower the chance of misinterpretation of people’s emotions in online communication

3_edited.jpg
4.png
Stakeholder Analysis & Value Analysis:

Screenshot of Some Research Findings

To begin the project, I first identified the stakeholders involved. The direct stakeholders include both the users and companies providing the voice message service, while indirect stakeholders include those who do not use or dislike voice messages, as well as social media companies that do not offer this feature. An analysis of stakeholder values was conducted and visualized in a diagram.

In addition to stakeholder values, the project values cultural diversity as the target audience may speak different languages or accents. Thus, it is essential to develop a solution that can recognize and analyze various languages.

As a designer, I also prioritize user privacy and ensuring that their data remains secure. Therefore, the project includes measures to protect user data and provide the option to disable speech recognition if desired.

In researching a technology that detects human emotions in speech, I recognized that the task of emotion recognition is challenging......

5.png

Screenshot of Some Research Findings

Empirical Investigation

Empirical investigation is a research approach that involves the collection and analysis of data based on direct observation or experience, in order to support or challenge a hypothesis or answer a research question.

 

In order to get concrete, verifiable evidence to support and revise the conceptual investigation findings, I conducted a semi-structured interview for the empirical investigation.

6_edited.jpg
Interviews

The purpose of the interview was to gather users' experience on understanding emotions through sending and receiving voice messages. To achieve this, a semi-structured interview was used as it allowed for exploration of participants' thoughts, feelings, and beliefs about the topic, and provided an opportunity to delve deeply into personal and sensitive issues.

Two college students from the University of Washington (aged 19-20), who identified themselves as frequent voice message users, were interviewed in person at their location to provide a comfortable and secure environment for sharing their thoughts. Each interview took about 20 minutes and no longer than 30 minutes.

Interview Findings:

Prior to conducting the interviews, I had a narrow understanding of voice messages as only spoken messages sent on social media platforms with a voice bar. However, I came to realize that receiving a voice message is similar to being in a phone call or virtual meeting without a camera. Hence, I revised the definition of "voice message" to include all centralized electronic systems that store vocal messages, which encompasses all methods of conveying voice messages online.

The participants affirmed that observing facial expressions and gestures is helpful in understanding other people's emotions, which aligns with findings from the conceptual investigation.

Additionally, the participants expressed a preference for using voice messages over texting in messaging apps as they believe it more accurately conveys their mood and emotional state.

6_edited.png

Technical Investigation

Technical investigation refers to the process of examining the technical feasibility, constraints, and opportunities of a design solution. Technical investigation helps designers understand the limitations and possibilities of the technology available and guides them in making informed decisions about design choices that align with technical constraints and requirements.

For the technical investigation, we focused on proactive design which involved the design of systems to support values identified in a conceptual investigation (Friedman, 1996). Our prototype is designed based on two applications, one is Zoom for professional purposes and another is Instagram for non-professional purposes. Our plugin is currently designed for occasions in which users communicate in English and the speakers' emotions are recognized sentence by sentence.

7.png
7_edited.jpg

We used Hidden Markov Models for speech recognition and emotion classification. Models are trained for each individual speaker and a separate model is trained for each archetypal emotion. 60% of speakers' utterance samples were used to train the models and 40% are used for testing. The utterances were classified into 6 primary archetypal emotions and assigned specific pictograms. The pictograms supplemented the facial expression that is undermined in online communication settings.

Three broad types of speech variables have been identified as related to the expression of emotional states. These are fundamental frequency (F0)contour, continuous acoustic variables, and voice quality.

9.png
8.png

Discussion

10.png
11.png

© 2023 by Alyson

  • Linkedin
  • Instagram
bottom of page