This post is about my Master’s thesis project (University of Washington school of design). The focus is interaction design for extended reality (XR), and this project specifically is to is the research on initiating and terminating conversations in social virtual reality (VR). The article is a thorough discussion around the secondary research (literature review) and my own case studies in VR, to find possible design interventions on the topic of VR conversations.
The inevitable emergence of the metaverse sparked heated discussions among the design community, a glance at the extended reality’s R&D markets implies that the virtual world is here to stay and most definitely will grow exponentially. Technological advancements in hardware and software provide a wide variety of opportunities for designers to start prototyping novel interactions and experiences that were deemed impossible a decade ago. Especially when it comes to social virtual reality (VR), because of its delicate and humanistic nature, the possibilities for manipulation and enhancement are limitless. Among these opportunities are conversations. People join these virtual spaces with their customized/personalized avatars and usually have conversations with each other. In some senses, it is very similar to any other social setting, a bar, a concert, a party, etc…. However, the whole “virtualness” of the interactions brings numerous pros and cons that will greatly affect the experience. Being new means that there is not a ton of experiments done specifically in VR, however, there are numerous theories and grounded researches in self-representation and interpersonal conversations to form the basis for my research.
Due to the lack of data, I needed to examine this new social VR experience (Facebook Horizon beta at the time) from an interaction design standpoint. So naturally, I spend tens of hours in VR, engaging in conversations or simply observing other people’s (avatars’) conversations.
In these observations, I was looking for pressure points in the flow of interactions. Things like personal space and orientation, how people stand and form conversation circles. conflict resolve methods, like how people negotiate who claim the speaker role when simultaneous talking happens. How people greet (other) unknown and known people in VR. leave-taking interactions and rejection of bad behavior. And how all this differs from the real world. This initial research phase helped me to form some general ideas for the secondary research phase. First, I indulge myself in the literature on self-representation and avatar theory. In general, I can categorize my findings on this area into three distinct classifications.
- How do we create/ choose an avatar? 2. How our avatar affects our/others’ behavior? 3. Technicalities of building, interaction with avatars.
The first two categories consist of twenty-peer-reviewed literature and cover many interesting ideas that I will later use in discussions.
The first point of interest was, “How we recreate ourselves in an avatar”. Here the emphasis is on the relation between self-awareness and self-representation, the idea is that people generally like to present some degree of “realness” when it comes to creating/choosing their digital avatar. This realness could be an aspect of their character (being an artist, or adventurous) or a level of physical likeness (Hairstyle, skin color, etc), or a combination of both (Kafaie et al. 2010). Although the VR environment (context) can amplify or abbreviate the urge for realness. It seems that the more self-aware the VR environment lets you be, the closer you want to be represented as your real self (Vasalou et al. 2005).
Secondly, how tangible virtual factors like relative size, overall shape, tone and voice, distance, etc will affect others’ perceptions; and also our own behaviors inside and outside the VR environment. Here grounded ideas like proxemics (the social rule of proximity between people) rule our virtual interactions as well as the real world (Slater M. 2010). Also, quantification of social/psychological arousal (using GVS, CVS methods) and comparing it between virtual and real scenarios revealed how a virtual social scenario feels “real” to the perceiver (Karnath et al. 2019). The proteus effect is also being discussed where Nick Yee (2009) revealed stereotypical traits on an avatar would boost individuals’ engagement in related stereotype-conforming behaviors. In another interesting set of papers, (2005–2009) Professor Jeremy Bailenson at Stanford University did a series of experiments regarding eye movement and gaze. He purposed the idea of multiple points of view for multiple viewers which is a novel interaction that is not possible to replicate outside the VR medium.
All these new ideas and experiments provide me with a new perspective to go back and observe interactions in social VR from a new perspective. This time I focused on the interactions in Facebook’s Horizon and Microsoft’s Altspace, with a more granular fidelity. Since the most thing that is happening in social VR today is conversations, I did a thorough literature review on this topic too. On the topic of approaching others to start a conversation, I found the works of Professor Genta Yoshioka (2015–2019) fascinating. He and his team at Shizouka University used mannequins and a given scenario to record and analyze people’s paths in approaching them. The finely tuned details of their study reveal a lot of information about the physical/environmental properties of conversation initiation that I think can very well be synthesized to be used for any VR environment too. On the matter of terminating conversations and how it is an agreement on (usually) temporal inaccessibility by one or both parties, I found studies by Knapp (Mark Knapp, 1973), and C. Fichten (1992) very useful. Basically, they categorize the interactions involved in two people saying goodbye (Leave-taking) in comparing them in different scenarios. Their work revealed the importance of the duration of conversation and the perception of the length of unavailability (after goodbye) in how leave-taking would take place.
Another important source for learning about conversations and VR is pop culture. I used Clips.io and Fandango services to search for movie clips containing conversation starters, things like bar settings, train wagons, etc. Additionally, I looked at Neuromancer (Gibson, 1984) and Ghost in the shell (Shirow 1989). On the surface, the fictional work about VR has a different tone. It is usually a dystopian portray of our world where all the current issues (pollution, inequality, military, etc) are amplified, However, it is useful to see how the experience design in VR can arise primed negative perceptions about the virtual world.
The synthesis and analysis of the research phase is an ongoing process at the time. However, I gathered the result to form a general scope and finally some principles for the design phase. The scope consists of several guidelines, all looking at the conversation from a different perspective.
Anatomy of a conversation, initiation. To start a conversation one needs to evaluate other partys’ willingness/need to participate in the conversation (Rutter D. 1977). The information about other person’s interests can be evaluated from multiple sources provided by the environment, context, and body language/gesture. Research shows that in real scenarios people can give an accurate enough guestimation or signal their interest/disinterest in a very conscious and clear way (C. Fichten, 1992). In Virtual scenarios, however, there are different affordances, to begin with.
- The context in VR: the information on the context can easily be conveyed. For example, when two people are talking about their romantic night in a bar setting it sends a clear message to the third person to not start a conversation. Places/situations, Conversation topics, and the number of people involving a conversation can be used just like in the real world, and also amplified in VR. This helps the conversation starter to evaluate the need for a conversation on the other side and decide if he wants to involve.
- Body Language/Gestures: VR provides two types of gestures, direct and indirect (automatic). Direct gestures are directly read from hand, fingers, and head positions mounted on a virtual body and connected to the physical device. The indirect gestures are generated based on algorithms that translate direct gesture data into the more fine-tuned eye, shoulder, and waist movements. Because of the difference between people’s actual gestures (probably sitting on a couch, wearing VR goggles) and their virtual position (usually standing up, moving around), it is hard to directly translate correct postures or try to read into it for information about conversation need level.
- Paralinguistics in VR: VR uses people’s own voices in conversations. SO a “I’m really enjoying this!” with a deep sarcastic voice can be heard and underwood by others. However, the lack of real context (precise facial, gestural expressions) might cause problems in some situations, which requires a sure signifier on top of the subtle paralinguistic cues.
- VR specific affordances: VR environment despite the many disadvantages to the real-world scenarios, has some specific qualities that potentially can open up into design solutions on the topic of conversations.
- User-specific perspective: avatars can be hidden/revealed to specific users. Also, an avatar can always stand back to or in front of another avatar/avatars regardless of the angle of approach.
- Time: Passed/ present time can be utilized in the virtual environment. The duration of a going-on conversation can be used to determine its importance to the participants.
- Spatial Proxemics: The distance between avatars or the size of a conversation group can be another asset to be used as a signifier in starting and ending conversations.
- Spatial Audio: Users can turn on/off their voice for specific avatars or limit/amplify their voice within an area.
So far these are the tools and knowledge I gathered on my master’s thesis project. The next step for me is to use these synthesized materials and jam them into an ideation brainstorming process and create ideas. With some fresh ideas, I will be ready for the build phase (Next quarter) to prototype, hi-fidelity VR interactive environments, where people can put themself in conversation-starting/ending scenarios and challenge the ideas. Finally, by gathering those experiences and synthesizing them, I will refine the initial ideas into design solutions to public them as my final thesis project paper.