Jiaying Li

Master's Student in Music Technology
Georgia Institute of Technology
jli3269@gatech.edu

I am Jiaying Li, a second-year master's student in music technology at Georgia Tech. I work as a researcher in the Computational and Cognitive Musicology Lab (CCML) at the Georgia Tech Center for Music Technology (GTCMT) under the guidance of Dr. Nat Condit-Schultz and Dr. Claire Arthur. My research at Georgia Tech focuses on Human-Computer Interaction (HCI). During my first year of master's studies, my research focused on music cognition & perception and audio signal processing. Progressing into the second year, my research interest shifted to Human-Computer Interaction (HCI) and Visualization.

I completed my Bachelor of Engineering degree in Electronic Information Engineering and minor in Philosophy at the Chinese University of Hong Kong, Shenzhen. During my undergraduate studies, I worked at Shenzhen Institute of Artificial Intelligence and Robotics for Society (AIRS), where my focus was on developing an interactive auto music generation system and facial beauty prediction.

Outside of academics, I have a passion for playing the piano and writing novels in my spare time. Also, I am a professional speed cuber.

Publication

Conference Paper

K. Xue, Z. Liu, J. Li, X. Ji and H. Qian, "SongBot: An Interactive Music Generation Robotic System for Non-musicians Learning from A Song," 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), Xining, China, 2021, pp. 1300-1305, doi: 10.1109/RCAR52367.2021.9517454. [PDF][Video]

Conference Poster

J. Li and N. Condit-Schultz, “Four Chords Go a Long Way: Measuring Chord Progression Similarity in Chinese Popular Music”, 2022 Society for Music Perception and Cognition (SMPC). [Poster]

Research

Active Projects

Mid Air Text Interaction with Hand-Tracking

Collaborator: Dr. Yalong Yang
Georgia Tech Immersive Visualization & Interaction Lab

Our goal is to develop a ten-finger text input prototype in virtual reality based on hand tracking. The project includes text input, text selection, and some basic operations such as copy/cut/paste.


Past Projects

WHOI ALVIN Submersible

[Video]

Collaborator: Dr. Bruce Walker, Angela Dai
Georgia Tech Sonification Lab

This project aims to create an immersive orientation and a training workflow for the WHOI ALVIN submersible using Virtual Reality (VR). During the orientation, users should be able to follow the instructions and see the highlights of all the interior panels and objects. Also, they will be able to interact with the interior objects, for example, multimeters, buttons and switches, with controllers. For the safety training workflow, users are taught some safety instructions, for example, how to use the fire extinguisher, how to deal with the situation when the submersible is stuck, how to throw away the trash, etc. The simulation system is built in Unity and tested by the Meta Quest 2 and Meta Quest 3.



Audio Technology II Interaction Website

[Website] [Code]

Collaborator: Dr. Claire Arthur
Georgia Tech Center for Music Technology

This research project addresses the challenges encountered by students without an engineering background when trying to comprehend the fundamental concepts of digital signal processing (DSP). Concepts such as convolution, autocorrelation, modulation, and the discrete Fourier transform (DFT) often prove difficult to grasp without proper visualization and practical application. To overcome these difficulties and enhance comprehension, we have developed a website featuring interactive modules that allow students to visualize animations and simulations of these DSP concepts. Additionally, the website offers coding practices and exercises specifically designed to assist beginners in acquiring essential audio processing skills. By actively engaging with these interactive tools and practical exercises, students are encouraged to delve deeper into the learning process, resulting in a more profound understanding of DSP principles. The website is implemented using HTML, CSS, and JavaScript, and it is intended for integration into the Audio Technology II lecture curriculum in Fall 2023.



Accessible Learning Material User Interface Prototype for Disabled Students

[PDF] [Code]

Advisor: Dr. Michael Helms

Our project aims at helping designers to design more accessible course materials for visually impaired students. We have sought advice from developers of existing apps or websites and professional therapists to comprehend how these students learn and cognize new knowledge. The data we gather from the interviews can be used to improve existing computational cognitive models. Based on our interview findings, we can outline the design schema for our computational cognitive model, which incorporates critical considerations and strategies for creating accessible learning tools for visually impaired students.



Singing Voice Conversion based on Target Waveform Mapping

[Poster] [Code]

Collaborator: Dr. Nat Condit-Schultz
Georgia Tech Center for Music Technology

Singing voice conversion (SVC) is a technique that converts the source sound of singer A to the target sound of singer B without changing the lyric contents. In this paper, we implemented and validated our baseline model, the MelGAN model, which is inspired by the voice conversion (VC) task. We trained the model on an existing database, NUS-48 collected by the National University of Singapore. In our research, we found that the vocal range has a great impact on the model training results. Therefore, we collected our own database and purposed a more intuitive method, the waveform mapping method. We compared the results with different training data and found that the database with a larger frequency range will provide a better conversion result. Also, the waveform mapping method works better on specific target timbres and source timbres. The main contribution of this research project is the new dataset, and also our idea of waveform mapping algorithm. These findings provide new insights into the singing voice conversion task.



A Chord Progression Similarity Calculation Method Adapted to Human Ear Perception

[PDF] [Code]

Collaborator: Dr. Nat Condit-Schultz
Georgia Tech Center for Music Technology

This research presents a novel method, the Chord Progression Similarity Index (CPSI), for quantifying the similarity between chord progressions. The CPSI measures the sum of 1st-order transitions between pitch classes in consecutive chords. The algorithm allows for customization by incorporating weights for important metric positions, chord degrees, and other factors. The effectiveness of the CPSI is evaluated through two experiments. In a perceptual experiment, 34 participants rated the perceived similarity of 40 diatonic chord progressions. Various CPSI calculations were compared to the participants' responses, revealing a weak correlation. Notably, the participants' responses were most accurately predicted by considering only root progression. In a separate computational experiment, the CPSI was employed to assess chord progression diversity within a novel database comprising 200 Chinese songs, representing the most popular selections from each year between 2012 and 2021. The findings provide evidence indicating an increasing similarity in Chinese pop music over recent years, aligning with numerous existing reports. This research showcases the utility of the CPSI in quantifying chord progression similarities and contributes insights into the evolving characteristics of Chinese pop music.



An Interactive Music Generation Software Through Painting

[Code]

Georgia Tech Center for Music Technology

The musical painting software allows people to draw and generate the corresponding music based on the built-in rules. By selecting brushes and painting on the canvas, people can generate their real-time music. The project source code was written in Python.



An Innovative Musical Instrument Prototype Based on Rubik's Cube: Musical Cube

[PDF] [Code]

Georgia Tech Center for Music Technology

The Musical Cube is a groundbreaking prototype that combines a traditional Rubik's cube with music-sounding software. It generates chords and melodies based on the cube's rotation, creating music that corresponds to its scrambled state. This innovative instrument serves as both a tool for Rubik's Cube beginners to learn solution algorithms and an improvisation tool for musicians. Evaluation results from performers and audiences highlight its potential and future prospects. The Musical Cube offers a unique fusion of technology, music, and puzzle-solving, providing an engaging and interactive musical experience for enthusiasts and musicians alike. The project source code was written in Python.



DJI Goggles 2

[Product] [Summary of VR Specs]

Collaborators: DJI Goggles 2 Group
SZ DJI Technology CO., LTD.

I have worked as a User Experience Research Scientist at DJI, where I actively contributed to the design process of DJI Goggles 2. As part of my role, I conducted competitor analysis focusing on AR and VR products, aiming to gain insights and inform our design decisions. To enhance the user experience of DJI Avata and DJI Goggles 2, I conducted perceptual experiments that yielded valuable findings. Additionally, I took charge of designing ergonomics experiments aimed at improving the shape and weight of the DJI Goggles mask, and effectively analyzed optical data to optimize the qualifying range for our products. The outcomes of these endeavors were highly favorable, demonstrating that DJI Goggles 2 is an exceptional product.



Facial Beauty Scoring Recommendations Based on VGG16 and XGboost

Collaborators: David Zhang, Lynn Liao
Shenzhen Institute of Artificial Intelligence and Robotics for Society

Facial attractiveness significantly impacts various aspects of social interactions, including entertainment, career prospects, and everyday life. To investigate the relationship between facial beauty and its influence, we constructed two extensive databases: a Celebrity Faces database comprising more than 4,900 well-known personalities, and a human face database containing over 6,000 samples obtained from Hefei University. Leveraging the VGG16 model, we extracted the contour features of eyes and eyebrows, subsequently developing a system that utilizes the XGBoost model for predicting facial beauty degrees. Our research contributes to the understanding of facial aesthetics by providing a robust framework for evaluating attractiveness based on facial contour features.