D̴̼̥̉a̸͂͜ͅd̵͂͑ͅą̶̀͐s̵̯̈́e̶̜͔̍͝t̶̘͎̃s̶̫͙͂

D̴̼̥̉a̸͂͜ͅd̵͂͑ͅą̶̀͐s̵̯̈́e̶̜͔̍͝t̶̘͎̃s̶̫͙͂  is an ongoing interdisciplinary music project, and a response to the cultural and economic ecosystem of voice AI and voice data that is rapidly terraforming the meaning, value and function of voice. The project involves researching the development of open digital music tools, bespoke voice datasets and artistic approaches to dataset creation that challenge the popular narratives and agendas around voice AI – which focus on the spectacle and fears around the technological results, such as digital clones that reproduce the voice of a famous narrator or singer, rather than the labor and values of vocal artists and laypersons who create these datasets. D̴̼̥̉a̸͂͜ͅd̵͂͑ͅą̶̀͐s̵̯̈́e̶̜͔̍͝t̶̘͎̃s̶̫͙͂ aims to create work that playfully, intimately and artistically foregrounds less visible labor and the often extractive and exploitative relationships around voice AI, treating datasets and dataset making as cultural and artistic processes imbued with context, and to build conversations and bring attention to desirable care and ethics around human voices under the concept of “vocal values”.

While the dominant narrative of Generative AI is often focused on the uncanny realism, and shock and awe of its audible and visible results and the gold rush to capitalize on them. The goal of D̴̼̥̉a̸͂͜ͅd̵͂͑ͅą̶̀͐s̵̯̈́e̶̜͔̍͝t̶̘͎̃s̶̫͙͂ is to decenter the spectacle and instead foreground AI’s total dependence on a complex ecosystem of training data and people who create and handle that data. This often involves huge amounts of hidden and easily exploited vocal labor by artists, as well as data labelling and categorisation. This data work dwarfs the engineering work invested in creating the AI technologies themselves by an order of magnitude. With D̴̼̥̉a̸͂͜ͅd̵͂͑ͅą̶̀͐s̵̯̈́e̶̜͔̍͝t̶̘͎̃s̶̫͙͂  we unearth this ecosystem and inject the messy and productive friction of human bodies, improvisation and reciprocity.

The project began in 2023 and develops octopus-like through 2024 as a series of public workshops on vocal values, interviews and presentations within different communities across research and industry, technological R&D into open and accessible voice synthesis tools, and artistic experiments in voice data collection and the creation of “poems” for dataset performance, combining inspiration from dada-ist typographic and graphic scores, glitch text and the long history in voice computing of creating phonetically balanced reading scripts for dataset creation (such as the Harvard Sentences and The Rainbow Passage for English).

The project involves a series of public workshops and expert meetings to map out the many (and often contradictory) values around human voice data, and create a proof-of-concept public dataset as an archetype for future diverse, open (but with conditions for care), and ethically sourced voice data. The work has a special focus on vocal artists that fall outside of the mainstream economies of voice data and music AI. These voice datasets are carefully crafted through building reciprocal and value-aligned collaborations with experimental vocal artists, sound poets, tradition-bearing singers, polyphonic choirs, and others who have developed a particularly unique craft or carry a cultural tradition whose values do not align with the dominant extractivist model of voice data collection and distribution. These datasets are planned to be released under a speculative fair use license once fair and value-aligned voice data transmission is legally, technologically and socially possible. This is not the case at the moment of this writing, but many groups are working on this thorny problem with different approaches and solutions.

The history of voice technologies such as the vocoder and autotune tells us that the most interesting part of these technologies is usually how they are “misused”. On the technology and scientific side of this project, D̴̼̥̉a̸͂͜ͅd̵͂͑ͅą̶̀͐s̵̯̈́e̶̜͔̍͝t̶̘͎̃s̶̫͙͂ involves the creation of a new AI voice synthesis instrument “Tungnaa” (named after the Icelandic “River of Tongues”), which is an AI voice software instrument meant for live performance and real-time experimentation with the unique aesthetics of deep neural network generation of voice-like audio, with the goal of being an open (hackable, free, cross-platform), accessible (low CPU and energy consumption), and playful musical instrument. Tungnaa will be able to run on a modest laptop without the need for high-end GPU computing resources, and without its underlying technology being kept behind a web-based gatekeeping portal or paid services running on high-energy consumption GPUs. Inspired in part by the free-form flow of text in musical live coding, and the typographical-vocal experiments of the postwar Dada art movement, Tungnaa also invites artists to invent their own text-based vocal notation systems, and to explore possible vocal notation systems for all the possible things a human voice can do – and those that it cannot – without a technological focus on conventional language or singing styles.

Early outcomes of this work were presented at IRCAM’s 30th anniversary festival in Paris, The Interdisciplinary Voice symposium at Cambridge University, The Values of Voice public workshop at Mozfest House Amsterdam, and published at the New Instruments for Musical Expression (NIME) conference in 2024. An experimental performance cycle BLA BLAVATAR vs. JAAP BLONK, the result of a close collaboration with the sound poet Jaap Blonk, premiered Oct 4th 2024 at IZIS festival in Koper – as an artistic experiment around the tensions of freedom and improvisation against dataset creation as a live artistic form, using a set of bespoke, phonetically balanced “machine-friendly dataset poems” developed collaboratively with Blonk and reflecting his unique vocal practice. The principles developed within the project and through workshops have been distilled into a collaborative document on the ethics of voice/data/AI in the form of Vocal Values Principles – a living document on voice/data that will undergo continued revisions in the future, and inform the public release of D̴̼̥̉a̸͂͜ͅd̵͂͑ͅą̶̀͐s̵̯̈́e̶̜͔̍͝t̶̘͎̃s̶̫͙͂.

This page will be updated as the project continues!

Credits and Collaborations

This work is supported by S+T+ARTS AIR, an artist residency program of the European Comission that supports interdisciplinary work at the intersection of art, science and technology with a focus on holistic and human-centered ways of thinking. Is it supported primarily through the core AIR hub PINA, in Koper, and by composer Mauricio Valdes, who runs PINA’s art+science lab, and spatial sound studio HEKA.

Core S+T+ARTS AIR Partners

PiNA – Association for Culture and Education, Koper (primary hub)

In4Art

Barcelona Supercomputing Center

High-Performance Computing Center Stuttgart

External Partners and Supporters

S+T+ARTS AIR, “Making the Invisible Visible”, is funded by European Union call CNECT/2022/3482066 – Art and the digital: Unleashing creativity for European industry, regions, and society under grant agreement LC-01984767. It is part of the S+T+ARTS programme.

Category: CollaborationCommunityMusicPerformanceResearchWorkshop