voicebox: an interactive sound and video installation
Sarah Drury
sdrury@temple.edu


VOICEBOX provides an intimate interaction within an enclosed "box" and an amplification of that interaction outside the box. Inside VOICEBOX , the Participant has privacy in which to utter a peep, a word, a scream or a song into a microphone. These vocalizations are sensed and matched in pitch by audio voice fragments, stored in a database and output on interaction, piecing together a narrative. Simultaneously, the Participant watches a silent video on a small monitor inside the box. On voice interaction, this video switches to a parallel video sequence. In this way, the Participant "practices" the VOICEBOX , evoking hidden layers of sound and image.


Outside VOICEBOX, the Participant appears to viewers as a shadow image juxtaposed to a large screen projection of the video narrative. When the Participant sustains a note, their shadow is keyed into the narrative as a third video channel. Like early silent film accompanists, the Participant generates live accompaniment to the moving picture. The narrative takes shape in iteratively, in interactive response to the unique vocal patterns and verbal elaborations of each Participant.


A story is told in this human-computer duet, but by whom? Based on a Chinese myth about a woman who becomes two people and is later re-unified, fragmentation and integration run parallel here. VOICEBOX invites participants into the paradoxical intersection of several story lines and vocal presences. This paradigm manifests the dislocated "self" of digital and communications culture: disembodied, distributed, transient, potentially disintegrated. VOICEBOX posits an intimate experience of this "disintegration," reflecting both the technological moment and a philosophical inquiry into the transience of identity.


Input: Max/MSP analyzes the voice signal spectrum
Output:
— audio samples sequenced to generate a live narrative with both random and linear elements;
— computer-controlled video switching between default video channel and secondary channel, with third channel that appears on a sustained vocal note, keying video through a live shadow mask.