04 September 2009

Sheffield researchers rebuild a voice

Bernadette Chapman

A new technique designed to reconstruct the voices of people who have had their vocal cords removed, has been applied by researchers at the University of Sheffield.

Students and academics in the Departments of Computer Science and Human Communication Sciences have helped reconstruct the voice of patient Bernadette Chapman, who had a larynjectomy operation to remove her vocal cords after developing cancer.

Researchers took recordings of the patient’s voice prior to the operation and in collaboration with the University of Edinburgh's Centre for Speech Technology Research, who developed the theory and supplied software, used a speech synthesis technique based on statistical models of speech sounds. This used the patient’s recordings to adapt an 'average voice model' to sound like the person concerned.

Once a voice is built, it is possible to synthesise any sentence by supplying the word sequence. The voice was built using around seven minutes of speech from the client, which amounted to 100 sentences. This method is therefore much more practical than established ‘Voice Banking’ technologies which require two or three hours of recording to build a voice.

The client’s regenerated voice was developed by University of Sheffield Master’s student Zahoor Khan as part of his dissertation, with guidance from research student Sarah Creer, whose doctoral work uses the same technique to improve the voices of people with speech disorders. Their work forms part of the research done within the CAST (Clinical Application of Speech Technology) group, which is a multidisciplinary research group interested in applying speech technology in clinical areas such as assistive technology, speech and language therapy and electronic control systems.

Researchers have since assessed the quality of the recordings by getting listeners to judge the similarity of the simulated voice with the original and by asking Mrs Chapman and her family what they think of the voice. All listeners have thought the regenerated voice sounded very similar to the original.

Researchers in CAST hope to use these personalised synthetic voices in communication aids for people whose speech has become intelligible, speaking for them like a human interpreter.

Professor Phil Green, from the Speech and Hearing Research Group in the Department of Computer Science at the University of Sheffield, said: “Your voice is part of your identity and if this technique can help you to recover it and communicate in a natural way your quality of life could be much improved.

“The technique is still evolving and not yet ready to be installed on a hand-held device but that is coming, maybe as soon as a couple of years time.”

Bernadette Chapman said: “For many years the Servox machine, or artificial larynx, has been the main means of communication for patients following laryngectomy or for those who have had severe speech impairment. The machine tends to sound very like a dalek and can be very embarrassing to use, especially in public places.

“I was a Nursing Sister before losing my voice and have always been aware of how difficult patients find the Servox machine, some refusing to use it and so becoming very reclusive. To have a new technique that sounds more like a human voice, indeed more like the patient's own voice prior to operation, would be a great breakthrough and welcomed by all of us who rely on technology to communicate.

“We all take our voices for granted, but it is not until we lose them that we realise what a marvellous gift the voice is. So I thank the researchers on behalf of myself and all who will benefit from this new appliance, for all the work they have undertaken.”

More information about the Speech and Hearing Research Group can be found here.