Home arrow All News arrow Automatic speech recognition
Saturday, 18 May 2013 12:51
Automatic speech recognition Print E-mail
Image
To recognize all languages. New technology that allows computers to recognize any language without having to learn each language first, could revolutionize automatic speech recognition.
If the machines are better at recognizing what we say, we can dictate to the computer instead of using the keyboard.

The technology can also be used to search the sound archive, which is a growing need for the use of audio and video on the internet increases.

Speech recognition is difficult because we express ourselves differently orally than in writing.

In addition, there may be considerable variation from person to person, partly because of different dialects.

Scientists have been working on automatic speech recognition (automatic speech recognition - ASR) for fifty years.

Make more mistakes than people

There has been a tremendous development in speech recognition during this period, primarily because we have more speech data to train the machine with more powerful machines, says Professor Torbjørn Svendsen at NTNU.

Svendsen shows to the iPhone app Siri, which allows us to use your voice and ask questions to your mobile as a human being, without relying on strict syntax and style.

For example, the question "The weather tomorrow?" Provide information on tomorrow's weather where you are. A "dumb" system would have required a questioning such as "What is the weather forecast for Trondheim tomorrow?".

What makes Siri so easy to use, according to Svendsen that there is a lot of intelligent programming behind.

Now, we see that the improvements start to stop, and on virtually all machines makes ten times as many mistakes as human beings.  Therefore, we looked for alternative ways to solve the problem, says Svendsen.

Produces sound like

Together with colleagues has Svendsen in a project supported by the Research Program VERDIKT, tested an entirely new approach to develop next-generation speech recognition technology.

They have shown that the fundamental way to produce speech is the same for all languages. Therefore, their technology could be used for all languages without speech recognizer must be trained with speech data from each language as they have today.

The researchers have concentrated on phonetics, ie the study of how speech and audio production.

In addition, they have made the system more knowledge about speech and language, the relationship between the frequency of the vibration and words and how we put together words into sentences.

When we speak, we hear the organ that produces the sound.  The way we use your lips, tongue, jaw and vocal cords to determine which noises we make.  By identifying which production traits that are present, we can recognize what is said.

We get the computer to find out which parts of speech organ that is active from the analysis of the acoustic pressure wave that is captured by the microphone, says Svendsen.

Two previous approaches

It has hitherto been usual to make speech recognition system with two different approaches.  Both are based on the use of a variety of voice data and text to teach the computer to recognize different languages.

One approach is that people observe the words and sounds and pulls out the rules that they put into the computer. If a sound is voiced or not depends for example on whether the vocal cords vibrate.

If for example we analyze a small section of the speech and find out that it is tuned and that the speech has resonance peaks at 750 and 1200 hertz (Hz), it is likely that the sound is a. If the resonance peaks located at 350 and 800 Hz, it is likely that the sound is a u, explains Svendsen.

The second approach is to let the computer even learn a huge amount of examples.

In such a statistical approach is based on all events equally likely.  As machine learning progresses, the frequently occurring incidents have increased probability while rare events will likely be reduced, says Svendsen.

If such an approach can be used much more speech data than when based on human observations, there are limits to what humans can interpret, says Svendsen.

Classify sounds

In the future you can also use voice to find what you're looking for.  Svendsen and his colleagues have chosen to lie somewhere between these two approaches.

We believe the statistical approach.  However, there is a certain regularity in how we talk in real life, says Svendsen.

They enter the knowledge of this to make rules in machine learning.

Much variation in speech is natural because we are among other things, different physiology, accent, education and health.  All this affects our voice and how to build sentences.  For the machine to understand speech, it must address the most common variations of normal speech and language.

We create a computer program that finds the likelihood of the various production traits such as the vocal cords vibrate, is present or not.  On the way we classify sounds, he explains.

Revealing language in seconds

Now, Svendsen continue working together with international partners to develop a language-independent model that can be used to create competitive speech recognition products.

There will be both time-and cost-effective, especially for small languages like our own.  In this country we afford to buy our solutions that cost little, but there are many other languages with only a few million users who will benefit from this technology, says Svendsen.

The technology will also be used in cases where mixed language because it needs only from three to thirty seconds to decide what language it is.

In Norway we do not blend in as much as other languages, it is worse in Denmark, but it can also be used where you have quotes in the original language in between.  In addition, it may be useful in gathering intelligence to determine what language a person speaks.
This article is provided by VERDIKT (Core Competence and Growth in ICT), the Research Council's major program for ICT research.
 
< Prev   Next >
More News
More News
Vehicles
Electric car for 25 thousand dollars

article thumbnail GM Announced that given electric car in the U.S. at only $ 25,000 the tax...

Nissan introduced the strategy of development of fast charging stations in the U.S.

article thumbnail Corporation Nissan unveiled at the Washington Auto Show network strategy of...

Energy & Green Tech
Fashion Tech
Robotics
Underwater robots will hear the voice of whales

article thumbnail Whaling at the same time greatly reduced the population of a species , but...

Biotechnology
IT & Telecom
Homemade Photo Booth

article thumbnail   Wedding photos are without question wonderful memories of the big...

Automation
The Hong Kong port hit by strike,

article thumbnail A dock strike Hutchison International Terminals, Operator of the port of Kwai...

Space
Gripen pilots behind amounts of incident

article thumbnail JAS39 Gripen flying wrong too often. 21 people were seconds from losing one's...

Planes private space station Russian

article thumbnail Russian company Energia space veteran at the head to build a private space...


© 2013 TECH NEWS: IT, TECHNOLOGY
Copyright &
Design by bgdna.com