Market mainstream microphone array technology analysis

The wind blows the wheat waves, and the night worms, the human ear can hear about 400,000 kinds of sounds, the frequency is between 20 and 20,000 Hz; not only that, with the help of the brain, humans also have the ability to distinguish between noise and filtering interference. So what about the machine?

What is a microphone array?

Microphone Array , literally, refers to the arrangement of microphones. That is to say a system consisting of a certain number of acoustic sensors (generally microphones) for sampling and processing the spatial characteristics of the sound field.

As early as the 1970s and 1980s, microphone arrays have been used in the research of speech signal processing. Since the 1990s, speech signal processing algorithms based on microphone arrays have gradually become a new research hotspot. In the "sound control era", the importance of this technology is particularly prominent.

What can a microphone array do?

1. Speech Enhancement

Speech enhancement refers to the process of extracting pure speech from a noisy speech signal when the speech signal is disturbed or even flooded by various noises (including speech). Therefore, DingDong can accurately recognize voice commands in noisy environments.

Schematic diagram of speech enhancement by microphone array beamforming

Since the 1960s, researchers such as Boll have proposed a speech enhancement technology for using a microphone called single-channel speech enhancement. Because it uses the fewest number of microphones and fully considers the characteristics of speech spectrum and noise spectrum, these methods have better noise suppression effects in some scenarios, and are widely used due to their simple and easy to implement features. In existing voice communication systems and consumer electronic systems.

However, in a complex acoustic environment, the noise always comes from all directions, and it often overlaps with the speech signal in time and frequency, plus the effects of echo and reverberation, using a single microphone to capture relatively pure The voice is very difficult. The microphone array combines the space-time information of the speech signal to simultaneously extract the sound source and suppress the noise.

At present, the beamforming and noise reduction technologies based on linear arrays, planar arrays and spatial stereo arrays have achieved the best in the industry.

2013 University of Science and Technology Flight Vehicle Noise Reduction Products and International Competitors

2. Source Localization

In reality, the position of the sound source is constantly changing, which is an obstacle for microphone radio reception. The microphone array can perform sound source localization. The sound source localization technology refers to using the microphone array to calculate the angle and distance of the target speaker, thereby realizing tracking of the target speaker and subsequent voice orientation picking, which is human-computer interaction, audio and video. Very important pre-processing techniques in areas such as conferences. Therefore, the microphone array technology does not limit the movement of the speaker, does not need to move the position to change its receiving direction, has flexible beam control, high spatial resolution, high signal gain and strong anti-interference ability, thus becoming An important means of capturing speaker speech in an intelligent speech processing system.

Schematic diagram of reverberation

3. Dereverberation

Generally, when we listen to music, we hope to have the effect of reverberation. This is a kind of enjoyment in hearing. A suitable reverb will make the sound mellow and appealing. The phenomenon of reverberation refers to the fact that sound waves are reflected by walls, ceilings, floors and other obstacles when they are transmitted indoors, and they are superimposed with direct sounds. This phenomenon is called reverberation.

However, reverberation is not good for recognition. Since reverberation causes unsynchronized speech to superimpose each other, the phoneme Overlap Effect is brought about, which seriously affects the speech recognition effect.

The part that affects speech recognition is generally the late reverberation part, so the main focus of de-reverberation is on how to remove the late reverberation. For many years, de-reverberation suppression is a hotspot and a difficult point in the industry. The main methods of using the microphone array to reverberate are as follows:

(1) Based on the Blind signal enhancement approach , the reverberation signal is used as a normal additive noise signal, and a speech enhancement algorithm is applied thereto.

(2) Beamforming based approach , by weighting and adding the signals collected by the multi-microphone pair, forming a pickup beam in the direction of the target signal while attenuating the reflected sound from other directions.

(3) An inverse filtering approach is used to estimate the Room Impulse Response (RIR) of the room through a microphone array, and a reconstruction filter is designed to compensate to eliminate the reverberation.

Now the microphone array-based de-reverberation technology realized by the University of Science and Technology is able to adaptively estimate the reverberation of the room, which is a good way to restore the pure signal, which significantly improves the voice sense and recognition effect. In the test comparison, the recognition effect under various reverberation times is close to the mobile phone near-talk level.

Reverberant speech signal spectrum

Voice signal spectrum after dereverberation

4. Sound source signal extraction (separation)

The family talks too much, and who DingDong listens to. This time you need DingDong to smartly identify which sound is the command. The microphone array can realize the sound source signal extraction. The sound source signal extraction is to extract the target signal from a plurality of sound signals, and the sound source signal separation technology needs to extract all the mixed sounds.

Speech extraction and separation through microphone array beamforming

There are several ways to extract and separate signals using a microphone array:

(1) a method based on beamforming , that is, performing sound extraction or separation by separately forming sound pickup beams from sound sources in different directions and suppressing sounds in other directions;

(2) Based on the traditional blind source separation method (Blind Source Separation) , mainly including Principal Component Analysis (PCA) and Independent Component Analysis (ICA) .

Current microphone array

Although the microphone array technology can reach a considerable level of technology, there are still some problems in general. For example, when the distance between the microphone and the signal source is too far (such as 10m, 20m distance), the signal-to-noise ratio of the recorded signal will be low, and the algorithm Processing is very difficult; for portable devices, due to the size of the device and power consumption, the number of microphones should not be too large, and the array size should not be too large. Distributed microphone array technology is a possible way to solve the current problem. The so-called distributed array is to arrange sub-array or sub-array to a larger range, and exchange and share data with each other through wired or wireless means, and based on this, sound source localization and beam in a broad sense. Forming and other technologies to achieve signal processing.

The advantages of distributed arrays are also significant compared to current centralized microphone arrays. Firstly, the size limitation of the distributed microphone array (especially wireless transmission) does not exist; in addition, the nodes of the array can cover a large area - there will always be an array of nodes close to the sound source, and the recording signal to noise ratio is large. As the amplitude increases, the processing difficulty of the algorithm will also decrease, and the overall signal processing effect will also be significantly improved. Therefore, the distributed array may be the mainstream solution in the future smart home and conference systems. At present, Science and Technology News has begun the layout of related technical research.

Today, in the Internet of Everything, microphone array technology has entered our daily life profoundly. In the era of smart car, smart home, robots, wearable devices and other applications, voice interaction has become the first choice for human-computer interaction because of its convenience. Microphone arrays have naturally become a very important front-end technology.

On September 13, 2016, the International Multi-Channel Speech Separation and Recognition Competition (CHiME) Organizing Committee unveiled the results of the 4th CHiME-4 contest in San Francisco, USA. Before the results are announced, we will come first. See what this CHiME is? CHiME (Computational Hearing in Multisource Environments) was founded in 2011 by renowned research institutes such as the French Institute of Computer Science and Automation, the University of Sheffield, and the Mitsubishi Electronic Research Laboratory. The aim of the competition is to hope that academia and industry A new speech recognition solution for the actual scene under the influence of high noise and reverberation is proposed to further enhance the practicability and universality of speech recognition, which is a difficult game in the international speech recognition evaluation.

This year, Keda Xunfei participated in the competition for the first time. Through in-depth cooperation with well-known experts at home and abroad, such as Professor Du Jun from the University of Science and Technology of China, Professor Chen Jingdong from Northwestern Polytechnical University, and Professor Li Jinhui from Georgia Institute of Technology, it won the crown of all three projects and greatly The best record of the history of each item is refreshed . The three items are the voice separation and English recognition tasks in the six-microphone, dual-microphone and single-microphone scenarios. Although the language of the competition is English, both Chinese and English are interlinked in speech technology.

Industrial Zinc Alloy Die Casting

Industrial Zinc Alloy Die Casting,Alloy Electroplating Die Casting,Custom Metal Die Cast Buckle,Zinc Alloy Pressure Die Casting Parts

Dongguan Metalwork Technology Co., LTD. , https://www.dgdiecastpro.com