]> git.basschouten.com Git - openhab-addons.git/commitdiff
[rustpotterks] Upgrade to version 2 (#14615)
authorGiviMAD <GiviMAD@users.noreply.github.com>
Tue, 21 Mar 2023 21:55:40 +0000 (14:55 -0700)
committerGitHub <noreply@github.com>
Tue, 21 Mar 2023 21:55:40 +0000 (22:55 +0100)
* [rustpotter] Use version 2

Signed-off-by: Miguel Álvarez <miguelwork92@gmail.com>
bundles/org.openhab.voice.rustpotterks/README.md
bundles/org.openhab.voice.rustpotterks/pom.xml
bundles/org.openhab.voice.rustpotterks/src/main/java/org/openhab/voice/rustpotterks/internal/RustpotterKSConfiguration.java
bundles/org.openhab.voice.rustpotterks/src/main/java/org/openhab/voice/rustpotterks/internal/RustpotterKSService.java
bundles/org.openhab.voice.rustpotterks/src/main/resources/OH-INF/config/config.xml
bundles/org.openhab.voice.rustpotterks/src/main/resources/OH-INF/i18n/rustpotterks.properties

index f4eed6e7c1e4c68954b3aa3d43d7040a2fe378ce..1b42db3e936ba205d07891b0e1393cdaafd30980 100644 (file)
@@ -5,6 +5,11 @@ This voice service allows you to use the open source library Rustpotter as your
 
 Rustpotter provides personal on-device wake word detection. You need to generate a model for your keyword using audio samples.
 
+You can test library in your browser using these web pages:
+
+- [The spot demo](https://givimad.github.io/rustpotter-worklet-demo/), which include some example wakewords (but it's recommended to use your own).
+- [The model creation demo](https://givimad.github.io/rustpotter-create-model-demo/), it allows you to record compatible wav files and generate a wakeword file that you can test on the previous page.
+
 Important: No voice data listened by this service will be uploaded to the Cloud.
 The voice data is processed offline, locally on your openHAB server by Rustpotter.
 
@@ -12,17 +17,19 @@ The voice data is processed offline, locally on your openHAB server by Rustpotte
 
 After installing, you will be able to access the service options through the openHAB configuration page in UI (**Settings / Other Services - Rustpotter Keyword Spotter**) to edit them:
 
-* **Threshold** - Configures the detector threshold, is the min score (in range 0. to 1.) that some wake word template should obtain to trigger a detection. Defaults to 0.5.
-* **Averaged Threshold** - Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
-* **Eager mode** - Enables eager mode. End detection as soon as a result is over the score, instead of waiting to see if the next frame has a higher score.
-* **Noise Detection Mode** - Use build-in noise detection to reduce computation on absence of noise. Configures the difficulty to consider a frame as noise (the required noise level).
-* **Noise Detection Sensitivity** - Noise/silence ratio in the last second to consider noise is detected. Defaults to 0.5.
-* **VAD Mode** - Use a voice activity detector to reduce computation in the absence of vocal sound.
-* **VAD Sensitivity** - Voice/silence ratio in the last second to consider voice is detected.
-* **VAD Delay** - Seconds to disable the vad detector after voice is detected. Defaults to 3.
-* **Comparator Ref** - Configures the reference for the comparator used to match the samples.
-* **Comparator Band Size** - Configures the band-size for the comparator used to match the samples.
-
+- **Threshold** - Configures the detector threshold, is the min score (in range 0. to 1.) that some wake word template should obtain to trigger a detection. Defaults to 0.5.
+- **Averaged Threshold** - Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
+- **Score Mode** - Indicates how to calculate the final score.
+- **Min Scores** - Minimum number of positive scores to consider a partial detection as a detection.
+- **Comparator Ref** - Configures the reference for the comparator used to match the samples.
+- **Comparator Band Size** - Configures the band-size for the comparator used to match the samples.
+- **Gain Normalizer** - Enables an audio filter that intent to approximate the volume of the stream to a reference level.
+- **Min Gain** - Min gain applied by the gain normalizer filter.
+- **Max Gain** - Max gain applied by the gain normalizer filter.
+- **Gain Ref** - The RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation of the wakeword level is used.
+- **Band Pass** - Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.
+- **Low Cutoff** - Low cutoff for the band-pass filter.
+- **High Cutoff** - High cutoff for the band-pass filter.
 
 In case you would like to setup the service via a text file, create a new file in `$OPENHAB_ROOT/conf/services` named `rustpotterks.cfg`
 
@@ -31,21 +38,24 @@ Its contents should look similar to:
 ```
 org.openhab.voice.rustpotterks:threshold=0.5
 org.openhab.voice.rustpotterks:averagedthreshold=0.2
+org.openhab.voice.rustpotterks:scoreMode=max
+org.openhab.voice.rustpotterks:minScores=5
 org.openhab.voice.rustpotterks:comparatorRef=0.22
-org.openhab.voice.rustpotterks:comparatorBandSize=6
-org.openhab.voice.rustpotterks:eagerMode=true
-org.openhab.voice.rustpotterks:noiseDetectionMode=hard
-org.openhab.voice.rustpotterks:noiseDetectionSensitivity=0.5
-org.openhab.voice.rustpotterks:vadMode=aggressive
-org.openhab.voice.rustpotterks:vadSensitivity=0.5
-org.openhab.voice.rustpotterks:vadDelay=3
+org.openhab.voice.rustpotterks:comparatorBandSize=5
+org.openhab.voice.rustpotterks:gainNormalizer=true
+org.openhab.voice.rustpotterks:minGain=0.5
+org.openhab.voice.rustpotterks:maxGain=1
+org.openhab.voice.rustpotterks:gainRef=
+org.openhab.voice.rustpotterks:bandPass=true
+org.openhab.voice.rustpotterks:lowCutoff=80
+org.openhab.voice.rustpotterks:highCutoff=400
 ```
 
 ## Magic Word Configuration
 
 The magic word to spot is gathered from your 'Voice' configuration. 
 
-You can generate your own wake word model by using the [Rustpotter CLI](https://github.com/GiviMAD/rustpotter-cli).
+You can generate your own wakeword files using the [Rustpotter CLI](https://github.com/GiviMAD/rustpotter-cli).
 
 You can also download the models used as examples on the [rustpotter web demo](https://givimad.github.io/rustpotter-worklet-demo/) from [this folder](https://github.com/GiviMAD/rustpotter-worklet-demo/tree/main/static).
 
@@ -59,11 +69,11 @@ The service will only work if it's able to find the correct rpw for your magic w
 
 You can setup your preferred default keyword spotter and default magic word in the UI:
 
-* Go to **Settings**.
-* Edit **System Services - Voice**.
-* Set **Rustpotter Keyword Spotter** as **Default Keyword Spotter**.
-* Choose your preferred **Magic Word** for your setup.
-* Choose optionally your **Listening Switch** item that will be switch ON during the period when the dialog processor has spotted the keyword and is listening for commands.
+- Go to **Settings**.
+- Edit **System Services - Voice**.
+- Set **Rustpotter Keyword Spotter** as **Default Keyword Spotter**.
+- Choose your preferred **Magic Word** for your setup.
+- Choose optionally your **Listening Switch** item that will be switch ON during the period when the dialog processor has spotted the keyword and is listening for commands.
 
 In case you would like to setup these settings via a text file, you can edit the file `runtime.cfg` in `$OPENHAB_ROOT/conf/services` and set the following entries:
 
index 20576c6fdab6df6414acf7a0bc70dfa05d60b301..a9c9d00893dffd6cbe0403449208006985b1a564 100644 (file)
@@ -18,7 +18,7 @@
     <dependency>
       <groupId>io.github.givimad</groupId>
       <artifactId>rustpotter-java</artifactId>
-      <version>1.0.0</version>
+      <version>2.0.0</version>
     </dependency>
   </dependencies>
 </project>
index 71e53a5e6fcf80c05f157d61c8897f1eb08bc338..6f0406f147a69e36c0e235436382007ee47e9dbb 100644 (file)
@@ -13,6 +13,7 @@
 package org.openhab.voice.rustpotterks.internal;
 
 import org.eclipse.jdt.annotation.NonNullByDefault;
+import org.eclipse.jdt.annotation.Nullable;
 
 /**
  * The {@link RustpotterKSConfiguration} class contains fields mapping thing configuration parameters.
@@ -36,37 +37,49 @@ public class RustpotterKSConfiguration {
      */
     public float averagedThreshold = 0.2f;
     /**
-     * Terminate the detection as son as one result is above the score,
-     * instead of wait to see if the next frame has a higher score.
+     * Indicates how to calculate the final score.
      */
-    public boolean eagerMode = true;
+    public String scoreMode = "max";
     /**
-     * Use build-in noise detection to reduce computation on absence of noise.
-     * Configures the difficulty to consider a frame as noise (the required noise level).
+     * Minimum number of positive scores to consider a partial detection as a detection.
      */
-    public String noiseDetectionMode = "disabled";
+    public int minScores = 5;
     /**
-     * Noise/silence ratio in the last second to consider noise is detected. Defaults to 0.5.
+     * Configures the reference for the comparator used to match the samples.
      */
-    public float noiseSensitivity = 0.5f;
+    public float comparatorRef = 0.22f;
     /**
-     * Seconds to disable the vad detector after voice is detected. Defaults to 3.
+     * Configures the band-size for the comparator used to match the samples.
      */
-    public int vadDelay = 3;
+    public int comparatorBandSize = 5;
     /**
-     * Voice/silence ratio in the last second to consider voice is detected.
+     * Enables an audio filter that intent to approximate the volume of the stream to a reference level (RMS of the
+     * samples is used as volume measure).
      */
-    public float vadSensitivity = 0.5f;
+    public boolean gainNormalizer = false;
     /**
-     * Use a voice activity detector to reduce computation in the absence of vocal sound.
+     * Min gain applied by the gain normalizer filter.
      */
-    public String vadMode = "disabled";
+    public float minGain = 0.5f;
     /**
-     * Configures the reference for the comparator used to match the samples.
+     * Max gain applied by the gain normalizer filter.
      */
-    public float comparatorRef = 0.22f;
+    public float maxGain = 1f;
     /**
-     * Configures the band-size for the comparator used to match the samples.
+     * Set the RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation of the
+     * wakeword level is used.
+     */
+    public @Nullable Float gainRef = null;
+    /**
+     * Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.
+     */
+    public boolean bandPass = false;
+    /**
+     * Low cutoff for the band-pass filter.
+     */
+    public float lowCutoff = 80f;
+    /**
+     * High cutoff for the band-pass filter.
      */
-    public int comparatorBandSize = 6;
+    public float highCutoff = 400f;
 }
index 8b3c28628ce9b3490e43a5cdc506ecb64971f188..c31982e508bb401895f66fafcb43d423d2883eda 100644 (file)
@@ -17,6 +17,7 @@ import static org.openhab.voice.rustpotterks.internal.RustpotterKSConstants.*;
 import java.io.File;
 import java.io.IOException;
 import java.nio.file.Path;
+import java.util.ArrayList;
 import java.util.Locale;
 import java.util.Map;
 import java.util.Set;
@@ -38,7 +39,6 @@ import org.openhab.core.voice.KSService;
 import org.openhab.core.voice.KSServiceHandle;
 import org.openhab.core.voice.KSpottedEvent;
 import org.osgi.framework.Constants;
-import org.osgi.service.component.ComponentContext;
 import org.osgi.service.component.annotations.Activate;
 import org.osgi.service.component.annotations.Component;
 import org.osgi.service.component.annotations.Modified;
@@ -46,10 +46,10 @@ import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 import io.github.givimad.rustpotter_java.Endianness;
-import io.github.givimad.rustpotter_java.NoiseDetectionMode;
-import io.github.givimad.rustpotter_java.RustpotterJava;
-import io.github.givimad.rustpotter_java.RustpotterJavaBuilder;
-import io.github.givimad.rustpotter_java.VadMode;
+import io.github.givimad.rustpotter_java.Rustpotter;
+import io.github.givimad.rustpotter_java.RustpotterBuilder;
+import io.github.givimad.rustpotter_java.SampleFormat;
+import io.github.givimad.rustpotter_java.ScoreMode;
 
 /**
  * The {@link RustpotterKSService} is a keyword spotting implementation based on rustpotter.
@@ -76,7 +76,7 @@ public class RustpotterKSService implements KSService {
     }
 
     @Activate
-    protected void activate(ComponentContext componentContext, Map<String, Object> config) {
+    protected void activate(Map<String, Object> config) {
         modified(config);
     }
 
@@ -111,7 +111,7 @@ public class RustpotterKSService implements KSService {
             throws KSException {
         logger.debug("Loading library");
         try {
-            RustpotterJava.loadLibrary();
+            Rustpotter.loadLibrary();
         } catch (IOException e) {
             throw new KSException("Unable to load rustpotter lib: " + e.getMessage());
         }
@@ -126,8 +126,13 @@ public class RustpotterKSService implements KSService {
         }
         var endianness = isBigEndian ? Endianness.BIG : Endianness.LITTLE;
         logger.debug("Audio wav spec: frequency '{}', bit depth '{}', channels '{}', '{}'", frequency, bitDepth,
-                channels, audioFormat.isBigEndian() ? "big-endian" : "little-endian");
-        RustpotterJava rustpotter = initRustpotter(frequency, bitDepth, channels, endianness);
+                channels, isBigEndian ? "big-endian" : "little-endian");
+        Rustpotter rustpotter;
+        try {
+            rustpotter = initRustpotter(frequency, bitDepth, channels, endianness);
+        } catch (Exception e) {
+            throw new KSException("Unable to configure rustpotter: " + e.getMessage(), e);
+        }
         var modelName = keyword.replaceAll("\\s", "_") + ".rpw";
         var modelPath = Path.of(RUSTPOTTER_FOLDER, modelName);
         if (!modelPath.toFile().exists()) {
@@ -141,48 +146,43 @@ public class RustpotterKSService implements KSService {
         logger.debug("Model '{}' loaded", modelPath);
         AtomicBoolean aborted = new AtomicBoolean(false);
         executor.submit(() -> processAudioStream(rustpotter, ksListener, audioStream, aborted));
-        return new KSServiceHandle() {
-            @Override
-            public void abort() {
-                logger.debug("Stopping service");
-                aborted.set(true);
-            }
+        return () -> {
+            logger.debug("Stopping service");
+            aborted.set(true);
         };
     }
 
-    private RustpotterJava initRustpotter(long frequency, int bitDepth, int channels, Endianness endianness) {
-        var rustpotterBuilder = new RustpotterJavaBuilder();
+    private Rustpotter initRustpotter(long frequency, int bitDepth, int channels, Endianness endianness)
+            throws Exception {
+        var rustpotterBuilder = new RustpotterBuilder();
         // audio configs
         rustpotterBuilder.setBitsPerSample(bitDepth);
         rustpotterBuilder.setSampleRate(frequency);
         rustpotterBuilder.setChannels(channels);
+        rustpotterBuilder.setSampleFormat(SampleFormat.INT);
         rustpotterBuilder.setEndianness(endianness);
         // detector configs
         rustpotterBuilder.setThreshold(config.threshold);
         rustpotterBuilder.setAveragedThreshold(config.averagedThreshold);
+        rustpotterBuilder.setScoreMode(getScoreMode(config.scoreMode));
+        rustpotterBuilder.setMinScores(config.minScores);
         rustpotterBuilder.setComparatorRef(config.comparatorRef);
         rustpotterBuilder.setComparatorBandSize(config.comparatorBandSize);
-        @Nullable
-        VadMode vadMode = getVADMode(config.vadMode);
-        if (vadMode != null) {
-            rustpotterBuilder.setVADMode(vadMode);
-            rustpotterBuilder.setVADSensitivity(config.vadSensitivity);
-            rustpotterBuilder.setVADDelay(config.vadDelay);
-        }
-        @Nullable
-        NoiseDetectionMode noiseDetectionMode = getNoiseMode(config.noiseDetectionMode);
-        if (noiseDetectionMode != null) {
-            rustpotterBuilder.setNoiseMode(noiseDetectionMode);
-            rustpotterBuilder.setNoiseSensitivity(config.noiseSensitivity);
-        }
-        rustpotterBuilder.setEagerMode(config.eagerMode);
+        // filter configs
+        rustpotterBuilder.setGainNormalizerEnabled(config.gainNormalizer);
+        rustpotterBuilder.setMinGain(config.minGain);
+        rustpotterBuilder.setMaxGain(config.maxGain);
+        rustpotterBuilder.setGainRef(config.gainRef);
+        rustpotterBuilder.setBandPassFilterEnabled(config.bandPass);
+        rustpotterBuilder.setBandPassLowCutoff(config.lowCutoff);
+        rustpotterBuilder.setBandPassHighCutoff(config.highCutoff);
         // init the detector
         var rustpotter = rustpotterBuilder.build();
         rustpotterBuilder.delete();
         return rustpotter;
     }
 
-    private void processAudioStream(RustpotterJava rustpotter, KSListener ksListener, AudioStream audioStream,
+    private void processAudioStream(Rustpotter rustpotter, KSListener ksListener, AudioStream audioStream,
             AtomicBoolean aborted) {
         int numBytesRead;
         var bufferSize = (int) rustpotter.getBytesPerFrame();
@@ -200,10 +200,20 @@ public class RustpotterKSService implements KSService {
                     continue;
                 }
                 remaining = bufferSize;
-                var result = rustpotter.processBuffer(audioBuffer);
+                var result = rustpotter.processBytes(audioBuffer);
                 if (result.isPresent()) {
                     var detection = result.get();
-                    logger.debug("keyword '{}' detected with score {}!", detection.getName(), detection.getScore());
+                    if (logger.isDebugEnabled()) {
+                        ArrayList<String> scores = new ArrayList<>();
+                        var scoreNames = detection.getScoreNames().split("\\|\\|");
+                        var scoreValues = detection.getScores();
+                        for (var i = 0; i < Integer.min(scoreNames.length, scoreValues.length); i++) {
+                            scores.add("'" + scoreNames[i] + "': " + scoreValues[i]);
+                        }
+                        logger.debug("Detected '{}' with: Score: {}, AvgScore: {}, Count: {}, Gain: {}, Scores: {}",
+                                detection.getName(), detection.getScore(), detection.getAvgScore(),
+                                detection.getCounter(), detection.getGain(), String.join(", ", scores));
+                    }
                     detection.delete();
                     ksListener.ksEventReceived(new KSpottedEvent());
                 }
@@ -216,35 +226,27 @@ public class RustpotterKSService implements KSService {
         logger.debug("rustpotter stopped");
     }
 
-    private @Nullable VadMode getVADMode(String mode) {
-        switch (mode) {
-            case "low-bitrate":
-                return VadMode.LOW_BITRATE;
-            case "quality":
-                return VadMode.QUALITY;
-            case "aggressive":
-                return VadMode.AGGRESSIVE;
-            case "very-aggressive":
-                return VadMode.VERY_AGGRESSIVE;
-            default:
-                return null;
-        }
-    }
-
-    private @Nullable NoiseDetectionMode getNoiseMode(String mode) {
+    private ScoreMode getScoreMode(String mode) {
         switch (mode) {
-            case "easiest":
-                return NoiseDetectionMode.EASIEST;
-            case "easy":
-                return NoiseDetectionMode.EASY;
-            case "normal":
-                return NoiseDetectionMode.NORMAL;
-            case "hard":
-                return NoiseDetectionMode.HARD;
-            case "hardest":
-                return NoiseDetectionMode.HARDEST;
+            case "average":
+                return ScoreMode.AVG;
+            case "median":
+                return ScoreMode.MEDIAN;
+            case "p25":
+                return ScoreMode.P25;
+            case "p50":
+                return ScoreMode.P50;
+            case "p75":
+                return ScoreMode.P75;
+            case "p80":
+                return ScoreMode.P80;
+            case "p90":
+                return ScoreMode.P90;
+            case "p95":
+                return ScoreMode.P95;
+            case "max":
             default:
-                return null;
+                return ScoreMode.MAX;
         }
     }
 }
index 583d4380a7d168158f0784c406a58b6ab1f915db..4c9c0bf82fd2c4a087bed66c88a61fa60599f64c 100644 (file)
@@ -9,13 +9,9 @@
                        <label>Wakeword Detector</label>
                        <description>Wakeword detection options.</description>
                </parameter-group>
-               <parameter-group name="noiseDetector">
-                       <label>Noise Detector</label>
-                       <description>Optional noise detection options.</description>
-               </parameter-group>
-               <parameter-group name="vadDetector">
-                       <label>VAD Detector</label>
-                       <description>Optional voice activity detector options.</description>
+               <parameter-group name="filters">
+                       <label>Audio Filters</label>
+                       <description>Optional audio filter options.</description>
                </parameter-group>
                <parameter name="threshold" type="decimal" min="0" max="1" groupName="wakewordDetector">
                        <label>Threshold</label>
                                cpu. If set to 0 this functionality is disabled.</description>
                        <default>0.2</default>
                </parameter>
+               <parameter name="scoreMode" type="text" groupName="wakewordDetector">
+                       <label>Score Mode</label>
+                       <description>Indicates how to calculate the final score.</description>
+                       <default>max</default>
+                       <options>
+                               <option value="average">Average</option>
+                               <option value="max">Max</option>
+                               <option value="median">Median</option>
+                               <option value="p25">P25</option>
+                               <option value="p50">P50</option>
+                               <option value="p75">P75</option>
+                               <option value="p80">P80</option>
+                               <option value="p90">P90</option>
+                               <option value="p95">P95</option>
+                       </options>
+               </parameter>
+               <parameter name="minScores" type="integer" groupName="wakewordDetector">
+                       <label>Min Scores</label>
+                       <description>Minimum number of positive scores to consider a partial detection as a detection.</description>
+                       <default>5</default>
+               </parameter>
                <parameter name="comparatorRef" type="decimal" min="0" max="1" groupName="wakewordDetector">
                        <label>Comparator Ref</label>
                        <description>Configures the reference for the comparator used to match the samples.</description>
                <parameter name="comparatorBandSize" type="integer" groupName="wakewordDetector">
                        <label>Comparator Band Size</label>
                        <description>Configures the band-size for the comparator used to match the samples.</description>
-                       <default>6</default>
+                       <default>5</default>
                        <advanced>true</advanced>
                </parameter>
-               <parameter name="eagerMode" type="boolean" groupName="wakewordDetector">
-                       <label>Eager Mode</label>
-                       <description>Enables eager mode. End detection as soon as a result is over the score, instead of waiting to
-                               see if the
-                               next frame has a higher score.</description>
-                       <default>true</default>
+               <parameter name="gainNormalizer" type="boolean" groupName="filters">
+                       <label>Gain Normalizer</label>
+                       <description> Enables an audio filter that intent to approximate the volume of the stream to a reference level (RMS
+                               of the samples is used as volume measure).</description>
+                       <default>false</default>
                </parameter>
-               <parameter name="noiseDetectionMode" type="text" groupName="noiseDetector">
-                       <label>Noise Detection Mode</label>
-                       <description>Use a noise detector to reduce computation in the absence of sound. Configures the difficulty to
-                               consider
-                               a
-                               frame as noise (the required noise level).</description>
-                       <default>disabled</default>
-                       <options>
-                               <option value="disabled">Disabled</option>
-                               <option value="easiest">Easiest</option>
-                               <option value="easy">Easy</option>
-                               <option value="normal">Normal</option>
-                               <option value="hard">Hard</option>
-                               <option value="hardest">Hardest</option>
-                       </options>
-               </parameter>
-               <parameter name="noiseSensitivity" type="decimal" min="0" max="1" groupName="noiseDetector">
-                       <label>Noise Sensitivity</label>
-                       <description>Noise/silence ratio in the last second to consider voice is detected.</description>
+               <parameter name="minGain" type="decimal" min="0.1" max="1" step="0.1" groupName="filters">
+                       <label>Min Gain</label>
+                       <description>Min gain applied by the gain normalizer filter.</description>
                        <default>0.5</default>
                </parameter>
-               <parameter name="vadMode" type="text" groupName="vadDetector">
-                       <label>VAD Mode</label>
-                       <description>Use a vad detector to reduce computation in the absence of vocal sound.</description>
-                       <default>disabled</default>
-                       <options>
-                               <option value="disabled">Disabled</option>
-                               <option value="low-bitrate">Low Bitrate</option>
-                               <option value="quality">Quality</option>
-                               <option value="aggressive">Aggressive</option>
-                               <option value="very-aggressive">Very Aggressive</option>
-                       </options>
+               <parameter name="maxGain" type="decimal" min="0.1" max="1" step="0.1" groupName="filters">
+                       <label>Max Gain</label>
+                       <description>Max gain applied by the gain normalizer filter.</description>
+                       <default>1</default>
                </parameter>
-               <parameter name="vadSensitivity" type="decimal" min="0" max="1" groupName="vadDetector">
-                       <label>VAD Sensitivity</label>
-                       <description>Voice/silence ratio in the last second to consider voice is detected.</description>
-                       <default>0.5</default>
+               <parameter name="gainRef" type="decimal" min="0" max="1" step="0.001" groupName="filters">
+                       <label>Gain Ref</label>
+                       <description>Set the RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation
+                               of the wakeword level is used.</description>
+               </parameter>
+               <parameter name="bandPass" type="boolean" groupName="filters">
+                       <label>Band Pass</label>
+                       <description>Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.</description>
+                       <default>false</default>
+               </parameter>
+               <parameter name="lowCutoff" type="decimal" min="0" groupName="filters">
+                       <label>Low Cutoff</label>
+                       <description>Low cutoff for the band-pass filter.</description>
+                       <default>80</default>
                </parameter>
-               <parameter name="vadDelay" type="integer" groupName="vadDetector">
-                       <label>VAD Delay</label>
-                       <description>Seconds to disable the vad detector after voice is detected.</description>
-                       <default>3</default>
+               <parameter name="highCutoff" type="decimal" min="0" groupName="filters">
+                       <label>High Cutoff</label>
+                       <description>High cutoff for the band-pass filter.</description>
+                       <default>400</default>
                </parameter>
        </config-description>
 </config-description:config-descriptions>
index 5cfc65511f3c7e15c0b6a11b683aaad0fc3e5de7..6569a9cf8975ff917c6a55173648633cce881184 100644 (file)
@@ -1,40 +1,42 @@
 voice.config.rustpotterks.averagedThreshold.label = Averaged Threshold
 voice.config.rustpotterks.averagedThreshold.description = Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
+voice.config.rustpotterks.bandPass.label = Band Pass
+voice.config.rustpotterks.bandPass.description = Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.
 voice.config.rustpotterks.comparatorBandSize.label = Comparator Band Size
 voice.config.rustpotterks.comparatorBandSize.description = Configures the band-size for the comparator used to match the samples.
 voice.config.rustpotterks.comparatorRef.label = Comparator Ref
 voice.config.rustpotterks.comparatorRef.description = Configures the reference for the comparator used to match the samples.
-voice.config.rustpotterks.eagerMode.label = Eager Mode
-voice.config.rustpotterks.eagerMode.description = Enables eager mode. End detection as soon as a result is over the score, instead of waiting to see if the next frame has a higher score.
-voice.config.rustpotterks.group.noiseDetector.label = Noise Detector
-voice.config.rustpotterks.group.noiseDetector.description = Optional noise detection options.
-voice.config.rustpotterks.group.vadDetector.label = VAD Detector
-voice.config.rustpotterks.group.vadDetector.description = Optional voice activity detector options.
+voice.config.rustpotterks.gainNormalizer.label = Gain Normalizer
+voice.config.rustpotterks.gainNormalizer.description = Enables an audio filter that intent to approximate the volume of the stream to a reference level (RMS of the samples is used as volume measure).
+voice.config.rustpotterks.gainRef.label = Gain Ref
+voice.config.rustpotterks.gainRef.description = Set the RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation of the wakeword level is used.
+voice.config.rustpotterks.group.filters.label = Audio Filters
+voice.config.rustpotterks.group.filters.description = Optional audio filter options.
 voice.config.rustpotterks.group.wakewordDetector.label = Wakeword Detector
 voice.config.rustpotterks.group.wakewordDetector.description = Wakeword detection options.
-voice.config.rustpotterks.noiseDetectionMode.label = Noise Detection Mode
-voice.config.rustpotterks.noiseDetectionMode.description = Use a noise detector to reduce computation in the absence of sound. Configures the difficulty to consider a frame as noise (the required noise level).
-voice.config.rustpotterks.noiseDetectionMode.option.disabled = Disabled
-voice.config.rustpotterks.noiseDetectionMode.option.easiest = Easiest
-voice.config.rustpotterks.noiseDetectionMode.option.easy = Easy
-voice.config.rustpotterks.noiseDetectionMode.option.normal = Normal
-voice.config.rustpotterks.noiseDetectionMode.option.hard = Hard
-voice.config.rustpotterks.noiseDetectionMode.option.hardest = Hardest
-voice.config.rustpotterks.noiseSensitivity.label = Noise Sensitivity
-voice.config.rustpotterks.noiseSensitivity.description = Noise/silence ratio in the last second to consider voice is detected.
+voice.config.rustpotterks.highCutoff.label = High Cutoff
+voice.config.rustpotterks.highCutoff.description = High cutoff for the band-pass filter.
+voice.config.rustpotterks.lowCutoff.label = Low Cutoff
+voice.config.rustpotterks.lowCutoff.description = Low cutoff for the band-pass filter.
+voice.config.rustpotterks.maxGain.label = Max Gain
+voice.config.rustpotterks.maxGain.description = Max gain applied by the gain normalizer filter.
+voice.config.rustpotterks.minGain.label = Min Gain
+voice.config.rustpotterks.minGain.description = Min gain applied by the gain normalizer filter.
+voice.config.rustpotterks.minScores.label = Min Scores
+voice.config.rustpotterks.minScores.description = Minimum number of positive scores to consider a partial detection as a detection.
+voice.config.rustpotterks.scoreMode.label = Score Mode
+voice.config.rustpotterks.scoreMode.description = Indicates how to calculate the final score.
+voice.config.rustpotterks.scoreMode.option.average = Average
+voice.config.rustpotterks.scoreMode.option.max = Max
+voice.config.rustpotterks.scoreMode.option.median = Median
+voice.config.rustpotterks.scoreMode.option.p25 = P25
+voice.config.rustpotterks.scoreMode.option.p50 = P50
+voice.config.rustpotterks.scoreMode.option.p75 = P75
+voice.config.rustpotterks.scoreMode.option.p80 = P80
+voice.config.rustpotterks.scoreMode.option.p90 = P90
+voice.config.rustpotterks.scoreMode.option.p95 = P95
 voice.config.rustpotterks.threshold.label = Threshold
 voice.config.rustpotterks.threshold.description = Configures the detector threshold, is the min score (in range 0. to 1.) that some of the wakeword templates should obtain to trigger a detection. Model defined value takes prevalence if present.
-voice.config.rustpotterks.vadDelay.label = VAD Delay
-voice.config.rustpotterks.vadDelay.description = Seconds to disable the vad detector after voice is detected.
-voice.config.rustpotterks.vadMode.label = VAD Mode
-voice.config.rustpotterks.vadMode.description = Use a vad detector to reduce computation in the absence of vocal sound.
-voice.config.rustpotterks.vadMode.option.disabled = Disabled
-voice.config.rustpotterks.vadMode.option.low-bitrate = Low Bitrate
-voice.config.rustpotterks.vadMode.option.quality = Quality
-voice.config.rustpotterks.vadMode.option.aggressive = Aggressive
-voice.config.rustpotterks.vadMode.option.very-aggressive = Very Aggressive
-voice.config.rustpotterks.vadSensitivity.label = VAD Sensitivity
-voice.config.rustpotterks.vadSensitivity.description = Voice/silence ratio in the last second to consider voice is detected.
 
 # service