Go SDK

更新时间:
复制 MD 格式

This topic describes how to use the Go software development kit (SDK) for the short-sentence speech recognition feature of Intelligent Speech Interaction. It includes installation instructions and code examples.

Prerequisites

  • Before you use the SDK, you must review the API reference. For more information, see API reference.

  • Ensure that the Golang environment is installed and its basic configuration is complete.

  • The SDK supports Go 1.16 and later.

Download and install

  1. Download and install the SDK.

    Run the following command to download and install the SDK.

    go get github.com/aliyun/alibabacloud-nls-go-sdk
  2. Import the SDK.

    Add the following import statement to your code.

    import ("github.com/aliyun/alibabacloud-nls-go-sdk")

SDK constants

Constant

Description

SDK_VERSION

The SDK version.

PCM

The PCM audio format.

WAV

The WAV audio format.

OPUS

The OPUS audio format.

OPU

The OPU audio format.

DEFAULT_DISTRIBUTE

The default region used to obtain a token. The default value is "cn-shanghai".

DEFAULT_DOMAIN

The default URL used to obtain a token. The default value is "nls-meta.cn-shanghai.aliyuncs.com".

DEFAULT_VERSION

The protocol version used to obtain a token. The default value is "2019-02-28".

DEFAULT_URL

The default public cloud URL. The default value is "wss://nls-gateway-cn-shanghai.aliyuncs.com/ws/v1".

Establish a connection

Important

If you use an Akid and Akkey to obtain a token, you must cache the token and update it before it expires. Do not frequently call the API operation to obtain a token, because the cloud service may throttle your requests.

1. ConnectionConfig

The basic parameters that are required to establish a connection.

Parameters:

Parameter

Type

Description

Url

String

The public cloud URL to access. If you are unsure, use the DEFAULT_URL constant.

Token

String

The access token. For more information, see Overview of token generation.

Akid

String

The AccessKey ID of your Alibaba Cloud account.

  • This parameter is required if you do not specify the Token parameter.

  • This parameter is optional if you specify the Token parameter.

Akkey

String

The AccessKey Secret of your Alibaba Cloud account.

  • This parameter is required if you do not specify the Token parameter.

  • This parameter is optional if you specify the Token parameter.

Appkey

String

The Appkey of the project. To obtain an Appkey, go to the console.

2. func NewConnectionConfigWithToken(url string, appkey string, token string) *ConnectionConfig

Creates a `ConnectionConfig` object from a URL, an Appkey, and a token.

  • Parameters:

    Parameter

    Type

    Description

    Url

    String

    The public cloud URL to access. If you are unsure, use the DEFAULT_URL constant.

    Appkey

    String

    The Appkey of the project. To obtain an Appkey, go to the console.

    Token

    String

    The access token. For more information, see Overview of token generation.

  • Return value:

    *ConnectionConfig: A pointer to the connection configuration object.

3. func NewConnectionConfigFromJson(jsonStr string) (*ConnectionConfig, error)

You can create connection parameters from a JSON string.

  • Parameters:

    Parameter

    Type

    Description

    jsonStr

    String

    A JSON string that describes the connection parameters. Valid fields are url, token, akid, akkey, and appkey. The url and appkey fields are required. If you include the token field, you do not need to include the akid and akkey fields.

  • Return values:

    • *ConnectionConfig: A pointer to the connection configuration object.

    • error: An error.

Short-sentence speech recognition

1. SpeechRecognitionStartParam

The parameters for a short-sentence speech recognition request.

Parameter

Type

Description

Format

String

The audio format. Default value: PCM. Valid values: OPUS, OPU, and PCM. If you use OPUS or OPU, you must encode the audio yourself.

SampleRate

Integer

The sample rate. Default value: 16000 Hz.

EnableIntermediateResult

Boolean

Specifies whether to return intermediate recognition results.

  • true: It is enabled.

  • false: Does not return intermediate results.

EnablePunctuationPrediction

Boolean

Specifies whether to enable punctuation prediction.

  • true: The setting is enabled.

  • false: The feature is disabled.

EnableInverseTextNormalization

Boolean

Specifies whether to enable inverse text normalization (ITN). ITN converts Chinese numerals to Arabic numerals. If you set this parameter to true, Chinese numerals are converted to Arabic numerals in the output. Default value: False.

2. func DefaultSpeechRecognitionParam() SpeechRecognitionStartParam

Returns a set of default parameters. By default, the audio format is PCM, the sample rate is 16000 Hz, and the features for intermediate results, punctuation prediction, and Inverse Text Normalization (ITN) are enabled.

  • Parameters: None.

  • No value is returned.

3. func NewSpeechRecognition(...) (*SpeechRecognition, error)

Creates a SpeechRecognition instance.

  • Parameters:

    Parameter

    Type

    Description

    config

    *ConnectionConfig

    For more information, see the Establish a connection section.

    logger

    *NlsLogger

    For more information, see the SDK logs section.

    taskfailed

    func(string, interface{})

    The callback parameter for handling errors during the recognition process. interface{} is a user-defined parameter.

    started

    func(string, interface{})

    The callback parameter for when the connection is established.

    resultchanged

    func(string, interface{})

    The callback parameter for intermediate recognition results.

    completed

    func(string, interface{})

    The callback parameter for the final recognition result.

    closed

    func(interface{})

    The callback parameter for when the connection is disconnected.

    param

    interface{}

    A user-defined parameter.

  • Return values:

    • *SpeechRecognition: A pointer to the short-sentence speech recognition object.

    • error: An error.

4. func (sr *SpeechRecognition) Start(param SpeechRecognitionStartParam, extra map[string]interface{}) (chan bool, error)

Initiates a short-sentence speech recognition request with the specified parameters.

  • Parameters:

    Parameter

    Type

    Description

    param

    SpeechRecognitionStartParam

    The parameters for short-sentence speech recognition.

    extra

    map[string]interface{}

    Extra key:value parameters.

  • Return values:

    • chan bool: A channel that is used to synchronize the start of the speech recognition process. You can send audio data only after the channel is ready.

    • error: An error.

5. func (sr *SpeechRecognition) Stop() (chan bool, error)

Stops the short-sentence speech recognition process.

  • Parameters: None.

  • Return values:

    • chan bool: A channel that is used to synchronize the end of the speech recognition process.

    • error: An error.

6. func (sr *SpeechRecognition) Shutdown()

Forcibly disconnects the connection.

  • Parameters: None.

  • Return value: None.

7. func (sr *SpeechRecognition) SendAudioData(data []byte) error

Sends audio data. The audio format must match the format that is specified in the parameters.

  • Parameters:

    Parameter

    Type

    Description

    data

    []byte

    The audio data.

  • Return value:

    error: An error.

SDK logs

1. func DefaultNlsLog() *NlsLogger

Creates a global default log object. By default, the log has the prefix "NLS" and writes to standard error.

  • Parameters: None.

  • Return value:

*NlsLogger: A pointer to the log object.

2. func NewNlsLogger(w io.Writer, tag string, flag int) *NlsLogger

Creates a new log object.

  • Parameters:

    Parameter

    Type

    Description

    w

    io.Writer

    Any object that implements the io.Writer interface.

    tag

    String

    The log prefix. It is printed at the beginning of each log line.

    flag

    Integer

    The log flag. For more information, see the official Go log documentation.

  • Return value:

    *NlsLogger: A pointer to the log object.

3. func (logger *NlsLogger) SetLogSil(sil bool)

Specifies whether to write the log to the corresponding io.Writer.

  • Parameters:

    Parameter

    Type

    Description

    sil

    Boolean

    Specifies whether to disable log output.

    • true: Prohibited.

    • false: Allowed.

  • Return value: None.

4. func (logger *NlsLogger) SetDebug(debug bool)

Specifies whether to enable debug logging. This setting affects only logs that are created using Debugf or Debugln.

  • Parameters:

    Parameter

    Type

    Description

    debug

    Boolean

    Specifies whether to allow debug log output.

    • true: Allows output.

    • false: Disabled.

  • Return value: None.

5. func (logger *NlsLogger) SetOutput(w io.Writer)

Sets the output destination for the log.

  • Parameters:

    Parameter

    Type 

    Description

    w

    io.Writer

    Any object that implements the io.Writer interface.

  • Return value: None.

6. func (logger *NlsLogger) SetPrefix(prefix string)

Sets the prefix for each log entry.

  • Parameters:

    Parameter

    Type

    Description

    prefix

    String

    The label for log lines. It is output at the beginning of each log line.

  • Return value: None.

7. func (logger *NlsLogger) SetFlags(flags int)

Sets the log properties.

  • Return value: None.

8. Log printing

Logging method

Method name

Description

func (l *NlsLogger) Print(v ...interface{})

Standard log output.

func (l *NlsLogger) Println(v ...interface{})

Standard log output that adds a new line.

func (l *NlsLogger) Printf(format string, v ...interface{})

Formatted log output. For more information about the format, see the official Go documentation.

func (l *NlsLogger) Debugln(v ...interface{})

Debug log output that adds a new line.

func (l *NlsLogger) Debugf(format string, v ...interface{})

Formatted debug log output.

func (l *NlsLogger) Fatal(v ...interface{})

Outputs a fatal error log and then exits the process.

func (l *NlsLogger) Fatalln(v ...interface{})

Outputs a fatal error log, adds a new line, and then exits the process.

func (l *NlsLogger) Fatalf(format string, v ...interface{})

Outputs a formatted fatal error log and then exits the process.

func (l *NlsLogger) Panic(v ...interface{})

Outputs a fatal error log, prints crash information, and then exits the process.

func (l *NlsLogger) Panicln(v ...interface{})

Outputs a fatal error log, adds a new line, prints crash information, and then exits the process.

func (l *NlsLogger) Panicf(format string, v ...interface{})

Outputs a formatted fatal error log, prints crash information, and then exits the process.

Code example

package main

import (
        "errors"
        "flag"
        "fmt"
        "log"
        "os"
        "os/signal"
        "sync"
        "time"

        "github.com/aliyun/alibabacloud-nls-go-sdk"
)

const (
    AKID  = "Your AKID"
        AKKEY = "Your AKKEY"
        //online key
        APPKEY = "Your APPKEY"      // To obtain an Appkey, go to the console: https://nls-portal.console.aliyun.com/applist
        TOKEN  = "Your TOKEN"      // For details about how to obtain a token, see https://help.aliyun.com/document_detail/450514.html
)

func onTaskFailed(text string, param interface{}) {
        logger, ok := param.(*nls.NlsLogger)
        if !ok {
                log.Default().Fatal("invalid logger")
                return
        }

        logger.Println("TaskFailed:", text)
}

func onStarted(text string, param interface{}) {
        logger, ok := param.(*nls.NlsLogger)
        if !ok {
                log.Default().Fatal("invalid logger")
                return
        }

        logger.Println("onStarted:", text)
}

func onResultChanged(text string, param interface{}) {
        logger, ok := param.(*nls.NlsLogger)
        if !ok {
                log.Default().Fatal("invalid logger")
                return
        }

        logger.Println("onResultChanged:", text)
}

func onCompleted(text string, param interface{}) {
        logger, ok := param.(*nls.NlsLogger)
        if !ok {
                log.Default().Fatal("invalid logger")
                return
        }

        logger.Println("onCompleted:", text)
}

func onClose(param interface{}) {
        logger, ok := param.(*nls.NlsLogger)
        if !ok {
                log.Default().Fatal("invalid logger")
                return
        }

        logger.Println("onClosed:")
}

func waitReady(ch chan bool, logger *nls.NlsLogger) error {
        select {
        case done := <-ch:
                {
                        if !done {
                                logger.Println("Wait failed")
                                return errors.New("wait failed")
                        }
                        logger.Println("Wait done")
                }
        case <-time.After(20 * time.Second):
                {
                        logger.Println("Wait timeout")
                        return errors.New("wait timeout")
                }
        }
        return nil
}

var lk sync.Mutex
var fail = 0
var reqNum = 0

func testMultiInstance(num int) {
        pcm, err := os.Open("tests/test1.pcm")
        if err != nil {
                log.Default().Fatalln(err)
        }

        buffers := nls.LoadPcmInChunk(pcm, 320)
        param := nls.DefaultSpeechRecognitionParam()
        config, _ := nls.NewConnectionConfigWithAKInfoDefault(nls.DEFAULT_URL, APPKEY, AKID, AKKEY)
        var wg sync.WaitGroup
        for i := 0; i < num; i++ {
                wg.Add(1)
                go func(id int) {
                        defer wg.Done()
                        strId := fmt.Sprintf("ID%d   ", id)
                        logger := nls.NewNlsLogger(os.Stderr, strId,    log.LstdFlags|log.Lmicroseconds)
                        logger.SetLogSil(false)
                        logger.SetDebug(true)
      logger.Printf("Test Normal Case for SpeechRecognition:%s", strId)
                        sr, err := nls.NewSpeechRecognition(config, logger,
                                onTaskFailed, onStarted, onResultChanged,
                                onCompleted, onClose, logger)
                        if err != nil {
                                logger.Fatalln(err)
                                return
                        }

      test_ex := make(map[string]interface{})
      test_ex["test"] = "hello"

                        for {
                                lk.Lock()
                                reqNum++
                                lk.Unlock()
                                logger.Println("SR start")
                                ready, err := sr.Start(param, test_ex)
                                if err != nil {
                                        lk.Lock()
                                        fail++
                                        lk.Unlock()
                                        sr.Shutdown()
                                        continue
                                }

                                err = waitReady(ready, logger)
                                if err != nil {
                                        lk.Lock()
                                        fail++
                                        lk.Unlock()
                                        sr.Shutdown()
                                        continue
                                }

                                for _, data := range buffers.Data {
                                        if data != nil {
                                                sr.SendAudioData(data.Data)
                                                time.Sleep(10 * time.Millisecond)
                                        }
                                }

                                logger.Println("send audio done")
                                ready, err = sr.Stop()
                                if err != nil {
                                        lk.Lock()
                                        fail++
                                        lk.Unlock()
                                        sr.Shutdown()
                                        continue
                                }

                                err = waitReady(ready, logger)
                                if err != nil {
                                        lk.Lock()
                                        fail++
                                        lk.Unlock()
                                        sr.Shutdown()
                                        continue
                                }

                                logger.Println("Sr done")
                                sr.Shutdown()
                        }
                }(i)
        }

        wg.Wait()
}

func main() {
        coroutineId := flag.Int("num", 1, "coroutine number")
        flag.Parse()
        log.Default().Printf("start %d coroutines", *coroutineId)

        c := make(chan os.Signal, 1)
        signal.Notify(c, os.Interrupt)
        go func() {
                for range c {
                        lk.Lock()
                        log.Printf(">>>>>>>>REQ NUM: %d>>>>>>>>>FAIL: %d", reqNum, fail)
                        lk.Unlock()
                        os.Exit(0)
                }
        }()
        testMultiInstance(*coroutineId)
}