前回 Google AIY Voice Kit を購入してマニュアル通りに組み立ててサンプルプログラムを動かすところまでやってみましたが、やはり Voice Kit が Google Echo と違って面白いのは Raspberry Pi ベースであるからこその自由度ということで、今回はとりあえずブレッドボードで LED を接続して、音声で LED を操作してみました。

GPIO ピンヘッダの取り付け

　AIY Voice Kit では Raspberry Pi に Voice HAT Accessory Board をマウントしているので、 Raspberry Pi の GPIO は全て覆われてしまっています。なので GPIO で接続する場合には Voice HAT の方の GPIO を使用することになります。 Voice HAT の GPIO Pinout はドキュメントでも図で説明されています。

aiyprojects.withgoogle.com

　図の下の方には表形式で各ピンについての説明が書かれています。今回は LED を二つ接続してみますので、GPIO02, 03 を使ってみます。 I2C としても使えるピンなので、ボード上のプリントは I2C になっています。写真赤枠のピンの右から GPIO02, GPIO03, GND になります。

f:id:akanuma-hiroaki:20180123223632j:plain:w450

　Voice HAT にはピンがついていないので、別途ピンヘッダを用意して、一度 Voice HAT を Raspberry Pi から外して半田付けします。

f:id:akanuma-hiroaki:20180123224112j:plain:w450

　半田付けしたらジャンパーコードを接続しておきます。

f:id:akanuma-hiroaki:20180123224336j:plain:w450

　そして再度 Voice Kit を組み上げ、ジャンパーコードの先を外へ引っ張り出しておきます。

f:id:akanuma-hiroaki:20180123224503j:plain:w300

ブレッドボードと LED の取り付け

　先ほど引っ張り出しておいたジャンパーコードから、下記の写真のようにブレッドボードとLEDを接続します。緑色のコードが GPIO02、黄色のコードが GPIO03 でそれぞれ LED のアノード側へ接続し、黒いコードが GND の接続になります。

f:id:akanuma-hiroaki:20180123225024j:plain:w300

スクリプト実装

　それでは Python のスクリプトを実装します。公式から提供されている下記サンプルをベースに、GPIOの操作などを追加しています。

github.com

　まずスクリプト全体は下記の通りです。

#!/usr/bin/env python3

import sys

import aiy.assistant.auth_helpers
import aiy.audio
import aiy.voicehat
from google.assistant.library import Assistant
from google.assistant.library.event import EventType

import RPi.GPIO as GPIO

class MyAssistant:
    GPIO_LED_GREEN = 2
    GPIO_LED_YELLOW = 3

    def __init__(self):
        self.print_and_say('Initializing MyAssistant.')
        self.credentials = aiy.assistant.auth_helpers.get_assistant_credentials()
        self.status_ui = aiy.voicehat.get_status_ui()

        GPIO.setmode(GPIO.BCM)
        GPIO.setup(MyAssistant.GPIO_LED_GREEN, GPIO.OUT)
        GPIO.setup(MyAssistant.GPIO_LED_YELLOW, GPIO.OUT)

    def print_and_say(self, text):
        print(text)
        aiy.audio.say(text)

    def process_event(self, assistant, event):
        print('Processing event. The event is %s.' % event.type)

        if event.type == EventType.ON_START_FINISHED:
            self.status_ui.status('ready')
            if sys.stdout.isatty():
                print('Say "OK, Google" then speak.')
        elif event.type == EventType.ON_CONVERSATION_TURN_STARTED:
            self.status_ui.status('listening')
        elif event.type == EventType.ON_RECOGNIZING_SPEECH_FINISHED and event.args:
            print('You said: %s' % event.args['text'])
            text = event.args['text'].lower()

            if text == 'turn on green led':
                assistant.stop_conversation()
                GPIO.output(MyAssistant.GPIO_LED_GREEN, GPIO.HIGH)
                self.print_and_say('Turned on green LED.')
            elif text == 'turn off green led':
                assistant.stop_conversation()
                GPIO.output(MyAssistant.GPIO_LED_GREEN, GPIO.LOW)
                self.print_and_say('Turned off green LED.')
            elif text == 'turn on yellow led':
                assistant.stop_conversation()
                GPIO.output(MyAssistant.GPIO_LED_YELLOW, GPIO.HIGH)
                self.print_and_say('Turned on yellow LED.')
            elif text == 'turn off yellow led':
                assistant.stop_conversation()
                GPIO.output(MyAssistant.GPIO_LED_YELLOW, GPIO.LOW)
                self.print_and_say('Turned off yellow LED.')
            elif text == 'turn on all led':
                assistant.stop_conversation()
                GPIO.output(MyAssistant.GPIO_LED_GREEN, GPIO.HIGH)
                GPIO.output(MyAssistant.GPIO_LED_YELLOW, GPIO.HIGH)
                self.print_and_say('Turned on all LED.')
            elif text == 'turn off all led':
                assistant.stop_conversation()
                GPIO.output(MyAssistant.GPIO_LED_GREEN, GPIO.LOW)
                GPIO.output(MyAssistant.GPIO_LED_YELLOW, GPIO.LOW)
                self.print_and_say('Turned off all LED.')
            elif text == 'goodbye':
                self.status_ui.status('stopping')
                assistant.stop_conversation()
                aiy.audio.say('Goodbye. See you again.')
                print('Stopping...')
                sys.exit()

        elif event.type == EventType.ON_END_OF_UTTERANCE:
            self.status_ui.status('thinking')
        elif event.type == EventType.ON_CONVERSATION_TURN_FINISHED:
            self.status_ui.status('ready')
        elif event.type == EventType.ON_ASSISTANT_ERROR and event.args and event.args['is_fatal']:
            sys.exit(1)

    def main(self):
        self.status_ui.status('starting')
        self.print_and_say('Starting main method.')
        with Assistant(self.credentials) as assistant:
            for event in assistant.start():
                self.process_event(assistant, event)

if __name__ == '__main__':
    sample = MyAssistant()
    sample.main()

　Assistant の start() メソッドを実行することで Google Assistant が hotword を待ち受け、検知するとイベントが発生しますので、イベントを process_event() メソッドに渡して処理します。

with Assistant(self.credentials) as assistant:
    for event in assistant.start():
        self.process_event(assistant, event)

　イベントの種別についてはこちらに記載があります。

Google Assistant Library | Google Assistant SDK | Google Developers

　ユーザが hotword の後に発話した内容は ON_RECOGNIZING_SPEECH_FINISHED イベントのパラメータとして渡されますので、その内容によって処理を切り分け、 GPIO の操作をして LED をコントロールしています。ポイントは assistant.stop_conversation() を実行しているところで、これを行うことによって Google Assistant による処理を中断し、ローカルで行わせたい処理だけ行なっています。 stop_conversation() を実行しないと発話内容に対して Google Assistant でも処理を行おうとするので、そのコマンドには対応できない的な返答が返されてしまいます。

if text == 'turn on green led':
    assistant.stop_conversation()
    GPIO.output(MyAssistant.GPIO_LED_GREEN, GPIO.HIGH)
    self.print_and_say('Turned on green LED.')