I’ve had published a new Alexa Skill (How-to walkthrough)


This article is long in order to provide a step-by-step guide to how I wrote my Alexa Skill.


Never too old to learn, I wanted to understand how to code and publish an Alexa skill.

It’s an interesting mix of web technologies that work together to bring Alexa skills to life, and I wanted to learn and build end-to-end solution for a purpose I think would be really useful for the locals in my area: The next published opening time (if any) of the Newhaven Swing Bridge.

This particular swing bridge crosses the River Ouse in the centre of Newhaven in East Sussex. It’s a busy bridge because it carries the A259 South Coast Road from Brighton to Eastbourne. If the bridge swings open to allow a ship through to/from Newhaven Port, the whole process takes upwards of 20 minutes. The whole of Newhaven becomes congested and traffic jams snake as far away as Peacehaven and Telscombe Cliffs in the West , up the A26 towards Lewes in the North, and away towards Seaford in the East.

The alternative? A 15 mile rural journey North up the C7 to Lewes, along the A27 highway East towards Eastbourne, then South along the A26 towards Newhaven Ferry.

So planning your journey to avoid Newhaven from 5 minutes before the bridge swings open to – I’d say – an hour later, is an important task!

Both Newhaven Port Authority and Newhaven Town Council publish the next scheduled opening time on their websites, if any. A bridge opening averages once every 2-3 days. That’s great, but in these days of just chatting away to a circular tube in your home that houses a voice which obeys every command is far more convenient!

To start with – and beyond the scope of this article – I had to either find or write an API call that would actually get this information. Later on in this article I wail talk about a ‘Lambda’ function that uses code that could be used to go, for example, and scrape a website with the information on it. But I felt writing an simple API on my own server (actually the same one serving you this page!) would be the best idea.

I located a source of the information and then used my PHP 7 skills to write a simple API to scrape and verify that information, and present words ready for Alexa to speak in a simple JSON response. The Lambda function I will talk about later is therefore the link between the Alexa Voice Service and calls to my API. It is simply the easiest thing to do, mainly for Amazon’s security reasons since they prefer Alexa Voice Service calling Lambda functions rather than the outside world. So I wrote the API, installed it on my server, and then now this story continues…

To set about building my Alexa Skill, I arrived at the Amazon Developers website and signed in using my standard Amazon Store credentials (not Amazon Web Service (AWS) credentials):

https://developer.amazon.com/alexa

I then clicked on ‘ALEXA” from the top menu row and arrived at this selection:

I clicked on Alexa Skills Kit’s “Get Started” button, and arrived here:

As you can see, my Newhaven Swing Bridge skill is live, but I can still view it. You can click ‘Add A New Skill’ to see the same screen (Ha! My amateurish red line is simply to hide what might be a useful piece of information for a would-be hacker? Just taking precautions!):

You’ll notice that ‘Skill Information’ is highlighted in orange top left, as this is the Skill Information window.

I entered:

  • For the Skill Name, ‘Newhaven Swing Bridge,
  • For the language I chose ‘English UK’, At the time of writing there’s also ‘English US; and ‘German’, which represent the three markets which Amazon is selling Alexa products. Choose ‘English ‘US’ when you are in the UK and your Amazon Echo won’t see your skill when you test it (and vice versa).
  • For the Invocation Name, I typed  the words that a person would use to activate the skill. Note how, for ease, I separated Newhaven into ‘new haven’ to make it as easy as possible for Alexa to understand what a user would be saying.
  • For Global Fields, I left them at their default settings of ‘No’, as I am not using audio files, video apps or graphics.

I then clicked the ‘Builder’ button’ under ‘Interaction Model’  (below ‘Skill Information’) which loads this screen after a few seconds:

So what’s this about? The Builder allows you to teach Alexa what individual commands to listen for so she* knows what code to run to make the skill work. *Yes personifying a computer voice is probably a bit morally weird but I’m going with it!

These commands are known as Intents. Three Intents are built-in to every skill: Cancel, Help and Stop. If a user shouts out any of these three words then Alexa will start running the code connecting to the appropriate intent.

So we need to add a new Intent which I’ve called ‘OpeningTimeIntent’. This name is arbitrary but it must match a function in the code you’ll be writing shortly. For now we need to teach Alexa what words to listen out for that would make her decide that this was the Intent the user wanted to run. You can ‘Add an Intent’ in the blue square – I’ll just click on my existing Intent to show you it:

A window called ‘Sample Utterances’ appears. These are phrases that a user would use that would cause Alexa to decide that OpeningTimeIntent was the desired action. As you can see, I thought about what a user would actually say and wrote these down:

  • Alexa, ask Newhaven Swing Bridge when it is next open
  • Alexa, ask Newhaven Swing Bridge if it is opening today
  • Alexa, ask Newhaven Swing Bridge for the next opening time
  • Alexa, ask Newhaven Swing Bridge when it next opens

I could have added a whole load more examples, but these seem reasonable. If I wanted I could ask Alexa to respond with a sentence such as, “You want to hear the next opening time, is that correct?” in which case I could have added those very words to the Intent Confirmation box. This would be useful if you have lots of your own different Intents and there may a situation where you want to check that Alexa has chosen the correct Intent. The user would reply, saying  Yes or No. If Yes, the Intent would then be executed. It’s a but much for this simple skill, though, so if Alexa thinks you said, “Alexa, ask Newhaven Swing Bridge for the next opening time” then she’s just going to go ahead and run the code to get the info regardless.

For this simple skill we’re done here, so at the top of the window I click ‘Save Model’ followed by ‘Build Model’. Whilst ‘Save Model’ is pretty much instant, ‘Build Model’ takes a good minute during which Alex servers far, far away take the skill and build the binaries for it.

Once that’s done, I leave Interaction Model by clicking the next button at the top right corner of the Builder screen: Configuration.

It takes a few seconds for the web page to get rid of the Builder screen and load the Configuration screen, which looks like this (again, ignore the red line and enter your lambda function ID once you have created it):

So now I need to pause my Alexa Skill development and do some coding. On the screen above, the skill is asking for an ‘AWS Lambda ARN’. This is a link to a program script that will execute the skill which is hosted in a service provided by Amazon Web Service cloud, called Lambda.

Amazon Web Services (AWS) describes Lambda:

AWS Lambda is a compute service that lets you run code without provisioning or managing servers. AWS Lambda executes your code only when needed and scales automatically, from a few requests per day to thousands per second. You pay only for the compute time you consume – there is no charge when your code is not running.

Importantly, Alexa skills trust Lambda in security terms so that the skill and the lambda program you write work together to bring your skill to life.

So now I head to Amazon Web Services at:

https://aws.amazon.com/

It just so happens that I have an account at AWS so I simply login to get to the main logged-in page. If you don’t have an account you will need to create it and register a payment card. Lambda functions have a large ‘free’ tier use so you should have little worries about cost. Indeed, AWS and Alexa are trying to make Lambda free even if thousands use your skill every day.

Check for the latest pricing at https://aws.amazon.com/lambda/pricing/

At the time of writing, that page says:

The Lambda free tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month.

You are charged for the total number of requests across all your functions. Lambda counts a request each time it starts executing in response to an event notification or invoke call, including test invokes from the console.

  • First 1 million requests per month are free
  • $0.20 per 1 million requests thereafter ($0.0000002 per request)

Now I’m logged into my AWS account, I see this list of services (partial screen – there are loads!):

I click on ‘Lambda’ under the ‘Compute’ group. If you haven’t used Lambda yet then you’ll get a welcome screen, otherwise it a list of Lambda functions will appear. I click on Create Lambda Function. What appears is a screen like this:

These templates are examples of already-written program code that you can use and adjust to your own needs. When I originally started this project, I chose a blueprint using a computer language I am familiar with and that is easy to program – Python.

To get to this blueprint, set the search and filter boxes at the top of the screen to these values: Python 2.7 (drop down list) and ‘alexa’ text:

You will then see “alexa-skills-kit-color-expert-py… ” in the results. Click on that blueprint.

Immediately, you see this screen:

This screen is showing a ‘trigger’ – something that causes the Lambda program you’ll be writing to execute. Here pre-filled-in is the trigger ‘Alexa Skills Kit’. All I need to do is click Next – and I get this screen:

I’m being asked for a function name, but other than that everything else seems to be filled in, including what looks like a text editor with program code in it. This is the code from the blueprint I selected.

If you wish, you are welcome to read through the code – it’s really interesting. However I want to change it to the code that runs my skill.

And here is my code, which i developed in their web editor. I have obfuscated the actual calls the API running on my own server but I have commented in the code what got returned, so it should be easy to follow. Bear in mind that my simple API has no incoming parameters of properties – it is a simple web address which you could even call with a web browser and get back a JSON data response with the next opening time information. Just about the rest of the code came from me altering the original blueprint code here and there.


"""
Nick Lansley's Newhaven Swing Bridge skill
"""

from __future__ import print_function
import urllib
import json
import time


# --------------- Helpers that build all of the responses ----------------------

def build_speechlet_response(title, output, reprompt_text, should_end_session):
    return {
        'outputSpeech': {
            'type': 'PlainText',
            'text': output
        },
        'card': {
            'type': 'Simple',
            'title': title,
            'content': output
        },
        'reprompt': {
            'outputSpeech': {
                'type': 'PlainText',
                'text': reprompt_text
            }
        },
        'shouldEndSession': should_end_session
    }


def build_response(session_attributes, speechlet_response):
    return {
        'version': '1.0',
        'sessionAttributes': session_attributes,
        'response': speechlet_response
    }


# --------------- Functions that control the skill's behavior ------------------

def GetHelpInfo():
    helpText = 'To use this skill, simply say, ask newhaven swing bridge for the next opening time. ' \
               + 'I will then go and find the next scheduled opening time of the Swing Bridge that carries the A 2 5 9 road across the river Ouse ' \
               + 'at Newhaven in East Sussex. When the bridge swings open, it closes the main road at the centre of Newhaven ' \
               + 'for about 20 minutes. ' \
               + 'The next swing bridge opening time is published by the Newhaven Port Authority on their website. ' \
               + 'Try it now. ask newhaven swing bridge for the next opening time!'
    return helpText


def get_help_response():
    session_attributes = {}
    card_title = "Help"

    speech_output = GetHelpInfo()
    reprompt_text = "I'm waiting to give you help!"
    should_end_session = True

    return build_response(session_attributes, build_speechlet_response(
        card_title, speech_output, reprompt_text, should_end_session))


def handle_session_end_request():
    card_title = "Session Ended"
    speech_output = ""
    # Setting this to true ends the session and exits the skill.
    should_end_session = True
    return build_response({}, build_speechlet_response(
        card_title, speech_output, None, should_end_session))

def GetOpeningTime():
    ssml = ''
    session_attributes = {}
    url = ''
    try:
        f = urllib.urlopen(url)
        jsonResponse = f.read().decode('utf-8')
        newhavenObject = json.loads(jsonResponse)
        statusId = int(newhavenObject['STATUSID']) #STATUSID is one of the JSON variable returned by my API with values 1, 2, or 3
    except:
        newhavenObject = ''
        statusId = 1

    if statusId == 1:
        ssml = "I am unable to access the bridge opening time information at the moment. Please try again shortly. "
    elif statusId == 2:
        ssml = "There are no scheduled openings of the Newhaven swing bridge today."
    elif statusId == 3:
        ssml = "The Newhaven swing bridge is due to open next " + newhavenObject['STATUSTEXT'].replace(' ', ' ')
        #STATUSTEXT is another variable returned by my API with the date and time of the next opening to be spoken by Alexa.
    return build_response(session_attributes, build_speechlet_response("Newhaven Swing Bridge Opening Time", ssml, "Shall I repeat that?", True))



# --------------- Events ------------------

def on_session_started(session_started_request, session):
    """ Called when the session starts """

    print("on_session_started requestId=" + session_started_request['requestId']
          + ", sessionId=" + session['sessionId'])


def on_launch(launch_request, session):
    """ Called when the user launches the skill without specifying what they
    want
    """

    print("on_launch requestId=" + launch_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    # Dispatch to your skill's launch
    return GetOpeningTime()


def on_intent(intent_request, session):
    """ Called when the user specifies an intent for this skill """

    print("on_intent requestId=" + intent_request['requestId'] +
          ", sessionId=" + session['sessionId'])

    intent = intent_request['intent']
    intent_name = intent_request['intent']['name']

    # Dispatch to your skill's intent handlers
    if intent_name == "OpeningTimeIntent":
        return GetOpeningTime()
    elif intent_name == "AMAZON.HelpIntent":
        return get_help_response()
    elif intent_name == "AMAZON.CancelIntent" or intent_name == "AMAZON.StopIntent":
        return handle_session_end_request()
    else:
        raise ValueError("Invalid intent")


def on_session_ended(session_ended_request, session):
    """ Called when the user ends the session.

    Is not called when the skill returns should_end_session=true
    """
    print("on_session_ended requestId=" + session_ended_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    # add cleanup logic here


def nick_debug(string_to_show):
    apiParms = "?string=" + json.dumps(string_to_show)
    apiUrl = '<a debug API call to my server for logging info - great during development>'
    uh = urllib.urlopen(apiUrl + apiParms)
    data = uh.read()


# --------------- Main handler ------------------

def lambda_handler(event, context):
    """ Route the incoming request based on type (LaunchRequest, IntentRequest,
    etc.) The JSON body of the request is provided in the event parameter.
    """
    nick_debug('reached here!')

    print("event.session.application.applicationId=" +
          event['session']['application']['applicationId'])

    """
    Uncomment this if statement and populate with your skill's application ID to
    prevent someone else from configuring a skill that sends requests to this
    function.
    """
    # if (event['session']['application']['applicationId'] !=
    #         "amzn1.echo-sdk-ams.app.[unique-value-here]"):
    #     raise ValueError("Invalid Application ID")

    if event['session']['new']:
        on_session_started({'requestId': event['request']['requestId']},
                           event['session'])

    if event['request']['type'] == "LaunchRequest":
        return on_launch(event['request'], event['session'])
    elif event['request']['type'] == "IntentRequest":
        return on_intent(event['request'], event['session'])
    elif event['request']['type'] == "SessionEndedRequest":
        return on_session_ended(event['request'], event['session'])


After all that, it’s a case of testing the Alexa skill which is linked to an Echo using the same Amazon account, and then submitting it for Amazon’s consideration using the Alexa Developer portal using the Publishing Information tab.

So back to the Alexa Developer Skill. In keep with the skill’s function, I clicked on the Publishing Information tab and set:

  • Category to “Travel and Transportation”
  • Sub category: “Travel and Trip Planners”
  • Testing instructions: I just described, using a couple of sentences, what words an Amazon tester could use to engage with the skill, and what to expect.
  • Countries and Regions: Just the UK
  • Short Skill Description: “This skill allows you to find out the next opening time of the swing bridge in Newhaven, East Sussex” (keep it short just to hook those interested!).
  • Full Skill Description: (A full description about what the skill does, why it is useful (avoiding traffic jams!) and how to interact with it).
  • Example utterances: To get this right you MUST include at least two utterances that you used in the Skill Information and Interaction Model, or you skill will be rejected. My example phrases were: “Alexa, ask Newhaven Swing Bridge for the next opening time:, “when it is next open”, and “if it opening today” – in that order and that case and syntax in the three provided text boxes.
  • Keywords: I used newhaven,east sussex,swing,bridge,a259
  • Small icon (108 x 108 pixel PNG or JPG) and Large Icon (512 x 512 pixel PNG) mean that you need to rustle up your design skills in Windows Paint or better to create a distinctive icon. Bear in mind that these images are seen through a circular ‘quote’ design when published on the Alexa Skills site so keep the detail of your design towards the middle, avoiding the edges. When you upload you can see the result and if anything is hidden.

Finally, I submitted for publication and awaited my inbox for good news over two to three working days.

Good luck with your own skill!