Real-time reporting using AWS Lambda and DynamoDB: a tutorial to compute the number of players in a game level on the Snowplow event stream (2/2)

This is part 2 or our tutorial Real-time reporting using AWS Lambda and DynamoDB. Part 1 can be found here.

7. Setting our transitions table

To produce the visualization required, Codecombat need to know not just how many users are active in each level now, but how many have transitioned from one level to another in the last time period.

Our transitions table looks as follows:

It is relatively straightforward to update the table based on our existing SetLevelState.py Lambda, which reads from the PlayerState DynamoDB stream looking for changes to player state and uses that to set the current LevelState table with the appropriate PlayerCount. We extend it to update the Transitions table above in Dynamo using the following function:

def write_transition(old_level, new_level):
    # write to the transitions table, bumping the count for this record
    if old_level != new_level:

        record_key = make_transition_key(old_level, new_level)
        print("transition key = {}".format(record_key))

        if old_level is not None and new_level is not None:
            # from / to a level
            response = transitions_table.update_item(
                Key={'transitionLevels': record_key},
                UpdateExpression="set #total = if_not_exists(#total, :initial) + :val, levelFrom = :from, levelTo = :to",
                ExpressionAttributeValues={':val': 1, ':initial': 0, ':from': old_level, ':to': new_level },
                ExpressionAttributeNames={'#total': 'count'},
                ReturnValues="UPDATED_NEW"
            )
        elif old_level is None and new_level is not None:
            # entered game
            response = transitions_table.update_item(
                Key={'transitionLevels': record_key},
                UpdateExpression="set #total = if_not_exists(#total, :initial) + :val, levelTo = :to", # there's no levelFrom in here (no attribute means it's null here)
                ExpressionAttributeValues={':val': 1, ':initial': 0, ':to': new_level },
                ExpressionAttributeNames={'#total': 'count'},
                ReturnValues="UPDATED_NEW"
            )
        elif new_level is None and old_level is not None:
            # exiting game
            response = transitions_table.update_item(
                Key={'transitionLevels': record_key},
             UpdateExpression="set #total = if_not_exists(#total, :initial) + :val, levelFrom = :from ", # there's no levelTo in here
                ExpressionAttributeValues={':val': 1, ':initial': 0, ':from': old_level },
                ExpressionAttributeNames={'#total': 'count'},
                ReturnValues="UPDATED_NEW"
            )
        else:
            raise ValueError("Unexpected error - level change does not meet transition criteria")

The function makes use of the make_transition_key function, defined as follows:

def make_transition_key(from_level, to_level):
    f=from_level
    t=to_level

    if from_level is None:
        f = ""
    if to_level is None:
        t = ""

    return "{}/{}".format(f,t)

i.e. returns the from and to levels as a single string value, suitable for being a key in our Transitions table.

The write_transition function simply increments the appropriate record in the Transitions table, creating a record (with initial count of 0) if none exists already.

Like the LevelState table, we need a way to periodically prune this table. In this case, we have a simple Lambda that:

  1. Flushes the data in the table to S3 (so that it is readily available to the app)
  2. Deletes all the data in the table.

This Lambda is set to run every minutes. The full Lambda can be found [here][FlushTransitionState.py].

8. Tips on developing your Lambdas

The AWS Lambda UI provides a practical environment for developing your Lambda, including testing it on individaul events before deploying it in production.

As you may have noticed looking at the code, we make extensive use of Cloudfront logging to assist debugging the application. You’ll see print statements throughout the Lambda: the output of these can be read directly from the Cloudwatch logs, making it easy to understand if the application is working as intended and debug if not.

9. A recap

As the above hopefully shows, Kinesis + Lambda + Dynamo provides a very powerful toolkit for building potentially very knarly computations (in this case count distincts on the number of players in each level) simply and efficiently in real-time, as an analytics-on-write application. The full application can be found [here][full-application-on-github].

We hope to build out our library of example Analytics-on-write applications over the next few weeks and months, and encourage particularly users of our real-time pipeline to contribute their own receipt to the Discourse.

[codecombat-level-map]:
[codecombat-schema-registry]: https://github.com/snowplow-proservices/com.codecombat-schema-registry
[snowplow-python-analytics-sdk]: https://github.com/snowplow/snowplow-python-analytics-sdk
[SetPlayerState.py]: https://github.com/snowplow-proservices/com.codecombat-analytics-on-write/blob/master/SetPlayerState.py
[PrunePlayerLevel.py]: https://github.com/snowplow-proservices/com.codecombat-analytics-on-write/blob/master/PrunePlayerLevel.py
[FlushTransitionState.py]: https://github.com/snowplow-proservices/com.codecombat-analytics-on-write/blob/master/FlushTransitionState.py
[full-application-on-github]: https://github.com/snowplow-proservices/com.codecombat-analytics-on-write
[level-state-table]: level-table-screenshot.png
[transitions-table-screenshot.png]: transitions-table-screenshot.png

5 Likes

Hi @yali This is a fantastic tutorial and we are implementing a similar pipeline. A question remains is that how did you send the data in real time to your frontend?
Did you use web socket? Or other subscription/pushing model to continuously feed the data change to frontend?
Thank you

It was a while ago @kuangmichael07 and the Codecombat team built that part of the tech so I’m afraid I can’t answer that… I don’t know if other community members have ideas about how best to do that part?