# Unmarshalling DynamoDB Items from Step Functions

AWS Step Functions introduced two new features: variables and support for [JSONata](https://docs.jsonata.org/overview.html). JSONata is a lightweight query and transformation language for JSON data. Whoever has worked with Step Functions knows that this is a real game-changer! When I heard the news, I immediately saw the potential for many things that would previously require a Lambda function but would now be achievable “natively” in Step Functions.

One common task many Step Functions workflows do is fetching items from DynamoDB. However, the `getItem` or `query` tasks return the raw “unmarshalled” data from DynamoDB.

Example:

```json
{
  "id": {
    "S": "1"
  },
  "name": {
    "S": "John"
  },
  "address": {
    "M": {
      "street": {
        "S": "123, 5th Avenue"
      },
      "postCode": {
        "S": "5555"
      },
      "city": {
        "S": "New York"
      }
    }
  },
  "age": {
    "N": "32"
  }
}
```

This format is not very practical to work with because you need to remember and know the field types in the paths (e.g., `$.item.name.S`). Additionally, there's an issue with certain values, such as numbers, which are encoded as strings (like `age` above). This makes it harder to perform simple operations like math and comparisons.

With the arrival of JSONata, I started wondering if we could use the [Function Library](https://docs.jsonata.org/string-functions) to "visit" and decode DynamoDB objects (a.k.a unmarshall them).

## Step One: A Simple Proof of Concept

Before getting into the nitty-gritty of Step Functions, I first wanted to make a quick proof of concept to see if JSONata would give us that possibility. Luckily, JSONata has a practical [playground](https://try.jsonata.org/) to try it out. After some time, I came to this simple solution:

```javascript
(
  $unmarshall := function ($object) {(
      $type($object) = 'array'
        ? [$map($object, $unmarshall)]
        : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
      );
  )};
  
  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);
  
    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
        : $type in ['N'] ? $number($value)
        : $type in ['M'] ? $unmarshall($value)
        : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
        : $type in ['L'] ? [$map($value, $convertValue)]
        : $type in ['NS', 'Ns'] ? [$value.$number()]
        : $type in ['NULL', 'Null', 'Nul'] ? null
        : $error('Unsupported type: ' & $type);
  )};
  
  $unmarshall($);
)
```

**What’s going on in there?**

`$unmarshall` is a function that takes an object or an array as input. It visits the value and for each attribute, it converts the nested type objects into native values with `$convertValue`. It does so recursively for nested arrays and maps.

The final result is very similar to the [JavaScript version](https://github.com/aws/aws-sdk-js-v3/blob/main/packages/util-dynamodb/src/convertToNative.ts).

[Try it yourself!](https://try.jsonata.org/yG-465u5K)

## Step Two: Use it With Step Functions

After proving it’s doable, the second step was to make it work within Step Functions.

My first attempt was to use the new `Assign` property and store the `$unmarshall` and `$convertValue` functions into variables of the same name. Then I tried to call them from the `Output` property.

```json
{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Assign": {
    "unmarshall": "{% function ($object) { ... } %}",
    "convertValue": "{% function ($object) { ... } %}"
  },
  "Output": {
    "result": "{% $unmarshall($states.input.dyanmoDbItem) %}"
  }
}
```

But this did not work. For two reasons:

1. As mentioned in [this article](https://arc.net/l/quote/lvbezixj) and [the doc](https://docs.aws.amazon.com/step-functions/latest/dg/transforming-data.html#querylanguage-field), you can’t use the variables in the same state you assigned them.
    

```json
{
    "error": "States.QueryEvaluationError",
    "cause": "The JSONata expression '$unmarshall($states.input.dyanmoDbItem)' specified for the field 'Output/result' threw an error during evaluation. T1006: Attempted to invoke a non-function"
}
```

This is because the `Assign` and `Output` steps are evaluated in parallel.

2. Assigning functions in variables is not supported.
    

This is not mentioned anywhere in the doc, but you can’t assign a function to a variable (it must be a “real” value). I learned this when I tried to move the `Assign` part into a previous task to work around the previous limitation.

```json
{
    "error": "States.QueryEvaluationError",
    "cause": "The JSONata expression 'function ($object) { ... }' specified for the field 'Assign/unmarshall' returned an unsupported result type."
}
```

After some thought and research, I figured that nothing prevents me from putting everything into a single expression. This expression can define the functions **and** return the final result (just like in the JSONata playground).

And because that expression evaluates to a value, the result would end up in that variable, ready to be used later.

```json
{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Assign": {
    "unmarshalledItem": "{% (
  $unmarshall := function ($object) {(
    $type($object) = 'array' ?
      [$map($object, $unmarshall)]
      : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
    );
  )};

  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);
  
    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
      : $type in ['N'] ? $number($value)
      : $type in ['M'] ? $unmarshall($value)
      : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
      : $type in ['L'] ? [$map($value, $convertValue)]
      : $type in ['NS', 'Ns'] ? [$value.$number()]
      : $type in ['NULL', 'Null', 'Nul'] ? null
      : $error('Unsupported type: ' & $type);
  )};
 
  $unmarshall($states.input.dynamoDbItem);
) %}"
  }
}
```

<div data-node-type="callout">
<div data-node-type="callout-emoji">🗒</div>
<div data-node-type="callout-text">Note: I kept the new lines inside <code>Assign</code> for readability, but for it to be a valid JSON/ASL, it must all go into a single line when deployed to Step Functions.</div>
</div>

Testing it out:

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1734188827533/9fb02611-c03a-436b-bc58-d6c202e88470.png align="center")

Now I can use the `$unmarshalledItem` variable, which contains the result, anywhere in a later state.

```json
{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Output": {
    "unmarshalledItem": "{% $unmarshalledItem %}"
  }
}
```

Alternatively, I could also return the result directly in the `Output`

```json
{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Output": {
    "unmarshalledItem": "{% (
  $unmarshall := function ($object) {(
    $type($object) = 'array' ?
      [$map($object, $unmarshall)]
      : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
    );
  )};

  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);
  
    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
      : $type in ['N'] ? $number($value)
      : $type in ['M'] ? $unmarshall($value)
      : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
      : $type in ['L'] ? [$map($value, $convertValue)]
      : $type in ['NS', 'Ns'] ? [$value.$number()]
      : $type in ['NULL', 'Null', 'Nul'] ? null
      : $error('Unsupported type: ' & $type);
  )};
 
  $unmarshall($states.input.dynamoDbItem);
) %}"
  }
}
```

## Step 3: Create a CDK Construct

After having a proof of concept that works, I wanted to put it all into a simple re-useable CDK construct:

```typescript
import { CustomState } from "aws-cdk-lib/aws-stepfunctions";
import { Construct } from "constructs";

interface DynamoUnmarshallProps {
  path: string;
  variableName: string;
}

const generateUnmarshall = (path: string) => `{% (
  $unmarshall := function ($object) {(
    $type($object) = 'array' ?
      [$map($object, $unmarshall)]
      : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
    );
  )};

  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);
  
    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
      : $type in ['N'] ? $number($value)
      : $type in ['M'] ? $unmarshall($value)
      : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
      : $type in ['L'] ? [$map($value, $convertValue)]
      : $type in ['NS', 'Ns'] ? [$value.$number()]
      : $type in ['NULL', 'Null', 'Nul'] ? null
      : $error('Unsupported type: ' & $type);
  )};
 
  $unmarshall(${path});
) %}`;

export class DynamoUnmarshall extends CustomState {
  constructor(scope: Construct, id: string, props: DynamoUnmarshallProps) {
    const { path, variableName } = props;

    super(scope, id, {
      stateJson: {
        Type: "Pass",
        QueryLanguage: "JSONata",
        Assign: {
          [variableName]: generateUnmarshall(path),
        },
      },
    });
  }
}
```

The construct takes two input parameters:

* `path`: The JSONata path of the raw DynamoDb item to unmarshall
    
* `variableName` the name of the variable where to store the result
    

```typescript
const unmarshall = new DynamoUnmarshall(this, "Unmarshall", {
  path: "$states.input.dynamoDbItem",
  variableName: "unmarshalledItem",
});
```

You can find a [fully working example on GitHub](https://github.com/bboure/stepfunction-unmarshall-dynamodb/tree/main).

<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">Please note that this was for experimental and learning purposes only. I do not guarantee that this code will work in all cases. I did not test this thoroughly and it should <strong>not</strong> be considered production-ready.</div>
</div>

## Conclusion

Unmarshalling DynamoDB within Step Functions lets developers access their data more easily. Previously, developers needed to remember to include field types in their paths or use a Lambda function. By embedding the logic into a reusable CDK construct, it simplifies the process by hiding the complexity of the logic.
