How to serve files from an AppSync API using S3 signed URLs

Files with AppSync

AppSync, as all serverless solutions, builds on the “small and quick” response model. This means the response size and the time it takes to assemble that response is limited in a not-adjustable way. This model does not play well with files that can be arbitrarily large, such as books, videos, or archives. To handle them, we need to rely on a different mechanism: signed URLs.

The idea is rather simple behind signed URLs: the files are stored in a dedicated object store, S3 in the case of AWS, and the API only returns a link to these files signed with keys. This way, since the API does not do anything with files the responses are small and quick and the heavy lifting is moved to the object store. The API still needs to implement access control, but it does not need to handle the bytes themselves.

293679-04c5cb5e994783efa30ad3317dbbd9c0a93942b7e7b75739f676f80220afcdea.jpg

How S3 Signed URLs work

Signed URLs provide secure a way to distribute private content without streaming them through the backend. Learn how they work and how to use them.

a9eee2-d8b66f5070565a625ff965c9dd4cf391098fb6883686b8a29433780c2b6166cc.jpg

Are S3 signed URLs secure?

The potential security problems with file distribution based on signed URLs

In an example project, let’s build a photo sharing app running on AppSync where users can upload images and then decide which ones to make public!

Recommended book

d5ff54-903567c7c3141d0138f9016a9121ecf6687fe7de91f874f9711622a81d924b3b.jpg

Building GraphQL APIs with AWS AppSync

How to design, implement, and deploy GraphQL-based APIs on the AWS cloud

Book, 2022

Here, the first of user1’s photos is set to private:

user1_photos@1.25-fe5cb735d41b66abee7dc9cd1e3a44aeb05e709886fc7a7afb44796fe1f94cd3.png

A different user then can’t see it:

user1_public_photos@1.25-c105dfd69d0749cd85cd0f7fac674d62a0e013ade728ca1f2988804152a9db7f.png

Implementation

The schema defines that the images have a url field:

type Image {
	key: ID!
	url: AWSURL!
	public: Boolean!
}

And images are accessible through a user object:

type User {
	id: ID!
	username: String!
	images(nextToken: String): PaginatedImages!
}

type PaginatedImages {
	images: [Image!]!
	nextToken: String
}

Access control is implemented in the resolver for the User.images. Here, the request only fetches the public images of the user:

export function request(ctx) {
	return {
		version : "2018-05-29",
		operation : "Query",
		index: "useridPublic",
		query: {
			expression: "#useridPublic = :useridPublic",
			expressionNames: {
				"#useridPublic": "userid#public",
			},
			expressionValues: {
				":useridPublic": {S: ctx.source.id + "#true"},
			},
		},
	};
}

There are multiple ways to implement this query, the above one builds on a secondary index. This has the benefit of not leaking information about the number of non-public images, a simple filtering approach would do.

e56405-321a22233a66d976fba8c0c1a824311c33e5f06b8799a2c40bfd63110491a3b3.jpg

Efficient filtering in DynamoDB

How to structure a table to allow getting only the matching elements

The URL signing is the role of the Image.url resolver. This calls a Lambda function:

export function request(ctx) {
	return {
		version: "2018-05-29",
		operation: "BatchInvoke",
		payload: {
			type: "download",
			imageKey: ctx.source.key,
		}
	};
}

The Lambda uses the AWS JS SDK v3 to sign a GetObjectCommand and return the result:

import {S3Client, GetObjectCommand} from "@aws-sdk/client-s3";
import {getSignedUrl} from "@aws-sdk/s3-request-presigner";

export const handler = async ({imageKey}) => {
	const client = new S3Client();

	const roundTo = 5 * 60 * 1000; // 5 minutes
	const signedUrl = await getSignedUrl(client, new GetObjectCommand({
		Bucket: process.env.Bucket,
		Key: imageKey,
	}), {signingDate: new Date(Math.floor(new Date().getTime() / roundTo) * roundTo)});
	return {
		data: signedUrl,
	};
}

Notice that the above code sets the signingDate argument to the nearest past 5-minutes mark. This prevents changing the URL for every request, so that the browser does not need to download it again and again.