Modern fetch and 3 ways to get Buffer output from aws-sdk v3 s3 GetObjectCommand
Many people have trouble with the AWS SDK for JavaScript v3's GetObjectCommand when trying to get Buffer output from s3. In this post, I will cover the reason why this happens, solutions and related information. This post is strongly related to the #1877 issue.
TLDR: For those who come here to find a just-work version. Here is it. This is the working version in nodejs. If you are running on browser, see the details in the rest of the post.
import {GetObjectCommand, S3Client} from '@aws-sdk/client-s3'
import type {Readable} from 'stream'
const s3Client = new S3Client({
apiVersion: '2006-03-01',
region: 'us-west-2',
credentials: {
accessKeyId: '<access key>',
secretAccessKey: '<access secret>',
}
})
const response = await s3Client
.send(new GetObjectCommand({
Key: '<key>',
Bucket: '<bucket>',
}))
const stream = response.Body as Readable
// if you are using node version < 17.5.0
return new Promise<Buffer>((resolve, reject) => {
const chunks: Buffer[] = []
stream.on('data', chunk => chunks.push(chunk))
stream.once('end', () => resolve(Buffer.concat(chunks)))
stream.once('error', reject)
})
// if you are using node version >= 17.5.0
return Buffer.concat(await stream.toArray())
Javascript version (commonjs)
const {GetObjectCommand, S3Client} = require('@aws-sdk/client-s3')
const s3Client = new S3Client({
apiVersion: '2006-03-01',
region: 'us-west-2',
credentials: {
accessKeyId: '<access key>',
secretAccessKey: '<access secret>',
}
})
const response = await s3Client
.send(new GetObjectCommand({
Key: '<key>',
Bucket: '<bucket>',
}))
const stream = response.Body
// if you are using node version < 17.5.0
return new Promise((resolve, reject) => {
const chunks = []
stream.on('data', chunk => chunks.push(chunk))
stream.once('end', () => resolve(Buffer.concat(chunks)))
stream.once('error', reject)
})
// if you are using node version >= 17.5.0
return Buffer.concat(await stream.toArray())
Why this post?
Recently, I migrated my storage from AWS S3 to DigitalOcean spaces service to save data transfer costs, which included upgrading the storage adapter for this blog (s3-ghost). At the time of the upgrade, the AWS SDK Javascript v3 looks getting mature, so I decided to upgrade it too from v2.
Initially, everything went fine and I released the update. However, 2 days after the release, I realized that my blog was dead (actually, It was dead for 2 days until this post). I checked the server log, and see the following error.
The "data" argument must be of type string or an instance of Buffer, TypedArray, or DataView. Received an instance of IncomingMessage
This error happened in the call to AWS SDK GetObjectCommand. It turned out that getting the Buffer output from the SDK command is not trivial and there is much interesting information I want to share in this post and also for my future reference.
How?
This is a sample code to send a GetObjectCommand request.
import {GetObjectCommand, S3Client} from "@aws-sdk/client-s3";
const s3Client = new S3Client({
apiVersion: '2006-03-01',
region: 'us-west-2',
credentials: {
accessKeyId: '<access key>',
secretAccessKey: '<access secret>',
}
})
const response = await s3Client
.send(new GetObjectCommand({
Key: '<key>',
Bucket: '<bucket>',
}))
const body = response.Body
From the official docs of GetObjectCommandOutput.Body, the body's type is Readable | ReadableStream | Blob
. Why these 3 types?
Let's start digging into the source code of the AWS Javascript v3 sdk.
@aws-sdk/client-s3
package uses @aws-sdk/node-http-handler
(source), @aws-sdk/fetch-http-handler
(source) as requestHandler
.
In browser environment
Looking at the source code of the SDK's @aws-sdk/fetch-http-handler
package. The sdk uses global fetch
to send network requests.
Because the global fetch
is used, it requires polyfill if your browser does not support fetch
. whatwg-fetch is a common choice for polyfill.
In browser, response.body
is a ReadableStream
.
Where and why the Blob type comes to the output?
If we look again at the source code of the SDK. When response.body
is not available, the SDK returns a blob as a workaround in old browsers/polyfill.
const hasReadableStream = response.body !== undefined;
// Return the response with buffered body
if (!hasReadableStream) {
return response.blob().then(/*...*/);
}
If your browser is new, you can just skip the Blob
type and cast the output type to Readable | ReadableStream
in Typescript.
In node environment
In node environment, @aws-sdk/node-http-handler
is used to send network requests. From the source code of the package:
Suppose that the request does not use SSL, nodejs's http
is used. In the flow of the source code, the GetObjectCommandOutput.Body
is assigned to the http.IncomingMessage
class as the first parameter of the callback in the 'response'
event on the ClientRequest
class.
IncomingMessage
extends stream.Readable
, that is why we got the Readable
type for the GetObjectCommandOutput.Body
.
This also explains why I got the Received an instance of IncomingMessage
error described in this post, initially.
When the request uses SSL, https
module is used instead. However, the nodejs doc does not mention the type of the response/request in detail. Probably, the type is the same as of the http
module.
Conclusion
GetObjectCommandOutput.Body
type is
- In node:
Readable
(or precisely, the subclass ofReadable
, namelyIncomingMessage
). - In browser:
- If thefetch
API in your browser does not supportrequest.body
,Blob
type is returned.
- (This is most of the case) Otherwise,ReadableStream
type is returned.
How to handle the output stream
I will introduce 3 ways, isomorphic way, node-only way, and browser-only way.
The isomorphic method
The trick is to use the Response
class.
In node environment, import by import {Response} from 'node-fetch'
.
In the browser environment, the Reponse
object is available in the global scope. Note: you always need to polyfill fetch if your browser does not support fetch API natively.
const res = new Response(body)
Response
is a very handy class in which you can convert the stream to many types. For example:
// blob type
const blob = await res.blob()
// json
const json = await res.json()
// string
const text = await res.text()
// buffer
const buffer = await res.arrayBuffer() // note: res.buffer() is deprecated
The buffer's type is nodejs Buffer in node (node-fetch
), and ArrayBuffer in browser (native fetch).
The node-only way
Use this implementation to convert a Readable to a buffer in node environment.
import type { Readable } from "stream"
const streamToBuffer = (stream: Readable) => new Promise<Buffer>((resolve, reject) => {
const chunks: Buffer[] = []
stream.on('data', chunk => chunks.push(chunk))
stream.once('end', () => resolve(Buffer.concat(chunks)))
stream.once('error', reject)
})
If you are using nodejs version >= 17.5.0, Readable.toArray
provides a shorter version.
import type { Readable } from "stream"
const streamToBuffer = (stream: Readable) => Buffer.concat(await stream.toArray())
Note that, at the time of this writing (Feb 13, 202), Readable.toArray
is an experimental feature.
The browser-only way
Use this implementation to convert a ReadableStream to a buffer in browser environment.
// Buffer is a subclass of Uint8Array, so it can be used as a ReadableStream's source
// https://nodejs.org/api/buffer.html
export const concatBuffers = (buffers: Uint8Array[]) => {
const totalLength = buffers.reduce((sum, buffer) => sum + buffer.byteLength, 0)
const result = new Uint8Array(totalLength)
let offset = 0
for (const buffer of buffers) {
result.set(new Uint8Array(buffer), offset)
offset += buffer.byteLength
}
return result
}
const streamToBuffer = (stream: ReadableStream) => new Promise<Buffer>(async (resolve, reject) => {
const reader = stream.getReader()
const chunks: Uint8Array[] = []
try {
while (true) {
const {
done,
value
} = await reader.read()
if (done) break
chunks.push(value!)
}
} finally {
// safari (iOS and macOS) doesn't support .releaseReader()
// https://developer.mozilla.org/en-US/docs/Web/API/ReadableStreamDefaultReader/releaseLock#browser_compatibility
reader?.releaseLock()
}
return concatBuffers(chunks)
})