Skip to content

tweedegolf/storage-abstraction

Repository files navigation

Storage Abstraction

ci

Provides an abstraction layer for interacting with a storage; the storage can be on a local file system or in the cloud. Supported cloud storage providers are:

  • Amazon S3
  • Google Cloud
  • Azure Blob
  • MinIO (native and S3)
  • Backblaze B2 (native and S3)
  • CloudFlare R2 (S3)
  • Cubbit (S3)

The storage abstraction API only supports basic storage operations (see below) so the API can be cloud agnostic. This means that you can develop your application using local disk storage and then use for instance Google Cloud or Amazon S3 in your production environment without the need to change any code.

Table of contents

Documentation Adapter API Introspective API Storage API
1. How it works listBuckets getProvider getAdapter
2. Instantiate a storage listFiles getConfiguration switchAdapter
a. Configuration object bucketIsPublic getConfigurationError
  b. Configuration URL bucketExists getServiceClient
  c. How bucketName is used fileExists getSelectedBucket
3. Adapters createBucket setSelectedBucket
4. Adding an adapter clearBucket
  a. Add your storage type deleteBucket
  b. Define your configuration addFile
  c. Adapter class addFileFromPath
  d. Adapter function addFileFromBuffer
  e. Register your adapter addFileFromStream
  f. Adding your adapter code to this package getPresignedUploadURL
5. Tests getPublicURL
6. Example application getSignedURL
7. Questions and requests getFileAsStream
  removeFile
  sizeOf

How it works

A Storage instance is a thin wrapper around one of the available adapters. These adapters are available as separate packages on npm. This way your code base stays as slim as possible because you only have to add the adapter(s) that you need to your project.

The adapters itself are wrappers around the cloud storage provider specific client SDK's, e.g. the AWS SDK.

List of available adapters:

It is important to know that S3 and Amazon S3 are different adapters; Amazon S3 only supports Amazon S3 and strict compatible providers whereas S3 supports Amazon S3 and the partial compatible providers Cubbit, Cloudflare, MinIO S3 and Backblaze S3. The latter adapter exists for historical reasons; originally we tried to create a single adapter for all S3 compatible providers but unfortunately the implementation of S3 differs quite across 'compatible' providers which led to a lot of if-else forking in the code making it hard to read and maintain.

Therefor we decided to write one adapter with a strict Amazon S3 implementation and then write a separate adapters for every S3 compatible provider, extending the Amazon S3 adapter and overriding methods that had to be implemented differently. So far this has lead to 4 new adapters that extend the strict Amazon S3 adapter:

  • AdapterBackblazeS3
  • AdapterCloudflareS3
  • AdapterCubbitS3
  • AdapterMinioS3

Note

The adapter AdapterS3 is based on the code of the former version of AdapterAmazonS3. This means that if you are using a 1.x.x version of AdapterAmazonS3 and you want to upgrade, you need to switch to AdapterS3 or use the S3 adapter that matches your provider.

Note

You might wonder why all adapter names starts with sab-adapter; sab is short for StorageABstraction. Sometimes we may refer to this library as sab or sablib.

When you create a Storage instance it creates an instance of an adapter based on the configuration object or url that you provide. Then all API calls to the Storage are forwarded to this adapter instance, below a code snippet of the Storage class that shows how createBucket is forwarded:

// member function of class Storage
public async createBucket(name: string): Promise<ResultObject> {
  return this.adapter.createBucket(name);
}

The class Storage implements the interface IAdapter and this interface declares the complete API. Because all adapters have to implement this interface as well, either by extending AbstractAdapter or otherwise, all API calls on Storage can be directly forwarded to the adapters.

The adapter subsequently creates an instance of the cloud storage specific client and this instance handles the actual communication with the cloud service. For instance:

// Amazon S3 adapter
private const _client = new S3Client();

// Azure Blob Storage adapter
private const _client = new BlobServiceClient();

Therefor, dependent on what definitions you use, this library could be seen as a wrapper or a shim.

Instantiate a storage

const s = new Storage(config);

When you create a new Storage instance the config argument is used to instantiate the right adapter. You can provide the config argument in 2 forms:

  1. using a configuration object (js: typeof === "object" ts: AdapterConfig)
  2. using a configuration URL (typeof === "string")

Internally the configuration URL will be converted to a configuration object so any rule that applies to a configuration object applies to configuration URLs as well.

The configuration must at least specify a provider; the provider is used to determine which adapter should be created. Note that the adapters are not included in the Storage Abstraction package so you have to add them to you project's package.json before you can use them.

The value of the type is one of the enum members of Provider:

export enum Provider {
  NONE = "none",          // initial value for the abstract adapter, don't use this one
  LOCAL = "local",        // adapter for local storage (ideal for testing)
  GCS = "gcs",            // Google Cloud Storage
  GS = "gs",              // Google Cloud Storage
  S3 = "s3",              // Amazon S3 and (partly) S3 compatible providers Cubbit, Cloudflare, Minio and Backblaze
  AWS = "aws",            // Amazon S3 or providers that are fully S3 compatible)
  AZURE = "azure",        // Azure Storage Blob
  B2 = "b2",              // BackBlaze B2 using native API with AdapterBackblazeB2
  BACKBLAZE = "b2",       // BackBlaze B2 using native API with AdapterBackblazeB2
  B2_S3 = "b2-s3",        // Backblaze B2 using S3 API with AdapterAmazonS3
  BACKBLAZE_S3 = "b2-s3", // Backblaze B2 using S3 API with AdapterAmazonS3
  MINIO = "minio",        // Minio using native API with AdapterMinio
  MINIO_S3 = "minio-s3",  // Minio using S3 API with AdapterAmazonS3
  CUBBIT = "cubbit",      // Cubbit uses S3 API with AdapterAmazonS3  
  R2 = "r2",              // Cloudflare R2 uses S3 API with AdapterAmazonS3    
  CLOUDFLARE = "r2",      // Cloudflare R2 uses S3 API with AdapterAmazonS3   
}

The Storage instance is only interested in the provider so it checks if the provider is valid and then passes the rest of the configuration on to the adapter constructor. It is the responsibility of the adapter to perform further checks on the configuration. I.e. if all mandatory values are available such as credentials or an endpoint.

Note

Some adapters have two entries, for instance the keys Provider.BACKBLAZE and Provider.B2 both have value b2 and use both AdapterBackblazeB2.

Note

Although there are 16 keys in the enum, there are only 11 adapters supporting 7 different cloud storage providers. The providers Minio and Backblaze B2 have both a native API and support for the S3 API.

Configuration object

To enforce that the configuration object contains a provider key, it expects the configuration object to be of type StorageAdapterConfig

interface AdapterConfig {
  bucketName?: string;
  [id: string]: any; // any service specific mandatory or optional key
}

interface StorageAdapterConfig extends AdapterConfig {
  provider: Provider;
}

Besides the mandatory key provider one or more keys may be mandatory or optional dependent on the provider; for instance keys for passing credentials such as keyFilename for Google Storage or accessKeyId and secretAccessKey for Amazon S3, and keys for further configuring the storage service such as StoragePipelineOptions for Azure Blob.

Configuration URL

The general format of configuration urls is:

const u = "protocol://username:password@host:port/path/to/object?region=auto&option2=value2...";

For most storage services username and password are the credentials, such as key id and secret but this is not mandatory; you may use these values for other purposes.

The protocol part of the url defines the storage provider and must match the value of one of the keys in the Provider type:

  • local:// → local storage
  • aws:// → Amazon S3 (and fully S3 compatible providers)
  • gcs:// → Google Cloud
  • gs:// → Google Cloud
  • azure:// → Azure Blob Storage
  • s3:// → Amazon S3 (and partly S3 compatibles: Backblaze, Cloudflare, Cubbit and MinIO)
  • minio:// → MinIO
  • minio-s3:// → MinIO S3 API
  • b2:// → Backblaze B2
  • backblaze:// → Backblaze B2
  • b2-s3:// → Backblaze B2 S3 API
  • backblaze-s3:// → Backblaze B2 S3 API
  • r2:// → Cloudflare R2 Storage
  • cloudflare:// → Cloudflare R2 Storage
  • cubbit:// → Cubbit Storage

Note that some providers can be addressed by multiple protocols, e.g. for Google Cloud Storage you can use both gcs and gs.

Also note that s3 and aws are different adapters; aws only supports Amazon S3 and strict compatible S3 providers whereas s3 supports Amazon S3, Cubbit, Cloudflare, MinIO S3 and Backblaze S3.

The url parser generates a generic object with generic keys that resembles the standard javascript URL object. This object will be converted to the adapter specific AdapterConfig format in the constructor of the adapter. When converted the searchParams object will be flattened into the config object, for example:

// port and bucket
const u = "aws://key:secret@the-buck/path/to/object?region=auto&option2=value2";

// output parser
const p = {
  protocol: "aws",
  username: "key",
  password: "secret",
  host: "the-buck",
  port: null,
  path: "path/to/object",
  searchParams: {
    region: "auto",
    option2: "value2",
  },
};

// AdapterConfigAmazonS3
const c = {
  type: "aws",
  accessKeyId: "key",
  secretAccessKey: "secret",
  bucketName: "the-buck",
  region: "auto",
  option2: "value2",
};

The components of the url represent config parameters and because not all adapters require the same and/or the same number of parameters, not all components of the url are mandatory. When you leave certain components out, it may result in an invalid url according to the official specification but the parser will parse them anyway.

// port and bucket
const u = "aws://part1:part2@bucket:9000/path/to/object?region=auto&option2=value2";
const p = {
  protocol: "aws",
  username: "part1",
  part2: "part2",
  host: "bucket",
  port: "9000",
  path: "path/to/object",
  searchParams: { region: "auto", option2: "value2" },
};

// no bucket but with @
const u = "aws://part1:part2@:9000/path/to/object?region=auto&option2=value2";
const p = {
  protocol: "aws",
  username: "part1",
  password: "part2",
  host: null,
  port: "9000",
  path: "path/to/object",
  searchParams: { region: "auto", option2: "value2" },
};

// no bucket
const u = "aws://part1:part2:9000/path/to/object?region=auto&option2=value2";
const p = {
  protocol: "aws",
  username: "part1",
  password: "part2",
  host: null,
  port: "9000",
  path: "path/to/object",
  searchParams: { region: "auto", option2: "value2" },
};

// no credentials, note: @ is mandatory in order to be able to parse the bucket name
const u = "aws://@bucket/path/to/object?region=auto&option2=value2";
const p = {
  protocol: "aws",
  username: null,
  password: null,
  host: "bucket",
  port: null,
  path: "path/to/object",
  searchParams: { region: "auto", option2: "value2" },
};

// no credentials, no bucket
const u = "aws:///path/to/object?region=auto&option2=value2";
const p = {
  protocol: "aws",
  username: "/path/to/object",
  password: null,
  host: null,
  port: null,
  path: null,
  searchParams: { region: "auto", option2: "value2" },
};

// no credentials, no bucket, no extra options (query string)
const u = "aws:///path/to/object";
const p = {
  protocol: "aws",
  username: "/path/to/object",
  password: null,
  host: null,
  port: null,
  path: null,
  searchParams: null,
};

// only protocol
const u = "aws://";
const p = {
  protocol: "aws",
  username: null,
  password: null,
  host: null,
  port: null,
  path: null,
  searchParams: null,
};

// absolutely bare
const u = "aws";
const p = {
  protocol: "aws",
  username: null,
  password: null,
  host: null,
  port: null,
  path: null,
  searchParams: null,
};

How bucketName is used

If you provide a bucket name it will be stored in the state of the Storage instance. This makes it for instance possible to add a file to a bucket without specifying the name of bucket:

storage.addFile("path/to/your/file"); // the file was automatically added to the selected bucket

Note that if the bucket does not exist it will not be created automatically for you when you create a Storage instance! This was the case in earlier versions but as of version 2.0.0 you have to create the bucket yourself using createBucket.

Adapters

The adapters are the key part of this library; where the Storage is merely a thin wrapper, adapters perform the actual actions on the cloud storage by translating generic API methods calls to storage provider specific calls. The adapters are not part of the Storage Abstraction package; you need to install the separately. See How it works.

A description of the available adapters; what the configuration objects and URLs look like and what the default values are can be found in the README of the adapter packages:

provider npm command readme
Local storage npm i @tweedegolf/sab-adapter-local npm.com↗
Amazon S3 npm i @tweedegolf/sab-adapter-amazon-s3 npm.com↗
Azure Blob npm i @tweedegolf/sab-adapter-azure-blob npm.com↗
Backblaze B2 npm i @tweedegolf/sab-adapter-backblaze-b2 npm.com↗
Google Cloud npm i @tweedegolf/sab-adapter-google-cloud npm.com↗
MinIO npm i @tweedegolf/sab-adapter-minio npm.com↗
S3 npm i @tweedegolf/sab-adapter-s3 npm.com↗
Backblaze B2 S3 npm i @tweedegolf/sab-adapter-backblaze-s3 npm.com↗
Cubbit npm i @tweedegolf/sab-adapter-cubbit-s3 npm.com↗
Cloudflare R2 npm i @tweedegolf/sab-adapter-cloudflare-s3 npm.com↗
MinIO S3 npm i @tweedegolf/sab-adapter-minio-s3 npm.com↗

You can also add more adapters yourself very easily, see below.

Note

Note that the S3 adapter supports Amazon, Cubbit, Cloudflare R2 and the S3 API of Backblaze B2 and Minio. The Amazon S3 adapter only supports Amazon and fully compatible S3 providers It is recommended to use the provider specific S3 adapter (e.g. MinIO S3 sab-adapter-minio-s3) instead of the generic S3 adapter.

Adapter Introspect API

These methods can be used to introspect the adapter. Unlike all other methods, these methods do not return a promise but return a value immediately.

getProvider

getProvider(): Provider;

Returns the cloud storage provider, value is a member of the enum Provider.

Also implemented as getter:

const storage = new Storage(config);
console.log(storage.provider);

getSelectedBucket

getSelectedBucket(): null | string

Returns the name of the bucket that you've provided with the config upon instantiation or that you've set afterwards using setSelectedBucket

Also implemented as getter:

const storage = new Storage(config);
console.log(storage.bucketName);

setSelectedBucket

setSelectedBucket(null | string): void

Sets the name of the bucket that will be stored in the local state of the Adapter instance. This overrides the value that you may have provided with the config upon instantiation. You can also clear this value by passing null as argument.

If you use this method to select a bucket you don't have to provide a bucket name when you call any of these methods:

  • createBucket
  • clearBucket
  • deleteBucket
  • bucketExists
  • bucketIsPublic
  • addFile, addFileFromStream, addFileFromBuffer, addFileFromPath
  • getFileAsStream
  • getPublicURL, getSignedURL
  • fileExists
  • removeFile
  • listFiles
  • sizeof

Also implemented as setter:

const storage = new Storage(config);
storage.bucketName = "the-buck-2";

getConfiguration

getConfiguration(): AdapterConfig

Returns the typed configuration object as provided when the storage was instantiated. If you have provided the configuration in url form, the function will return it as an configuration object.

Also implemented as getter:

const storage = new Storage(config);
console.log(storage.config);

getConfigurationError

getConfigurationError(): string | null

Returns an error message if something has gone wrong with initialization or authorization. Returns null otherwise.

Also implemented as getter:

const storage = new Storage(config);
console.log(storage.configError);

getServiceClient

getServiceClient(): any

Under the hood some adapters create an instance of a service client that actually makes connection with the cloud storage. If that is the case, this method returns the instance of that service client.

For instance in the adapter for Amazon S3 an instance of the S3Client of the aws sdk v3 is instantiated; this instance will be returned if you call getServiceClient on a storage instance with an S3 adapter.

// inside the Amazon S3 adapter an instance of the S3Client is created. S3Client is part of the aws-sdk
this._client = new S3Client();

This method is particularly handy if you need to make API calls that are not implemented in this library. The example below shows how the CopyObjectCommand is used directly on the service client of the Amazon S3 adapter. The API of the Storage Abstraction does not (yet) offer a method to copy an object that is already stored in the cloud so this can be a way to circumvent that.

const storage = new Storage(config);
const client = storage.getServiceClient(); // returns an instance of AWS S3Client

const input = {
  Bucket: "destinationbucket",
  CopySource: "/sourcebucket/HappyFacejpg",
  Key: "HappyFaceCopyjpg",
};
const command = new CopyObjectCommand(input);
const response = await client.send(command);

Also implemented as getter:

const storage = new Storage(config);
console.log(storage.serviceClient);

Adapter API

These methods are actually accessing the underlying cloud storage service. All these methods are async and return a promise that always resolves in a ResponseObject type or a variant thereof:

export interface ResultObject {
  value: string | null;
  error: string | null;
}

If the call succeeds the error key will be null and the value key will hold the returned value. This can be a simple string "ok", a status message, a warning or for instance an array of bucket names.

In case the call yields an error, the value key will be null and the error key will hold the error message. Usually this is the error message as sent by the cloud storage service so if necessary you can lookup the error message in the documentation of that service to learn more about the error.

listBuckets

listBuckets(): Promise<ResultObjectBuckets>

return type:

export type ResultObjectBuckets = {
  value: Array<string> | null;
  error: string | null;
};

Returns an array with the names of all buckets in the storage.

Note

dependent on the type of storage and the credentials used, you may need extra access rights for this action. E.g.: sometimes a user may only access the contents of one single bucket.


listFiles

listFiles(...args:
  [bucketName?: string, numFiles?: number] |
  [numFiles?: number] |
  [bucketName?: string]
)): Promise<ResultObjectFiles>;

return type:

export type ResultObjectFiles = {
  error: string | null;
  value: Array<[string, number]> | null;
};

Returns a list of all files in the bucket; for each file a tuple is returned: the first value is the path and the second value is the size of the file. If the call succeeds the value key will hold an array of tuples.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".


bucketIsPublic

bucketIsPublic(bucketName?: string): Promise<ResultObjectBoolean>;

return type:

export type ResultObjectBoolean = {
  error: string | null;
  value: boolean | null;
};

Check if the bucket is publicly accessible.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

Note

Both Cloudflare R2 and Cubbit do not provide a way to check if a bucket is public. You have to check this in the respective web consoles. Using this method return an error {Cubbit | Cloudflare} does not support checking if a bucket is public, please use the {Cubbit | Cloudflare} web console";

Note

If you are connected to Azure using a SAS token this method will return an error: "This request is not authorized to perform this operation using this permission." Please use any of the other ways to login to Azure if you want to use this method.


bucketExists

bucketExists(bucketName?: string): Promise<ResultObjectBoolean>;

return type:

export type ResultObjectBoolean = {
  error: string | null;
  value: boolean | null;
};

Check whether a bucket exists or not. If the call succeeds the value key will hold a boolean value.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".


fileExists

fileExists(...args:
  [bucketName: string, fileName: string] |
  [fileName: string]
): Promise<ResultObjectBoolean>;

return type:

export type ResultObjectBoolean = {
  error: string | null;
  value: boolean | null;
};

Check whether a file exists or not. If the call succeeds the value key will hold a boolean value.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".


createBucket

createBucket(...args:
  [bucketName?: string, options?: Options] |
  [options?: Options]
): Promise<ResultObject>;
type Options {
  public: boolean, 
  [anykey]: any
};

return type:

export interface ResultObject {
  value: string | null;
  error: string | null;
}

Creates a new bucket. If successful, value will hold a string "ok". You can provide extra storage-specific settings such as access rights using the options object.

If you want to create a public bucket add a key public to the options object and set its value to true.

By default a bucket is private and has no versioning.

Fails if the bucket already exists. This is done because bucket names must be globally unique so if the bucket already exists it might have been created by someone else and may therefor not be accessible for you.

Note

Setting public to true equals access='blob' in Azure Blob Storage; if you want to set your bucket to another access level you add it to the options object:

// set custom access level
createBucket("test", {access: "container"});

Note

Cloudflare R2 and the S3 API of Backblaze don't support creating a public bucket; please use the web console of these services to make your bucket public.

Note

Cloudflare R2 only supports public buckets if you add a custom domain to your bucket, see the documentation on the Cloudflare site. You can add a custom domain to your bucket in the Cloudflare Console.

Note

Cubbit allows you to create a public bucket but if you want the files stored in the bucket to be public as well you need to add {ACL: "public-read"} or {ACL: "public-read-write"} to the options object of addFileFromPath, addFileFromBuffer and addFileFromStream as well:

addFileFromPath({
  bucketName: "test",
  origPath: "path/to/your/file.ext",
  targetPath: "new-name.ext",
  options: {
    ACL: "public-read",
  }
});

Note that adding {ACL: "public-read"} or {ACL: "public-read-write"} also makes files in a private bucket publicly accessible!

If the bucket was created successfully the value key will hold the string "ok".

If you wanted to create a public bucket and the bucket couldn't be made public because Backblaze with S3 API and Cloudflare R2 don't support it, value will hold Bucket ${bucketName} created successfully but you can only make this bucket public using the ${provider} web console.

If the bucket exists or if creating the bucket fails for another reason the error key will hold the error message.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

Note

dependent on the type of storage and the credentials used, you may need extra access rights for this action. E.g.: sometimes a user may only access the contents of one single bucket and has no rights to create a new bucket. Additionally you may not have the rights to create a public bucket.


clearBucket

clearBucket(bucketName?: string): Promise<ResultObject>;

return type:

export interface ResultObject {
  value: string | null;
  error: string | null;
}

Removes all files in the bucket. If the call succeeds the value key will hold the string "ok". Backblaze B2 uses by default a form of versioning which can't be turned off, clearBucket automatically removes all existing versions of the file.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

Note: dependent on the type of storage and the credentials used, you may need extra access rights for this action.


deleteBucket

deleteBucket(bucketName?: string): Promise<ResultObject>;

return type:

export interface ResultObject {
  value: string | null;
  error: string | null;
}

Deletes the bucket and all files in it. If the call succeeds the value key will hold the string "ok". If the deleted bucket was the selected bucket, selected bucket will be set to null.

Does not fail if the bucket doesn't exist.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

Note: dependent on the type of storage and the credentials used, you may need extra access rights for this action.


addFile

addFile(params: FilePathParams | FileStreamParams | FileBufferParams): Promise<ResultObject>;

A generic method that is called under the hood when you call addFileFromPath, addFileFromStream or addFileFromBuffer. It adds a file to a bucket and accepts the file in 3 different ways; as a path, a stream or a buffer, dependent on the type of params.

There is no difference between using this method or one of the 3 specific methods. For details about the params object and the return value see the documentation below.


addFileFromPath

addFileFromPath(params: FilePathParams): Promise<ResultObject>;

param type:

export type FilePathParams = {
  bucketName?: string;
  origPath: string;
  targetPath: string;
  options?: {
    [id: string]: any;
    checkIfBucketExists: boolean;
    ACL?: string; // for Cubbit S3
  };
};

return type:

export interface ResultObject {
  value: string | null;
  error: string | null;
}

Copies a file from a local path origPath to the provided path targetPath in the storage. The value for targetPath needs to include at least a file name. You can provide extra storage-specific settings such as access rights using the options object.

By setting checkIfBucketExists to false you can skip the bucket check. This is useful when you have limited the access rights to the bucket.

The key bucketName is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will hold "no bucket selected".

If the call is successful value will hold the string "ok".

Note

For Cubbit: if you want the files stored in a public bucket to be public as well you need to add {ACL: "public-read"} or {ACL: "public-read-write"} to the options object.


addFileFromBuffer

addFileFromBuffer(params: FileBufferParams): Promise<ResultObject>;

param type:

export type FileBufferParams = {
  bucketName?: string;
  buffer: Buffer;
  targetPath: string;
  options?: {
    [id: string]: any;
    checkIfBucketExists: boolean;
    ACL?: string; // for Cubbit S3
  };
};

return type:

export interface ResultObject {
  value: string | null;
  error: string | null;
}

Copies a buffer to a file in the storage. The value for targetPath needs to include at least a file name. You can provide extra storage-specific settings such as access rights using the options object.

By setting checkIfBucketExists to false you can skip the bucket check. This is useful when you have limited the access rights to the bucket.

The key bucketName is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will hold "no bucket selected".

If the call is successful value will hold the string "ok".

This method is particularly handy when you want to move uploaded files directly to the storage, for instance when you use Express.Multer with MemoryStorage.

Note

For Cubbit: if you want the files stored in a public bucket to be public as well you need to add {ACL: "public-read"} or {ACL: "public-read-write"} to the options object.


addFileFromStream

addFileFromStream(params: FileStreamParams): Promise<ResultObject>;

param type:

export type FileStreamParams = {
  bucketName?: string;
  stream: Readable;
  targetPath: string;
  options?: {
    [id: string]: any;
    checkIfBucketExists: boolean;
    ACL?: string // for Cubbit S3
  };
};

return type:

export interface ResultObject {
  value: string | null;
  error: string | null;
}

Allows you to stream a file directly to the storage. The value for targetPath needs to include at least a file name. You can provide extra storage-specific settings such as access rights using the options object.

By setting checkIfBucketExists to false you can skip the bucket check. This is useful when you have limited the access rights to the bucket.

The key bucketName is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

If the call is successful value will hold the string "ok".

This method is particularly handy when you want to store files while they are being processed; for instance if a user has uploaded a full-size image and you want to store resized versions of this image in the storage; you can pipe the output stream of the resizing process directly to the storage.

Note

For Cubbit: if you want the files stored in a public bucket to be public as well you need to add {ACL: "public-read"} or {ACL: "public-read-write"} to the options object.


getPresignedUploadURL

getPresignedUploadURL(  
  [bucketName: string, fileName: string, options?: Options] |
  [fileName: string, options?: Options]
): Promise<ResultObjectObject>;

Options:

type Options = {
  expiresIn?: number,
  [id: string]: any,
}

return type:

export interface ResultObject {
  value: { url: string, [id: string]: any, } | null;
  error: string | null;
}

Returns a presigned upload URL that you can use to upload a file without having to log in to the cloud storage service.

The key bucketName is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

The way presigned upload URLs are implemented in the various cloud storage services differs a lot. Below code examples for the supported services:

Amazon S3, Backblaze S3, Cloudflare R2, Cubbit and MinIO S3

const r = await storage.getPresignedUploadURL("the-bucket", "test.jpg", {
  expiresIn: 3600, // seconds, default 300
  conditions: [
    ["starts-with", "$key", fileName], // only upload if the name of the uploaded file matches
    ["content-length-range", 1, 25 * 1024 * 1024], // limit upload to 25MB
    ["starts-with", "$Content-Type", "image/"], // only allow images
    { "x-amz-server-side-encryption": "AES256" },
    { "acl": "private" }, // if using ACLs
    ["starts-with", "$x-amz-meta-user", ""], // force certain metadata fields
  ],
  fields: [
    "x-amz-server-side-encryption": "AES256",
    acl: "bucket-owner-full-control",
  ] 
});

// Process the result in Node 18+ using Node native fetch and FormData:

const {value: {url, fields}} = r;
const form = new FormData();
const fileBuffer = fs.readFileSync("./tests/data/image1.jpg");

Object.entries((r.value as any).fields).forEach(([field, value]) => {
    form.append(field, value as string);
});
form.append("file", new Blob([fileBuffer]), fileName);

response = await fetch(url, {
    method: 'POST',
    body: form,
});

Azure Blob

const r = await storage.getPresignedUploadURL("the-bucket", "test.jpg", {
  expiresIn: 3600, // seconds, default 300
  startsAt: -60, // seconds, default -60
  permissions: {
    add: true,
    create: true,
    write: true,
  }
});

// Process the result in Node 18+ using Node native fetch PUT:

const {value: {url}} = r;
const fileBuffer = fs.readFileSync("./tests/data/image1.jpg");

response = await fetch(url, {
    method: 'PUT',
    body: fileBuffer,
    headers: {
        'x-ms-blob-type': 'BlockBlob',
    }
});

Backblaze B2 (native API)

const r = await storage.getPresignedUploadURL("the-bucket");

// Process the result in Node 18+ using Node native fetch POST:

const {value: {url, authToken}} = r;
const fileBuffer = fs.readFileSync("./tests/data/image1.jpg");

response = await fetch(url, {
    method: 'POST',
    body: fileBuffer,
    headers: {
        "Authorization": authToken,
        "X-Bz-File-Name": "test.jpg", 
        "Content-Type": "image/jpeg",
        "X-Bz-Content-Sha1": crypto.createHash("sha1").update(fileBuffer).digest("hex"),
        "X-Bz-Info-Author": "sab-test" // anything goes
    }
});

Note

You don't have to specify a filename and there are no options such as expiresIn available. The Backblaze B2 upload url is standard valid for 24 hours and this isn't customizable

Google Cloud Storage

const r = await storage.getPresignedUploadURL("the-bucket", "test.jpg", {
  expiresIn: 3600,    // seconds, default 300
  version: "v4",    // either "v2" or "v4", defaults to "v4"
  action: "write",  // either "write", "read", "delete" or "resumable", defaults to "write"
  contentType: "application/octet-stream", // set content type to match your file type or use the default "application/octet-stream" that works in any case
});

// Process the result in Node 18+ using Node native fetch PUT:

const {value: {url}} = r;
const fileBuffer = fs.readFileSync("./tests/data/image1.jpg");

response = await fetch(url, {
    method: 'PUT',
    body: fileBuffer,
    headers: {
        "Content-Type": "application/octet-stream" // content type must match with the value specified above!
    }
});

Minio

const r = await storage.getPresignedUploadURL("the-bucket", "test.jpg", {
  expiresIn: 3600,    // seconds, default 300
});

// Process the result in Node 18+ using Node native fetch PUT:

const {value: {url}} = r;
const fileBuffer = fs.readFileSync("./tests/data/image1.jpg");

response = await fetch(url, {
    method: 'PUT',
    body: fileBuffer,
    headers: {
        "Content-Type": "application/octet-stream"
    }
});

getPublicURL

getPublicURL(...args:
  [bucketName: string, fileName: string, options?: Options] |
  [fileName: string, options?: Options]
): Promise<ResultObject>;

param type:

```typescript
type Options {
  [id: string]: any;
  noCheck?: boolean;
  withoutDirectory?: boolean; // only for the local adapter
}

return type:

export type ResultObject = {
  value: string | null;
  error: string | null;
};

Returns the public url of the file. Returns an error if the bucket is not public.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

With the noCheck key in the options object set to true you can bypass the check if the bucket is actually public. Using this the method will always return a url. The bypass was put in place because Cubbit and Backblaze S3 don't support checking if a bucket is public using their API; you can only check this using the web console of Cubbit and Backblaze respectively. You should only use this bypass if you are sure the bucket is public otherwise the url returned will be unreachable.

The Amazon S3 SDK doesn't have a method to retrieve a public url, instead the url is composed of known data using a cloud service specific template:

  • Amazon: https://${bucket_name}.s3.${region}.amazon.com/${file_name}
  • Backblaze S3: https://${bucket_name}.s3.${region}.backblazeb2.com/${file_name}
  • Cloudflare: N/A, see below
  • Cubbit: https://${bucket_name}.s3.cubbit.eu/${file_name}
  • Minio S3: https://${endpoint}/${bucketName}/${fileName}

Although Cloudflare R2 is a S3 compatible storage, this method cannot return a public url because Cloudflare R2 only supports public buckets if you add a custom domain to your bucket, see the documentation on the Cloudflare site. You can add a custom domain to your bucket in the Cloudflare Console and after that you can simply construct the url of the bucket in your own code. You could enable and use the Public Development URL but that is not meant to be used for production. Alternately, you could use a pre-signed url instead of a public url.

For the local adapter you can use the key withoutDirectory in the options object:

const s = new Storage({
  type: Provider.LOCAL,
  directory: "./your_working_dir/sub_dir",
  bucketName: "bucketName",
});

const url1 = getPublicURL("bucketName", "fileName.jpg");
// your_working_dir/sub_dir/bucketName/fileName.jpg

const url2 = getPublicURL("bucketName", "fileName.jpg", { withoutDirectory: true });
// bucketName/fileName.jpg

getSignedURL

getSignedURL(...args:
  [bucketName: string, fileName: string, options?: Options] |
  [fileName: string, options?: Options]
): Promise<ResultObject>;

param type:

export Options {
  expiresIn: number // number of seconds the url is valid, defaults to a week (604800)
  [id: string]: any; 
}

return type:

export type ResultObject = {
  value: string | null;
  error: string | null;
};

Returns a signed url of the file.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

Because the local adapter does not support signed urls, this method behaves exactly the same as getPublicURL when using the local adapter, see previous section.

Note

If you are connected to Azure using the password less option or with a SAS token you get an error: "Can only generate the SAS when the client is initialized with a shared key credential" Please use any of the other ways to login to Azure if you want to use this method.


getFileAsStream

getFileAsStream(...args:
  [bucketName: string, fileName: string, options?: StreamOptions] |
  [fileName: string, options?: StreamOptions]
): Promise<ResultObjectStream>;

param type:

export interface StreamOptions extends Options {
  start?: number;
  end?: number;
}

return type:

export type ResultObjectStream = {
  value: Readable | null;
  error: string | null;
};

Returns a file in the storage as a readable stream. You can pass in extra options. If you use the keys start and/or end only the bytes between start and end of the file will be returned.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

Some examples:

getFileAsReadable("bucket-name", "image.png"); // &rarr; reads whole file

getFileAsReadable("bucket-name", "image.png", {}); // &rarr; reads whole file

getFileAsReadable("bucket-name", "image.png", { start: 0 }); // &rarr; reads whole file

getFileAsReadable("bucket-name", "image.png", { start: 0, end: 1999 }); // &rarr; reads first 2000 bytes

getFileAsReadable("bucket-name", "image.png", { end: 1999 }); // &rarr; reads first 2000 bytes

getFileAsReadable("bucket-name", "image.png", { start: 2000 }); // &rarr; reads file from byte 2000

removeFile

removeFile(...args:
  [bucketName: string, fileName: string] |
  [fileName: string, options?: Options]
): Promise<ResultObject>;

return type:

export interface ResultObject {
  error: string | null;
  value: string | null;
}

Removes a file from the bucket. Does not fail if the file doesn't exist.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

If the bucket can not be found an error will be returned: No bucket ${bucketname} found.

If the call succeeds the value key will hold the string "ok".

If the file can not be found value will be: No file ${filename} found in bucket ${bucketname}.


sizeOf

sizeOf(...args:
  [bucketName: string, fileName: string] |
  [fileName: string]
): Promise<ResultObjectNumber>;

return type:

export type ResultObjectNumber = {
  error: string | null;
  value: number | null;
};

Returns the size of a file.

The bucketName arg is optional; if you don't pass a value the selected bucket will be used. The selected bucket is the bucket that you've passed with the config upon instantiation or that you've set afterwards using setSelectedBucket. If no bucket is selected the value of the error key in the result object will set to "no bucket selected".

If the call succeeds the value key will hold the size of the file.

Storage API

The Storage class has two extra method besides all methods of the IAdapter interface.

getAdapter

getAdapter(): IAdapter;

// also implemented as getter
const s = new Storage({type: Provider.S3})
const a = s.adapter;

Returns the instance of the Adapter class that this Storage instance is currently using to access a storage service.


switchAdapter

switchAdapter(config: string | AdapterConfig): void;

This method is used to instantiate the right adapter when you create a Storage instance. The method can also be used to switch to another adapter in an existing Storage instance at runtime.

The config parameter is the same type of object or URL that you use to instantiate a Storage. This method can be handy if your application needs a view on multiple storages.

If your application needs to copy over files from one storage service to another, say for instance from Google Cloud to Amazon S3, then it is more convenient to create 2 separate Storage instances:

import { Storage } from "@tweedegolf/storage-abstraction"

const s1 = new Storage({type: "s3"});
const s2 = new Storage({type: "gcs"});

s2.addFile({
  bucketName: "bucketOnGoogleCloud"
  stream: s1.getFileAsStream("bucketOnAmazon", "some-image.png"),
  targetPath: "copy-of-some-image.png",
})

Adding an adapter

It is relatively easy to add an adapter for an unsupported cloud service. Note however that many cloud storage services are compatible with Amazon S3 so if that is the case, please check first if the Amazon S3 adapter does the job; it might work right away. However, sometimes even if a storage service is S3 compatible you have to write a separate adapter. For instance: although MinIO is S3 compliant it was necessary to write a separate adapter for MinIO.

If you want to add an adapter you can choose to make your adapter a class or a function; so if you don't like OOP you can implement your adapter using FP or any other coding style or programming paradigm you like.

Your adapter might have additional dependencies such as a service client library, like for instance the aws-sdk as is used in the Amazon S3 adapter. Add these dependencies to the package.json file in the ./publish/YourAdapter folder.

You may want to add your Adapter code to this package, in that case please add your dependencies to the package.json file in the root folder of the Storage Abstraction package as well. Your dependencies will not be added to the Storage Abstraction package when published to npm because only the files in the publish folder are published and there is a stripped version of the package.json file in the ./publish/Storage folder.

You may also want to add some tests for your adapter and it would be very much appreciated if you could publish your adapter to npm and add your adapter to this README, see this table.

Follow these steps:

  1. Add a new type to the Provider enum in ./src/types/general.ts
  2. Define a configuration object (and a configuration url if you like)
  3. Write your adapter, make sure it implements all API methods
  4. Register your adapter in ./src/adapters.ts
  5. Publish your adapter on npm.
  6. You may also want to add the newly supported cloud storage provider to the keywords array in the package.json file of the Storage Abstraction storage (note: there 2 package.json file for this package, one in the root folder and another in the publish folder)

Add your storage type

You should add the name of the your type to the enum Provider in ./src/types/general.ts. It is not mandatory but may be very handy.

// add your type to the enum
export enum Provider {
  LOCAL = "local",
  GCS = "gcs",      // Google Cloud Storage
  S3 = "s3",        // Amazon S3
  B2 = "b2",        // BackBlaze B2
  AZURE = "azure",  // Microsoft Azure Blob
  MINIO = "minio",
  ...
  YOUR_PROVIDER = "your-provider",
}

Define your configuration

A configuration object type should at least contain a key provider. To enforce this the Storage class expects the config object to be of type StorageAdapterConfig:

export interface AdapterConfig {
  bucketName?: string;
  [id: string]: any; // eslint-disable-line
}

export interface StorageAdapterConfig extends AdapterConfig {
  provider: Provider;
}

For your custom configuration object you can either choose to extend StorageAdapterConfig or AdapterConfig. If you choose the latter you can use your adapter standalone without having to specify a redundant key provider, which is why the configuration object of all existing adapters extend AdapterConfig.

export interface YourAdapterConfig extends AdapterConfig {
  additionalKey: string,
  ...
}

const s = new Storage({
  provider: Provider.YOUR_PROVIDER, // mandatory for Storage
  key1: string, // other mandatory or optional key that your adapter need for instantiation
  key2: string,
}) // works!

const a = new YourAdapter({
  key1: string,
  key2: string,

}) // works because provider is not mandatory

Also your configuration URL should at least contain the provider. The name of the provider is used for the protocol part of the URL. Upon instantiation the Storage class checks if a protocol is present on the provided URL.

example:

// your configuration URL
const u = "your-provider://user:pass@bucket_name?option1=value1&...";

You can format the configuration URL completely as you like as long as your adapter has an appropriate function to parse it into the configuration object that your adapter expects. If your url follows the standard URL format you don't need to write a parse function, you can import the parseUrl function from ./src/util.ts.

For more information about configuration URLs please read this section

Adapter class

It is recommended that your adapter class extends AbstractStorage. If you look at the code you can see that it implements the complete introspective API. getServiceClient returns an any value and getConfig returns a generic AdapterConfig object; you may want to override these methods to make them return your adapter specific types.

Note that all API methods that have and optional bucketName arg are implemented as overloaded methods:

  • clearBucket
  • deleteBucket
  • bucketExists
  • bucketIsPublic
  • getPublicURL
  • getSignedURL
  • getFileAsStream
  • fileExists
  • removeFile
  • listFiles
  • sizeof

The implementation of these methods in the AbstractAdapter handles the overloading part and performs some general checks that apply to all adapters. Then they call the cloud specific protected 'tandem' function that handles the adapter specific logic. The tandem function has the same name with an underscore prefix.

For instance: the implementation of clearBucket in AbstractAdapter checks for a bucketName arg and if it is not provided it looks if there is a selected bucket set. It also checks for configuration errors. Then it calls _clearBucket which should be implemented in your adapter code to handle your cloud storage specific logic. This saves you a lot of hassle and code in your adapter module.

One other thing to note is the way addFileFromPath, addFileFromBuffer and addFileFromReadable are implemented; these are all forwarded to the API function addFile. This function stores files in the storage using 3 different types of origin; a path, a buffer and a stream. Because these ways of storing have a lot in common they are grouped together in a single method.

If you look at addFile you see that just like the overloaded methods mentioned above, the implementation handles some generic logic and then calls _addFile in your adapter code.

The abstract stub methods need to be implemented and the other IAdapter methods can be overridden in the your adapter class if necessary. Note that your adapter should not implement the methods getAdapter and switchAdapter; these are part of the Storage API.

You don't necessarily have to extend AbstractAdapter but if you choose not to your class should implement the IAdapter interface. You'll find some configuration parse functions in the separate file ./src/util.ts so you can easily import these in your own class if these are useful for you.

You can use this template as a starting point for your adapter. The template contains a lot of additional documentation per method.

Adapter function

The only requirement for this type of adapter is that your module exports a function createAdapter that takes a configuration object or URL as parameter and returns an object that has the shape or type of the interface IAdapter.

You may want to check if you can use some of the utility functions defined in ./src/util.js. Also there is a template file that you can use as a starting point for your module.

Register your adapter

The switchAdapter method of Storage parses the type from the configuration and then creates the appropriate adapter instance. This is done by a lookup table that maps a storage type to a tuple that contains the name of the adapter and the path to the adapter module:

export const adapterClasses = {
  s3: ["AdapterAmazonS3", "@tweedegolf/sab-adapter-amazon-s3"],
  your_type: ["AdapterYourService", "@you/sab-adapter-your-service"],
  ...
};

If switchAdapter fails to find the module at the specified path it tries to find it in the source folder by looking for a file that has the same name as your adapter, so in the example above it looks for ./src/AdapterYourService.ts.

Once the module is found it will be loaded at runtime using require(). An error will be thrown the type is not declared or if the module can not be found.

The lookup table is defined in ./src/adapters.ts.

Adding your adapter code to this package

You can create your own adapter in a separate repository and publish it from there to npm. You may also want to add your adapter code to this package, to do this follow these steps:

  1. Place the adapter in the ./src folder
  2. Create a file that contains all your types in the ./src/types folder
  3. Create an index file in the ./src/indexes folder
  4. Create a folder with the same name as your adapter in the ./publish folder
  5. Add a package.json and a README.md file to this folder
  6. Add your adapter to the copy.ts file in the root folder

Tests

If you want to run the tests you have to checkout the repository from github and install all dependencies with npm install or yarn install. There are tests for all storage types; note that you may need to add your credentials to a .env file, see the file .env.default and config_urls.md for more explanation, or provide credentials in another way. Also it should be noted that some of these tests require that the credentials allow to create, delete and list buckets.

To run all Jasmine tests consecutively:

npm run test-all

To run the Jasmine tests per provider you can specify it as commandline argument:

# test local disk
npm run jasmine local

# test Amazon S3
npm run jasmine s3

# test Backblaze B2 S3 API
npm run jasmine b2-s3

# and so on

Note that you can specify the provider using the values in the Provider enum, see here

Note

For some reason you can't run jasmine with ts-node and the current settings in tsconfig.json. So please don't forget to run npm run tsc after you've changed files in the src folder or in the jasmine.ts file itself. You can also run npm run watch to enable compiling on save.

You can find some additional non-Jasmine tests in the file tests/test_runs.ts. Every test is a functions that makes a series of API calls to test certain functionality in isolation. A the bottom of this file you'll find the run function where you can comment out the you don't want to run.

You can find the API calls in the file tests/api_calls.ts. Every API call is declared in a function with the same name as the API method it is calling, some additional functionality like logging and checking the result is added to the function.

You can select the provider you want to test by passing it as a commandline parameter:

npm test local
npm test s3
npm test b2-s3

Note that the test testPublicBucket tries to create a public bucket. However creating a public bucket on Cloudflare R2 and on Backblaze B2 when using the S3 adapter is not possible; even if you add {public: true} the created bucket sab-test-public will be private.

You can make the created bucket public using the web console of Cloudflare and Backblaze. You can also create a public bucket sab-test-public before you run the test.

Example application

Note

not yet updated to API 2.0!

A simple application that shows how you can use the storage abstraction package can be found in this repository. It uses and Ts.ED and TypeORM and it consists of both a backend and a frontend.

Questions and requests

Please let us know if you have any questions and/or request by creating an issue.