What is GemDrive?

GemDrive is a protocol and reference implementation for bringing remote filesystem functionality to web browsers and other HTTP clients. It shares similarities with WebDAV, NFS, FTP, Inrupt Solid, remoteStorage, Amazon S3, Google Drive, etc. For detailed comparisons with other tools, see this page.

GemDrive grew out of a combination of needs I had for managing my own data and needs I ran into at my day job working with genomics web apps and datasets.

It is currently targeted at developers and self-hosters, but will hopefully be useful to a wider audience in the long run.

Why does it exist?

The goal of GemDrive is to provide a "hard drive" for the web. That means providing a way for web apps to read and write to a server in a similar way as a normal filesystem (files and folders), rather than the database query paradigm normally used for the web. If you've ever used an app that supports using Google Drive to store your data, GemDrive is the same idea, but much simpler, open source, and providing extra features that enable some really cool stuff.

To understand some of the motivations behind GemDrive, consider how you would accomplish the following with tools available today:

I'm certain there are nice solutions to some of these problems (and would appreciate people letting me know about them). GemDrive is pretty good at solving all of them.

How is it implemented?

The central tenet of GemDrive is to be as simple as reasonable.

The protocol seeks to add the minimal necessary layer on top of standard HTTP to do its job.

The following HTTP request types already behave in fairly intuitive ways for filesystem operations:

However, a few pieces are still missing/underspecified:

GemDrive simply provides opinionated answers for these pieces.

What does it look like?

Here's a taste of GemDrive in action. For a complete description, see the protocol page.

Note that for brevity the requests below don't show the usual auth. When the server starts up it prints a master key which then must be included either in the Authorization: Bearer <token> header or the access_token query parameter.

Alternatively you can mark a directory or file as public. We'll assume that has already been done for the following requests, although it's something you would pretty much never want to do even on localhost.

Preliminaries

Before we run the examples, first create some dummy data to work on:

mkdir -p dir1/subdir1
echo Hi there > dir1/f1.txt
mkdir dir2
echo Hello there > dir2/f2.txt

You should now have the following directory structure on disk:

dir1/
    f1.txt
    subdir1/
dir2/
    f2.txt

Start a local server

./gemdrive-server -dir dir1 -dir dir2

List the root directory

curl localhost:3838/gemdrive/index/list.json
{
  "children": {
    "dir1/": {},
    "dir2/": {}
  }
}

List a subdirectory

curl localhost:3838/gemdrive/index/dir1/list.json
{
  "children": {
    "f1.txt": {
      "size": 9,
      "modTime": "2021-08-18T03:10:16Z"
    },
    "subdir1/": {
      "size": 4096,
      "modTime": "2021-08-18T03:09:59Z"
    }
  }
}

Read a file

curl localhost:3838/dir1/f1.txt
Hi there

Read part of a file

curl -H "Range: bytes=0-3" localhost:3838/dir1/f1.txt
Hi t

Upload a file

curl -X PUT localhost:3838/dir1/f1.txt?overwrite=true -d 'New text'
curl localhost:3838/dir1/f1.txt
New text

A few things to note: