"dat://" as a protocol for stable references to data stored in various hypercore based data structures

As food for thought, I wanted to share a rough idea that’s been bouncing around in my head with regard to “dat://”.

Earlier this month…

So recently I checked out gnunet, which meant no less than 1 very late night of configuration drama, but which also meant learning about some very cool ideas.

One thing I’m impressed by is gnunet’s URI scheme. gnunet is composed of lots of different modules with names like “fs” and “cadet” and “vpn” and all kinds of stuff and I honestly don’t really know what they all do. But basically each module provides access to different kinds of data.

And gnunet’s uri scheme gives modules quite a bit of freedom to each implement their own APIs. A gnunet URI looks like this:

gnunet://module/identifier

The “identifier” piece can be just about anything, depending on which module you’re talking to. To access a file from gnunet’s fs module, you might use a URI that looks like this:

gnunet://fs/chk/[file hash].[query hash].[file size in bytes]

Although the fs module provides a few different ways to query for files.

What if dat:// did something similar?

So what if there was something like this for linking data stored in various hypercore based data structures?

As a very literal example, we might imagine URIs like:

  • dat://hypercore/[id]
  • dat://kappa-core/[id]
  • dat://hyperdrive/[id]
  • dat://cabal-core/[id]
  • dat://corestore/[id]

Just to be clear, this example takes very literal inspiration from gnunet://module/id, but it wouldn’t necessarily need to look exactly like gnunet.

All of these modules have their own ways of dealing with addresses. Some of them will also loop in more hypercores or other hypercore based data structures.

And of course these modules are all evolving. But this would opens things up so that each module is responsible for maintaining its own backwards compatibility story.

It’d also open up some room for new hypercore based modules to enter the picture, as time moves into the future (inevitable).

What about everyone just making their own protocols?

During the last comm-comms meeting, @rangermauve helped me realize that it’s actually pretty awesome that people in various hypercore based projects are just going ahead and making their own protocols.

This idea isn’t about discouraging developers from making their own protocols. Rather it’s about attempting to make dat:// URIs both (1) more explicit about the data they point at, (2) more inclusive of the many different hypercore based data structures people in the community are building and (3) more stable.

In current operating systems there’s a very strong coupling of protocols and applications. Protocols end up being pretty strongly suggestive of how to actually use the URI. This is great for linking to a specific way of interacting with data, but maybe it’s not always so great for referencing underlying data in a stable, interaction-agnostic way.

Thoughts?

1 Like

I like the idea.
In our project https://twitter.com/datdotorg - as a major side goal - we want to make “dat addresses” available in regular browsers, so people can build apps which use dat infrastructure using any web browser they want, so many addresses in use might not even show up in the main address bar.

Instead, an app, whether served via http(s) or in beaker browser via dat:// or maybe hyper:// ?? …it is always a mashup of many hypercores using many hypercore based data structures, so the only so called “main url” would probably be the app itself, which ideally does not include any data, but served the app (as a function) to view and/or edit other data.

All data urls in our current opinion should just use the well established “file extension” at the end of the url (e.g. .hyper? …or .html or .json, or whatever)

But app urls which is what websites would become and which would load one or more other data urls to display a mashup of them, might follow different conventions.

In fact, I would love to bookmark websites which are such dat app urls to assign them as the default apps for my data urls similar to how i do that on traditional operating systems (windows, linux, macosx, …)


So what would be cool is, if any app url would indicate what kind of data urls they support, so i know what they can be used for and without knowing all the details about the unwalled.garden project - at least what i perceive to be the spirit of that project, it should maybe be used, not only to specify “data file extensions” but also be re-used in app urls so it’s totally clear which kind of data those apps are guaranteed to support

1 Like

update

So if there was a standard “mapping hypercore” as best practice how people would e.g. in beaker browser or similar environments choose, which kind of data uri's they want to open with which kind of default app uri, then that could be crawled and people can of course filter it to include only mappings from people they care about, and then sort it, to see which app uri's are most commonly used for certain data uri's :slight_smile:

This goes a bit against how URLs are constructed (at least on the web), and thus has consequences for how web browsers would use these. Under the URL Spec, after the protocol section, there is an optional ‘authority’ section (preceded by two slashes). This is where,for HTTP, the hostname is, and semantically indicates where to look for this content (i.e. from the web server at the address and port given). On the web this is also called the ‘origin’ and a large portion of web security and privacy depends on this to partition browser storage between different origins.

In Dat, the authority or origin is a content address - the archive’s address on the network. This matches up nicely to the web model, allowing each dat site to have it’s own localStorage/cookie store etc. This is something that IPFS struggles with, as alot of content is loaded over gateways, which means that sites share origin with each other.

This specific proposal would therefore cause issues by making the origin equal to the sub-protocol in use. Also, as you mention, there is usually a 1-to-1 mapping of protocols to applications, meaning that if everything is grouped under dat: then you have to have one application that can handle every dat sub module. Even if that is desirable and feasible, going the other way is much more flexible and configurable, one application could handle all the different application protocols in the ecosystem.

2 Likes

+1 to Sam’s point about origins in URLs.

On the call I think I mentioned that we could do something like cabal+dat://somekey/whatever or hyper+dat://somekey/whatever

This is kinda how the registerProtocolHandler API supports a web+whatever protocol scheme to allow for “non standard” schemes.

1 Like