Some thoughts on wallet APIs and payment protocols

Why a standard wallet API?

Right now, Themelio has a mini-ecosystem around on-chain wallets (i.e. software that manages on-chain assets) as follows:

  • melwalletd is a headless daemon that can manage wallets, offering a localhost REST API.
  • melwallet-cli is a CLI interface to melwalletd.
  • Mellis is a GUI wallet, which talks to and internally packages melwalletd

However, there’s no clear way for programs other than these three to interact with wallets. This is a problem, because all sorts of apps built on top of Themelio may need access to user funds to e.g. pay gas, interact with on-chain constructs, etc.

To use a currently existing example, melminter right now sheepishly asks the user to manually transfer funds to a mystery address on first start:

This is pretty bad UX, which will only be worse for complex logic. Imagine a user trying to, say, register a TNS name, only to be faced with a prompt like:

  • Transfer exactly 0.129392 MEL to t38r2zs0921ec20t3av47sd6f6p927c58hx06ajdahfer0a9zs6sqg
  • Set additional data to "zr29ejnsdkf"
  • Set covenant to "deadbeefcafebabe1337d00d"
  • Spend custom input "39c51233875c42dcd71a5af9bca0ac3b3c3faaa36929110e34d5c07d47d439f4-2"

That would be a complete disaster.

What do we want instead? We want melminter, the TNS frontend, etc to automatically pop up Mellis (or any other wallet the user might use), presenting a graphical confirmation dialogue that the user can interact with.

Some options

There are several ways this can be done:

Option 1: inject a JavaScript API into webpages

This is how MetaMask and most Ethereum-ecosystem wallets work. The MetaMask browser extension injects a JavaScript API into every webpage that calls into the browser extension, similar to how wry IPC calls into native code. On non-browser-extension wallets (like MetaMask mobile), the wallet itself provides a webview-based “dApp browser” that has this API.

Pros:

  • Easy to integrate with webpage dApps
  • Easy to “listen to” from a browser extension, no serialization/deserialization or “protocol” is really required, just JavaScript all the way
  • Similar to Ethereum ecosystem and friendly to existing dApp developers

Cons:

  • Security and privacy can be tricky. The attack surface of the injected API can be fairly large, especially if we consider attacks such as browser engine exploits that dump out JS memory (which may contain secrets, such as authentication tokens, that can bypass API-level checks). Furthermore, any website can fingerprint you as a user by checking whether this API exists.
  • Non-browser-extension wallets are annoying to use.
  • Non-webpage dApps are basically impossible to integrate with wallets.
  • Generally, ossifies a “webpage-centric” ecosystem that’s incompatible with Themelio’s vision of decentralized apps, most of which would not be simple JavaScript frontends over big on-chain constructs.

Option 2: have everybody talk to melwalletd over localhost

I don’t think any other wallet does this, but this is very roughly how things integrate with systems like Tor or IPFS. In this model, a global melwalletd instance runs in the background, and dApps all talk to it in an analogous fashion to how Mellis talks to melwalletd.

Pros:

  • Completely language-agnostic, can integrate with apps written in any system that has access to localhost. Does not limit accessibility to “webpage dapps”.
  • Easy to use; everybody knows how to talk to HTTP
  • Unifies wallet integration API with wallet interface API. dApps developers don’t need to learn different APIs for internally managing wallets (e.g. melminter’s internal wallet) and invoking user-controlled wallets. APIs that belong to “both” (such as querying available balance) can just be one endpoint.

Cons:

  • melwalletd needs many API additions, because fundamentally wallet management (which is what melwalletd currently does) is separate from dApps interacting with a wallet. For example, wallet creation / locking / unlocking APIs are unlikely to be used by dApps, while interactive “pop up a prompt in the GUI connected to the daemon” APIs are not going to be used by frontends.
  • These new APIs can be complex to implement; e.g. anything that requires notifying GUI frontends would require a subscription-based API on melwalletd. Preventing e.g. misbehavior if multiple frontends try to subscribe to melwalletd at the same time will also be complex.
  • melwalletd’s security model needs to be revamped, and the API redesigned around it. The current API is not designed to be invoked by untrusted code.
    • For example, a global “locked”/“unlocked” state is horrifically insecure if unlocking a wallet in Mellis leads to unlocking it to random webpage dApps. CORS whitelisting is also far too coarse-grained a permission scheme.
    • In general, we probably need to build a complex permission system with a significant attack surface.
  • Good programming guidelines can prevent this, but a huge temptation to app developers is to “pollute” the global melwalletd with app-internal wallets (like the first version of melminter). This leads to essentially all the problems associated with global variables, and race conditions between different apps stepping on the same melwalletd can be tricky to debug.
  • Extending to platforms where running a global, native daemon is difficult or impossible (mobile platforms, browser extensions)

Option 3: specify a “wallet protocol” around a custom URL scheme

This was once intended to be the way Bitcoin works (see BIP70), but the proposal did not gain widespread adoption, and manual interaction is still the primary way Bitcoin wallets are used. In this option, a custom protocol scheme like melwallet://.. is defined, which defines any sort of action that requires wallet interaction. dApps will simply use OS-standard functionality to handle requests to “open” these URLs, which leads to a wallet prompt.

(This is the option I’m most leaning towards)

Pros:

  • Completely language- and environment-agnostic; any environment that can open URLs works. Scanning QR codes, clicking buttons on webpages, or interacting with desktop apps all support opening URLs.
  • Highly secure; the only attack surface is the URL handling logic, which we can tightly specify.
  • Highly private; no way for random websites to profile whether we even use Themelio wallets without explicit user interaction
  • Encourages a decentralized ecosystem rather than centralizing things around one JavaScript API (option 1) or one daemon (option 2)
  • Allows bringing up the wallet even when it’s closed; nothing needs to run in the background
  • Eliminates global state and associated bugs
  • Allows complete freedom for internal implementation of wallets; we can ditch melwalletd, use WASM, use hardware wallets, etc as long as whatever frontend we make supports the URL system.

Cons:

  • Complex, bidirectional interactions beyond “send a transaction looking like this” are hard to encode. These are usually rare, but some apps like DeFi apps wanting to query wallet balance might require tons of these interactions. For that, we can imagine URLs with network callbacks melwallet://streambalance?ws=ws://localhost:12345/deadbeef which prompt the user to approve a streaming websocket connection that constantly provides localhost:12345/deadbeef with balance data. Wallet frontends / melwalletd would implement the other side of this callback, with some interface to close these connections. Reopening the connection would require “opening” this URL again and reapproval.
  • Stuffing things into a URL can be annoying and require bug-prone string handling code. This can be mitigated by e.g. well-designed libraries to construct and parse melwallet URLs, or by designing the whole URL scheme to be more machine-readable (e.g. melwallet://<hex string> where <hex string> is a compressed CBOR or JSON document)
  • Opening/handling URLs can be finicky on some platforms (Linux, for example). This can cause poor usability, though fortunately with URLs it’s easy to fall back to “copy this URL manually into your wallet if it doesn’t pop up automatically”

Some examples of apps doing wallet interaction (WIP)

Traditional “webpage defi dapps”

Press button on webpage => wallet pops up => “links” wallet / sends tx / etc

Non-webpage “crypto” apps

  • Rollup/payment channel L2 clients
    • Prompts for access to wallet to deposit to and withdraw from L1
  • Bisq
    • Accesses wallet to transfer funds into an internal wallet used for peer-to-peer OTC trading
  • melminter

“Non-crypto” apps

  • Federated social media (Mastodon+Matrix killer)
    • Access wallet to fund built-in payment channel “tipping jar”
    • Access wallet to fund TNS transactions for registering “shortnames” or registering globally known server names
1 Like