This and the following pages are a brain dump, written on (check following dates), outlining the current state of affairs, things that need to be worked on, and things that I (Katharina) don't have the capacity to work on right now. When outlinging things that are too complex to solve for me right now, I will try to provide as much context as possible, both for my future self, as well as other contributors.
If you see any open questions in this document, and you think you can help, please send an e-mail to the mailing list, or, if you're shy, to me personally: email@example.com
As of 2020-01-13 the service API is looking pretty good. I've been hacking on the project full time for ~4 months or so, and there's been a lot of progress. All basic message functions are implemented, user creation, authentication, etc. Services can be registered and managed internally, and can pass mesages through the stack, and receive them as well. All API endpoints are doing blocking IO, meaning that callbacks are only done via lambdas.
One paint point here is that we want to move to all async IO, not for
speed but mostly for simplicity in our own scheduling. async-std
implements a good scheduler and it'll be easier for us to use this,
than doing our own stuff with threads. Nora has been working on this
a bunch and the current state (that don't build) are on the
branch on upstream.
An issue that Jess raised a few days ago (as of 2020-10-13) was service storage: if we allow people to write services that handle user data, how do we make sure that they don't just drop that data on the FS and call it a day? Furthermore, how do we encourage people to write services that can have their IDs wiped without going all awol, i.e., we want to encourage ephemeral IDs for activist networks.
A solution to this is that we provide a storage API via libqaul, that people can use to store arbitrary data that they think is important to do their whole thing.
One issue that I, and others came up with is where to then store this data, but also how to avoid sidechannel attacks from a malicious service that might leak user metadata, i.e. what user uses what service? This is a bit of an open question because on one hand we want users to be able to announce services, but it should also be possible to run hidden services, that might not be legal to run. Furthermore, how do we design an abstraction over the secure store provided by alexandria (also wip) that doesn't fall victim to a padding-oracle-like attack, i.e. a service that can send queries for various users and services and looking at the failure responses to figure out what user is running what services. There's another layer to this, which is that we do expose a list of local IDs via a user endpoint so that the frontend can render a list of users to login as, for someone to just select. This obviously means that any code running on the user's device will have access to the list of users on the device.
One possible mitigating factor here would be to have users marked as "secret", where even libqaul itself doesn't know if it owns data for a user. A user would provide a secret, and their own key, similar to how the detached identities hash challenge works (outlined later?), to decrypt a user manifest and the load all the state from alexandria. This would mean that a user is unknown to peers outside of the current user session, or by dumping the libqaul memory space from a priviledged user (but honestly, if an adversary owns your device there's very little you can do - we should look into if mbedtls supports linux secmem extentions!)
As for hidder services for a user, we can use the same mechanism as the hidden user directory, except that we generate a secret for a service which is required when performing service lookup queries. When a user signs in, we can load the hidden service secrets from the user manifest, then unlock the service data, which is ambiguated at rest. If a malicious service tries to check for a service on a user by trying to load certain files from storage, it will require the service auth token which was only handed to the hidden service. The initialisation of the hidden service then follows another challenge response mechanism, similar to how we can load a hidden user: the hidden service generates a token, the user uses a trusted app to auth that token, then the serice secret is passed to the hidden service. Again, if the attacker can dump memory... etc etc.
I think these threat models are realistic, and my solutions offer a viable protection mechanism against them. There's some open questions, but this should be enough te implement the required endpoints in the service API now. I would suggest either adding a storage endpoint, or expanding the service endpoint if there aren't too many functions.
Not something that I want to write as much about here because the problem has way more factors and this is not a part of security analysis that I know much about (see previous section for application and protocol security): system integration and malicious daemons.
Say a user installs a malicious instance of libqaul, and it registers itself with dbus, intents, etc and now other applications, possibly our qaul.net app itself, will start talking to it. Is there anything we can do? Qaul.net, the app, could pin it's daemon, but that would be annoying, especially because it could deter any app from connecting to any daemon, rendering the whole point of the API useless.
Shipping keys doesn't work (lol, why do people keep trying this?), maybe we could get people to checksum their binaries? But also hard. Anyway, I'll move on because there's more to cover.
The current state in ratman is a mess. Messages can be routed, but floods don't work because they don't get deduplicated. To do this well, we need to finish the journal, which should index frames by ID, and then refuse to broadcast when it has already send a specific frame ID. Also, there's a lot of mising logic to rebuild the sliced messages, and also I had a conversation at 36c3 about doing the slicing in a more clever way. TODO: message Jeff.
When it comes to asyncing, there's a lot of work to do: ratman itself needs to get async internals so that we can actually call the netmod interface that now exists, but also we need to make sure that we can wrap blocking calls properly? Like, how do we call into native drivers that can't do async in Rust? I have some ideas about how to build the interfaces up on the async branch, but this all needs integration and I'm drawing a blank how this will actually look in C++, so maybe it's time to actually build the Android abstractions? I don't know.
After Ratman has an async API towards libqaul we need to async the core of that, and also make sure that we give Ratman a similar abstraction for the native code because let's remember that initialisation of a stack happens from the bottom up and all these APIs are gonna have to be wrapped for C++!
Hmm... I might have missed some things, but that's been making me anxious for the last 2 weeks or so. Please talk to me about these things if you have ideas, thoughts or suggestions.
Check the "Challenges" section following this, for some ideas that you (yes you!) could get started with, if you wanted to help out with developing qaul.net.