EQUIP Component Toolkit state/startup design notes
2004-05-17, Chris & James
Introduction
Persistence is presumed to ultimately file-based. E.g. file
backup/restore can be used to recover a failed node.
Several installations (= multi-host single dwelling system, e.g. Tom's
flat) may exist on a single LAN (e.g. Tom's building-area network, BAN)
Setting up a new node should be easy using Java WebStart :-)
Single Dwelling System
A single dwelling system comprises (a) hardware:
- Some LAN networking provision, e.g. WaveLAN access point
- Some Internet access, e.g. routed/firewalled ISP connection,
presumed always on
- One or more PC-like machines = infrastructure servers. Exactly
one PC will be designated as the installation master.
- Misc devices, sensors, etc. plugged into the above PCs
(b) Software:
- Exactly one persistent shared dataspace for non-persistent
coordination (e.g. component adverts, component properties) and
persistent component-independent configuration (e.g. component
requests, property link requests). This will run on the installation
master machine.
- Zero or more (but zero would be a bit silly) Containers,
typically one Java Container per machine, and perhaps one C# container
per machine (or maybe more, depending on which bits are likely to crash
most often).
- Each Container has zero or more component capabilities (e.g. Jar
files containing Java Bean classes)
- Each Container provides its own persistence for recreating
components, e.g. consistent use of GUIDs, persistence of component
internal state (Shahram)
Security and Trust
Direct physical access to hardware is taken as the base-line for
security, i.e. at this level you can do whatever you want :-)
Consequently, all processes on a given host are presumed mutually
trustworthy.
Access to the LAN is assumed to provide a minimal degree of security,
but is not sufficient for granting trust (e.g. see multiple dwellings
per LAN case, above).
Security model 1 presumes that out of band distribution of a shared
secret to a host (PC) incorporates it into that trust domain. E.g.
copying a shared secret via a USB flash disk from an existing member of
the particular dwelling to a new host would establish mutual trust.
Scenarios:
- Tom buys a new infrastructure PC to wire up another room, and is
easily able to add it to his own installation... :-)
- Tom and his neighbour, X, both have separate installations. Tom
should not be able to add nodes to X's installation, or make changes to
X's configuration, or connect components in X's installation to
components in his own installation.
Additional Requirements
For development/testing/etc it should be possible for a single PC to be
part of different installations. It is not clear whether this would
only be at different times, or possibly even at the same time :-)
Design 1
When a machine is re-booted, by inspecting its own filesystem (or other
local persistence mechanism) it should be able to determine:
- What (if any) installations it has previously been a member of.
- The shared secret for such installations
- Which (if any) installations to re-join/re-start automatically
(by default).
- Where it was the installation master
machine and consequently ran the dataspace for an
installation, it will have the persisted state of the DS (checkpoint(s)
and event log files).
NOTE to configure the use
of DS persistence in the current implementation requires the
specification in advance of full dataspace URL including IP address and
server port number! If this may vary then the configuration must be
dynamically generated on startup.
NOTE client will
automatically try to reconnect to a failed dataspace server, but will
use the ORIGINAL IP address and port number.
- Where it ran any Container(s) in an installation it will have the
container-specific persistent state (GUIDs, previous component requests
and component state(s) and GUIDs).
What can you do when you start a new machine?
- Restart a previously mastered installation for which we were
master.
Optionally clear out persistent information (other than shared secret)
from that installation (= hard reset/wipe and restart)
- Rejoin a previously joined installation of which we were a member
(and which may or may not still exist or be running!).
Optionally clear out persistent information (other than shared secret)
from that installation.
- Start a new installation with this machine as master
NOTE generates or remotely obtains new shared secret.
- Join an installation (somewhere on this network, presumably)
which we have not previously been a member of.
NOTE installation's shared secret must be provided securely (e.g. via
dongle).
NOTE when (re)starting an installation master the IP and port of
installation dataspace may have changed and must be made available.
NOTE when (re)starting a Container the IP and port of the installation
dataspace may have changed and must be determined.
Discovery issues
How do you know what installations are available to join?
- Option 1: you remember that you already created one, and have the
shared secret.
- Option 2 (better): there is a network-scoped
discovery/advertisement process, which lets you find existing
installations, although an out of band mechanism is still required to
obtain the shared secret and join them.
ISSUE: make sure that this is not as insecure as WEP :-)
NOTE Option 1 should still be available to cover the case where the
master is currently down/broken/etc.
How do you know what IP and port the installation's dataspace is
running on?
- Option 1 (bad): it is configured when first joined, e.g. with the
shared secret, and presumed never to change.
- Option 2 (better): it is discovered from the running installation
master.
ISSUE: make sure that someone can't fake being the installation master.
Design for use of EQUIP discovery
Background:
- Jini-like announce protocol, mapping group name(s) and type
name(s) to string(s) (usually URLs).
- No security.
Desired outcome (a) installation discovery:
- Human readable/meaningful installation name so that you can ask
someone to join it or remember that you made it. e.g. "Tom's flat"
Desired outcome (b) installation dataspace URL discovery:
- Exactly one URL (protocol, IP, port) of installation dataspace,
with confidence that this is the one true master of this installation
Option 1:
- group name = Human-readable installation name plus additional
installation-specific string (not the shared secret :-) which can
easily be seperated from the human-readable name (e.g. '...#...')
- Discovered service string = master dataspace URL PLUS digital
signature testable using shared secret, e.g. derive a key from the
shared secret (optionally incorporating information from the URL), use
to encrypt a secure digest of the dataspace URL, concatenate with URL
in unambiguous parsable form.
NOTE non-secure version will not be able to distinguish deliberate
clashes on group name.
So:
- Installation master runs a discovery server to announce both the
installation's existence, and the installation's dataspace URL
- Containers use discovery client with digital signature checking
to discovery the installation dataspace URL
- Start-up interface uses general discovery client with group "any"
(or is it "*"??) to create/maintain a list of currently avail able
installations to join.
Design for EQUIP security
Connecting to the dataspace should also require the mutual demonstrated
holding of the shared secret :-) E.g. challenge response using
deterministically derived key from shared secret.
What happens next?
Presumably the user can see:
- what if any installations the machine is currently mastering
- for each active installation which if any containers are
currently running
The user can presumably, for each active installation of which this
machine is currently a master or member:
- start a new container
- restart a container that is thought to have crashed?!
- add/update/remove capabilities for any container
- start a user interface (there may be more than one kind)
What might the filesystem look like?
Deterministic choice of common root directory/
per-installation subdirectory (?made safe version of
discovery group)/
shared secret
full version of discovery group, or at
least Human-readable name
installation configuration, e.g.
restart/rejoin by default, dataspace port number?
dataspace persistence directory (for
installation master, only)
per-container directories/ (each with)
container startup
configuration, e.g. executable pathname, restart/rejoin by default
container's own
persistence stuff... (inc. component-specific persistence)
container component
deployment directory/
e.g. jar files
??extra
metadata for update management e.g. origin website??
EOF