High order bits from SOS discussions:
- SOS/Self-star submission cost us Sigops submission. Paper was very
well recived by SIGOPS but the view was that if we were showing it at other
workshops, even those w/o proceedings and by invitation only, then it should
not also appear at SIGOPS. This is disconcerting since I specifically
sent email inquiries about this before deciding to submit to the
non-proceedings workshops in addition to SIGOPS.
- dynamic adaptation/proactive recovery for resilience: keep system a moving target. Separates
survivability from debugability, but we have already bit that bullet.
- Carl L is a bit disappointed in this--he thinks we can do better and this feels like ahack, like frequency hopping.
- 2 biggest sources of cyberattacks are insider and DDOS. corollary: cannot think of "critical systems" separately from noncritical, since DDOS
relies on recruiting many noncrit sys to attack a crit sys; cannot think of crit sys separately from operator tools,
since operators can be attackers (wittingly or unwittingly).
- a lot on byzantine survival, atomic reliable byzantine broadcast, threshold crypto... in family of
view-synchronous group comm ala ISIS/Ensemble
- eg Christian Cachin, Using SINTRA for intrusion-tolerant distributed DNS (full ppr at DSN04)
- "beyond byzantine ft"?
- turn byz failures into crashes
- Recent work on data struct repair: see Mazieres' SUNDR (repairable FS), UTAustin survivable storage work (martin,alvisi,dahlin)
Fred Schneider (Cornell)
- overview of lamport state mach repl- "narrow bw channel"==fault isoln
- focus on attack resistance:
- attacks are correlated -indep failure assmp doesn't hold
- all assmpt are vulnerabilities
- DOS attack exploits "synchronous sys" assumption
- servers must manipulate secrets for authenticity->replication=more copies of secret to steal
- "correct cmpt satisfies spec"-but wyat is the spec? How do you know you've thought of all assmp? (eg ddos)
- "rapidly changing secrets" doesn't work if client set is large/unknown
- obfuscation: rearranging code blocks /vars to foil "plumbing" attacks like buffer overruns
- chall of "recovering state" if state has been compromised
- implemented@cornell as bsd3.4 kernel that periodically reloads itself w/syscall numbers
changed
- attacker can still buf-ovflw, but now will just crash instead of taking over machine.
- davidopp points out: have to do this for persistent state too. Can do if svc state is byzantine safe.
- scacrifice some determinism for replica robustness
- idea- test ssm /dstore against malicious but non-Dos attacks?
CARL LANDWEHR - intrusion det
- tolerate attacks thru graceful degr
- next gen: self-regenerative
- what is survivability? --> ability to continue critical operns in face of cyber attack - not just accidental faults
- critical op: failureo complete means mission not accomplished. may be ok to degrade noncritical ops.
- COROLLARY: survivability only makes sense w/r/t a specific mission objective.
- Serious attacker will do extensive recon, befrind operators, identify specific targes, send probes, etc
- checkout quantitative targets in survivability BAA 02-16! ("detect x% of attacks withintime y withless than z% false pos rate") - but even these leave room for interpretation, which is happening now as Red Teams gear up.
- audience q: tough to do research inthis env since need alot of engineering to get sys to point where it can be attacked
- Fred Schneider:"There was asurprising amount of publishable research that came out of the DemVal" (suggests interesting priorities!)
- integration of trusted w/nontrusted systems is hot topic
- suggested approach: assign hig-assurance security policy to simple separate components; creae env where commercial threat model is appropriate, then use COTS to enforce
- one impl: one-way X/citrix channel, hw or sw switch that directs keystrokes toeither secure desktop sys or insecure X server; some SW to fake X11 acks to keep the channel one-way (MSL - multiple single level approach)
- Fred points out: if acks are forwarded back to insecure layer, it is a covert channel. If they aren't, SW semantics break.
- notion of"data diode" is challenging b/c of acks problem and b/c of buffering issues and exposure of timing as a covert channel
- Cybertrust vision: freedom from cyberattacks in civilian systems (critical economic functions,sensitive info mgt, etc)
- scope: DHS>Critical infrastructure protection>cyber security>cybertrust
- but cybertrust extends beyond scope of DHS activities
- note, Helen Gill's EHS being subsumed into cybertrust??
- Industry/University Coop Rsch Consortia (I/UCRC): funded by industry w/sme support from NSF, typ $500k/yr of which $50kfromNSF; industr provides adv bd. Ask Alex Schwarzkopf forthe howto manual on doing these.
- Lorenzo: is NSF collaborating w/EU? A: trying to, but constrained b/c NSF is funded by US tax$. No joint funding so far
- see Butler's IEEEComputer article this month; can't trust systems w/lot of SW, must clearly separate critical funcs and have backup
systems that just do those.
CATEGORY-PRESERVING PRIVACY
- given e=escrow(user,txntype,txn) and e'=escrow(user',txutype',txn), can tell if (user,txntype)=(user',txntpe') but not identity of user or txntype
- resists guessing attac s on either user or txtype
- can bin (u,type) but not either one separately
SURVIVABILITY ON THE INTERNET
- "Static/predictable defense will fail"
- SunTzu ("art of war "):Those whocan aapt toenemy tactics "are called divine"
- attackers oftenfind vunerabilities by reverse engineering the patches. eg "Witty" worm released w/in 48hrs of patch release!
- Compromised nodes can infect other nodes, but patched nodes only protect themselves
- idea: install 'traps' on bait nodes
- if youknow how tomke atrap, why not just patch node?
- can you convince users uwilling toinstall patch to act as bait?
- assumes youcan repair/contain virus ater a trap has triggered
- "OODA loop"(Boyd):observe, orient, decide, act (military planning)
COMMENTS ON SOS-RADS TALK
- Fred Schneider: doesn't work on programs that interact w/the environment a lot, since may be unsafe to recover/retry
- Focus on making more actions idempotent
- "Only if stateless/idempotent" - not true but you have to be willing to sarifice some bad requests
- emphasize this is one tool in arsenal: what can it be combined with (logging to avoid redoing nonidempotent ops, BFT, etc)
- Too long by 2 content slides
Wormholes
- it's fine to build small kernel of trusted svc, but making that piece trusworthy is still hard. eg,communication witthe trusted comp must be secure, etc, an failure semantics must be carefully defined. The talk did not seem to be up front about that.
Martin and Alvisi, fast byzantine consensus:
- 2 comm rounds in common case, meets provable lower bounds on solution fasibility
- in uncommon case may not makeprogress but will never be unsafe
Forwarding in the face of malicious routers (Savage et al. UCSD)
- goal -detect effects of malicious routers ontrafic flows
- traffic validation does traffic profile at diferen sites match up?
- metrivs -accuracy and completeness. spkr wanted opinion ifthese are good metricsc. reoughly, accuracy=precision and completeness=recall.
- challenge: normal internet conditions (pkt drops, etc) cause naie comparisons to fail.
- dependingon meas points, faulty router cannot be pinpointed but can be shown to be in some subset of routers. Defs of completeness and accuracy are thn modifiedto applytothe subset. **wee should discuss pinpoit's use of recall/precision w/them, it's a similar problem.
- routers reliably broadcast info about suspected-bad peers.
- fingerprints overflows and conservation of flows is used to compute path profiles. eachforwarded packet requires 1 fingerprint computation.
- deployed in prtotype form on UW/Sprintlink network (315 routers
- Countermeasure when alarm: exclude suspected segment from routing topology.
- how about false pos? ideally, due to inherent redundancy in routing, wouldn't matter! but may lead to routing stability issues ... NYI, so no data on how long it takes to rebuild tables.
- how are fingerrints resilient to "normal" routing problems? answer: dependson what "normal" beh of network is. IDEA: use SLT todetermine this dnamically!
- malicous router may implicate innocent neighbor? a: you can only implicate routers that are on same path as you, so still the case that SOMEONE iin your path is faulty.
SECURE SPECULATIVE REMOTE EXECUTION Bennet Yee, Google
- problem: latency is bottlenk in distrib sys, esp. geoplexed -> mobile code
- remotely fork copy of client in a trusted way. remote client will execute RPC's speculatively and results will be ready sooner. Evil shadow client can initiate but not commit a bogus rpc.
- RPC svr maintains rollback buffer in case attack deteted later; only commits once "true" client verifies it executed succesfully past the rpc call site and approves the speculative results w/a signed msg.
- this is purely a perfmance optimization that is resilient to attack/evil shadow client--not a defense mechasm against attacks.
Selected notes from DSN 2004
- keynote:Lamport on Paxos (Byz agreement fwk fromMSR). 2 new twists:
- fast paxos makes common case take only 1rtt for consensus (at cost of more overall msgs sent), rare case same as rgular paxos.
- generalized paxos allows consensus to be reached more easily for operations that commute: consensus fwk extended to mean consensus on a complete inremental partial ordering of results, not on each specific result. when possible conflts arise,it is slowerthan regular paxos; otherwise same # of RTT's.
FAB ppr and M-quorums: Dstore guaranees linearizability; delayed-commit is alloed. FAB enforces strict linearizability, which requires that a partial write be eiher committed orrolld back on thevery next read ofthe data (not at soe arbitrary future time). To achieve this,they use m-quorums -- quorums that intersect in m members rather than injust 1. Algss are in the ppr as are some interesting refs.
DEFENSE-CENTRIC ATTACK TAXONOMY
Roy Maxion's grp (see also their STIDE work and the biblio tothis ppr)
- idea: defenders want taxonomy based on what can be detected by IDS; most current taxonomies are based on attackers' goals
- in fact, basically a laundry list for attackers...keep trying until an attack foils the defender's IDS
- goal: construct taxonomy based on how anomalies in senor data are detected.
- sensrs==syscalls, as in the Hofmeyr 1998 ppr
- related: Carl Landwehr (1994) taxonomy based on programming errors that facilitate attack(eg buffer overruns)
- syscall-based IDS ignoresthe arg of syscalls, only uses trap#. According to Mike Jordan, algo' are naive in Hofmeyr88, and kernel machines could do better. Could we improve detectability (of attacks undetectable under this taxonomy) by using better algo's and considering the arguments too?(like Liblit/Zheng assertion sampling) -a good SWDYFORP! (note, this type of datais called 'categorical data', and the anomaly type corresponding to his taxons are the ONLY anomaly types that arise in categorical data.)
- defense-centric txnmy classes: foreign symbol anomaly, foreign sequence anomay, dormant sequence (truncated normal seuence), no anomaly (not detectable based on sequences of syscall symbols).
- based on 25 known attacks
- the methodology in this ppr is interesting! (esp. Validation of taxonomy)
CHECKPOINTING INTERNAL DATA STRUCTURES IN MAIN MEM DB'S (Kalbarczyk,Iyer et al)
- goal: allow fine grain recovery for in-mem DB by ckpointing its ctrl strutures using ARMOR.
- sounds like Sleepycat but not sure
- embed ARMOR into monolithic proc's to allow it to beome an ARMOR compt
- q: how d you know you captured ALL relevaut state (+haven't introduced INCONSISTENCY by fine grain reco)? A: we don't, but observed no such cases.
- SWDYFORP: improve on JAGR microrecovery using ARMORs?
PROACTIVE RECOVERY IN DISTRIB CORBA APPS
Basically replicates the memory leak/proactive microrejuv sec of Geo's OSDI ppr, but in CORBA ratherthan J2ee.
SECURE DISTRIBUTED DNS (Cachin)
Use optimistic atomic bcast (reqires Byzantine-resistant leader election) to order reads/writes among DNS peers (don't do master-slave dissemination). Client gets N replies and uses majorityvoting to select right one. For signing DNSupdates, use threshold crypto to avoid leaking the secret sining key to a corrupt server.