rcrowley + ops   13

ideal ops checklist — Gist
Sort of a laundry list but if you hit most of these you're in pretty good shape.
ops  devops 
4 days ago by rcrowley
Operations - Cassandra Wiki
Introductory operational information about Cassandra.
ops  cassandra  backups  monitoring 
5 weeks ago by rcrowley
Google: Achieving Rapid Response Times in Large Online Services
Things to consider when optimizing at the 99th percentile. Variance is key and is one of the biggest reasons EBS blows.

Counterintuitively, he suggests synchronizing variances across a cluster (for example, a Puppet run). Even though for that moment in time all requests are slow, it's better to do that than let every request potentially be slow because it fanned-out to a temporarily slow node.
ops  dist  variance  performance  scalability  optimization  google  jeffdean 
8 weeks ago by rcrowley
lusis/discotheque - GitHub
(Dumping GitHub watches into Pinboard.)

AWS node discovery tools.
aws  ruby  discovery  ops 
october 2011 by rcrowley
lusis/Noah - GitHub
Interesting CM-focused analogue to Chubby/ZooKeeper/Doozer. I can't help but think the high-level API has muddled the design. I'd love to see that same API implemented in terms of Doozer.
noah  ops  cm  chubby  doozer  zookeeper 
april 2011 by rcrowley
Coping With Cloud Downtime | Backdrift
Some good, if a bit obvious, advice on operable cloud infrastructure. "Manage your systems from outside the cloud," is one I'm not so sure about, as it implies a more permeable firewall and/or more directly Internet-accessible system components, which may not be an option for many companies because of, among other things, PCI-DSS standards.

Plugs for Puppet are always good, too.
ops  cloud  puppet  bcp 
april 2011 by rcrowley

Copy this bookmark:



description:


tags: