You are here

Blog

IoT, DNS, Feature Flags, and Chaos-as-a-Service

Velocity NYC 2018

 

O’Reilly Velocity 2018 Distributed Systems Conference was held October 1-3 at New York’s midtown Hilton. As sponsors, Team Opsview arrived with a song in our hearts and a lot of purple Opsview-branded swag to give away, including a mystery flavor lip-balm that proved irresistibly magnetic to attendees (or perhaps it was the Nintendo Switch we were giving away to someone who correctly guessed the flavor).

Velocity is a bigger, broader, more-sprawling event than some of the others we’ve attended recently. Pre-show training days, half-day bootcamps, multiple presentation tracks, plus show-floor and sponsor pavilion meant no single person could see everything. So our choice of highlights, this time around, is more spartan, varied, and subjective. You can dig deeper by visiting the Velocity NYC 2018 website, and spelunking the schedule for links to video highlights and slides. Full videos are available through O’Reilly’s online learning platform, for which a free ten-day, no-credit-card-required trial is available.

SaaS Services for Dev and Ops

Samsara runs a platform for monitoring huge flocks of distributed IoT sensors in fleet, industrial, and other applications. As you can imagine, they understand scale (the company’s name, drawn from Sanskrit, means something like “wandering through the infinite cycles of the world”). In her talk, titled Practical Performance Theory, Samsara’s Kavya Joshi discusses (mostly relatively simple) models of applications; explains their limitations; shows how to reason with them; and how to derive useful predictions from them. She moves back and forth between basic queueing theory and spookier, more complex, multi-regime scenarios, and provides hugely valuable insights that anyone working on parallel transactions at high volumes (e.g., serverless apps) can put to work today.

NS1 provide DNS and traffic-shaping services to some of the world’s biggest internet companies. Their CEO/founder, Kris Beevers, gave a keynote on how to find the sweet spot between (as he puts it) “not getting to make big mistakes” and delivering new products and features at speed. His keynote on “Balancing Good Enough and Perfect” opens by discussing a pathfinding/optimization problem on a topo map -- a physical example that turns out to have deep roots in explorations of chaos theory and the problem of avoiding premature optimization to local maxima.

Imagine being able to continuously deliver features without hassles -- turning them on and off on production systems without rollbacks or drama. Imagine being able to A/B or canary-test new features: presenting them to fine-tuned subsets of customers -- even selecting behaviorally or from user data such as PII -- while presenting the rest of your customers with tried-and-true functionality. LaunchDarkly provides a SaaS or premise-based platform that enables all this: it serves feature-flags to your users in realtime, giving you perfect manual or automated control over user experience.

Developers love the idea (sometimes not so much the practice) of Chaos Engineering -- challenging the resilience of production systems (and testing the systems and people monitoring and managing them) by automatically disabling components and servers at intervals and seeing what happens. Doing this in practice has mostly focused on a relatively hard-to-use solution called ChaosMonkey, invented and later open-sourced by Netflix. Now Gremlin, who we met at Velocity (and who were handing out the best gremlin-shaped swag mints ever!), has introduced what they call a Failure-as-a-Service platform: a hosted service that lets you perform selective experiments in automated system breakage (and/or simulated load application) with the goal of identifying areas of weakness and improving resilience. Clear web displays let you target server resources (e.g., CPU, memory, etc.), network traffic, or simulate various kinds of bad behavior, from unexpected restarts to process failures to time drift: a common cause of “split brain” failures in large-scale cloud frameworks. Not inexpensive, but if highly-controllable Chaos is what your team needs, probably very affordable.

Get unified insight into your IT operations with Opsview Monitor

jjainschigg's picture
by John Jainschigg,
Technical Content Marketing Manager
John is an open cloud computing and infrastructure-as-code/DevOps advocate. Before joining Opsview, John was Technical Marketing Engineer at OpenStack solutions provider, Mirantis. John lives in New York City with his family, a pariah dog named Lenny, and several cats. In his free time, John enjoys making kimchi, sauerkraut, pickles, and other fermented foods, and riding around town on a self-balancing electric unicycle.

More like this

AWS re:Invent
Dec 06, 2017
Blog
By John Hashem, Technical Pre-sales

AWS re:Invent 2017 in Las Vegas attracted over 40,000 attendees. From the vendors to the conference itself, there were themes of predictive...

San Jose Convention Center
Jun 11, 2018
Events
By Tom Callway, VP Marketing

The O'Reilly Velocity Conference helps systems engineers, software developers, and DevOps teams stay ahead of their game by keeping pace with key...

Tomas Ulin, Oracle VP Development for MySQL, at Percona Live 2018
May 09, 2018
Blog
By John Jainschigg, Technical Content Marketing Manager

Opsview's Bill Bauman and John Jainschigg attended Percona Live 2018 -- to talk about serverless computing, database monitoring, and catch up with...