Sep 10 2014

Automatic Docker Service Announcement with Registrator

No matter which service discovery system you use, it will not likely know how to register your services for you. Service discovery requires your services to somehow announce themselves to the service directory. This is not as trivial as it sounds. There are many approaches to do this, each with their own pros and cons.

In an ideal world, you wouldn't have to do anything special. With Docker, we can actually arrange this with a component I've made called Registrator.

Before I get to Registrator, let's understand what it means to register a service and see what kind of approaches are out there for registering or announcing services. It might also be a good idea to see my last posts on Consul and on service discovery in general.

Service Registration Data Model

Service registration involves a few different pieces of information that describes a service. At the very least, it will involve a service name, such as "web", and a locating IP and port. Often, there is a unique ID for a service instance ("web.2"). Some systems generate this automatically.

Around this, there might be extra information or metadata associated with a service. In some systems this could be key-value attributes. Or maybe just tags. Classic service discovery of the zero-configuration world would also include the protocol (HTTP, SMTP, Jabber, etc), but this isn't very useful information since in our case we already know the protocol of the service we're looking for.

When using etcd or Zookeeper it's up to you how your service directory works, both what information is stored and how to structure it. Specialized service discovery systems like Flynn's discoverd or Netflix's Eureka provide more structure around service semantics. Consul is sort of a hybrid, since it's really a specialized service discovery system built-in to a general configuration store.

Consul lets you define a service name, IP, port, optional service ID, and optional tags. In a future release, I believe it will tie in more with the key-value store to allow you to have arbitrary attributes associated with a service. Right now, Consul also lets you define a health check to use with its monitoring system, which is unique to Consul.

So far, that gives you an idea of the data involved in registering a single service, but that's not the complete model. A service "record" is a reference to an actual service, and it's important to understand what that actually is. Whether using containers or not, a service will always boil down to a long-running process, and a process may listen on several ports. This could imply multiple services.

One could argue that if a process listens on multiple ports for the same functional service, it might be a good idea to collapse it into a single service. Modeling it in this way ends up being either complicated (putting the other service ports in meta-data), or incomplete ("which port do I use for TLS?"). I've found it's simplest to just model each port a process listens on as a separate service, using the name to logically group them. For example, "webapp-http" and "webapp-https".

Registering In-process or Using a Coprocess

The most common strategy to register in service discovery is actually directly self-registering from the service process itself. From a "good solution" perspective, this might seem terrible. But it's common for a reason. Mostly, it's pragmatic, as many organizations build their specific services around their specific service discovery system. However, it does have other advantages.

Service discovery systems like Eureka and discoverd provide a library that can be used in your service to register itself, as well as lookup and discover other services from in-process. This provides opportunities like having balancing and connection pooling logic taken care of for you, without the extra hop of a reverse proxy. And in cases where heartbeats are used for liveness, the library can handle heartbeating for you.

The disadvantage of this approach as a reusable system is that libraries are hard provide across languages, so there might be limited language support for the library. Depending on how complex the library is, it may also be difficult to port for people that want to make the effort to expand language support.

Though, the biggest disadvantage is putting the responsibility on the service in the first place. This creates two problems. First, if you intend to make your services useful to anybody else, your service will be less portable across environments that use different discovery mechanisms. Netflix open source projects suffer from this, as people already complain it's too hard to use some of their components without using all of them. Second, third-party components and services like Nginx, Memcached, or pretty much any datastore will not register themselves.

While some software might provide hooks or extensions to integrate with your service discovery, this is pretty rare. And patching is not a scalable solution. Instead, the common solution for third-party services is to put the registering responsibility near the service.

If you're not directly registering in-process, the second most common approach is running another parallel process to register the service. This works best with a process manager like systemd that can ensure if the service starts, so does the paired registering service.

Some call this technique using a coprocess or a "sidekick". When working with containers, I usually use coprocess in reference to another process in the same container. A sidekick would be a separate container and process. Either way, this is a useful pattern even beyond service registration. I use it for other administrative services that support the main service, for example to re-configure the service. The open source PaaS Deis used this pattern for shipping out a service's logs. However, it seems to simplify they're moving to my tool logspout.

A variation of using a coprocess is process "wrapping", where you use a launcher process that will register and run the service as a child process. Flynn does this with sdutil. Some might say it can make starting services feel very complicated since you now have to configure the service as usual, on top of providing registration details to the launcher. At the end of the day, this is effectively the coprocess model launched with one command instead of two.

The Problem with a Coprocess for Registering

In whatever form it comes, a coprocess comes with two challenges: configuration and manageability.

With a parallel announcing process, you need to tell it what service or services it should announce, providing it all the information we talked about before. An interesting problem with any external registration solution is where that service description is stored. For example, if you were doing announcement in-process, it would at least already know what ports it exposes. However, it most likely wouldn't know what the operator wants to call it. Some systems will roll all this information up into higher-level system constructs, like "service groups" or some unit of orchestration. I prefer not to couple service discovery with orchestration. Instead, I'd rather service semantics live as close to the service process as possible.

A coprocess or sidekick for registering also means you'll have one for every service you start. There is no technical problem with this, but it introduces operational complexity. A system has to manage this, whether it's a process manager like systemd or full-on orchestration. That system likely has to be configured, adding more configuration, which may or may not be the right place to define the service. And now you need to be sure to always use this system to launch any service, since running a service by hand will not register the service.

In an ideal world, we don't worry about any of this. We just run a service and its ports somehow get registered as services. If we want to specify more details about the service, we can do this in a way that's packaged as close to the service as possible. And of course, we want an operator and automation friendly way to set or override that service definition at runtime.

How Docker Helps Achieve the Ideal

Running services in Docker provides a number of benefits, and those who believe Docker is just about container isolation clearly miss the point. Docker defines a standard unit of software that can have anything in it and yet have a standard interface of operations. This interface works with a runtime that gives you certain capabilities in managing and operating that unit of software. These capabilities and this common container model happen to have everything we need to automatically register services for any software.

The Docker container image includes default environment variables, which can be defined by the Dockerfile. This turns out to be the perfect place to describe the service it contains. The container author has the option to use the environment variables to include their idea of how the service should be described and registered, which will be shipped with the container wherever it goes. The operator can then set runtime environment variables to further define or redefine their own description of the service.

The Docker runtime makes these values easy to inspect programmatically. The runtime also produces events when a container starts or stops, which is generally when you want to register or deregister the services of the container.

All this together lets us provide automatic service registration for any Docker container using a little appliance I've made called Registrator.

Introducing Registrator

Registrator is a single, host-level service you run as a Docker container. It watches for new containers, inspects them for service information, and registers them with a service registry. It also deregisters them when the container dies. It has a pluggable registry system, meaning it can work with a number of service discovery systems. Currently it supports Consul and etcd.

There are a few neat properties of Registrator:

First, it's automatic. You don't have to do anything special other than have Registrator running and attached to a service registry. Any public port published is registered as a service.

Related but fairly significant, it requires no cooperation from inside the container to register services. If no service description is included and the operator doesn't specify any at runtime, it uses Docker container introspection for good defaults.

Next, it uses environment variables as generic metadata to define the services. Some people have asked how you can add metadata to Docker containers, but the answer is right in front of them. As mentioned this comes with the benefit of being able to define them during container authorship, as well as at runtime.

Lastly, the metadata Registrator uses could become a common interface for automatic service registration beyond Registrator and even beyond Docker. Environment variables are a portable metadata system and Registrator defines a very data-driven way to define services. That same data could be used by any other system.

In terms of previous work, Michael Crosby's project Skydock was a big inspiration on the direction of Registrator, so it might be worth looking into for reference. Registrator is a little more generic and made specifically for distributed systems, not as much for single host registries. For example, Registrator focuses on published ports and uses a host-level IP as opposed to local container IPs. For people interested in single-host discovery, Registrator has already inspired compatible alternatives, including Brian Lalor's docker-hosts.

In any case, I believe I've made the first general purpose solution to automatic service registration. Here's a video demo:

Onward…

In retrospect, the problem we've solved here now seems very trivial, but we've never had this before. Like many good designs, it can take a while for all the pieces to come together and make sense in one's mind before it becomes obvious. Once it's obvious, it seems like it always was.

Combining auto-registration with a good service directory, you're almost to an ideal service discovery system. That last problem is about the other side of discovery: connecting to registered services. The next post will describe how this is also not as trivial as it sounds, and as usual, I will offer an open source solution.

Comments

Aug 20 2014

Consul Service Discovery with Docker

Consul is a powerful tool for building distributed systems. There are a handful of alternatives in this space, but Consul is the only one that really tries to provide a comprehensive solution for service discovery. As my last post points out, service discovery is a little more than what Consul can provide us, but it is probably the biggest piece of the puzzle.

Understanding Consul and the "Config Store"

The heart of Consul is a particular class of distributed datastore with properties that make it ideal for cluster configuration and coordination. Some call them lock servers, but I call them "config stores" since it more accurately reflects their key-value abstraction and common use for shared configuration.

The father of config stores is Google's Chubby, which was never made publicly available but is described in the influential Chubby paper. In the open source world we have Apache Zookeeper, the mostly defunct doozerd, and in the last year, etcd and Consul.

These specialized datastores are defined by their use of a consensus algorithm requiring a quorum for writes and generally exposing a simple key-value store. This key-value store is highly available, fault-tolerant, and maintains strong consistency guarantees. This can be contrasted with a number of alternative clustering approaches like master-slave or two-phase commit, all with their own benefits, drawbacks, and nuances.

You can learn more about the challenges of designing stateful distributed systems with the online book, Distributed systems for fun and profit. This image from the book summarizes where the quorum approach stands compared to others:

Quorum datastores such as our config stores seem to have many ideal properties except for performance. As a result, they're generally used as low-throughput coordinators for the rest of the system. You don't use them as your application database, but you might use them to coordinate replacing a failed database master.

Another common property of config stores is they all have mechanisms to watch for key-value changes in real-time. This feature is central in enabling use-cases such as electing masters, resource locking, and service presence.

Along comes Consul

Since Zookeeper came out, the subsequent config stores have been trying to simplify. Both in terms of user interface, ease of operation, and implementation of the consensus algorithms. However, they're all based on this very expressive, but lowest common denominator abstraction of a key-value store.

Consul is the first to build on top of this abstraction by also providing specific APIs around the semantics of common config store functions, namely service discovery and locking. It also does it in a way that's very thoughtful about those particular domains.

For example, a directory of services without service health is actually not a very useful one. This is why Consul also provides monitoring capabilities. Consul monitoring is comparable, and even compatible, with Nagios health checks. What's more, Consul's agent model makes it more scalable than centralized monitoring systems like Nagios.

A good way to think of Consul is broken into 3 layers. The middle layer is the actual config store, which is not that different from etcd or Zookeeper. The layers above and below are pretty unique to Consul.

Before Consul, HashiCorp developed a host node coordinator called Serf. It uses an efficient gossip protocol to connect a set of hosts into a cluster. The cluster is aware of its members and shares an event bus. This is primarily used to know when hosts come and go from the cluster, such as during a host failure. But in Serf the event bus was also exposed for custom events to trigger user actions on the hosts.

Consul leverages Serf as a foundational layer to help maintain its cluster. For the most part, it's more of an implementation detail. However, I believe in an upcoming version of Consul, the Serf event bus will also be exposed in the Consul API.

The key-value store in Consul is very similar to etcd. It shares the same semantics and basic HTTP API, but differs in subtle ways. For example, the API for reading values lets you optionally pick a consistency mode. This is great not just because it gives users a choice, but it documents the realities of different consistency levels. This transparency educates the user about the nuances of Consul's replication model.

On top of the key-value store are some other great features and APIs, including locks and leader election, which are pretty standard for what people originally called lock servers. Consul is also datacenter aware, so if you're running multiple clusters, it will let you federate clusters. Nothing complicated, but it's great to have built-in since spanning multiple datacenters is very common today.

However, the killer feature of Consul is its service catalog. Instead of using the key-value store to arbitrarily model your service directory as you would with etcd or Zookeeper, Consul exposes a specific API for managing services. Explicitly modeling services allows it to provide more value in two main ways: monitoring and DNS.

Built-in Monitoring System

Monitoring is normally discussed independent of service discovery, but it turns out to be highly related. Over the years, we've gotten better at understanding the importance of monitoring service health in relation to service discovery.

With Zookeeper, a common pattern for service presence, or liveness, was to have the service register an "ephemeral node" value announcing its address. As an ephemeral node, the value would exist as long as the service's TCP session with Zookeeper remained active. This seemed like a rather elegant solution to service presence. If the service died, the connection would be lost and the service listing would be dropped.

In the development of doozerd, the authors avoided this functionality, both for the sake of simplicity and that they believed it encouraged bad practice. The problem with relying on a TCP connection for service health is that it doesn't exactly mean the service is healthy. For example, if the TCP connection was going through a transparent proxy that accidentally kept the connection alive, the service could die and the ephemeral node may continue to exist.

Instead, they implemented values with an optional TTL. This allowed for the pattern of actively updating the value if the service was healthy. TTL semantics are also used in etcd, allowing the same active heartbeat pattern. Consul supports TTL as well, but primarily focuses on more robust liveness mechanisms. In the discovery layer I helped design for Flynn, our client library lets you register your service and it will automatically heartbeat for you behind the scenes.

This is generally effective for service presence, but it might not take the lesson to heart. Blake Mizerany, the co-author of doozerd and now maintainer of etcd, will stress the importance of meaningful liveness checks. In other words, there is no one-size-fits-all. Every service performs a different function and without testing that specific functionality, we don't actually know that it's working properly. Generic heartbeats can let us know if the process is running, but not that it's behaving correctly enough to safely accept connections.

Specialized health checks are exactly what monitoring systems give us, and Consul gives us a distributed monitoring system. Then it lets us choose if we want to want to associate a check with a service, while also supporting the simpler TTL heartbeat model as an alternative. Either way, if a service is detected as not healthy, it's hidden from queries for active services.

Built-in DNS Server

In my last post, I mentioned how DNS is not a sufficient technology for service discovery. I was very hesitant in accepting the value of a DNS interface to services in Consul. As I described before, all our environments are set up to use DNS for resolving names to IPs, not IPs with ports. So other than identifying the IPs of hosts in the cluster, the DNS interface at first glance seems to provide limited value, if any, for our concept of service discovery.

However, it does serve SRV records for services, and this is huge. Built-in DNS resolvers in our environments don't lookup SRV records, however, the library support to do SRV lookups ourselves is about as ubiquitous as HTTP. This took me a while to realize. It means we all have a client, even more lightweight than HTTP, and it's made specifically for looking up a service.

To me this makes SRV the best standard API for simple service discovery lookups. I hope more service discovery systems implement it.

In a later post in this series, we build on SRV records from Consul DNS to generically solve service inter-connections in Docker clusters. I don't think I would have realized any of this if Consul didn't provide a built-in DNS server.

Consul and the Ecosystem

Consul development is very active. In the past few months, they've had several significant releases, although it's still pre-1.0. Etcd is also actively being developed, though currently from the inside out, focusing on a re-design of their Raft implementation. The two projects are similar in many ways, but also very different. I hope they learn and influence each other, perhaps even share some code since they're both written in Go. At this point, though, Consul is ahead as a comprehensive service discovery primitive.

Unfortunately, Consul is much less popular in the Docker world. Perhaps this is just due to less of a focus on containers at HashiCorp, which is contrasted by the heavily container-oriented mindset of the etcd maintainers at CoreOS.

I've been trying hard to help bridge the Docker and Consul world by building a solid Consul container for Docker. I try to design containers to be self-contained, runtime-configurable appliances as much as possible. It was not hard to do this with Consul, which is now available on Github or Docker Hub.

Running Consul in Docker

Running a Consul node in Docker for a production cluster can be a bit tricky. This is due to the amount of configuration that the container itself needs for Consul to work. For example, here's how you might start one node using Docker (one command over several lines for readability):

$ docker run --name consul -h $HOSTNAME  \
    -p 10.0.1.1:8300:8300 \
    -p 10.0.1.1:8301:8301 \
    -p 10.0.1.1:8301:8301/udp \
    -p 10.0.1.1:8302:8302 \
    -p 10.0.1.1:8302:8302/udp \
    -p 10.0.1.1:8400:8400 \
    -p 10.0.1.1:8500:8500 \
    -p 172.17.42.1:53:53/udp \
    -d -v /mnt:/data \
    progrium/consul -server -advertise 10.0.1.1 -join 10.0.1.2

The Consul container I built comes with a helper command letting you simply run:

$ $(docker run progrium/consul cmd:run 10.0.1.1::10.0.1.2 -d -v /mnt:/data)

This is just a special command to generate a full Docker run command like the first one, hence wrapping it in a subshell. It's not required, but a helpful convenience to hopefully get people started with Consul in Docker much quicker.

One of the neat ways Consul and Docker can work together is by giving Consul as a DNS server to Docker. This transparently runs DNS resolution in containers through Consul. If you set this up at the Docker daemon level, you can also specify DNS search domains. That means the .services.consul can be dropped, allowing containers to resolve records with just the service name.

The project README has some pretty helpful getting started instructions as well as more detail on all these features. Here's a quick video showing how easy it is to get a Consul cluster up and running inside Docker, including the above DNS trick.

Onward…

Once you have Consul running in Docker, you're close to having great service discovery, but as I mentioned in my last post, you're still missing those second two legs. Stay tuned for the next post on automatically registering containerized services with Consul.

Comments
Jul 29 2014

Understanding Modern Service Discovery with Docker

Over the next few posts, I'm going to be exploring the concepts of service discovery in modern service-oriented architectures, specifically around Docker. Many people aren't familiar with service discovery, so I have to start from the beginning. In this post I'm going to be explaining the problem and providing some historical context around solutions so far in this domain.

Ultimately, we're trying to get Docker containers to easily communicate across hosts. This is seen by some as one of the next big challenges in the Docker ecosystem. Some are waiting for software-defined networking (SDN) to come and save the day. I'm also excited by SDN, but I believe that well executed service discovery is the right answer today, and will continue to be useful in a world with cheap and easy software networking.

What is service discovery?

Service discovery tools manage how processes and services in a cluster can find and talk to one another. It involves a directory of services, registering services in that directory, and then being able to lookup and connect to services in that directory.

At its core, service discovery is about knowing when any process in the cluster is listening on a TCP or UDP port, and being able to look up and connect to that port by name.

Service discovery is a general idea, not specific to Docker, but is increasingly gaining mindshare in mainstream system architecture. Traditionally associated with zero-configuration networking, its more modern use can be summarized as facilitating connections to dynamic, sometimes ephemeral services.

This is particularly relevant today not just because of service-oriented architecture and microservices, but our increasingly dynamic compute environments to support these architectures. Already dynamic VM-based platforms like EC2 are slowly giving way to even more dynamic higher-level compute frameworks like Mesos. Docker is only contributing to this trend.

Name Resolution and DNS

You might think, "Looking up by name? Sounds like DNS." Yes, name resolution is a big part of service discovery, but DNS alone is insufficient for a number of reasons.

A key reason is that DNS was originally not optimized for closed systems with real-time changes in name resolution. You can get away with setting TTL's to 0 in a closed environment, but this also means you need to serve and manage your own internal DNS. What highly available DNS datastore will you use? What creates and destroys DNS records for your services? Are you prepared for the archaic world of DNS RFCs and server implementations?

Actually, one of the biggest drawbacks of DNS for service discovery is that DNS was designed for a world in which we used standard ports for our services. HTTP is on port 80, SSH is on port 22, and so on. In that world, all you need is the IP of the host for the service, which is what an A record gives you. Today, even with private NATs and in some cases with IPv6, our services will listen on completely non-standard, sometimes random ports. Especially with Docker, we have many applications running on the same host.

You may be familiar with SRV records, or "service" records, which were designed to address this problem by providing the port as well as the IP in query responses. At least in terms of a data model, this brings DNS closer to addressing modern service discovery.

Unfortunately, SRV records alone are basically dead on arrival. Have you ever used a library or API to create a socket connection that didn't ask for the port? Where do you tell it to do an SRV record lookup? You don't. You can't. It's too late. Either software explicitly supports SRV records, or DNS is effectively just a tool for resolving names to host IPs.

Despite all this, DNS is still a marvel of engineering, and even SRV records will be useful to us yet. But for all these reasons, on top of the demands of building distributed systems, most large tech companies went down a different path.

Rise of the Lock Service

In 2006, Google released a paper describing Chubby, their distributed lock service. It implemented distributed consensus based on Paxos to provide a consistent, partition-tolerant (CP in CAP theorem) key-value store that could be used for coordinating leader elections, resource locking, and reliable low-volume storage. They began to use this for internal name resolution instead of DNS.

Eventually, the paper inspired an open source equivalent of Chubby called Zookeeper that spun out of the Hadoop Apache project. This became the de facto standard lock server in the open source world, mainly because there were no alternatives with the same properties of high availability and reliability over performance. The Paxos consensus algorithm was also non-trivial to implement.

Zookeeper provides similar semantics as Chubby for coordinating distributed systems, and being a consistent and highly available key-value store makes it an ideal cluster configuration store and directory of services. It's become a dependency to many major projects that require distributed coordination, including Hadoop, Storm, Mesos, Kafka, and others. Not surprisingly, it's used in mostly other Apache projects, often deployed in larger tech companies. It is quite heavyweight and not terribly accessible to "everyday" developers.

About a year ago, a simpler alternative to the Paxos algorithm was published called Raft. This set the stage for a real Zookeeper alternative and, sure enough, etcd was soon introduced by CoreOS. Besides being based on a simpler consensus algorithm, etcd is overall simpler. It's written in Go and lets you use HTTP to interact with it. I was extremely excited by etcd and used it in the initial architecture for Flynn.

Today there's also Consul by Hashicorp, which builds on the ideas of etcd. I specifically explore Consul and lock servers more in my next post.

Service Discovery Solutions

Both Consul and etcd advertise themselves as service discovery solutions. Unfortunately, that's not entirely true. They're great service directories. But this is just part of a service discovery solution. So what's missing?

We're missing exactly how to get all our software, whether custom services or off-the-shelf software, to integrate with and use the service directory. This is particularly interesting to the Docker community, which ideally has portable solutions for anything that can run in a container.

A comprehensive solution to service discovery will have three legs:

  • A consistent (ideally), highly available service directory
  • A mechanism to register services and monitor service health
  • A mechanism to lookup and connect to services

We've got good technology for the first leg, but the remaining legs, despite how they sound, aren't exactly trivial. Especially when ideally you want them to be automatic and "non-invasive." In other words, they work with non-cooperating software, not designed for a service discovery system. Luckily, Docker has both increased the demand for these properties and makes them easier to solve.

In a world where you have lots of services coming and going across many hosts, service discovery is extremely valuable, if not necessary. Even in smaller systems, a solid service discovery system should reduce the effort in configuring and connecting services together to nearly nothing. Adding the responsibility of service discovery to configuration management tools, or using a centralized message queue for everything are all-to-common alternatives that we know just don't scale.

My goal with these posts is to help you understand and arrive at a good idea of what a service discovery system should actually encompass. The next few posts will take a deeper look at each of the above mentioned legs, touching on various approaches, and ultimately explaining what I ended up doing for my soon-to-be-released project, Consulate.

Comments
Jul 25 2014

Building the Future with Docker

Before Docker, a handful of colleagues and I were dealing with a typical service-oriented system on EC2. It was typical in that it was quite far from ideal. Deploying components took too long. Getting new components in production was a nightmare. It was resource inefficient and harder to manage than it needed to be. It was big and complicated, and tightly-coupled in all the wrong places. It was hard to make changes to, especially removing legacy and broken code. Every component was different. Developers could only run a few components at a time locally, which was nothing like production. System-level integration testing was near impossible. On and on…

We built and scaled individual components and solved individual problems to satisfy product requirements. As such, the company continued to be successful. But the system as a whole, which had the most influence on the developer and operator experience, was not improving fast enough to keep up.

We sat down and started to just sketch out ideas of what an ideal system would look like. What would the perfect developer experience be like? How could you keep operators from falling into reactive mode? What would facilitate developers and operators to better work together? What infrastructure would you need to employ everything we knew to be best practice? What were large-scale systems like Google and Twitter doing for these that we weren't? How could you collapse these requirements into the simplest system possible?

For the past three years, I've been focusing on building and refining that ideal system. The full picture is still only in my head, but over time it's been getting validated as new tools come out, either by me or by others. It's been further validated and refined as I talk and work with various companies, both providers and consumers in this space.

It turns out the ideal involves a different way of thinking about systems than common system architecture and operations. Though, it's not so different from the way modern large-scale systems like Netflix and Google have evolved. The difference is in simplicity, orthogonality, and, most important, the overall experience design.

I don't just have a system in mind, I have particular kind of experience with that system in mind.

I played a very minor role in the creation and success of Docker. To me, it was the first necessary, and maybe most pivotal component in making a reasonable approximation of this ideal system. Docker doesn't solve every problem, but it's not supposed to. It is the meta-primitive needed to build the rest of the primitives for this system. Docker is huge, but it's just the foundation.

I don't know how far I'll be able to take this obsessive drive to flesh out the rest of this ideal system. Luckily, I've found a lot of supporters that have sponsored my work in various ways. Without them, I would likely not even be working on these types of technologies.

Anyway, a couple weeks ago CenturyLink Labs did an interview with me that captures a slice of the specifics in what I'm talking about. It covers some history, but it also gets into my roadmap, which I'm hoping to share more details on in upcoming posts. The big milestone for me is Manifold. As a stepping stone, the project Consulate should have a big impact in the Docker community. Some of the other projects mentioned like Duplex and Configurator are also just as important to me.

Comments
Jul 01 2014

Beyond Flynn, or Flynn-as-a-Worldview

About five months ago, I finally wrote about Flynn, a project started almost a year ago, but as I described was in the making for much longer. I wanted to give an update on where I am today and set the stage for some upcoming posts on what I've been working on and what I've been learning.

Lots has happened in five months, but first I want to talk about Flynn. There's been some confusion, so I should probably clarify that while I did help start and co-architect Flynn, it's not actually a project of mine. Jonathan and Daniel have been very considerate in allowing me to share ownership conceptually and architecturally, and they treated me as an equal partner as long as they could afford me, but technically I never was. This isn't a problem, it's just a little confusing to some.

Ultimately, it doesn't matter because Flynn is open source and they're going out of their way to protect the integrity of the project as open source. That's actually why I'm not in the Github organization. The project has rules about maintainers meeting an ongoing contribution requirement. Since I lost the means to work on Flynn in January, I no longer met that requirement, so I'm not listed as a maintainer.

Anyway, what's exciting to me are the ideas and ideals of Flynn architecturally. This is why I've continued to give talks about Flynn at conferences. While the initial goal of Flynn was to be an open source PaaS, the actual scope of Flynn was quite open-ended from there. This is because Flynn is about a new paradigm of infrastructure.

This new paradigm is what's been in the making for so long. It's the reason I collaborated with dotCloud on Docker, and it's the motivation behind most of my work recently, including Flynn. This paradigm is ultimately what I'm interested in figuring out and building. It's more than Docker, it's beyond Flynn. It's an ecosystem of projects that align with this worldview, that nobody owns, and maybe for the time has no name. But at least people are now starting to see it coming.

For the time, Flynn is the closest thing to embody this paradigm. But there are others like CoreOS, and even Deis have aspirations towards this ideal. While my work is likely to feed back into Flynn and it's possible we'll collaborate closely again in the future, I'm not specifically working on Flynn right now. It's more like R&D for Flynn and Deis and even Docker itself, forging ahead to bring us closer to this world I've been imagining for quite a while.

Alright, so what have I been working on? What's happened since February?

Not long after my last post, I decided to explore an opportunity working with DigitalOcean. The general premise being, "they want to build some kind of platform layer on top of their VMs, and I can help them build that platform" using Flynn or whatever makes most sense. It seemed like a good vehicle to continue the work I had started, and work with an amazing team and great brand to take it even further.

As a warm-up / get-acquainted project, I decided to take on re-designing the DigitalOcean API. I drafted an initial API design, pushed it through reviews, set up infrastructure for docs and public feedback, and generally got everything moving. Eventually it was handed off since I was not getting anything else done. That API entered public beta last week.

What we learned in that time, however, was that in the growth chaos that is DigitalOcean's success, DigitalOcean was not clear on what they wanted in terms of an application platform layer. And at the time there weren't enough resources to allocate to it when there was plenty to be done with their existing cloud infrastructure. We came to an arrangement to put my employment on hold as a leave of absence for several months, coupled with sponsorship of my open source work in this space.

I'm absolutely thankful for this since otherwise I would not be making any progress towards this vision. Instead, I'd probably become that bitter early innovator that would have to leave the industry to avoid the pain of seeing only a shadow of what could have been…

As of right now, the sponsorship is a little past half over. I've been working on as much as I can in this time, but I think now would be a good time to start sharing it all with more context. If you follow me on Twitter or Github, there's been a lot of activity, and some projects have better documentation than others. I'm going to start explaining them in blog posts and hopefully that will also help convey this paradigm of tooling I'm after.

For now, here are links to a few of the projects built and released in this time. Most of them are components for projects I haven't gotten to yet, but that I'm excited to announce soon.

Lastly, Flynn is still in active development towards a stable release since their preview release in April. I still contribute here and there and stay in touch with those guys. I also stay in touch with Deis, CoreOS, and more recently the Hashicorp guys. Really I talk to everybody and anybody doing anything vaguely related to what I'm after because it's important we're all moving in roughly the same direction, and I can only build on the shoulders of giants.

Comments