Jan 05
2013

Executable Tweets and Programs in Short URLs

A few weeks ago I was completely consumed for the better part of a day that I would have otherwise spent on more practical work.

Yeah, what? Weird, right? It started from a Twitter conversation earlier that day with my friend Joel:

This wishful brainstorming inspired me to start building exactly that. But first, a digression.

The idea reminded me of an idea I got from Adam Smith back when I was working on Scriptlets. If you can execute code from a URL, you could “store” a program in a shortened URL. I decided to combine this with the curl-pipe-bash technique that’s been starting to get popular to bootstrap installs. If you’re unfamiliar, take this Gist of a Bash script:

Given the “view raw” URL for that Gist, you can curl it and pipe it into Bash to execute it right there in your shell. It would look like this:

$ curl -s https://gist.github.com/raw/4464431/gistfile1.txt | bash
Hello world

Instead of having Gist store the program, how could we make it so the source would just live within the URL? Well in the case of curl-pipe-bash, we just need that source to be returned in the body of a URL. So I built a simple app to run on Heroku that takes the query string and outputs it in the body, a sort of echo service.

Letting you do this:

$ curl "http://queryecho.herokuapp.com?Hello+world"
Hello world

Which you could conceal and shorten with a URL shortener, like Bitly. I prefer the j.mp domain Bitly has. And since they’re just redirecting you to the long URL, you’d use the -L option in curl to make it follow redirects:

$ curl -L http://j.mp/RyUN03
Hello world

When you make a short URL from the bitly website, they conveniently make sure the query string is properly URL encoded. So if I just typed queryecho.herokuapp.com/?echo "Hello world" into bitly, it would give me a short URL with a properly URL encoded version of that URL that would return echo "Hello world". This URL we could then curl-pipe into Bash:

$ curl -Ls http://j.mp/VGgI3o | bash
Hello world

See what’s going on there? We wrote a simple Hello world program in Bash that effectively lives in that short URL. And we can run it with the curl-pipe-bash technique.

Later in our conversation, Joel suggests an example “app tweet” that if executed in Bash given a URL argument, it would tell you where it redirects. So if you gave it a short URL, it would tell you the long URL.

Just so you know what it would look like, if you put that program in a shell script and ran it against a short URL that redirected to www.google.com, this is what you would see:

$ ./unshortener.sh http://j.mp/www-google-com
http://j.mp/www-google-com
http://www.google.com/

It prints the URL you gave it and then resolves the URL and prints the long URL. Pretty simple.

So I decided to put this program in a short URL. Here we have j.mp/TaHyRh which will resolve to:

http://queryecho.herokuapp.com/?echo%20%22$url%22;%20curl%20-ILs%20%22$url%22%20|%20grep%20Location%20|%20grep%20-o%20'http.*'

Luckily I didn’t have to do all that URL encoding. I just pasted his code in after queryecho.herokuapp.com/? and bitly took care of it. What’s funny is that this example program is made to run on short URLs, so when I told him about it, my example ran on the short URL that contained the program itself:

$ curl -Ls http://j.mp/TaHyRh | url=http://j.mp/TaHyRh bash
http://j.mp/TaHyRh
http://queryecho.herokuapp.com/?echo "$url"; curl -ILs "$url" | grep Location | grep -o 'http.*'

You may have noticed my version of the program uses $url instead of $1 because we have to use environment variables to provide input to curl-pipe-bash scripts. For reference, to run my URL script against the google.com short URL we made before, it would look like this:

$ curl -Ls http://j.mp/TaHyRh | url=http://j.mp/www-google-com bash
http://j.mp/www-google-com
http://www.google.com/

Okay, so we can now put Bash scripts in short URLs. What happened to installing apps in Tweets? Building an apptweet program like Joel imagined would actually be pretty straightforward. But I wanted to build it in and install it with these weird programs-in-short-URLs.

The first obstacle was figuring out how to get it to modify your current environment. Normally curl-pipe-bash URLs install a downloaded program into your PATH. But I didn’t want to install a bunch of files on your computer. Instead I just wanted to install a temporary Bash function that would disappear when you leave your shell session. In order to do this, I had to do a variant of the curl-pipe-bash technique using eval:

$ eval $(curl -Ls http://j.mp/setup-fetchtweet)
$ fetchtweet 279072855206031360
@jf you asked for it... Jeff Lindsay (@progrium) December 13, 2012

As you can see by inspecting that URL, it just defines a Bash function that runs a Python script from a Gist. I cheated and used Gist for some reason. That Python script uses the Twitter embed endpoint (same one used for the embedded Tweets in this post) to get the contents of a Tweet without authentication.

The next thing I built installed and used fetchtweet to get a Tweet, parse it, put it in a Bash function named by the string after an #exectweet hashtag (which happens to also start a comment in Bash). So here we have a Tweet with a program in it:

To install it, we’d run this:

$ id=279087620145958912 eval $(curl -Ls http://j.mp/install-tweet)
Installed helloworld from Tweet 279087620145958912
$ helloworld
Hello world

We just installed a program from a Tweet and ran it! Then I wrapped this up into a command you could install. To install the installer. This time it would let you give it the URL to a Tweet:

$ eval $(curl -Ls http://j.mp/install-exectweet) 
Installed exectweet
$ exectweet https://twitter.com/progrium/status/279087620145958912
Installed helloworld from Tweet 279087620145958912
$ helloworld
Hello world

Where would I go from there? An app that calls itself into a loop, of course!

$ exectweet https://twitter.com/progrium/status/279123541054595074 && recursive-app
Installed recursive-app from Tweet 279123541054595074
Installed recursive-app from Tweet 279123541054595074
Installed recursive-app from Tweet 279123541054595074
Installed recursive-app from Tweet 279123541054595074
...

Obviously, this whole project was just a ridiculous, mind-bending exploration. I shared most of these examples on Twitter as I was making them. Here was my favorite response.

You may have noticed, it just happened to be 12/12/2012 that day.

Comments
Jan 01
2013

Where did Localtunnel come from?

Five years ago, async network programming scared me. I was a web developer. Working with the high level tools and frameworks of HTTP seemed much easier than any sort of serious low level networking. Especially since network programming would often also mean some kind of concurrent programming with threads or callbacks. I had mostly avoided multithreading and had no idea what an event loop was. I came from PHP.

Around 2007, I was starting to think about webhooks. One motivator was how you could use webhooks to let web developers, like me, build systems that used other protocols without them having to work with that protocol. For example, one of my first projects with webhooks was called Mailhooks. I wanted to accept email in my application, but I didn’t want to deal with email servers. I wanted to get an HTTP POST when an email came in with all the email fields nicely provided as POST parameters.

This is how I started working with Twisted. Twisted became my main tool to build webhook adapters for existing protocols. I even tried to generalize that idea in a project called Protocol Droid. Slowly I started to grok, and not fear, this kind of programming.

It’s funny how my desire to work with abstractions that didn’t exist yet to avoid a certain kind of programming was directly responsible for me eventually becoming an expert in that kind of programming.

Then in late 2009, I had another idea while thinking about webhooks. It would be great if I could expose a local web server to the Internet with a friendly URL. It should just be a simple command. There would have to be a server, but there could just be a public server that you didn’t even have to think about.

I committed the first prototype of Localtunnel to Github in January 2010. It was written entirely in Twisted. It also didn’t actually work. I really recommend taking a look because it was terrible. One of the challenges was multiplexing the HTTP requests into a single tunnel connection. My approach was so naive it just didn’t work. As soon as you made more than one request at a time, it broke.

A few months later, I decided to take a different approach. Instead of doing my own protocol, client, and server, I’d just make a wrapper around what I knew already worked: SSH tunneling. This was pretty quick to make happen, and that version is basically what’s been in production to this day.

This shortcut came with a lot of weird quirks. For example, the easiest way I found to implement an SSH tunnel client was a Ruby library, so I implemented the client in Ruby. The server, though, was in Python because I still only really knew Twisted for evented programming.

Actually, using SSH was the source of most of the quirks and annoyances. I was pretty bothered that it slowed down the initial user experience by requiring a public key to be uploaded. But most of the pain was operational. The server, sshd, would create a process for every tunnel. Localtunnel also needed its own user and to pretty much own the SSH configuration for that machine. Then, on occasion, something weird would happen where a tunnel would die and the process would go crazy eating up CPU. It would have to be manually killed or it would eventually bring the server to a halt. And, eventually, the authorized_keys file would become enormous from all the keys uploaded.

On top of all this, SSH is pretty opaque. It’s been around for so long and used so much that it certainly just works … you just don’t really know how. I still don’t know how SSH does tunneling or what the protocol looks like, even after trying to read the RFC for it.

By mid-2011, I was working at Twilio building distributed, real-time messaging systems at scale. I certainly came a long way from fearing async network programming. Localtunnel was still running the implementation based on SSH. By then it had quite a large user base and collected a number of bugs and feature requests. I also had my own operations and user experience wish list. With such a huge list of new requirements, so many problems with the current implementation, and a drastically different experience level and mindset, I decided to redesign Localtunnel from the ground up.

Since I was pretty consumed by Twilio, I didn’t have a lot of time to work on Localtunnel. I thought the biggest bang for buck in the long term would be to slowly work on the new version. They say software is never done, but I personally believe software can be finished. It just requires an aggressive drive for simplicity, and the only way you can make significant advances in simplicity is through redesign.

In the meantime, users continued to experience issues with the current implementation. These problems only got worse as it became more popular. For example, the biggest issue was that the namespace for tunnel names was too small. Users would get requests from old tunnels, and in rare cases tunnel names would get pulled out from under you while using them. This created confusion and a lot of emails and issue tickets, but it still worked with the occasional restart.

I’ve used this constant stream of complaints, which has been going on for almost two years, to make sure I keep making progress on the new version. In fact, I’m pretty sure I needed it because of my lifestyle of abundant projects.

Last week I finally released a beta of the new version. What’s interesting is that it’s a completely different architecture from what I started out with for the redesign. After the original unreleased prototype, there’s been 3 major approaches to implementation. In the coming weeks I’m going to share a more technical history of the architecture of Localtunnel, leading up to a deep exploration of what I hope will be its final form.

Comments
Dec 25
2012

Localtunnel v2 available in beta

A few years back, I released Localtunnel to make it super easy to expose a local web server to the Internet for demos and debugging. Since then, it’s gotten a ton of use. A few people even copied it and tried to make a paid service around the idea. Luckily, Localtunnel will always be free and open source.

With the release of Localtunnel v2, it will not only remain competitive with similar services, but continue to be the innovator of the group. I’ll post more on this later.

For now, let’s talk logistics. The current, soon-to-be-legacy Localtunnel stack includes the client that you install with Rubygems, and a server that runs on a host at Rackspace. These will continue to be available into 2013, but will be marked as deprecated. This means you should be making the switch to v2.

Besides the fact v1 will eventually be shutdown, there are a number of reasons to switch to v2. Here are some of the major ones:

  • It’s actively maintained. Bug reports, pull requests, and service interruptions are dealt with promptly.
  • No more mysterious requests from old tunnels. The subdomain namespace is much larger.
  • Custom subdomains. The new client lets you pick a tunnel name on a first come, first served basis.
  • Supports long-polling, HTTP streaming, and WebSocket upgrades. Soon general TCP tunneling.
  • No SSH key to start using it. A minor annoyance setting up v1, but it doesn’t exist in v2.

One implementation detail that affects users is that the client is now written in Python. This means you won’t use Rubygems to install it. Instead, you can use easy_install or pip.

$ easy_install localtunnel

On some systems, you may need to run this with sudo. If you don’t have easy_install, first make sure you have Python installed:

$ python --version

Localtunnel requires Python 2.6 or later, which comes standard on most systems. If you don’t have Python, you can install it for your platform. If easy_install isn’t available after you install Python, you can install it with this bootstrap script:

$ curl http://peak.telecommunity.com/dist/ez_setup.py | python

Once you’ve installed Localtunnel with easy_install, it will be available as localtunnel-beta. This lets you keep the old client to use in case anything goes wrong with v2 during the beta. Eventually, it will be installed as localtunnel, but only after v1 is shutdown.

Using localtunnel-beta is pretty much the same as before:

$ localtunnel-beta 8000
  Thanks for trying localtunnel v2 beta!

  Port 8000 is now accessible from http://fb0322605126.v2.localtunnel.com ...

Like I mentioned earlier, you can use a custom tunnel name if it’s not being used:

$ localtunnel-beta -n foobar 8000
  Thanks for trying localtunnel v2 beta!

  Port 8000 is now accessible from http://foobar.v2.localtunnel.com ...

Keep in mind v2 is in active development. There might be some downtime while I work out operational bugs, but you can always use the old version if you run into problems.

If you do run into any problems, you can ping me on Twitter. If you get traceback you can create an issue on Github. If you have more in-depth questions or want to get involved in development, check out the Localtunnel Google Group.

Comments
Dec 17
2012

HTTP Signatures with Content-HMAC

Today I wanted to propose another header. It would be used for signing HTTP content with HMAC, and is appropriately called Content-HMAC. In a previous post about the Callback header, I mentioned using an X-Signature header in callback requests to sign the payload of the callback. It looked like this:

X-Signature: sha1=<hexdigest of sha1 hmac>

The HMAC would be built with just the content of the request (i.e., no headers, no query params) and a secret key. This was borrowed directly from the PubSubHubbub spec, but the general idea of using HMAC to sign callback requests has become pretty standard in the world of webhooks. Here are details on how Google and Twilio use them.

Each of these providers is using their own header for basically the same use case. It would seem like there is an opportunity to standardize on a common header format for it. There’s been a number of proposals for a general Signature header to sign an entire request. There was a fairly comprehensive one proposed called Content-Signature. With signing, the difficulty is often getting the input string correct. Most signing mechanisms need to normalize their input. If you’ve ever had to deal with OAuth or AWS signatures, you’ll know what I’m talking about. With request signing, the headers pose a particularly tricky situation with signing since they often change as the request goes through proxies.

The idea of Content-HMAC is to focus on a simpler goal of signing just the content payload, since it’s normally treated as-is, and is not altered when going through proxies. The X-Signature proposal I had was a decent one, as is almost any cowpath-based proposal, but I realized it would probably be a good idea to limit the implied scope to what it’s really doing: providing an HMAC for request (or response) content.

It turns out there’s a similar header that’s not used that often anymore called Content-MD5. It was a simple mechanism to provide an MD5 digest of the content. My current proposal is to take this existing pattern and apply it to HMAC, giving us the Content-HMAC header:

Content-HMAC: <hash mechanism> <base64 encoded binary HMAC>

Here’s an example:

Content-HMAC: sha1 f1wOnLLwcTexwCSRCNXEAKPDm+U=

This proposal borrows its naming convention from Content-MD5, but the format is more similar to Authorization. The Authorization header allows multiple authorization schemes to be used. You define the scheme followed by a space and then the actual authorization data. Since HMAC allows different hashing techniques to be used, we use that pattern here to let you specify the hashing technique. We also take the existing pattern of base64 encoding used in several HTTP headers to make it conform even more to existing standards.

Content-HMAC was created for callback requests, but it’s a useful way to sign any HTTP request or response payload. For requests, it’s worth mentioning it only applies when there is a content payload, so for example it’s meaningless with GET requests.

It’s also very worth mentioning that the need for content signing is unnecessary when using HTTPS. It currently looks like the future will eventually be 100% SSL encrypted HTTP, but until then, there will always be situations where HTTPS is not available. Content-HMAC is perhaps a stop-gap until we reach that ideal. Until then, I think Content-HMAC is a good, standard way to add authorization to callback requests.

Let me know if you have any questions or feedback on this proposal. Further discussion is likely to happen on the Webhooks Google Group.

Comments
Dec 15
2012

Avoiding environmental fallacy with systems thinking

In 1905, German chemist Alfred Einhorn invented Novocaine to be used by doctors in surgery as a general anesthetic. Unfortunately, doctors didn’t find Novocaine to be a suitable general anesthetic. However, dentists were dying to use it as a local anesthetic. The inventor didn’t want to sell it for the “mundane purpose” of drilling teeth, so he continued marketing to doctors and surgeons. Einhorn persisted until his death, unwilling to let the market dictate the use of his invention. He felt the intrinsic value of Novocaine as a general anesthetic was enough to sell it as such, no matter what extrinsic value was placed on it by actual market demands. Charles West Churchman would call this an “environmental fallacy.”

Environmental fallacy is the blunder of ignoring or not understanding the effects of the environment of a system. Examples of this fallacy are all around us. Anti-drug legislation fails to see long-term, societal implications because they’re preoccupied by the immediate, localized problems. Efforts to improve a standardized public education are precisely and meticulously solving the wrong problem. Silicon Valley startups spend our brightest intellectual resources on photo sharing and social-whatever, while industries that affect the quality of living for millions are left with bureaucrats.

One could describe these all as failing to see the bigger picture. In systems we call this the environment of a system. The significance of which is governed by the principle of openness.

Openness is the principle that open systems, which includes everything from problems to corporations to opinions to products, can only be understood in the context of their environment. This is because open systems are dependent on and co-determined by their context. A closed system, like a watch or a hammer, can function entirely based on its own internal structure and process. An open system interacts with and is inextricably linked with its environment.

This insight may seem banal. In fact, the younger generations and the progressive recent generations are quite familiar with this concept at least as a vague intuition. But this is a very recent development. We don’t appreciate how little this idea was understood for basically all of human existence up until just a few decades ago.

Science, for example. Science is our greatest effort to understand our objective reality. Like any other open system, it was defined and limited by the context of its time. As modern science began to develop 350 years ago, it was based on a worldview that denied the principle of openness. Most subjects were studied as closed systems.

For the greater part of its life, science has only understood the environment as something to be minimized. This is best shown in laboratories, a symbol of scientific activity, which are specifically designed to exclude the environment. Based on the doctrines of determinism and reductionism, science up until the last 4 or 5 decades has ignored the environment in favor of reductionist explanations focused on internal determinism. At best this only partially describes most actual phenomenon. For example, Galileo’s equations for freely falling bodies completely ignore air resistance and the rotation of Earth, and Ohm’s law assumes there will be no dramatic change in surrounding temperature. In both cases, the assumption is no environment.

This understated handicap of traditional science ended up as the major dilemma in the 1992 film Medicine Man. Sean Connery’s character finds a miracle cure for cancer in a flower, but in the lab he’s unable to reproduce it. He eventually finds out the flower itself was not the cure, but that the cure was produced by the flower interacting with another element in its environment. The unintentional moral of the story is about the significance of environment and environmental fallacy.

Only until ecology took off in the mid-20th century did we have a science that explicitly observed the environment, though primarily as a subset of biology. In many ways, ecology was a precursor to systems sciences. The difference between an ecological environment and the environment of a system is that a system environment is more general. It can be used to talk about physical environments, but also abstract environments, such as decision-making and problem-solving environments.

Often the environment refers to all external variables and conditions of a system, but in some cases it might refer to a particular part of the total environment. This is because the environment represents any surrounding system. Any one open system is embedded in a greater system, embedded in an even greater system, and so on. For example, one slice of how nested environments can affect an individual at work might look like this:

If all of these layers influence each other, you start to realize, maybe somewhat helplessly, that everything depends on everything else. No wonder science originally dismissed the environment. But ignoring the complexities and dynamics of open systems leads to sometimes serious disparities from reality.

In 1850, which for historical context was when California became a state and the US got its 13th president, the leading scientists of the western world convened for a conference in Europe. They actually concluded that in just 50 years, through science, they would have a complete understanding of the universe. This absurd notion stemmed from the foundations of scientific thought, which been tremendously useful, but also severely limiting. Only after the Heisenberg principle in the late 1920’s have we begun to accept that reality is just too complicated to fully understand at once.

Ironically, admitting this has been beneficial to our grasp of reality. It’s helped us realize new frameworks for thinking and coping with our increasingly complex and interdependent world. Luckily, our world is so globalized and connected today that modern generations are growing up with this reality as a daily experience. Systems theory and systems thinking are tools that can keep the appreciation of openness and the defining power of context as a first class tenet in all our endeavors.

Comments