Serverless Architectures and the Internet of Garbage - AWS Lambda, Apache Open Whisk and Google Cloud Services

Can I introduce everyone to the latest IT innovation: the Internet of Garbage, a rather neat idea from GreenQ, which is using IBM’s OpenWhisk (an Apache Open Source Software (OSS) incubator project), on Bluemix, to bring efficient technology into waste management. I was talking to Shlomy Ashkenazy, its CEO, at an IBM Europe Cloud Analyst forum a week or so back, and was impressed by the way that the latest computing ideas (such as executing functions in response to events) were reducing the gap between business and code development.

GreenQ seems to me to be addressing the issue that traditional ways of automating the handling of garbage collection – with an asset database and automated route planning of the garbage pickup trucks–simply aren’t very efficient, nor effective (and garbage on the streets also has political – career-limiting–implications):

In order to avoid garbage spilling onto the street from the busiest bins, you need to collect garbage more often than is necessary for most bins;
You need garbage collection systems that can cope with the very busiest days (after New Year’s Eve, perhaps) and which may be fairly idle most of the year;
Garbage collection is reactive and responds to someone noticing an overfilled bin, rather than proactive, responding to a bin filling up more quickly than expected, but not yet full.
Just as garbage cans fill up, so do garbage trucks. You don’t want to send a truck that is full to pick up the garbage from the next site–that can waste a lot of fuel and add to unnecessary pollution.

GreenQ’s solution is to install sensors and intelligence (this is pretty cheap these days) in each garbage truck, so as to, in effect, implement “just in time” garbage collection; with “just enough” collecting trucks to address the actual garbage collection requirements (and, presumably, you can proactively hire another truck if you are approaching capacity). And, if garbage peaks in different parts of the city don’t all coincide, you can maintain service levels with fewer trucks overall. It’s a bit like virtualised storage, but with garbage cans in place of hard drives.

The trucks contain a GPS, so you know where they are; a camera, so you can monitor its surroundings, dynamic weighing sensors – and cellular communications, so they can be managed remotely. The garbage container is an “IoT device” with a microcontroller, battery and level/filling sensor. Garbage trucks and bins are pretty physical; but the process is all software-defined.

This application is essentially event driven, by garbage bin pick-up; and “peaky” because there is more garbage after a festival than on a Monday after a quiet Sunday. So, you don’t really want an always-on server, spending much of its time idle in anticipation of the peak load. You want a platform that provides the functionality you need as a service (Function as a Service, FaaS) as and when it is needed to service a garbage event, and which doesn’t charge when you are not using it.

GreenQ uses FaaS for its flexibility of scale and timing; for performance, handling unexpected events; and cost-avoidance, because there are no idling VMs and no server maintenance. It uses OpenWhisk because its OSS Apache 2.0 license lets you see under the hood and contribute; the open source option facilitates on-premises implementations; and it has easy access to Watson cognitive services – Shlomy’s vision is very much around intelligent garbage services.

More generally, I’m really talking about “Serverless Architectures” here, which are rapidly becoming popular in Cloud computing (and, some would say, a bit of a religious movement). Of course, as usual in IT, the name is misleading – these architectures aren’t serverless, it’s just that developers can leave the servers to be looked after by Cloud vendor services (such as Amazon AWS, Google Cloud Services or IBM Bluemix). There is no need for an “always on” server behind the application, which can markedly reduce costs (for peaky or intermittent workloads) – think in terms of of Function as a Service (FaaS), where the function is invoked on demand by a business event. If you want a reasonably in-depth look at the technology, read Mike Roberts’ article .

The most popular implementation (and, by a small margin, the oldest) of the concept today is probably AWS Lambda. I recently attended a BCS CMSG event, a “game of buzzword bingo”, where Denis Craig and Tamas Santa of Infinityworks gave us a fascinating workshop on developing applications which are composed simply of event-triggered functions, which are then automatically deployed in a DevOps process – and the developer enthusiasm this can generate.

Denis and Tamas described a developer-centric process, but one which made interactive communication with the business and the elucidation of requirements which the business stakeholders didn’t fully understand themselves (using prototypes) remarkably easy. This is partly because of the inherent simplicity of event-driven processing and because the first million or so Lambda accesses are free. If you want to play with serverless and AWS Lambda, check out the Serverless Application Template (which has lots of information, but is still a work in progress).

To return to OpenWhisk (I’m really not sure about that name, but it comes from the sort of people who gave us the MQSeries name) and the Internet of Garbage (IoG). GreenQ’s Open Whisk IoG application is described more fully here.

There are, in my view, two things in the favour of OpenWhisk: it is an OSS project, so (although IBM only uses it on Bluemix) it is, potentially, not tied to one Cloud platform (I think that 1980s-style Cloud lock-in is an emerging and serious issue); and IBM is developing enterprise-friendly resilience and manageability for its implementation of the Kubernates-based platform behind OpenWhisk. It is early days yet, but IBM is well-aware of the potential configuration management issues; has provided a lot of monitoring and logging capability; and is on a journey to better than 4 nines availability.

According to IBM, you should beware of developing a religious belief in Serverless – it doesn’t suit all workloads. Typically, it is best for short-running, peaky workloads – for instance, cheque processing in certain countries, which is mostly (but not entirely) on Friday/Saturday, where Open Whisk can offer 90% savings, against having an always-on server mostly idle for 5 days. And, IBM provides management capabilities, and the developer-led event-driven model running code on demand is easy to develop for.

Fundamentally, however, a key advantage of OpenWhisk is its community-based OSS model, not tied to any particular vendor (Redhat has now joined the OpenWhisk community; and Adobe is also a co-founder). So, for example, languages not supported by IBM (such as COBOL) are being supported by the community; “packages” expose OpenWhisk to other, IBM and non-IBM, services; and third-parties people are deploying OpenWhisk on other Clouds and even in-house (it is being used to modernise Batch processing).

IBM offers a simple pricing model for its platform, based on the amount of memory used and the time it is used for, with an initial free tier – it is still working on providing use-case examples, illustrating the conversion of “gb-secs” to real world implementations, but there are pricing examples on its Medium blog. Allegedly, pricing models on other Serverless platforms, which start free, can get more “interesting” (complicated) as you scale into the Enterprise (see what you think of Cloud Services pricing – bottom of page). There are no OpenWhisk charges for API usage, which frees up your design choices (there’s no temptation to use untidy designs which “game” APIs). Currently, IBM says, its sales discussions are mostly either with start-ups or with senior architects (with digital transformation projects) in very large financial institutions.

I think that Serverless (and I like IBM’s Open Whisk) is the way that many Mutable Businesses will want to go, for suitable applications (and there may be more of these than you think) – it is ideally suited to the needs of the Mutable Business. “Serverless” is also very fashionable, Serverless sounds to me rather like what everyone always thought cloud computing would be like, and any problems with Lambda’s (and other vendor’s) serverless architecture models will be fixed. So, Serverless has to be considered as part of the Infrastructure Model for the Mutable Enterprise even if, strictly speaking, it is abstracted from the physical infrastructure.

People deploying OpenWhisk have to make similar decisions to people using more directly infrastructure-based models. They might well start off with Serverless and then move their code onto a classic VM server or even bare metal – if it makes economic sense (and Tamas points out that this may sometimes be more trouble than you’d think, although Serverless may well be good for “proofs of concept”). Then move back to Serverless if demand drops off. Perhaps Configuration Management is the elephant in the room – there are problems with the default configurations of some of the tools associated with Serverless which may lose all your data when you tear down your environment (although losing temporary data is probably a good thing, you do need to design a Serverless app to persist data you want to keep in a database or whatever). At some point, a server-less function will be used in an application where it matters exactly what code was used for a particular business transaction, how it was configured, and what it impacted (and it may also be necessary to ensure that only the latest version of the code is used in production). Gerry Reilly (VP Event Services IBM Watson and Cloud platforms) didn’t go into configuration management in detail at the IBM Cloud Analyst event, but he convinced me that the OpenWhisk platform is taking it seriously. I’m hoping to get IBM to give the BCS CMSG a talk on “the future of config management on serverless platforms” and will report back.

So, whether Serverless Architecture is part of the Mutable Infrastructure is, to some extent, a cosmetic issue, but I think it will be increasingly important and that DevOps-oriented developers will just see FaaS as simply an alternative to PaaS – and both run on virtualised infrastructures, so one should just consider Serverless platforms as another part of the Cloud infrastructure.