I recently suggested that David Luckham add latency as a measure of “power” in his strawman for features that CEP users are interested in. This generated some discussion, and the question as to whether I was asserting that all CEP applications require low latency, so I thought I’d share some observations.
When people look at the performance, in terms of speed and capacity, of a CEP platform, they are generally concerned with one or both of two dimensions: throughput and latency. Throughput is typically measured in messages per second, i.e. how many incoming messages (events) per second can be processed without getting overloaded. Latency is the elapsed time between an event and a response. You could also call it the response time or reaction time.
Different applications have very different needs. Most care about throughput at some level, because they need to ensure that their application can keep up with the expected message rate. This can vary widely. One application might expect a few events a day, another might expect a few hundred thousand a second (the most extreme I’ve run across was a firm that was designing a system for a capacity of 15 million messages per second! yikes). So you care about throughput in so far as you need to have enough for your intended application. Beyond that, excess capacity is irrelevant other than to give you head room for growth.
The same is true for latency in that most applications will care about latency at some level, but expectations may vary widely. We see trading applications that are latency-sensitive to the level where every millisecond counts. We also see many BAM-type applications where the results are being displayed on a dashboard for consumption by a human user. In this case, the requirements might be for latency in the range of 150ms to a few seconds. Demanding latency of less than 150 ms in this context is pointless, since that’s below the range of human perception. In still other cases, a reaction time within the hour might be perfectly acceptable to the application - and yet it still might be an application that can benefit from CEP. At some level, however, you probably don’t need CEP. If you truly don’t care about latency - if it’s ok that your reponse comes hours, days or weeks after the fact, then you can probably accomplish what you want through conventional data analysis tools that run periodicall against historical event data. Sure, you could still use CEP, but if you aren’t trying to analyze events in real-time, as they occur, there are other tools available. And the only reason to go to the trouble to analyze them as soon as they occur, is because you care about latency, even if your idea of acceptable latency differs from someone else by seconds or minutes.
Bottom line, I think the fact that some CEP technology can be used for those very demanding high performance applications that require very high throughput or very low latency sometimes distorts the discussion. The big numbers (or extremely little numbers in the case of ultra-low latency) get peoples attention, but they shouldn’t cause people to think that they only need CEP if they are building a high speed application. If there is a need to analyze and act on event data in real-time, then CEP can help, regardless of how fast the events arrive or how quickly you need to respond.
Finally, lest I create the wrong impression, performance is merely ONE feature of CEP that users care about. It is by no means the only one. In fact a lot of the time I think too much attenion gets focused on performance at the expense of other critical factors - the most fundamental one being: can it address the use case in question - at any speed? But that’s for another day.
Tags: CEP, Complex Event Processing, Latency, performance, throughput





