What We Talk About When We Talk About Queries
A query, as any dictionary will tell you, is a question. Traditional databases are designed to answer concrete questions about existing data — that is, they’re made to tell us what happened in the past.
Querying the past has been well addressed by many years of research and practice, and modern data warehouses do a fantastic job with everything from end of quarter reporting to historical trend analysis. However, as technology has progressed, and the speed of business has increased, it has become both possible and necessary to ask about the ever more recent past. “How are we doing this month?” “What was our shipping status at close of business yesterday?” “How many visitors has our website had in the last fifteen minutes?” The nearer these questions come to the addressing the present moment, the more difficult it becomes for traditional databases to answer them.
This tension between the need for timely answers about recent events and the optimization of databases for retrospective analysis has led to the creation of a new class of data processing technology known as Complex Event Processing (CEP).
Continuous Queries
CEP systems, like databases, are designed to answer questions, but they extend the domain of inquiry into the future. One of the ways in which they do this is by allowing us to specify traditional queries in advance — for instance, inquiring “how well is our inventory balanced to our orders?” of a system that guarantees that it will have the latest answer on hand at any moment in the future. These forward-looking versions of traditional queries are called continuous queries because they operate continuously against arriving data rather calculating their results once against a body of static data, as conventional database queries do.
Using a continuous query can allow one to move from a business process that relies on extracting business intelligence via an end-of-day report to one that provides the same information on a continuously updated dashboard, thus granting the enterprise greater agility. In a similar fashion, we see customers in the financial industry moving computations that have traditionally been run every fifteen minutes to real-time dashboards, giving them an important edge over the competition.
Conditional Queries
In addition to changing the way traditional queries are performed, CEP systems allow us to ask a wholly different class of questions about events that may or may not occur in the future. These conditional queries typically involve a pattern of events that represents a condition about which one would want to be informed immediately. A simple conditional query might be something like “alert me if our stock price dips below a certain value,” while a more complicated one could be “flag any credit card charges that match the profile for fraudulent transactions.”
Conditional queries are frequently used for situation detection, often by integrating statistical models derived from historical data. We see customers applying these techniques to problems like network security, national security, fraud detection, algorithmic trading, and so on.
In many cases, it’s necessary to bring together all three of the technologies discussed in this article (i.e., OLTP, OLAP, and CEP), along with a great deal of domain expertise, to build an application. A fraud prevention application, for example, might build a baseline profile of fraudulent transactions by collecting historical data from a data warehouse, overlay that baseline with up to the minute updates provided by a continuous query, then perform the actual fraud detection with a conditional query that uses the data maintained by the continuous query in its calculations. Because this can be done in milliseconds or less, it can be used to prevent current fraudulent transactions rather than just detect them and shutdown future transactions.
In brief, combining the retrospective functionality of a data warehouse (i.e., historical analytics) with the prospective power of a CEP solution (i.e., real-time analytics) provides a hybrid analytics solution that leverages power of historical and real-time data.
Leave a Reply
You must be logged in to post a comment.





