What is it?
Event streaming is the capture of data events in real-time from numerous event sources. Sources can include databases, software applications, sensors, mobile devices and cloud/infrastructure services. Events are captured in the form of unbounded streams and stored durably in logs for later retrieval, manipulation and processing. Event sources are typically referred to as producers. Consumers of events can react in real-time, retrospectively and can also create new streams of events (stream processing).
Event stream platforms are a broad class of Message-Oriented Middleware (MoM) organising messages into topics rather than more traditional message queues and topics. They also differ in ordering of messages, as in more traditional MoM technologies order of messages isn’t necessarily guaranteed unless specifically configured for first in first out (FIFO).
Another key difference from a traditional message queue is the persistence of the records. Message Queues typically process a message and then discard it. Event Streams persist an historical log of all transactions making it an ideal source of information for analytics and general data intelligence.
Value proposition of the capability
Event streaming platforms can typically scale to handle trillions of events a day and offer stronger temporal guarantees than traditional MoM. An event stream is an append only log of ordered events. This makes them very fast and highly durable where traditional messaging technologies generally discard events once read from a queue or topic. The technology is the foundational underpinning of an event driven architecture and generally referred to as the event backbone.
Event driven architecture is associated with a number of key advantages over more traditional application monoliths:
- Loose coupling between components/services
- True decoupling of producers and consumers
- Ability to scale individual components/services
- Components/services can be developed independent of each other
- High cloud affinity
- Asynchronous push based messaging
- Fault tolerance and better resilience
- Ability to build processing pipelines
- Availability of sophisticated brokers to reduce code complexity
- Real time event streams for data science and machine learning
The architecture and technology goes hand in hand with microservices architecture but is also increasingly being used in more traditional data integration scenarios. Connectors to common systems, applications and technologies allows for data to be offered up at scale for consumption elsewhere in the organisation.
Common uses or use cases
- Messaging e.g. as a replacement for a traditional message broker
- Website activity tracking e.g. user activity tracking
- Metrics e.g. operational monitoring data from multiple applications
- Log aggregation e.g. abstracting log information from traditional file based logging
- Stream processing e.g. processing data in pipelines consisting of multiple stages
Many organisations across industries undergoing digital transformation initiatives are using event streaming as one of the key technology enablers. Almost every industry has scenarios where an event driven approach can deliver game changing outcomes:
- Automotive – tracking and monitoring vehicle events, driverless cars, smart cities etc
- Manufacturing – plant events, manufacturing line events from IoT devices
- Healthcare – patient and hospital equipment events
- Retail – understanding consumer behaviours, order processing events
- Logistics – supply chain event tracking
Event streams are effectively behaviours and as such are being used heavily in machine learning scenarios and artificial intelligence. They also provide opportunities for true real-time analytics allowing organisations to make more timely and accurate business decisions.
Implementation Best Practices
Implementation best practice with event-driven starts with data design around the events and more specifically the granularity of events. Like APIs we need to design events with a consumption mindset. What burden does the event we produce place on a subscriber to that event? Very fine grained events can leave the consumer needing to aggregate numerous events before they can act. An example might be an event for every single field change within a record that needs compiling into one record update. If events are too coarse grained then there may be potential to miss the behaviour in the event and the underlying value that the event stream could deliver.
The next step is to group related event data. There may be a tendency to denormalise all of the event structures which may ultimately make them unwieldy to work with especially as event structures change. Grouping may lead naturally into defining stream processing requirements as well. We have to accept that event structures may change over time, but we need to be careful not to break the subscribers to events. Loose coupling doesn’t mean we have a complete lack of contract between producer and consumer. The contract lies in making sure the event can be interpreted without inherent knowledge of the underlying system or device that produced it.
Organisations need to structure, size and partition their event architecture appropriately. Event driven architectures are generally pretty complex affairs. Wholesale redesign is not something you want to find yourself doing post-implementation. Event data gets partitioned across infrastructure and what may seem like minor changes can have broader reaching effects than you might anticipate. Working with the technology vendor and a partner like Chakray to assure your setup can help avoid costly mistakes down the track.
How do technologies differ?
There are a number of vendors and technologies in the event streaming space. Apache Kafka is one of the best known and most broadly used. However, Microsoft Azure Event Grid/Hub. Google Pub/Sub, Amazon Kinesis and Solace PubSub+ also have strong market adoption. Buyers will also find numerous vendor distributions of Apache Kafka the most popular of which being Confluent Kafka. However Microsoft, Amazon and Aiven all provide supported implementation of Kafka.
Confluent set themselves apart with their ongoing contribution and sponsorship of the Apache Kafka project, having founding members of the project within their organisation. Confluent provides a number of unique components and services for Apache Kafka not available elsewhere in the market.
Selection of an appropriate technology for event streaming may well be driven for some organisations by their existing cloud infrastructure investments. Organisations that are heavily invested in Azure may choose Event Grid or Event Hubs. Those with investment in Google may choose Google Pub/Sub. The same principles of success apply regardless of the underlying technology.
You May Be Interested In…
Further information and reading on subjects related to this page.
Analysts are informing other CIOs that ‘Event-Driven’ is the secret sauce of competitive edge and growth The event-driven CIO is already ahead of