

Siddhi 5.1 Features¶

New features of Siddhi 5.1

Information on new features of Siddhi 5.1 can be found in the release blog.

Development and deployment¶

Siddhi Streaming SQL allows writing processing logic for event consumption, processing, integration, and publishing as a SQL script.

Siddhi Editor provides graphical drag-and-drop and source-based query building capability, with event flow visualization, syntax highlighting, auto-completion, and error handling support.
Event simulation support to test the apps by sending events one by one, as a feed, or from CSV or database.
Ability to export the developed apps as .siddhi files, or as Docker or Kubernetes artifacts with necessary extensions.

Siddhi test framework provides tools to build unit, integration, and black-box tests, to achieve CI/CD pipeline with agile DevOps workflow.
App version management supported by manging .siddhi files in a preferred version control system.

Embedded execution in Java and Python as libraries.
Run as a standalone microservice in bare-metal, VM, or Docker.
Deploy and run as a standalone or as distributed microservices natively in Kubernetes, using Siddhi Kubernetes operator.

Consume and publish events via NATS, Kafka, RabbitMQ, HTTP, gRPC, TCP, JMS, IBM MQ, MQTT, Amazon SQS, Google Pub/Sub, Email, WebSocket, File, Change Data Capture (CDC) (From MySQL, Oracle, MSSQL, DB2, Postgre), S3, Google Cloud Storage, and in-memory.
Support message formats such as JSON, XML, Avro, Protobuf, Text, Binary, Key-value, and CSV.
Rate-limit the output based on time and number of events.
Perform load balancing and failover when publishing events to endpoints.

Filter event based on conditions such as value ranges, string matching, regex, and others.
Clean data by setting defaults, and handling nulls, using default, if-then-else functions, and many others.

Support data extraction and reconstruction of messages.
Inline mathematical and logical operations.
Inbuilt functions and 60+ extensions for processing JSON, string, time, math, regex, and others.
Ability to write custom functions in JavaScript, and Java.

Query, modify, and join the data stored in-memory tables which support primary key constraints and indexing.
Query, modify, and join the data stored in stores backed by systems such as RDBMS (MySQL, Oracle, MSSQL, DB2, Postgre, H2), Redis, MongoDB, HBase, Cassandra, Solr, and Elasticsearch.
Support low latency processing by preloading and caching data using caching modes such as FIFO, LRU, and RFU.

Support for calling HTTP and gRPC services in a non-blocking manner to fetch data and enrich events.
Handle responses accordingly for different response status codes.
Various error handling options to handle endpoint unavailability while retrying to connect, such as;
- Logging and dropping the events.
- Waiting indefinitely and not consuming events from the sources.
- Divert the events to error stream to handle the errors gracefully.

Aggregate of data using sum, count, average (avg), min, max, distinctCount, and standard deviation (StdDev) operators.
Event summarization based on time intervals using sliding time, or tumbling/batch time windows.
Event summarization based on number of events using sliding length, and tumbling/batch length windows.
Support for data summarization based on sessions and uniqueness.
Ability to run long running aggregations with time granularities from seconds to years using both in-memory and databases via named aggregation.
Support to aggregate data based on group by fields, filter aggregated data using having conditions, and sort & limit the aggregated output using order by & limit keywords.

Execution of rules based on single event using filter operator, if-then-else and match functions, and many others.
Rules based on collection of events using data summarization, and joins with streams, tables, windows or aggregations.
Rules to detect event occurrence patterns, trends, or non-occurrence of a critical events using complex event processing constructs such as pattern, and sequence.

Serve pre-created ML models based on TensorFlow or PMML that are built via Python, R, Spark, H2O.ai, or others.
Ability to call models via HTTP or gRPC for decision making.
Online machine learning for clustering, classification, and regression.
Anomaly detection using markov chains.

Process complex messages by dividing them into simple messages using tokenize function, process or transform them in isolation, and group them back using the batch window and group aggregation.
Ability to modularize the execution logic of each use case into a single .siddhi file (Siddhi Application), and connect them using in-memory source and in-memory sink to build a composite event-driven application.
Provide execution isolation and parallel processing by partitioning the events using keys or value ranges.
Periodically trigger data pipelines based on time intervals, and cron expression, or at App startup using triggers.
Synchronize and parallelize event processing using @sync annotations.
Ability to alter the speed of processing based on event generation time.

Act as an HTTP or gRPC service to provide realtime synchronous decisions via service source and service-Response sink.
Provide REST APIs to query in-memory tables, windows, named-aggregations, and database backed stores such as RDBMS, NoSQL DBs to make decisions based on the state of the system.

Support for periodic incremental state persistence to allow state restoration during failures.

Top