Batch processing Vs Stream processing –

Batch processing Vs Stream processing

What is batch processing?

Batch processing is the processing of transactions in a group or batch. There is no user interaction required once batch processing is running. This differentiates batch processing from transaction processing, which involves processing transactions one by one and requires user intervention.

What is stream processing?

Stream processing is the process of analyzing streaming data in real-time. Analysts are able to continuously monitor a stream of data to achieve various goals.

Stream processing is a low-latency way to capture information about events while they are in transit, processing the data. A data stream, or event stream, can include almost any type of information: social network or web browsing path data, factory production and other process data, stock or financial transaction details, patient data in a hospital, machine learning system data, IoT (Internet of Things)

The use cases of data processing are varied. From the most accessible to the most in-depth, they depend on the maturity of the company on the subject and the business needs. Here are some examples:

 Batch processing Vs Stream processing

1.   Streaming ETL :

ETL stands for Extract, Transform, Load and is defined as a mechanism to acquire data from various source systems (Extract), normalize it (Transform), and then introduce the transformed data into the target data warehouse (Load). Streaming ETL provides real-time insights and dashboards by transforming the data as soon as it arrives. This helps businesses make quick and insightful business decisions. 

2. Anomaly and fraud detection :

At the most basic level, anomaly detection is the process of identifying data elements that stand out suspiciously from the pack – rare occurrences, unexpected behaviors, Fraud, conflicting assets, and other outliers. Technically, these elements are called “dataset outliers”. They can indicate corrupted data parts, compromised secret data, hardware malfunctions, fraudulent activities of various kinds, etc. Machine learning-powered anomaly detection is the next level of the traditional anomaly detection routine where machine learning powers are used to speed up and smooth out the processes.

3. Predictive analytics :

Predictive analytics is the process of looking at historical data to predict future outcomes. Predictive analytics can dramatically improve customer satisfaction and operational efficiency, by using real-time streaming data to predict future outcomes.

Linkedin | Twitter | Youtube | Free training

Posted on July 20, 2022 by Yassine, LASRI