User-Defined Aggregate Functions(UDAF) Using Apache Spark

UDAF stands for User Defined Aggregate functions. Aggregate functions are used to perform a calculation on a set of values and return a single value. It is difficult to write an aggregate function compared to writing a User Defined Functions(UDF) as we need to aggregate on multiple rows and columns. Apache Spark UDAF operates on more than one row or Column while returning a single value results

Continue ReadingUser-Defined Aggregate Functions(UDAF) Using Apache Spark

What is Apache Spark? The Unified engine for large-scale data analytics.

Apache Spark is a distributed, in-memory and disk based optimized system which does real-time analytics using Resilient Distributed Data(RDD) Sets.Spark includes a streaming library, and a rich set of programming interfaces to make data processing and transformation easier.

Continue ReadingWhat is Apache Spark? The Unified engine for large-scale data analytics.