Often times we see ourselves chaining .read(), .join(), .groupBy() and .agg()for machine learning, recommendations or predictions. I have spent countless hours writing, testing and ultimately debugging these lines of codes to understand which line caused a bad transformation. Not to mention, they are one-off implementations with no reusability. Our goal is to improve these chained implementations by something more reusable and scalable.

Photo by Jez Timms on Unsplash

In this blog, we will discuss how to modularize our spark application using scala’s implicit keyword and ad-hoc polymorphism.

Goal

Our goal is to implement a job first, in a brute-force spark transformation script and then improve it using…

Omkar Kulkarni

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store