Often times we see ourselves chaining
.read(), .join(), .groupBy() and .agg()for machine learning, recommendations or predictions. I have spent countless hours writing, testing and ultimately debugging these lines of codes to understand which line caused a bad transformation. Not to mention, they are one-off implementations with no reusability. Our goal is to improve these chained implementations by something more reusable and scalable.
In this blog, we will discuss how to modularize our spark application using scala’s
implicit keyword and
Our goal is to implement a job first, in a brute-force spark transformation script and then improve it using…