Databricks Spark vs Open Source Spark


As an Architect we use to have lot of challenges to pick right technology on right use case and this is very key for long term Strategy of Product and Solutions in Cloud.

The intent of the writing is to give the right choice for the Architect and Data Solution engineer and pick the best breed of technology and not to criticize. Again this is completely my view and opinion so pick your favorite by during your own Due diligence 🙂

I did very deep dive on this areas of technology to find out the differences between Open Source Spark and Databricks Spark. Essentially some of the Open source Spark has been used in typical Cloud technologies to integrate in the framework for e.g. in this case Azure Synapse adopted Spark as separate pool differentiating from SQL DW pool.

My exercise is to show the differences in a Birds eye view based on my own research. This is reported as of April 1st 2021 but below information subject to change on technology evaluations for future. Hope this will help.

Love to see any comment , opinions / feedback please share

Author: Debashis Paul

Retired Oracle BI Enthusiastic. Musing on Enterprise Cloud & Data Architecture and Design in Open source Full stack framework in Kubernetes and working on Big Data/BI & Analytics. In my blog all the voices are of my own and does not necessarily reflect the views of my employer. Thanks for visiting my Journal.Have a Good Day !!!

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s