Databrick: pros and cons

Renan 60 Reputation points
2025-03-13T10:55:40.2666667+00:00

Hi,

Is it possible to highlight five pros and five cons of using Databricks for data engineering? This would help me make a more informed decision about the tool

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,368 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 9,995 Reputation points Microsoft External Staff
    2025-03-13T18:56:05.8433333+00:00

    @Renan

    Databricks is a popular cloud-based data platform that provides a collaborative environment for data engineering, data science, and machine learning. It's great that you are considering all aspects of the platform before making a decision.

    Here are some key pros and cons of using Databricks for data engineering:

    Pros

    Unified Analytics Platform - Databricks combines data processing, analytics, and machine learning tools, facilitating seamless collaboration between data engineers, scientists, and analysts.

    Scalability - Built on Apache Spark, Databricks efficiently handles datasets of all sizes, allowing you to scale your workloads according to your needs.

    Collaborative Workspace - The platform offers features like notebooks and real-time collaboration, enabling teams to work together effectively and share insights easily.

    Automated Cluster Management - Databricks automates cluster provisioning and management, reducing infrastructure management overhead and allowing you to focus on data processing.

    Built-in Data Connectors - It provides a wide range of connectors to various data sources, simplifying data integration and enhancing pipeline efficiency.

    Cons

    Cost - While powerful, Databricks can be expensive, especially at scale. It's important to manage usage carefully to avoid unexpected costs.

    Learning Curve - The platform's extensive features can present a learning curve, particularly for those new to Spark or cloud-based data platforms.

    Complexity of Advanced Features - Some advanced features can be complex to configure and use, potentially leading to underutilization or misconfigurations.

    Dependency on Internet Connectivity - As a cloud-based platform, reliable internet access is essential, and connectivity issues can affect productivity.

    Vendor Lock-in - Committing to Databricks can lead to challenges if you later decide to switch platforms or integrate with tools outside its ecosystem.

    For additional information, please refer: Pros And Cons Of Using Databricks

    Disclaimer: This response contains a reference to a third-party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these sites and has not tested any software or information found on these sites; therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet.

    I hope this information helps. Please do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.