toggle

A guide on Microsoft Fabric

The most-awaited end-to-end data and analytics platform Microsoft Fabric is here. What is it? How does it work? How can you navigate its complex licensing plans and implement them in your organization? Read the blog to get answers.

author

Suresh

April 17, 2024

|

9 mins

What is Microsoft Fabric?

On May 23, 2023, Microsoft announced the good news about releasing an all-in-one SaaS platform, Microsoft Fabric—which covers data engineering to data science and AI in one location.

This most awaited platform is a one-stop solution for everything data and analytics integrating some already existing Microsoft solutions like Azure data factory, stream analytics, Power BI, etc. This makes it possible to manage everything from data movement to integration to visualization to storage, all within one comprehensive suite, the fabric.

This is a boon for data-driven businesses to harness the full potential of data at all its flexibility without worrying about security and governance.

In simple terms, this is what Microsoft Fabric enables you to do. 

  • Building, maintaining, and scaling data architectures using lakehouses

  • Set up data movement connecting different data sources

  • View and act on real-time analytics 

  • Visualize data to absorb insights faster

  • Run data science and machine learning projects

  • Use lakes to store a single copy of data without duplication.

All within a single-encompassing platform, without having to rely on different vendors with different plans, pricing, and terms and conditions.

Microsoft Fabric architecture

Built on top of an open-source platform called OneLake, Microsoft Fabric is a unified SaaS platform that comes with seven workloads, providing infrastructure and computing capabilities to meet the dynamic analytic requirements of a business. These workloads or capabilities are called experiences which can be accessed from a single UI. 

Other than these workloads it comes with a lakehouse architecture for storage and workspaces and notebooks for users to write code, share, and collaborate with other users. 

Data factory

This is to help with data integration, ingestion, preparation, and transformation within the Fabric. It does this with the help of two high-level features, data pipelines and data workflows. 

Traditionally, we use tools like SQL, Azure data factory, power query, etc to perform data transformation and ingestion, separately or together. But Data Factory in Fabric combines the experience with the help of connectors, dataflows, and data pipelines. With the help of 200+ connectors, you can ingest data from any data source, and with the help of dataflow, you can perform transformations easily and seamlessly. Any specific conditions or flow control can be set with the help of data pipelines.

Data warehouse

One of the workloads of Fabric is the data warehouse which helps you build warehouses in open-source format. This can be used to store data in the OneLake environment in an open-data format at scale. Microsoft Fabric enables you to create a lake-centric warehouse in minutes and load data into it with the help of SQL commands, dataflow Gen 2, and data pipelines. Since the compute and storage capacities are decoupled here, high scalability and performance are guaranteed. As you get the required resources instantly and pay only for what you use, scalability will not be a problem either. When not required, you can automatically scale down. 

Azure Synapse data engineering

This experience is mainly to build and maintain data architecture using a lakehouse environment, which orchestrates data from different sources, and processes, and passes them downstream for analytics and data science. 

With this, data engineers can perform the following tasks with much ease.

  • Creating shortcuts in the data lake from other storage areas like Amazon, AWS, etc. So, they can tap into the data within the Fabric environment without moving the data from its actual source.

  • Create data lakehouses that store data in delta format, integrating data from different sources. With its T-SQL endpoint, can generate analytical reports on data stored in this lakehouse.

It contains the following components. 

Notebook so that developers and data engineers can write code to facilitate data movement like ingestion, preparation, etc. It comes with co-authoring capabilities and saves the progress as it goes just like Microsoft Office 365 products. This makes collaboration easier for large data teams. 

It contains data pipelines to copy data from one destination, ingest them into the data lakehouse, and allow transformations to happen.

Data science

Real-time insights and inferences are important for timely action for a growing business. Machine learning models help with generating such growth-driving insights and share them with business stakeholders. Microsoft Fabric offers you the atmosphere to create, build, and deploy machine learning models and generate the insights you need, all within the same environment. Data scientists can perform all their operations - from cleaning the data to testing, scoring, and monitoring the model. 

Real-time analytics

Data flows from all directions a lot of which are in unsupported formats and changing schemas making it difficult for real-time analytics. For example: IoT devices, customer support data, etc. Traditional warehouses aren’t suitable for storing or processing these data. This is where the real-time analytics of Microsoft Fabric helps, which helps with ingesting, transforming, integrating, and running queries on them, and performing analytics. 

Microsoft Fabric makes data ready for analysis without the need for querying or data movement. How? With the help of the semantic layer - direct Lake. This loads the data in paraquat format directly into the PowerBI—making it ready for visualization and exploratory analysis. This also reflects any changes made in the data.

This helps businesses manage tons of unstructured or streaming data without being able to extract real value from them. Consider this popular use case of the logistics industry for example, which is predictive maintenance - drawing real-time operational data from running machinery with the help of IoT devices, observing anomalies in values so any performance issues can be identified right away.

Many businesses already use Power BI to quickly grasp business insights and intelligence. Since Power BI helps visualize data and turn them into beautiful visual aids and dashboards, it becomes easier for stakeholders to understand critical metrics that matter to them, without disturbing anyone for data or reporting. The Power BI in Fabric can help BI teams instantly create reports from data stored in the OneLake and publish them so end-users can utilize self-service analytics. 

Microsoft is also working on other workloads similar to this to improve your overall experience. Take the data activator for example, yet to be released, and can take actions based on patterns it detects on your analytics data.

Centralized platform - OneLake 

While using different platforms for each of the above purposes, data synchronicity might get affected, also affecting the outcome and overall performance of your data initiatives. The solution to put an end to these data discrepancies and maintain one copy of data across all of the fields is OneLake.

Microsoft calls this the OneDrive of the Microsoft Fabric (like how OneDrive retains all the changes made on Office 365 files and retains one centralized, up-to-date version of them). 

This data can be accessible through multiple analytical engines like SQL, Spark, Power BI, and KQL DB which you can query without moving the data. 

In order to give relational database properties to this otherwise lake architecture, Delta Lake is integrated into the Microsoft Fabric environment. The properties of this open-source storage layer are as follows.

  • Data lakes store data in parquet format which has immutable properties. Ie. data cannot be altered, changed, edited, or deleted.

  • This is ideal for storing historical data but for handling transactional level data, ACID properties are mandatory to make and update changes, which is where delta tables help. These tables maintain a separate transaction log and maintain the records of CRUD operations there. So, the original parquet format still remains undisturbed while users can make updates and delete easily.

Benefits of OneLake

Maintaining and enforcing governance across all Fabric products: multiple teams can work together, collaborate, and access the data they require - while the admin can enforce the right access controls for each user.

It supports all kinds of data, business intelligence reports and files, data stored in warehouses, and real-time analytics data, all of which are instantly stored and updated in the OneLake cloud storage, despite their file type and version. No space for the relational type of storage here, making it easier for instant access to data.

Since it allows all types of data storage, it supports a wide range of data initiatives including real-time analytics, machine learning, and natural language processing.

Any file stored in the OneLake can be easily accessed with the help of Windows Explorer, another ease-of-access factor that can help both tech and non-tech users. 

Think of this as a big, bulky storage space within which each department can have its own container. They can share data across departments without having to move the data or make a copy of it. This democratization of data is its biggest advantage.

Licensing structure for Fabric

Microsoft Fabric has a simple pay-as-you-use type of pricing plan. They also have reservation plans if you are keen to save costs and have your capacity reserved for a specific period. 

Also, every tool is clubbed within a unified architecture from a single vendor. So, the capacity units of all workloads are pooled and are used on a shared basis equally. This is how Microsoft Fabric saves costs for you by freeing up unused resources and computational capacities. You don’t have to explore and purchase multiple tools from various vendors and waste money on compute and other resources. 

We will explore both below so you can choose the optimal payment structure for you. 

Pay-as-you-go

Here, they charge you on an hourly basis, depending on the used capacity unit and SKUs (SKUs here mean stock-keeping units which denote the capacity to use all Fabric experiences.). Due to the flexibility it offers, this is best suitable for companies with changing needs. For example, if you need additional capacity, you could have additional capacity and pay respectively without having it pre-allocated to you. 

If you are going for F 64 SKUs with 64 capacity units, the pricing will be $11.52/hour, which is the average amount of capacity units you will require to create, collaborate, and view Power BI content using Microsoft Fabric.

This pricing increases or decreases depending on the SKUs and capacity you opt for. Refer to the entire table here.

This pricing structure is only applicable to the Fabric capacity. Charges are separate to access OneLake storage. To access OneLake storage for a month, the charges are $0.023/GB. If you need OneLake capacity for business continuity and disaster recovery, then charges are $0.0414/GB.

Reservation

With this type of plan, you can reserve the amount of Fabric capacity prior for up to a year and pay upfront annually or monthly. After this one year, your workloads will be still operational but the payment plan will switch to a pay-as-you-go model unless you opt for an auto-renewal.

Note that, this pricing is only for the Fabric workloads and capacity and not for the storage and networking. You can reserve your capacity on Azure Portal.

Conclusion

Microsoft Fabric is the next biggest dynamic shift happening in the data engineering space. Even though the product gets improved every time there is an update and is still a work in progress, businesses are keen to explore its features and components. This is because of one of the many issues that Microsoft Fabric solves - from centralized governance to simplified access management to one vendor, one platform, and one payment plan for all data needs. 

However, you might have many questions and concerns waiting to be cleared before making this significant decision. You might want to know if Fabric is the best solution for your business’s future plans and current circumstances. Join us on a call to get them cleared and explore the best possible solutions to implement that tick all of your requirements.

All-in-one platform for your data teams

Get it now
metrics