Skip to main content

Upload files to Fabric OneLake with a Service Principal and Python

I've had the unfortunate pleasure of working with Microsoft Fabric lately. This is one way of uploading a file with python to Fabric storage module AKA OneLake and a Service Principal.

In Azure

  1. Create a new App:
    1. Go to App registrations and complete the process after having clicked New registration Screenshot of 'New registration' button
    2. The name field largely won't matter and generally Accounts in this organizational directory only should be sufficient. The redirect URI isn't necessary.
  2. Open the newly created App by going back to App registrations and selecting All applications
    1. In the Overview tab copy: Display name, Application (client) ID and Directory (tenant) ID
    2. Open the Certificates & secrets page under Manage
    3. Create a new client secret by selecting Client secrets and New client secret. Choose the appropriate name and expiration date for your use-case. Screenshot of 'New client secret' button
    4. Copy and save the secret value

In Fabric

  1. Open the Tenant settings in Admin Portal: https://app.fabric.microsoft.com/admin-portal/tenantSettings
  2. Scroll down to Developer settings and enable the option: Service principals can use Fabric APIs. Screenshot of 'Service principals can use Fabric APIs' setting

In Python

Install libraries with the following command:

pip install azure-identity azure-storage-file-datalake

Create a new json called credentials.json with the following content:

{
"tenant_id": "YOUR-TENANT-ID",
"client_id": "YOUR-CLIENT-ID",
"client_secret": "YOUR-CLIENT-SECRET-VALUE"
}

Create a new script in the same folder as your credentials.json file, with the following content:

import json

from azure.identity import ClientSecretCredential
from azure.storage.filedatalake import DataLakeServiceClient

config = json.load(open("./credentials.json"))
credential = ClientSecretCredential(
tenant_id=config.get("tenant_id"), client_id=config.get("client_id"), client_secret=config.get("client_secret")
)


workspace = "YOUR-FABRIC-WORKPACE-NAME"
lakehouse = "YOUR-FABRIC-LAKEHOUSE-STORAGE-ENDPOINT-NAME"

file_directory = "NAME-OF-THE-FOLDER-TO-UPLOAD-TO"
onelake_file_name = "NAME-OF-THE-FILE-IN-ONELAKE"
filesystem_file_path = "PATH-OF-THE-FILE-IN-CURRENT-FILESYSTEM"

service_client = DataLakeServiceClient(account_url="https://onelake.dfs.fabric.microsoft.com/", credential=credential)
file_system_client = service_client.get_file_system_client(file_system=workspace)
directory_client = file_system_client.get_directory_client(f'{lakehouse}.Lakehouse/Files/{file_directory}/')
file_client = directory_client.get_file_client(onelake_file_name)

with open(filesystem_file_path, "rb") as file:
file_client.upload_data(file) # Add the option 'overwrite=True' if needed

Enjoy!

For more info about Microsoft Entra service principals: https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal