DEV Community

Cover image for Automating files upload to Microsoft OneDrive. Unexpected challenges and a success story!
Jason
Jason

Posted on

Automating files upload to Microsoft OneDrive. Unexpected challenges and a success story!

In the current data era we live in, a huge number of reports are being generated every minute. Uploading and distributing these reports is an inconvenience to say the least, and a security threat if sensitive/confidential data is included in these reports.

I am currently in charge of approximately 200 analytics reports generated daily, attached to emails and sent to the according audience at different departments. While this is a time consuming task, it was not a security concern just because these emails and reports were accessed on company approved devices in a closed and highly secured network.

Now with the pandemic and the stay in home orders. The chances of downloading these reports on unsecured machines became an urgent issue.

So I decided to take advantage of Microsoft OneDrive as a central storage solution. Automating uploading these reports to a departmental folders and email just a link to the file location to my audience, reducing the chances of someone downloading the files on their machine when Microsoft OneDrive encourages web previews.

This sounds like an easy task, right? Especially if you have worked with Python and APIs before. Include the requests library, configure your Client Id and Client secret, request an AccessToken/RefreshToken, and start rolling.

Well it turned out that working with Microsoft Graph is not that easy especially if you want to schedule your application to run in the background with no human interference.

While Microsoft has a good documentation, it was still vague on many critical subjects:

  • Authorization and what endpoint to use
  • Authenticating using you Microsoft Account
  • An running in the background (A Daemon app) requires high privileged admin access I don't have
  • How to use delegate permission instead and still run your app in the background
  • How to set up the header for resumable large files upload

After doing what every developer does from reading documentation to looking up a solution or idea on stack overflow, I wasn't able to find a solution using python especially for the resumable large files upload.

So I decided to publish my solution to help any fellow developer who is currently in the same shoes I was in couple weeks ago!

For the complete code please visit my GitHub

There are two major parts to this tutorial:

  1. Create, set up and configure the API on Azure Portal
  2. Write the python script

Part 1 - Create, set up and configure the API on Azure Portal

Step 1: Register your application
Go to https://portal.azure.com/#home
Azure Active Directory -> App Registration -> New Registration

  • Name your API
  • Accounts in this organizational directory only (Single Tenant)
  • Redirect URI is not needed as our app running and authenticating in the background
  • Click register

Once your new API is created, click on the API and Save the following two information for the code later:

  • Application (client) ID
  • Directory (tenant) ID

Alt Text

Then grab the OAuth 2.0 authorization endpoint (v2)

Alt Text
We will use it during the authorization script to get the URL link to permissions consent

Step 2: Configure the API permissions

API permissions → Add a permission → Microsoft APIs → Microsoft Graph → Delegated permissions → Select permissions Permissions needed “Sites.ReadWrite.All” and “Files.ReadWrite.All”

Alt Text

Step 3: Expose the API

After adding the permissions in Step 3, we have to expose the API and those permissions to the scope.

Alt Text

Then we need to add the client ID and select the authorized scopes we just added.

Expose an API → Add a client application → Enter Client ID → select the Authorized scopes → click add application

Alt Text

Step 4: Edit the manifest (Very important to allow Implicit grant)

This is a very important step in the API set. Go to Manifest and set the Oauth2IdToken and ImplicitFlow to true

Alt Text

Now that our API is all set and configured we can start writing some Python code!!!

Part 2- Write the python script

Our code is 2 different python scripts:

1- generateOneDriveAPIConsentURL-public.py

Script to generate the consent URL.This script is basically run once after setting the permissions in the API setup to give the user's consent to these permissions.
This is the best solution to run the main app in the background without using the app permissions (which pose high security risk as it is global high privileged and requires admin approval)
Sticking to delegate permission limit the permissions to the current user privileges and in virtually all cases, it doesn't require admin approval.

import requests
import json
from requests_oauthlib import OAuth2Session
from oauthlib.oauth2 import MobileApplicationClient

client_id = "xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx"
scopes = ['Sites.ReadWrite.All','Files.ReadWrite.All']
auth_url = 'https://login.microsoftonline.com/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/oauth2/v2.0/authorize'

#OAuth2Session is an extension to requests.Session
#used to create an authorization url using the requests.Session interface
#MobileApplicationClient is used to get the Implicit Grant

oauth = OAuth2Session(client=MobileApplicationClient(client_id=client_id), scope=scopes)
authorization_url, state = oauth.authorization_url(auth_url)
consent_link = oauth.get(authorization_url)
print(consent_link.url)
This script basically connect the Microsoft authorization V2. endpoint, send the Client ID and the scope of permission we are asking for then a URL will be generated and sent back in the terminal
c:/Users/jsnmtr/Code/onedrive/generateOneDriveAPIConsentURL-public.py
https://login.microsoftonline.com/xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx/oauth2/v2.0/authorize?response_type=token&client_id=xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx&scope=Sites.ReadWrite.All+Files.ReadWrite.All&state=xxxxxxxxxxxxxxxxxxxxxxxx

Open the link in your web browser and click to accept to accept the permissions requested

Alt Text

2- AutomatedOneDriveAPIUploadFiles-public.py

This is the main script, it creates a public client application using the MSAL library, request a token on behalf of the user, gain access to Microsoft Graph and use the OneDrive API to upload files.

Basic flow:
-Importing libraries

import os
import requests
import json
import msal

-Configuration

CLIENT_ID = 'xxxxxxxx-xxxxxxx-xxxxxx-xxxxxxx-xxxxxxxxxx'
TENANT_ID = 'xxxxxxxx-xxxxxxx-xxxxxx-xxxxxxx-xxxxxxxxxx'
AUTHORITY_URL = 'https://login.microsoftonline.com/{}'.format(TENANT_ID)
RESOURCE_URL = 'https://graph.microsoft.com/'
API_VERSION = 'v1.0'
USERNAME = 'xxxxxxxxx@xxxxxx.xxx' #Office365 user's account username
PASSWORD = 'xxxxxxxxxxxxxxx'
SCOPES = ['Sites.ReadWrite.All','Files.ReadWrite.All'] # Add other scopes/permissions as needed.

-Create a public client application using the Microsoft Authentication Library (MSAL)

#Creating a public client app, Aquire a access token for the user and set the header for API calls
cognos_to_onedrive = msal.PublicClientApplication(CLIENT_ID, authority=AUTHORITY_URL)

-Acquire a token from Microsoft identity platform endpoint to access Microsoft Graph API

token = cognos_to_onedrive.acquire_token_by_username_password(USERNAME,PASSWORD,SCOPES)

-Set the request header with access token

headers = {'Authorization': 'Bearer {}'.format(token['access_token'])}

-Read all the file in source directory
so we loop into the directory, get the file path, file size and read the data in the file


#Looping through the files inside the source directory
for root, dirs, files in os.walk(cognos_reports_source):
    for file_name in files:
        file_path = os.path.join(root,file_name)
        file_size = os.stat(file_path).st_size
        file_data = open(file_path, 'rb')

-If the file is less than 4mb:

if file_size < 4100000:   

-perform a simple upload

#Perform simple upload to the OneDrive API
            r = requests.put(onedrive_destination+"/"+file_name+":/content", data=file_data, headers=headers)

-If the file is larger than 4mb:
Create an upload session

upload_session = requests.post(onedrive_destination+"/"+file_name+":/createUploadSession", headers=headers).json()

-Divide the file into byte chunks

total_file_size = os.path.getsize(file_path)
chunk_size = 327680
chunk_number = total_file_size//chunk_size
chunk_leftover = total_file_size - chunk_size * chunk_number
chunk_data = f.read(chunk_size)
start_index = i*chunk_size
end_index = start_index + chunk_size

-Set the header to match the starting index and end index of the byte chunk

headers = {'Content-Length':'{}'.format(chunk_size),'Content-Range':'bytes {}-{}/{}'.format(start_index, end_index-1, total_file_size)}

-Upload chunks

chunk_data_upload = requests.put(upload_session['uploadUrl'], data=chunk_data, headers=headers)

For the complete code please visit my GitHub

To me, anytime you achieve a goal, is a success story and by writing this code I achieved my goal to automate my reports upload process to OneDrive! SUCCESS!


Let me know your thoughts!

Find me on LinkedIn | Twitter | Dev.to | StackOverflow

Top comments (10)

Collapse
 
arnaudcampestre profile image
ArnaudCampestre

Hi Jason,

Thank you very much for this article, it is very well detailed and explanatory.

When following your steps for the authorisation, when I run the python script, pick up the account and approve the access, I end up on a Microsoft page saying AADSTS500113: No reply address is registered for the application.

Could you advise on what to do please?

Thank you in advance,

Arnaud

Collapse
 
sunpochin profile image
Sun Pochin

Error message on webpage when doing auth "python3 generateOneDriveAPIConsentURL-public.py":
AADSTS500113: No reply address is registered for the application
stackoverflow.com/questions/662627...

Added redirect url solved this. (I use google.com as redirect url)

Then, error message on cmd line when doing "python3 AutomatedOneDriveAPIUploadFiles-public.py":AADSTS7000218: The request body must contain the following parameter: 'client_secret' or 'client_assertion'

google found this solution:
stackoverflow.com/questions/456094...

In the Manifest also you can control this by setting:
"allowPublicClient": true

After setting this, it works.

Collapse
 
tomshaffner profile image
tomshaffner • Edited

Is there a reason to use msal instead of github.com/OneDrive/onedrive-sdk-p...?

Really useful article otherwise, thanks!

Update: Apparently it's deprecated. pypi.org/project/onedrivesdk/ :-/ So I guess just a note for others that come here that since this, pypi.org/project/graph-onedrive/ has been released and might be worth checking out also.

Collapse
 
gt5gerry profile image
gt5-gerry

Hi I'm new, a university student. I want to use something like this to upload trainign files and reports from a RaspberryPi. Do you think it would work? I can't seem to get the OAuth 2.0 authorization endpoint (v2). Either I don't know where to find it or it's not there.

Collapse
 
sunpochin profile image
Sun Pochin

Error message on webpage when doing auth "python3 generateOneDriveAPIConsentURL-public.py":
AADSTS500113: No reply address is registered for the application
stackoverflow.com/questions/662627...

Added redirect url solved this. (I use google.com as redirect url)

Then, error message on cmd line when doing "python3 AutomatedOneDriveAPIUploadFiles-public.py":AADSTS7000218: The request body must contain the following parameter: 'client_secret' or 'client_assertion'

google found this solution:
stackoverflow.com/questions/456094...

In the Manifest also you can control this by setting:
"allowPublicClient": true

After setting this, it works.

Collapse
 
lucaspanao profile image
Lucas Panao

You helped me a lot, thanks for sharing this information! It worked perfectly in python

Hugs from Brazil :)

Collapse
 
hkeaylinton profile image
hkeaylinton

Hi, thanks so much this was extremely helpful! My only question is, in this block of code:

Looping through the files inside the source directory

for root, dirs, files in os.walk(cognos_reports_source):
for file_name in files:
file_path = os.path.join(root,file_name)
file_size = os.stat(file_path).st_size
file_data = open(file_path, 'rb')

What is cognos_reports_source?

Cheers

Collapse
 
jstacoder profile image
Kyle J. Roux

its the directory that holds the files you wish to upload

Collapse
 
manishakohli09 profile image
manishakohli09

Hi Thanks a lot for this it helped a lot.

Collapse
 
truongnguyen012 profile image
TruongNguyen012 • Edited

@jason I have an issue when clicking Accept in the link that printed from python AutomatedOneDriveAPIUploadFiles-public.py:

Please help me.