OneDrive as Data Storage for Python Project
source link: https://towardsdatascience.com/onedrive-as-data-storage-for-python-project-2ff8d2d3a0aa
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
OneDrive as Data Storage for Python Project
Using OneDrive API to sync OneDrive files directly to your Python projects
We may already have this phrase “Data is the new oil”. And, we as data scientists work on the data science process — refining this new oil to be valuable and ready to use. The most fundamental step of the data science process is Data Storage. In this article, I am going to show an example of how to use Cloud technology as Data Storage.
OneDrive is one of the most efficient available cloud storage in terms of pricing and capacity. And, it is also very easy to get one. With the free version, you can use it You get 5 GB of free storage.But you can always subscribe to its service to get more storage. I have subscribed to the Microsoft 365 Family myself and get a total of 6 TB of cloud space along with the MS Office software to use. I think it is a great opportunity to use this space for data science projects with Python.
The Problem!?
Unfortunately, you cannot use the file directly via the URL share from OneDrive as it will return as an HTML page from OneDrive.com which requires you to click on a download button before the file can be used in your project.
In this short article, I will focus on how to sync files from OneDrive directly to Python in a few lines of code.
Create OneDrive Direct Download Link
Step 1: Share files through OneDrive and get a download link
This step is relatively simple. You can just upload or share files using OneDrive and click on the “share” then “Copy Link” buttons to create a cloud link.
Step 2: Convert OneDrive URL to Direct Download URL
To be able to download your OneDrive files directly in Python, the shared URL from Step 1 has to be converted to a direct download URL which conforms to the OneDrive API guide here. Or, you can follow my script below using the base64
module.
With the function above, you can pass the shared OneDrive URL from step 1 into this function.
Use Case
Import Excel on OneDrive to Pandas’ Dataframe
Let’s try using the steps above with the sample time-series dataset below. It is hosted on my OneDrive.
We can use a script from the 2-steps above to generate a direct download link then import the excel data directly using Pandas. The full sample script is shown as followed:
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK