Question

How to download a private file from spaces using boto3

I can upload or someone uploaded a csv file into my spaces using boto3.

import boto3

session = boto3.session.Session()
client = session.client('s3',
                        endpoint_url=ENDPOINT_URL,
                        aws_access_key_id=ACCESS_KEY,
                        aws_secret_access_key=SECRET_KEY)

file_to_copy = 'myfile.csv'
dest_file = 'myfile.csv'
client.upload_file(file_to_copy, SPACE_NAME, dest_file)

The permission is set to private.

Now I want to read that file using pandas dataframe without setting the file to public and without the quick share feature though. Is that even possible using boto3? Or perhaps not boto3 as long as it uses python.

Note:

  1. Setting the file to public, I can access it.
  2. Sharing the file for 1 hour, etc. (quick share feature). I can access it too.

Thanks a lot.

Show comments

Submit an answer


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Accepted Answer

All right I found it.

Download private file from DO space.

session = boto3.session.Session()
client = session.client(
    's3',
    region_name=region_name,
    endpoint_url=endpoint_url,
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key
)
client.download_file(space_name, src_filename, dest_filename)
where:

endpoint_url: f"https://{region_name}.digitaloceanspaces.com"

src_filename: is the file in DO space. If that file is under folder
  name "myfolder", then 
  set src_filename to "myfolder/src_filename"
  
dest_filename: is the destination filename

I got some ideas from this github repo https://github.com/ChariotDev/digital-ocean-spaces

alexdo
Site Moderator
Site Moderator badge
November 30, 2023

Heya,

Another tool you can use is s3cmd to handle the operations for you! You can check this question here:

https://www.digitalocean.com/community/questions/how-to-manage-digitalocean-spaces-using-s3cmd

Regards

KFSys
Site Moderator
Site Moderator badge
November 28, 2023

Heya,

You should be able to achieve this by directly downloading the file into memory and then reading it then. Instead of downloading the file to the filesystem, you can download it into a buffer in memory using io.BytesIO.

Here’s how I think you can do it:

import boto3
import pandas as pd
import io

session = boto3.session.Session()
client = session.client('s3',
                        endpoint_url=ENDPOINT_URL,
                        aws_access_key_id=ACCESS_KEY,
                        aws_secret_access_key=SECRET_KEY)

space_name = 'your-space-name'
file_name = 'myfile.csv'

# Create a buffer
csv_buffer = io.BytesIO()
client.download_fileobj(space_name, file_name, csv_buffer)
csv_buffer.seek(0)
df = pd.read_csv(csv_buffer)

# Now you can work with your DataFrame
print(df.head())

Hope it helps

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Featured on Community

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more
DigitalOcean Cloud Control Panel