Report this

What is the reason for this report?

How to download a private file from spaces using boto3

Posted on November 27, 2023
Ferd

By Ferd

I can upload or someone uploaded a csv file into my spaces using boto3.

import boto3

session = boto3.session.Session()
client = session.client('s3',
                        endpoint_url=ENDPOINT_URL,
                        aws_access_key_id=ACCESS_KEY,
                        aws_secret_access_key=SECRET_KEY)

file_to_copy = 'myfile.csv'
dest_file = 'myfile.csv'
client.upload_file(file_to_copy, SPACE_NAME, dest_file)

The permission is set to private.

Now I want to read that file using pandas dataframe without setting the file to public and without the quick share feature though. Is that even possible using boto3? Or perhaps not boto3 as long as it uses python.

Note:

  1. Setting the file to public, I can access it.
  2. Sharing the file for 1 hour, etc. (quick share feature). I can access it too.

Thanks a lot.



This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
0

Accepted Answer

All right I found it.

Download private file from DO space.

session = boto3.session.Session()
client = session.client(
    's3',
    region_name=region_name,
    endpoint_url=endpoint_url,
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key
)
client.download_file(space_name, src_filename, dest_filename)
where:

endpoint_url: f"https://{region_name}.digitaloceanspaces.com"

src_filename: is the file in DO space. If that file is under folder
  name "myfolder", then 
  set src_filename to "myfolder/src_filename"
  
dest_filename: is the destination filename

I got some ideas from this github repo https://github.com/ChariotDev/digital-ocean-spaces

Heya,

You should be able to achieve this by directly downloading the file into memory and then reading it then. Instead of downloading the file to the filesystem, you can download it into a buffer in memory using io.BytesIO.

Here’s how I think you can do it:

import boto3
import pandas as pd
import io

session = boto3.session.Session()
client = session.client('s3',
                        endpoint_url=ENDPOINT_URL,
                        aws_access_key_id=ACCESS_KEY,
                        aws_secret_access_key=SECRET_KEY)

space_name = 'your-space-name'
file_name = 'myfile.csv'

# Create a buffer
csv_buffer = io.BytesIO()
client.download_fileobj(space_name, file_name, csv_buffer)
csv_buffer.seek(0)
df = pd.read_csv(csv_buffer)

# Now you can work with your DataFrame
print(df.head())

Hope it helps

Heya,

Another tool you can use is s3cmd to handle the operations for you! You can check this question here:

https://www.digitalocean.com/community/questions/how-to-manage-digitalocean-spaces-using-s3cmd

Regards

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.