By Ferd
I can upload or someone uploaded a csv file into my spaces using boto3.
import boto3
session = boto3.session.Session()
client = session.client('s3',
endpoint_url=ENDPOINT_URL,
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY)
file_to_copy = 'myfile.csv'
dest_file = 'myfile.csv'
client.upload_file(file_to_copy, SPACE_NAME, dest_file)
The permission is set to private.
Now I want to read that file using pandas dataframe without setting the file to public and without the quick share feature though. Is that even possible using boto3? Or perhaps not boto3 as long as it uses python.
Note:
Thanks a lot.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Accepted Answer
All right I found it.
Download private file from DO space.
session = boto3.session.Session()
client = session.client(
's3',
region_name=region_name,
endpoint_url=endpoint_url,
aws_access_key_id=access_key,
aws_secret_access_key=secret_key
)
client.download_file(space_name, src_filename, dest_filename)
where:
endpoint_url: f"https://{region_name}.digitaloceanspaces.com"
src_filename: is the file in DO space. If that file is under folder
name "myfolder", then
set src_filename to "myfolder/src_filename"
dest_filename: is the destination filename
I got some ideas from this github repo https://github.com/ChariotDev/digital-ocean-spaces
Heya,
You should be able to achieve this by directly downloading the file into memory and then reading it then. Instead of downloading the file to the filesystem, you can download it into a buffer in memory using io.BytesIO
.
Here’s how I think you can do it:
import boto3
import pandas as pd
import io
session = boto3.session.Session()
client = session.client('s3',
endpoint_url=ENDPOINT_URL,
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY)
space_name = 'your-space-name'
file_name = 'myfile.csv'
# Create a buffer
csv_buffer = io.BytesIO()
client.download_fileobj(space_name, file_name, csv_buffer)
csv_buffer.seek(0)
df = pd.read_csv(csv_buffer)
# Now you can work with your DataFrame
print(df.head())
Hope it helps
Heya,
Another tool you can use is s3cmd
to handle the operations for you! You can check this question here:
https://www.digitalocean.com/community/questions/how-to-manage-digitalocean-spaces-using-s3cmd
Regards
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.