Welcome to Day 9 of the 12 Days of DigitalOcean! Over the past few days, we’ve been building an Email-Based Receipt Processing Service. The goal is simple: allow users to forward receipts or invoices to a dedicated email address, extract key details like date
, amount
, currency
, and vendor
, and store them in a database, or a spreadsheet for easy review later.
Sounds handy, right? Now, think about doing this manually—digging through emails, copying details one by one, and pasting them into a spreadsheet. It’s repetitive, error-prone, and, let’s be honest, a waste of your time. That’s where AI comes in.
Instead of writing a parser for every receipt format, you’ll use DigitalOcean’s GenAI Platform to handle this dynamically. By the end of this tutorial, you’ll have a Flask app that takes messy receipt emails, processes them with AI, and transforms them into clean, structured JSON.
Emails aren’t standardized. Every vendor formats receipts differently, and writing custom parsers for all those variations is impractical. AI is perfect for this kind of problem. It excels at recognizing patterns in unstructured text and extracting relevant information.
With DigitalOcean’s GenAI Platform, you get access to pre-trained models from Meta, Mistral, and Anthropic. These models are powerful and flexible, and you can even fine-tune them for specific tasks if needed. For this tutorial, we’ll use a pre-trained model to keep things simple.
To get the most out of this tutorial, we assume the following:
Note: Even if you don’t have these set up, you’ll still learn how to:
Let’s start by creating the core component of this system: a GenAI agent.
Log in to your DigitalOcean account and navigate to the GenAI Platform.
Note: The GenAI Platform is currently in early availability. If you don’t have access, you can request access here.
Click Create Agent, and give your agent a name like receipt-processor-agent
.
Add the following instructions to describe the agent’s task:
Extract the following details from emails:
- Date of transaction
- Amount
- Currency
- Vendor name
Ensure the output is in the following JSON format:
{
"date": "<date>",
"amount": "<amount>",
"currency": "<currency>",
"vendor": "<vendor>"
}
Select a Model: For this tutorial, select Llama 3.1 Instruct (8B). It’s great for general-purpose instruction-following tasks.
Pro Tip: Models can be updated later, even after the agent is created. You can also evaluate models in the Model Playground before selecting one. For a complete list of supported models, refer to the DigitalOcean documentation.
Optional: Add a Knowledge Base You can enhance your agent by adding a knowledge base. This allows it to learn from specific documents or datasets you provide. For simplicity, we’ll skip this step in this tutorial.
Create the Agent: Scroll down and click Create Agent to complete the setup.
Before hooking the agent into an app, let’s make sure it works. Go to the Playground tab in the agent settings. This is a testing area where you can interact with the agent and see how it responds.
Thank you for your purchase! Your order #12345 has been processed. Total: $35.99. Date: December 29, 2024. Thank you for shopping at Amazon.com
{
"date": "Dec 29, 2024",
"amount": "35.99",
"currency": "USD",
"vendor": "Amazon"
}
If it doesn’t, adjust the instructions and test again. This step is about making sure the agent understands what you want.
https://agent-1234abcd5678efgh1234-abcde.ondigitalocean.app
Now that we have the agent endpoint and secure agent key, we need to store them securely so they’re accessible to our Flask app.
Go to App Settings: Navigate to your app in the DigitalOcean dashboard and click Settings.
Locate Environment Variables: Scroll to the App-Level Environment Variables section.
Add the Following Variables:
AGENT_BASE_URL
: Paste the GenAI agent’s base URL, e.g., https://agent-1234abcd5678efgh1234-abcde.ondigitalocean.app
.SECURE_AGENT_KEY
: Paste the secure agent key generated in the GenAI settings.Save and Redeploy: Save the changes, and DigitalOcean will automatically redeploy your app with the new environment variables.
Now, we’ll update our Flask app to communicate with the GenAI agent using the OpenAI SDK. Here’s how we’ll approach it:
Install Required Libraries: Run this command to ensure you have the necessary libraries:
pip install openai flask python-dotenv
Freeze Requirements: Generate a requirements.txt
file to track your dependencies:
pip freeze > requirements.txt
Set Up the OpenAI SDK: In the app, we’ll initialize the OpenAI SDK and configure it to use the DigitalOcean GenAI endpoint. This is what handles communication with the agent we created earlier.
from openai import OpenAI
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Secure agent key and endpoint
SECURE_AGENT_KEY = os.getenv("SECURE_AGENT_KEY")
AGENT_BASE_URL = os.getenv("AGENT_BASE_URL")
AGENT_ENDPOINT = f"{AGENT_BASE_URL}/api/v1/"
# Initialize the OpenAI client
client = OpenAI(
base_url=AGENT_ENDPOINT,
api_key=SECURE_AGENT_KEY
)
Add a Function to Process Email Content: Now we’ll add a function to send email content to the GenAI agent, process it using a prompt, and return the result. Here’s how it works:
def process_with_ai(email_content):
"""
Process email content with GenAI.
"""
prompt = (
"Extract the following details from the email:\n"
"- Date of transaction\n"
"- Amount\n"
"- Currency\n"
"- Vendor name\n\n"
f"Email content:\n{email_content}\n\n"
"Ensure the output is in JSON format with keys: date, amount, currency, vendor."
)
response = client.chat.completions.create(
model="your-model-id", # Replace with your GenAI model ID
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
Update the Flask Endpoint: Finally, we’ll connect this function to the /inbound
route of the Flask app to handle incoming emails. Here’s the updated endpoint:
@app.route('/inbound', methods=['POST'])
def handle_inbound_email():
"""
Process inbound emails and return extracted JSON.
"""
email_content = request.json.get("TextBody", "")
if not email_content:
return jsonify({"error": "No email content provided"}), 400
extracted_data = process_with_ai(email_content)
return jsonify({"extracted_data": extracted_data})
# app.py
from flask import Flask, request, jsonify
import os
from dotenv import load_dotenv
from openai import OpenAI
import logging
# Load environment variables
load_dotenv()
# Initialize Flask app
app = Flask(__name__)
# Configure logging
logging.basicConfig(level=logging.INFO)
# Secure agent key and endpoint
SECURE_AGENT_KEY = os.getenv("SECURE_AGENT_KEY")
AGENT_BASE_URL = os.getenv("AGENT_BASE_URL")
AGENT_ENDPOINT = f"{AGENT_BASE_URL}/api/v1/"
# Initialize the OpenAI client for GenAI
client = OpenAI(
base_url=AGENT_ENDPOINT,
api_key=SECURE_AGENT_KEY
)
def process_with_ai(email_content):
"""
Process email content with GenAI.
"""
prompt = (
"Extract the following details from the email:\n"
"- Date of transaction\n"
"- Amount\n"
"- Currency\n"
"- Vendor name\n\n"
f"Email content:\n{email_content}\n\n"
"Ensure the output is in JSON format with keys: date, amount, currency, vendor."
)
response = client.chat.completions.create(
model="your-model-id", # Replace with your GenAI model ID
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
@app.route('/inbound', methods=['POST'])
def handle_inbound_email():
"""
Process inbound emails and return extracted JSON.
"""
email_content = request.json.get("TextBody", "")
if not email_content:
logging.error("No email content provided.")
return jsonify({"error": "No email content provided"}), 400
extracted_data = process_with_ai(email_content)
logging.info("Extracted Data: %s", extracted_data) # Log the extracted data
return jsonify({"extracted_data": extracted_data})
if __name__ == "__main__":
app.run(port=5000)
To deploy the updated Flask app, follow the steps from Day 7. Here’s a quick summary:
git add .
git commit -m "Add GenAI integration for receipt processing"
git push origin main
2.Monitor Deployment: You can track the progress in the Deployments section of your app’s dashboard.
3.Verify Your Deployment: After the deployment completes, navigate to your app’s public URL and test its functionality. You can also check the runtime logs in the dashboard to confirm that the app started successfully.
Send a Sample Email: Forward a sample receipt email to the Postmark address you configured in Day 8.
Monitor Postmark Activity: Log in to Postmark and confirm that the email was forwarded to your app. Refer to the Day 8 tutorial.
Check Runtime Logs: In the DigitalOcean App Platform dashboard, view your app’s runtime logs. You should see the JSON output from the Flask app.
Expected Output:
2024-12-31 12:34:56,789 INFO Extracted Data: {
"date": "Dec 29, 2024",
"amount": "35.99",
"currency": "USD",
"vendor": "Amazon"
}
![DigitalOcean runtime logs showing application output](https://doimages.nyc3.cdn.digitaloceanspaces.com/006Community/12-Days-of-DO/email-receipt-processor/digitalocean_runtime_logs_screenshot _json_response.png)
Here’s what we accomplished today:
Here is the previous tutorial from this series on Day 8:Connecting Postmark to Your Flask App.
Up next: In the next tutrorial, you’ll complete the pipeline by storing the extracted JSON data into a database. This will make your Email-Based Receipt Processing Service ready for real-world use. Stay tuned—Day 10 is all about tying it all together! 🚀
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!