Prepare source data

Overview

Initialize an object storage (S3 Bucket) to hold original documents (PDF, Word, Text). This acts as the “Source of Truth” that the Knowledge Base will access to read, analyze, and synchronize knowledge for the AI. You can store knowledge related to your field to use in creating your own personal assistant or Chatbot.

Data Preparation

We will create an S3 Bucket to store original documents, acting as the knowledge source for the Chatbot.

Step 1. Create S3 Bucket

  • Access the S3 service from the search bar.
  • AWS Region: Select United States (N. Virginia us-east-1).

Access S3

  • Click Create bucket.

Create S3 Bucket

  • Configure Bucket information:
    • Bucket Type: Click General purpose
    • Bucket name: Enter rag-workshop-demo
    • Object Ownership: Keep default ACLs disabled.
    • Block Public Access settings: Keep default (Selected Block all public access).

Configure S3 Bucket

  • Scroll to the bottom of the page, Click Create bucket.

Image illustrating S3 Bucket creation interface and Region selection

  • Check that the S3 Bucket has been created successfully.

Create S3 Bucket Successful

Step 2. Upload sample documents

  • This is a sample document, relevant to overview of AWS cloud computing knowledge. You can use it to run demos or upload your data.

PDF format file

  • In the Buckets list, Click on the bucket name you just created. Click_S3_Bucket

  • Click Upload. Upload_file_to_Bucket

  • In the Upload interface:

    • Click Add files.
    • Select the sample document file attached above or a file from your computer (PDF or Word file with a lot of text is recommended).
  • Scroll to the bottom of the page, Click Upload.

Upload_file_to_Bucket_02

  • When you see the green “Upload succeeded” notification, Click Close.

Upload file successfully