Internship Report > Worklog > Week 4 Worklog

Week 4 Worklog

Week 4 Objectives:

The goal of this week was to strengthen understanding of AWS data and AI services — from building scalable DataLake architectures to working with serverless NoSQL databases (DynamoDB) and exploring AI development lifecycle management.
Additionally, this week aimed to enhance technical translation and communication skills through hands-on labs, blog translations, and participation in AWS events.

Tasks to be carried out this week:

Day	Task	Start Date	Completion Date	Reference Material
1	Building a Data Lake on AWS - Data Lake Concepts: Explored centralized repository architectures for varied data analytics workloads. - AWS Glue (ETL): Studied Glue Crawlers for automated schema discovery and Data Catalog generation. - Amazon Athena: Performed serverless SQL queries directly on S3 data. - Amazon QuickSight: Visualized data insights via interactive Dashboards. - Hands-on Lab 35: Deployed Data Lake infrastructure, ingested data via Kinesis Firehose, and queried cataloged data using Athena.	29/09/2025	29/09/2025	Module 07
2	Query Optimization & Cost Analysis (Lab 40) - ETL Workflow: Executed end-to-end pipeline: Raw Data Ingestion -> Glue Crawler -> Transformation (Parquet) -> Athena Query. - Cost Analysis: Analyzed AWS billing data using SQL in Athena (Service breakdown, Tagging). - Optimization Techniques: Applied cost-saving strategies: Parquet compression, query LIMITs, and data partitioning.	30/09/2025	30/09/2025	Module 07
3	NoSQL Databases with Amazon DynamoDB (Lab 60) - DynamoDB Fundamentals: Researched Serverless NoSQL architecture, auto-scaling, and Capacity Modes (On-demand vs Provisioned). - Data Modeling: Deep dived into Primary Keys (Partition + Sort) and Secondary Indexes (GSI/LSI). - Consistency Models: Compared Eventual Consistency vs. Strong Consistency trade-offs. - Implementation: Practiced table creation, item manipulation, and querying via AWS Console and SDK (Boto3).	01/10/2025	01/10/2025	Module 06
4	AWS Technical Blog Research & Translation - Generative AI: Translated and studied GenAI lifecycle management using MLflow on SageMaker. - Industry Application: Researched Deep Learning applications for subsurface infrastructure mapping. - Contact Center AI: Explored Amazon Connect optimization using AI capabilities. - Skill Development: Enhanced technical vocabulary and domain knowledge in SageMaker, Batch, and ParallelCluster.	02/10/2025	02/10/2025
5	AWS Event: AI Development Lifecycle & Kiro - AI Lifecycle: Gained insights into the full AI development pipeline: Data Prep -> Training -> Deployment -> Monitoring. - AWS AI Ecosystem: Understood the role of SageMaker, Bedrock, and CodeWhisperer in AI DevSecOps. - New Tooling (Kiro): Explored Kiro for unified AI workflow management. - Real-world Use Cases: Analyzed case studies on AI automation and model optimization in production.	03/10/2025	03/10/2025

Week 4 Achievements:

Built an AWS DataLake pipeline integrating Glue, Athena, and QuickSight, gaining hands-on experience in data ingestion, transformation, and visualization.
Configured AWS Glue Crawlers and Athena queries for cost analysis and schema automation, applying cost optimization strategies (e.g., Parquet, partitioning, query limits).
Mastered DynamoDB fundamentals, including primary/composite keys, indexes (GSI, LSI), consistency models, and capacity modes.
Enhanced translation and comprehension skills by translating AWS technical blogs on AI, MLflow, and HPC, deepening understanding of SageMaker, Batch, ParallelCluster, and Connect.
Participated in an AWS event focused on the AI Development Lifecycle and Kiro, gaining insights into end-to-end model management, versioning, and deployment best practices.
Improved technical vocabulary and practical knowledge in cloud computing, AI model governance, and data-driven architecture design within the AWS ecosystem.