Hands-On: Data Pipeline with DynamoDB

Length: 00:05:24

Lesson Summary: AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premise data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR. We will use AWS Data Pipeline to retrieve data from a tab-delimited file in Amazon S3 to populate a DynamoDB table, use a Hive script to define the necessary data transformation steps, and automatically create an Amazon EMR cluster to perform the work.

This lesson is only available to Linux Academy members.

Sign Up To View This Lesson
Or Log In

Looking For Team Training?

Learn More