AWS Glue Crawler:
Glue crawler is part of AWS Glue service which will helps us to crawl over the datasets and bring the necessary metadata and creates a table in Glue Catalogue which can be seen even in AWS Athena.
Glue Crawler can crawl the different kinds of data formats:
- CSV
- Parquet
- Avro
- JSON
- XML
Crawler crawls data from different source systems like:
- S3
- JDBC
- MangoDB
- DynamoDB
- AmazonDocumentDB
We have to configure the data source like S3 location or JDBC connection details or DynamoDB table name, destination database and name and when crawler starts it will start reading the data form source and creates the necessary tables in glue catalogue.
Watch video for hands-on
Comments
Post a Comment