AWS Glue 3.0 version

AWS Glue 3.0

AWS has announced that they have released Glue 3.0 version with new set of more optimizations.

Based on Apache Spark 3.1.1, which has optimizations from open-source Spark and developed by the AWS Glue and EMR services such as adaptive query execution, vectorized readers, and optimized shuffles and partition coalescing.

Glue 2.0 has older version of spark 2.4 but with Glue 3.0 we are able to access customized Spark 3.1.1

Amazon S3 optimized output committers by default, which was not default earlier in spark 2.0.

Reduced startup latency improving overall job completion times and interactivity.

Billing is same as Glue 2.0 costs for each second.

Spark 3.1, Python 3 or Spark 3.1, Scala 2 in Glue 3.0 version.

AWS Glue 3.0 does not yet support machine learning transforms.

AWS Glue 3.0 does not yet support development endpoints.

Python 2.7 is not supported with Spark 3.1.1.

AWS Glue 3.0 does not run on Apache YARN, so YARN settings do not apply.

AWS Glue 3.0 does not have a Hadoop Distributed File System (HDFS).

Scala is also updated to 2.12 from 2.11, and Scala 2.12 is not backwards compatible with Scala 2.11.

As there are many upgrades in dependencies in Glue 3.0, use --user-jars-first = 'true' If you want to provide override default jars in glue.

Older and new versions comparisons:

Driver	JDBC driver version in past AWS Glue versions	JDBC driver version in AWS Glue 3.0
MySQL	5.1	8.0.23
Microsoft SQL Server	6.1.0	7.0.0
Oracle Databaes	11.2	21.1
PostgreSQL	42.1.0	42.2.18
MongoDB	2.0.0	4.0.0

Dependency	Version in AWS Glue 0.9	Version in AWS Glue 1.0	Version in AWS Glue 2.0	Version in AWS Glue 3.0
Spark	2.2.1	2.4.3	2.4.3	3.1.1-amzn-0
Hadoop	2.7.3-amzn-6	2.8.5-amzn-1	2.8.5-amzn-5	3.2.1-amzn-3
Scala	2.11	2.11	2.11	2.12
Jackson	2.7.x	2.7.x	2.7.x	2.10.x
Hive	1.2	1.2	1.2	2.3.7-amzn-4
EMRFS	2.20.0	2.30.0	2.38.0	2.46.0
Json4s	3.2.x	3.5.x	3.5.x	3.6.6
Arrow	N/A	0.10.0	0.10.0	2.0.0
AWS Glue Catalog client	N/A	N/A	1.10.0	3.0.0

World Tech Cure

Search This Blog

AWS Glue 3.0 version

AWS Glue 3.0

Comments

Post a Comment