Amazon EMR Encryption with Security Configurations

Easier EMR encryption setup offers local disk, Spark, Tez & Hadoop MapReduce options:

  • Similar to CSE-KMS, Amazon S3 encryption happens in the EMRFS client within the cluster.
  • For information specific to Amazon S3 encryption with Amazon EMR, see Encryption for Amazon S3 Data with EMRFS .
  • In-transit data encryption using Transport Layer Security (TLS) happens automatically for data in-transit between Amazon S3 and Amazon EMR.
  • For more information, see Amazon EMR In-transit Encryption and Providing Certificates for In-Transit Data Encryption .
  • You can use AWS KMS as your key provider, or specify a custom key provider application in Amazon S3.

Describes encryption configuration using security configurations in Amazon EMR.

@awscloud: Easier EMR encryption setup offers local disk, Spark, Tez & Hadoop MapReduce options:

The following diagram shows the different encryption options available with security configurations in Amazon EMR:

You can use a security configuration to encrypt data at-rest, data in-transit, or both. Each security configuration is stored in Amazon EMR rather than in cluster configuration objects, so you can easily reuse a configuration to specify encryption settings whenever a cluster is created. The settings defined in a security configuration take precedence over security settings that may be configured using cluster configuration objects.

Data encryption requires keys and certificates. A security configuration gives you the flexibility to choose from several options, including keys managed by AWS Key Management Service, keys managed by Amazon S3, and keys and certificates from custom providers that you supply.

When using AWS KMS as your key provider, charges apply for the storage and use of encryption keys. For more information, see AWS KMS Pricing.

When you enable at-rest data encryption, a combination of mechanisms helps encrypt the file system. For information specific to Amazon S3 encryption with Amazon EMR, see Encryption for Amazon S3 Data with EMRFS.

Linux Unified Key System (LUKS) provides local-disk encryption, encrypting EC2 instance store volumes (except boot volumes) and EBS volumes attached to cluster nodes. You can use AWS KMS as your key provider, or specify a custom key provider application in Amazon S3.

The Hadoop Distributed File System (HDFS) block-transfer encryption and RPC encryption are used to encrypt the cluster file system. These mechanisms use automatically generated keys, so after you enable encryption at-rest, no additional configuration is required.

The following options are available for encrypting Amazon S3 data and objects used with Amazon EMR File System (EMRFS) EMRFS:

Server-side encryption with Amazon S3-managed encryption keys (SSE-S3). Data is encrypted in Amazon S3 using Amazon S3 -managed keys. See Server-side encryption with Amazon S3-managed encryption keys (SSE-S3) in the Amazon S3 Developer Guide for details.

Server-side encryption with AWS KMS-managed keys (SSE-KMS). Data is encrypted in Amazon S3 using an AWS KMS customer master key (CMK), with policies suitable for Amazon EMR. See Creating Keys for Amazon EMR At-rest Data Encryption and Server-side encryption with AWS KMS-managed keys (SSE-KMS) in Amazon Simple Storage Service Developer Guide for details.

Client-side encryption with AWS KMS-managed keys (CSE-KMS). Amazon S3 encryption happens in the EMRFS client within the cluster. Similar to SSE-KMS, you use an AWS KMS CMK with policies suitable for Amazon EMR. See Creating Keys for Amazon EMR At-rest Data Encryption and Client-side encryption with AWS KMS-managed keys (CSE-KMS) in the Amazon Simple Storage Service Developer Guide for details.

Client-side encryption with a custom client-side master key (CSE-Custom). Similar to CSE-KMS, Amazon S3 encryption happens in the EMRFS client within the cluster. You specify a custom encryption materials provider in Amazon S3. See Amazon S3 Client-side Encryption with EMRFS and Client-side encryption using a custom client-side master key (CSE-Custom) in the Amazon Simple Storage Service Developer Guide for details.

In-transit data encryption applies to inter-node communication for certain distributed applications that implement transport layer security (TLS) encryption, such as Apache Hadoop Shuffle (with MapReduce or Tez as appropriate), and Apache Spark. When you enable in-transit encryption, the open-source TLS encryption feature sets available in these applications are enabled. You specify the certificates to be used in encryption as either a zip file containing certificates that you upload to Amazon S3, or using a custom certificate provider. For more information, see Amazon EMR In-transit Encryption and Providing Certificates for In-Transit Data Encryption.

Amazon EMR Encryption with Security Configurations

You might also like More from author

Comments are closed, but trackbacks and pingbacks are open.