Nucleus Logo

Nucleus

Replacing Bastion Host with AWS SSM

Nick Zelei

|

|

14min

single cube

Intro

Bastion Hosts are typically used to provide authenticated access to private VPCs. A common use-case for this is to tunnel into a database that is not publicly accessible to the internet. We usually use this at Nucleus to access our staging database, and in rare instances, our production database.

Unless the bastion host is behind a VPN, this results in port 22 (the common port for SSH) be exposed to the internet. This isn’t great for things like SOC2 or other general security audits. Using a product like AWS SSM (no, this is not an advertisement), allows for using IAM Roles for authentication instead of having to manage SSH keys. A jump box is still required, but we’re relying on AWS to be the source of authentication instead of SSH.

The full code seen in this blog can all be found here .

Goals for this Post

All of the goals below will utilize Terraform to set up the infrastructure.

  • Replace Bastion Host with AWS SSM
  • Set up a basic private EC2 instance to act as the jump box
  • Configure session logging to keep track of any and all commands being used on the jump box
    • Bonus: Set up an S3 bucket that is configured in a separate AWS account for optimal security
  • Set up SSM to allow tunneling to an Aurora Postgres DB

Setting up AWS SSM

We’ll first cover setting up AWS Session Manager Preferences.

This will cover the following:

  1. Using a KMS Key to encrypt user sessions
  2. Setting up an S3 bucket for session logging
    1. Cloudwatch can also be used, but this won’t be covered in this post
  3. Setting up a basic EC2 instance to be used as a jump box

Configuring Session Manager Preferences

AWS SSM works on the notion of using different “documents”, which are effectively access profiles that can be used to do different things when tunneling with SSM.

Context around the SSM Preferences Document

To configure SSM Preferences, another document is also used. This is a little confusing because if this is done through the UI, the document is lazily created the first time the preferences are accessed. Due to this, there are effectively two different places to configure the same preferences! If doing this through Terraform, the document can be created, unless the preferences have been previously accessed through the AWS UI. If that is the case, the document will have to be imported into state, or deleted prior to using Terraform to manage the preferences. The name of the document is ************************************SSM-SessionManagerRunShell************************************.

Code

Let’s see what this looks like using Terraform.

resource "aws_ssm_document" "session_manager_prefs" {
  name            = "SSM-SessionManagerRunShell"
  document_type   = "Session"
  document_format = "JSON"

  content = jsonencode({
    schemaVersion = "1.0"
    description   = "Document to hold regional settings for Session Manager"
    sessionType   = "Standard_Stream"
    inputs = {
      kmsKeyId                    = aws_kms_key.ssm_key.id
      s3BucketName                = var.logs_bucket_name
      s3KeyPrefix                 = "ssm/${var.account_id}"
      s3EncryptionEnabled         = true
      cloudWatchLogGroupName      = ""
      cloudWatchEncryptionEnabled = true
      cloudWatchStreamingEnabled  = false
      idleSessionTimeout          = 60
      maxSessionDuration          = null
      runAsEnabled                = false
      shellProfile = {
        linux   = ""
        windows = ""
      }
    }
  })
}

resource "aws_kms_key" "ssm_key" {
  description             = "Encrypts SSM User Sessions"
  deletion_window_in_days = 10
  enable_key_rotation     = true
}

This module is missing the name of the S3 bucket that will be used for session logging. Let’s set that up next.

SSM Session Logging Bucket

Let’s go through the configuration for the S3 bucket and explain what and why each section is doing what it is doing.

This first section creates a basic S3 bucket as well as enables versioning and encryption. It’s good practice to enable versioning. This way in the event that a key is overwritten, the previous version is then saved for better audibility. It’s also a good practice for buckets like this to enable MFA Delete for better protection against something like logs from being wiped.

Lastly, we enable server-side encryption of the bucket. Optionally, a CMK can be used instead of the provided S3 encryption at an additional cost and for further security. Bucket key encryption can optionally be enabled or disabled, we’re choosing to turn it on here to save on encryption costs.

resource "aws_s3_bucket" "ssm_bucket" {
  bucket = "ssm-audit-logs"
}

resource "aws_s3_bucket_versioning" "ssm_bucket" {
  bucket = aws_s3_bucket.ssm_bucket.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "ssm_bucket" {
  bucket = aws_s3_bucket.ssm_bucket.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
    bucket_key_enabled = true
  }
}

This next section focuses on the ACL policies.

This one is pretty straightforward, we want the S3 bucket to be private and not publicly accessible at all. This is also a general requirement for SOC2 audits as buckets should not be public unless absolutely necessary. In the case of this bucket, we’re storing SSM Session Logs which can be sensitive depending on what is actually done on the jump box.

resource "aws_s3_bucket_acl" "ssm_bucket" {
  bucket = aws_s3_bucket.ssm_bucket.id
  acl    = "private"
}

resource "aws_s3_bucket_public_access_block" "ssm_bucket" {
  bucket = aws_s3_bucket.ssm_bucket.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

This next section is optional and really depends on how long one wants to store logs in this bucket. It’s generally good practice for time series logging like this to at least transition it to different types of storage to save on some S3 costs.

The set up below has a hard expiration of 90 days for current and non-current versions. After 30 days, the storage class changes to STANDARD_IA. Many examples online will show a transition to glacier after 60 days. This is a lot of room for variance here.

resource "aws_s3_bucket_lifecycle_configuration" "ssm_bucket" {
  bucket     = aws_s3_bucket.ssm_bucket.id
  depends_on = [aws_s3_bucket_versioning.ssm_bucket]

  rule {
    id = "ssm"

    filter {
      prefix = "/"
    }

    expiration {
      days = 90
    }

    noncurrent_version_expiration {
      noncurrent_days = 90
    }

    noncurrent_version_transition {
      noncurrent_days = 30
      storage_class   = "STANDARD_IA"
    }

    status = "Enabled"
  }
}

The final bit to configure here is the bucket policy. The bucket needs to be configured to allow the IAM Role of the EC2 instance to write logs to this bucket. This is required if the bucket is being configured in a separate AWS account from the jump box, which is recommended!

This policy is doing a number of different things, so let’s dive in and dissect what each piece is doing. The policy is set up to explicitly be set up for a different stage role and production role (it could be more efficient with loops, but it’s laid out below very plainly to explicitly show what each piece is doing).

The stage_role and prod_role are the IAM Role Arn of the EC2 Instance Role of the jump box. That has not been configured yet and is shown in the following sections where the Instance is configured.

Statement: Encryption Config

This section is pretty simple. It allows the configured IAM Roles to read in the encryption configuration of the specified resources (in this case, the configured S3 bucket.)

Statement: SSMPutLogs_*

There are two sections here, one for stage, another for prod.

The EC2 Instance role needs the ability to put objects and put object acls, nothing more.

The resources for each section as such each role is only able to write to specific paths in the bucket. This path prefix is ssm/<account-id> - That way the “stage” account has no ability to write to any path other than under its own account’s prefix. This allows a single bucket to be used for multiple AWS accounts that might have a jump box configured.

resource "aws_s3_bucket_policy" "ssm_bucket" {
  bucket = aws_s3_bucket.ssm_bucket.id
  policy = data.aws_iam_policy_document.ssm_bucket.json
}

data "aws_iam_policy_document" "ssm_bucket" {

  statement {
    sid    = "EncryptionConfig"
    effect = "Allow"

    principals {
      type = "AWS"
      identifiers = [
        var.stage_role,
        var.prod_role,
      ]
    }
    actions = [
      "s3:GetEncryptionConfiguration",
    ]
    resources = [
      aws_s3_bucket.ssm_bucket.arn,
    ]
  }
  statement {
    sid    = "SSMPutLogs_Stage"
    effect = "Allow"

    principals {
      type = "AWS"
      identifiers = [
        var.stage_role,
      ]
    }

    actions = [
      "s3:PutObject",
      "s3:PutObjectAcl",
    ]
    resources = [
      "${aws_s3_bucket.ssm_bucket.arn}${var.stage_prefix}",
      "${aws_s3_bucket.ssm_bucket.arn}${var.stage_prefix}/*",
    ]
  }

  statement {
    sid    = "SSMPutLogs_Prod"
    effect = "Allow"

    principals {
      type = "AWS"
      identifiers = [
        var.prod_role,
      ]
    }

    actions = [
      "s3:PutObject",
      "s3:PutObjectAcl",
    ]
    resources = [
      "${aws_s3_bucket.ssm_bucket.arn}${var.prod_prefix}",
      "${aws_s3_bucket.ssm_bucket.arn}${var.prod_prefix}/*",
    ]
  }
}

Setting up the EC2 Instance

There is a lot to configure here, so let’s dive in and talk about what all is necessary.

What will be set up:

  1. IAM Role + Policy for EC2 instance
  2. IAM Instance Profile based off the above IAM Role that is attached to the EC2 Instance
  3. Custom AWS Launch Template to configure EC2 Instance
  4. AWS Autoscaling Group that manages the EC2 instance to more easily handle restarts
  5. Security group ingress + egress to secure the instance at the networking layer.

Let’s dive in.

IAM Role + Policy

The below code sets up the IAM Role that will be attached to the EC2 instance.

Won’t go into too much detail here, this is pulled primarily from the AWS documentation here .

A few variables are required here though:

  1. log_bucket_arn
    1. This is the arn of the bucket that we set up a little bit ago.
  2. session_kms_arn
    1. This is the KMS arn that we set up when the SSM Document was configured. This is needed for the EC2 instance to handle the user session encryption. With out, a user won’t be able to connect because it won’t be able to handle the encrypted session data.
locals {
  bastion_launch_template_name = "ssm-bastion-lt"
  name_prefix                  = local.bastion_launch_template_name
}

resource "aws_iam_role" "bastion_host_role" {
  name               = "ssm-bastion-host"
  path               = "/"
  assume_role_policy = data.aws_iam_policy_document.assume_policy_document.json
}

data "aws_iam_policy_document" "assume_policy_document" {
  statement {
    actions = [
      "sts:AssumeRole"
    ]
    principals {
      type        = "Service"
      identifiers = ["ec2.amazonaws.com"]
    }
  }
}

data "aws_iam_policy_document" "bastion_host_ssm_policy_document" {

  statement {
    actions = [
      "ssm:UpdateInstanceInformation",
      "ssmmessages:CreateControlChannel",
      "ssmmessages:CreateDataChannel",
      "ssmmessages:OpenControlChannel",
      "ssmmessages:OpenDataChannel"
    ]
    resources = ["*"]
  }

  statement {
    actions = [
      "s3:PutObject",
      "s3:PutObjectAcl",
      "s3:GetEncryptionConfiguration",
    ]
    resources = [
      var.log_bucket_arn,
      "${var.log_bucket_arn}/*"
    ]
  }

  statement {
    actions = [
      "kms:Decrypt"
    ]
    resources = [
      var.session_kms_arn,
    ]
  }
}

resource "aws_iam_policy" "bastion_host_ssm_policy" {
  name   = "${local.name_prefix}-host-ssm-policy"
  policy = data.aws_iam_policy_document.bastion_host_ssm_policy_document.json
}

resource "aws_iam_role_policy_attachment" "bastion_host_ssm" {
  policy_arn = aws_iam_policy.bastion_host_ssm_policy.arn
  role       = aws_iam_role.bastion_host_role.name
}

resource "aws_iam_instance_profile" "bastion_host_profile" {
  role = aws_iam_role.bastion_host_role.name
  path = "/"
}

The EC2 Instance

Ok, next is the instance itself! This is the actual jump box that is used to tunnel the SSM sessions.

This is a modified and slimmed down version of the EC2 instance that is created from the popular bastion host module on the Terraform registry.

This instance is used with the latest Amazon Linux 2. This comes preconfigured by default with AWS SSM. It might need to be installed separately if not using Amazon Linux 2.

data "aws_ami" "amazon-linux-2" {
  most_recent = true
  owners      = ["amazon"]
  name_regex  = "^amzn2-ami-hvm.*-ebs"

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }
}

resource "aws_launch_template" "bastion_launch_template" {
  name_prefix            = local.name_prefix
  image_id               = data.aws_ami.amazon-linux-2.id
  instance_type          = "t3.nano"
  update_default_version = true
  monitoring {
    enabled = true
  }
  network_interfaces {
    associate_public_ip_address = false
    security_groups = [
      aws_security_group.bastion_host_security_group.id,
    ]
    delete_on_termination = true
  }
  iam_instance_profile {
    name = aws_iam_instance_profile.bastion_host_profile.name
  }

  tag_specifications {
    resource_type = "instance"
    tags          = merge(tomap({ "Name" = local.bastion_launch_template_name }), merge(var.tags))
  }

  tag_specifications {
    resource_type = "volume"
    tags          = merge(tomap({ "Name" = local.bastion_launch_template_name }), merge(var.tags))
  }

  lifecycle {
    create_before_destroy = true
  }

  metadata_options {
    http_endpoint               = "enabled"
    http_tokens                 = "required" # SOC2
    http_put_response_hop_limit = 1
    instance_metadata_tags      = "enabled"
  }
}

resource "aws_autoscaling_group" "bastion_auto_scaling_group" {
  name_prefix = "ASG-${local.name_prefix}"
  launch_template {
    id      = aws_launch_template.bastion_launch_template.id
    version = aws_launch_template.bastion_launch_template.latest_version
  }
  max_size         = 1
  min_size         = 1
  desired_capacity = 1

  vpc_zone_identifier = var.auto_scaling_group_subnets

  default_cooldown          = 180
  health_check_grace_period = 180
  health_check_type         = "EC2"

  termination_policies = [
    "OldestLaunchConfiguration",
  ]

  dynamic "tag" {
    for_each = var.tags

    content {
      key                 = tag.key
      value               = tag.value
      propagate_at_launch = true
    }
  }

  tag {
    key                 = "Name"
    value               = "${local.name_prefix}-${var.env}"
    propagate_at_launch = true
  }

  instance_refresh {
    strategy = "Rolling"
  }

  lifecycle {
    create_before_destroy = true
  }
}

Security Groups

Let’s next set up some security groups. This is pretty easy and minimal.

No ingress groups are needed, as no ports need to be exposed here. Nice!

One egress rule is required to allow outbound traffic on port 443, as this it the port that SSM uses to communicate over.

resource "aws_security_group" "bastion_host_security_group" {
  description = "basic security group for bastion host"
  name        = "${local.name_prefix}-host"
  vpc_id      = var.vpc_id

  tags = merge(var.tags)
}

resource "aws_security_group_rule" "egress_https" {
  description = "Allow connections to HTTP for outbound SSM connections"
  type        = "egress"
  from_port   = "443"
  to_port     = "443"
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]

  security_group_id = aws_security_group.bastion_host_security_group.id
}

Using SSM to connect to the EC2 Instance

Finally! Wow, that was a lot of configuration. We can finally tunnel to our instance using SSM.

******************But wait!****************** We need an IAM role that we can assume that will allow us to use SSM!

Here is a handy AWS Article that can help with that.

Configuring AWS CLI

AWS CLI does not come configured by default to natively work with SSM, so an extension is needed to do so. Amazon has a pretty detailed article on here to install this on relevant platforms. That is found here .

If coming from a mac, it can easily be installed with brew: brew install --cask session-manager-plugin

After that, populate a terminal session with able-bodied credentials, then find the instance id of the EC2 instance created above.

aws ssm start-session --target <id>

Voila! Tunneling successful.

Tunnel into an RDS Database

How do we use SSM to tunnel to an RDS Database?

A new egress port will need to be opened on the EC2 Instance’s security group.

resource "aws_security_group_rule" "egress_postgres" {
  description = "Allow connections to Postgres"
  type        = "egress"
  from_port   = "5432"
  to_port     = "5432"
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]

  security_group_id = aws_security_group.bastion_host_security_group.id
}

After that, it is pretty straightforward, and the IAM Role being used will need permissions to use the AWS-StartPortForwardingSessionToRemoteHost SSM Document.

Starting the tunnel:

aws ssm -start-session --region <region> --target <ec2-instance-id> --document-name AWS-StartPortForwardingSessionToRemoteHost --parameters host="<rds-host>",portNumber="5432",localPortNumber="5432"

This will start a tunnel session, and open the port defined specified under the localPortNumber parameter. Now, psql or any other DB client can be used to connect to the remote database on the localhost port!

Tunneling into the database might be frequent, or not, either way, it can be a bit of a pain to find the ec2 instance id that is used as the jump box.

Below is a shell alias that can be used to easily automate this process. It specifically connects to the staging database, and it does so by looking for a specific tag on an ec2 instance with the name of ssm-bastion-lt-stage.

get_stage_tunnel() {
    INSTANCE_ID=$(aws ec2 describe-instances --filter "Name=tag:Name,Values=ssm-bastion-stage" --query "Reservations[0].Instances[0].InstanceId" --output text)
    echo "Connecting to Jump Box: $INSTANCE_ID"
    aws ssm start-session --region us-west-2 --document-name AWS-StartPortForwardingSessionToRemoteHost --parameters host="<host>",portNumber="5432",localPortNumber="5433" --target $INSTANCE_ID
}
alias stage_tunnel='get_stage_tunnel'

Conclusion

That was a lot, but now we have a nice, terraformed setup for configuring and managed AWS SSM, along with a basic jump box that is used for tunneling into a VPC and RDS database. It’s also configured for session logging for audibility as well as full encryption.

To view the entire code used in this blog post, check out the Github for it here .

Table of Contents

Latest Articles

blog_image

Product

3 types of Zero-Downtime Deployments in Kubernetes

A guide to the 3 types of zero-downtime deployments in Kubernetes

|

|

5min

Subscribe to new blogs from Nucleus.