Disaster Indifference with AWS and Microsoft DirectAccess MultiSite

In this blog post, I will show you how to create a highly-available and disaster indifferent remote access architecture using Microsoft DirectAccess on AWS.

Everything fails all the time – Dr. Werner Vogel

While we can’t control when things will fail, we certainly can develop architectures that are resilient enough to handle those failures. Stephen Orban makes reference in his book “Ahead in the Cloud” of architectures that are “disaster indifferent”. That is a great way to look at architecure so that we move beyond mere disaster recovery to disaster indifference!

In this example, I will walk you through the setup both on AWS and the Microsoft DirectAccess servers. Nearly everything I’m going to show can be done through scripting or other automation platforms like CloudFormation. I will show you the console steps just for clarity.

We will start with a fairly simple architecture across two AWS regions:

  • us-west-2 (Oregon)
  • eu-west-2 (London)

 

 

Prerequisites:

There are several prerequisites required for this solution. The complete setup of these prerequisites are outside the scope of this post, but I will try to offer guidance where I can. This should not be construed as a production setup, but just a POC you can experiment with.

  1. An AWS VPC network configuration that spans multiple regions.
  2. AWS Security Groups for Domain Controllers, Domain Members, and DirectAccess Servers.
  3. A Microsoft Active Directory deployment in the environment.
  4. A simple 2-tier PKI deployment in the environment. (Microsoft Enterprise CA).*  
  5. Domain joined servers that will be our DirectAccess servers.

*This blog post: https://www.derekseaman.com/2014/01/windows-server-2012-r2-two-tier-pki-ca-pt-1.html is an excellent reference to get you started with a simple PKI deployment. Although it is written for 2012 the basics are the same with 2016.

With those in-place we can get started.

DirectAccess Group Policy

Although it isn’t strictly required, I like to separate the DirectAccess servers inside of AD into their own OU. We can then set a simple GPO to make sure the IPv6 components are not disabled on the instance. Seems like a no-brainer, but some image templates may disable IPv6 and it is absolutely required for DirectAccess.

Add a registry preference item to:

HKLM\System\CurrentControlSet\Services\Tcpip6\Parameters
Value name: DisabledComponents (REG_DWORD)
Value type: REG_DWORD
Value data: 0 x 0

We will find in later steps where GPO plays a big role in Microsoft DirectAccess

Instance Configuration

I’ll start with a couple of instances in the public subnets inside of each region. Doesn’t really do as a lot of good if we don’t have at least two that we can load balance with. We will start with some standard EC2 instances deployed into our Public subnets in each region.

DirectAccess servers will require static IP addresses, so we will go ahead and choose our IPs and specify them at EC2 launch time. Once the instance is up and running, change the network adapter configuration to the static IPs we chose earlier. There is this funny (well not so funny until you do it) trick to be able to create a DA load balanced cluster inside of AWS.

Although DirectAccess supports other ports and protocols (Teredo and 6to4), for simplicity we will only use the IP-HTTPS adapter. Besides applying a standard domain member security group to the instance, you will need a security group specifically for the DA servers themselves.

At a minimum, the AWS security group should include:

  1. TCP/443 from either all or approved ingress addresses.
  2. TCP/80 from inside the VPC for ELB health checks.
  3. Allow all traffic between DA members.

The EC2 instance size is something you will have to determine within your environment based on the number of expected clients. Microsoft provides some guidance with DA capacity planning: https://docs.microsoft.com/en-us/windows-server/remote/remote-access/directaccess/directaccess-capacity-planning

I would go ahead and create the computer objects inside of our DA server OU so they will get the right GPO policy out of the gate.

This will also be the place where we apply our DirectAccess Server GPOs.

DirectAccess Configuration

There are a few things we will need as we start our configuration of DirectAccess

  1. One certificate per DirectAccess entry point installed on each DA server.  For this example we will use two certificates (da-us-west-2.site and da-eu-west-2.site)
  2. A place to host the network location server (NLS) website.  This site must only be reachable from inside corp net.  
  3. An active directory security group to hold our client computer objects

On the first DA server (DA1), install the RemoteAccess\DirectAccess and VPN (RAS) role service. Once this is complete we can start the configuration. I don’t like the “Getting Started Wizard”, so I prefer to just launch the Remote Access Management tool and start the configuration from there.

We will only be deploying DirectAccess

Now just go through the individual steps:

Step 1: Configure Clients

We will select the AD security group that will hold our clients and there are some choices you can make around force tunneling and a WMI filter for mobile laptops only.

Pick some resources you can access once you are connect over DA

Step 2: Remote Access Server Setup

We will deploy as a single network adapter behind an edge device

Choosing the certificate we provisioned earlier.

For our test environment we will use AD authentication (no two-factor) with computer certificates that chain up to our Root CA.

Step 3: Infrastructure Server Setup

In this step we add our NLS website (external to DirectAccess servers)

Adjust the rest of the settings in this step to match your environment (DNS, DNS Suffix List, and Management servers (i.e. SCCM or WSUS type of servers).

Final Step: Finish Configuration

We are ready to finish our configuration. I like to edit the GPO names so that they make sense. For a DirectAccess multi-site configuration, there will be a single client GPO and DA server GPO per entry point. You will need AD permissions to be able to create the GPOs in your domain.

Enabling External Load-Balancing

Step 1: First DA Server Configuration

I mentioned earlier this sort of funny step that you have to go through to enable load-balancing for your DA cluster. This only has to be done once per DA entry point or region.

  1. We already have the primary IP that we want to use for the DA server (i.e. 192.168.8.25).  What we need to do is add a secondary IP to the EC2 instance.  
  2. On the EC2 instance swap the IP address on the network adapter to the secondary IP address.
  3. Reconnect to the EC2 instance using the secondary IP address.

Once we are logged in from the Remote Access Management tool select to “Enable Load Balancing” using an external load balancer.

In the next step under “Dedicated IP Addresses” put in our original EC2 IP address (i.e. 192.168.8.25)

This will temporarily disconnect us from the RDP session, but just log back in using the original IP address. Once this is complete, you can remove the secondary private IP address from the EC2 instance.

Step 2: Second DA Server Configuration

  1. Make sure the certificate for this entry point is loaded on the second DA server.
  2. Add the DirectAccess roles and features on the second DA server (but do not configure).

On the Remote Access Management GUI, we will add DA2 to the Load Balanced Cluster

Enable DirectAccess MultiSite

From the Remote Access Management Console enable Multisite. This will configure the first entry point in our DA multisite configuration (i.e. us-west-2).

 

Now we need to add our second entry point (eu-west-2)

We will add the second entry point DNS name

We will select the certificate for the second entry point (eu-west-2)

Once this is complete, we should now have two entry points defined for our enterprise

 

We need to add the second server into our load balanced cluster in eu-west-2. We have to go through the same hoops of adding a secondary private IP to the EC2 instance and doing the replace/switch with the private IP addresses. Just not fun…

Once that is complete we can add the second server into our eu-west-2 cluster. Don’t forget to install the certificate on this machine as well. One thing to remember if your having trouble is to check DNS and since we are dealing with Microsoft Active Directory be sure to give things enough time to replicate in the domain.

AWS ELB Setup:

At this point we have a pair of load-balanced DA clusters in each region. Now we need to add AWS ELBs to start getting client traffic.

  1. Create a simple HTML file (i.e. healthcheck.htm) and place it in the inetpub\wwwroot folder on each DA server.
  2. Create a new AWS Network Load Balancer internet facing using TCP/443 into our public subnets.
  3. Set the health check interval to 10 seconds pointed to our health check HTML file.
  4. Register each DA server into the NLB.
  5. Next create new alias records in Route53 that will be the entry points for our multisite configuration.

At the end we should show two healthy EC2 instances that are ready to accept incoming client connections.

Client Testing:

We now need to prove out our architecture. For the most part, the Windows 10 client will select the closest entry point, although the exact algorithm that Microsoft uses is somewhat elusive to find. We need our client to be able to failover transparently both in-region and across regions transparently.

Chaos Monkey:

Our client is connected now being directed to us-west-2 (which would be logical based on my current location) and is currently connected to the server DA2.

We will start a continuous ping to one of the domain controllers and play chaos monkey and pull the DA2 server out from underneath the client. Once the ELB marks the server as unhealthy, we are redirected over to the DA1 server with only around 9 packets lost.

We see now are client has failed over to DA1

 

Chaos Gorilla:

Let’s start the same continuous ping to the domain controller and now let’s play chaos gorilla and take down the whole region.

It took a little longer to failover, but notice the increase in latency since the connections are now going through eu-west-2.

I hope you found this exercise useful. Don’t hesistate to reach out with any questions you might have.

Cloud on!

Leave a Comment