My experiences and findings of various Cloud solutions, automation, scripts, and problems (solved or otherwise) to share with everyone!

Connect AWS and Azure with a Private Endpoint

Posted by:

|

On:

|

Modified:

Categories: , ,

|

Goal

Enable non-public connection between Azure and AWS.

Restrictions

Cannot allow connections over internet via public addresses
Do not need to establish bi-directional communication as the connection is only going from AWS to Azure

10,000 foot view

A Virtual Network in Azure containing the resources that are to be read from and written to
A Virtual Private Cloud in AWS containing the resources that are going to read from and write to Azure resources
A private connection between the Azure and AWS clouds

Azure – Part 1

Here is a list of resources that are going to be created as part of the Azure side of the equation:

Required Networking Resources:

  • Virtual Network
  • Virtual Network Gateway
    • Public IP Address
  • Local Network Gateway
  • Private DNS Zone
  • DNS Private Resolver
  • Private Endpoint
    • Network Interface
  • Connection

Optional/Testing Resources:

  • CosmosDB

I will walk through the creation of these resources via the Azure Portal, but a Terraform definition can also be found here.

Virtual Network

Let’s start by creating the Azure Virtual Network.
First, Choose your subscription, your Resource Group (or create one if necessary), a name for the resource, and the region you want it in. Then click Next.

Creating an Azure Virtual Network in the Azure Portal. Basic Tab.
Creating an Azure Virtual Network in the Azure Portal. Basic Tab.

On the “Security” tab, everything will be left off for now, but you should evaluate each option as necessary for production deployments of this pattern. Click Next.
For the “IP addresses” tab an IPv4 address space needs to be chosen. For the sake of this blog post, I will use nearly the default values, but your specific values may need to be different based on your network configuration. The specific address space you chose will be important in the AWS section, so note it down exactly. In the case below, the CIDR block “10.0.0.0/16” should be noted. Click Next.

Creating an Azure Virtual Network in the Azure Portal. IP Addresses Tab.
Creating an Azure Virtual Network in the Azure Portal. IP Addresses Tab.

There do not need to add tags for this example, but I suggest always have good tags on your resources to make managing them easier. Click Review + create and then click Create.

Once that is created, navigate to the resource and click “Subnets” on the left side panel and then click “+ Gateway subnet”. Again, the defaults are acceptable for the example here, but your configuration may have extra requirements or restrictions. There is now a Subnet called “GatewaySubnet” with the IPv4 address space of 10.0.1.0/24.

Virtual Network Gateway

With the Virtual Network created, a Virtual Network Gateway can now be created. First on the Basics tab, you will choose your Subscription. The Resource Group will be filled in when the Virtual Network that the Gateway is for is selected, so next fill in the Name and Region fields. For this setup, the value “VPN” will need to be used for the Gateway type, the “VpnGw2AZ” SKU, and the “Generation2” Generation. Select the Virtual Network that was just created and it should automatically select the GatewaySubnet that was created as well. Lastly, a public IP is required for the Gateway, so create one now, give it a name, and select the “Zone-redundant” Availability Zone and disable both “Enable active-active mode” and “Configure BGP”. Click Review + create and then Create since tags are not included in this example.

Creating an Azure Virtual Network Gateway in the Azure Portal. Basics Tab.
Creating an Azure Virtual Network Gateway in the Azure Portal. Basics Tab.

Once this has finished creating, which may take some time, note the IP address of the newly created “Public IP address” resource. In the case of this example, it was 52.250.33.119. There is no more to do in Azure until AWS has been setup more with the information obtained from Azure.

AWS – Part 1

And here is a list of resources that are going to be created as part of the AWS side of the equation:

Required Networking Resources:

  • VPC Resources:
    • Virtual Private Cloud
    • Subnet
    • Route Table
    • Network ACL
    • Security Group
    • Customer Gateway
    • Virtual Private Gateway
    • Site-to-Site VPN Connection
  • Route53 Resources:
    • Outbound Endpoint
    • Rule

Optional/Testing Resources:

  • Lambda Resources:
    • Lambda Function

Virtual Private Cloud

First in AWS is the VPC. To keep thing as simple as possible only select “VPC only” for resources to create. Then give the VPC a name (recommended but technically optional) and choose IPv4 CIDR manual input. For the IPv4 CIDR input a similar block to the Azure IPv4 address space will be used for simplicity, but it is not required and your situation may require a different CIDR than the one chosen here, which is 10.1.0.0/16. There is no need for an IPv6 CIDR block, and Tenancy should be left as “Default”. Click Create VPC.

Creating an AWS VPC in the AWS Console.
Creating an AWS VPC in the AWS Console.

Subnet

Next, a Private Subnet needs to be created. Under the VPC service, click Subnets in the left panel, then click Create subnet. Choose the VPC that was just created from the dropdown then give the Subnet a good name and choose a preferred Availability Zone or No preference if it doesn’t matter as is the case for us here. The IPv4 CIDR block should be populated based on the VPC selected at the top of the page, and now a CIDR block for the Subnet needs to be created, again mimicking the setup in Azure, which was set to 10.1.0.0/24. Tags are optional but recommended for real-world uses.

Creating an AWS Subnet in the AWS Console.
Creating an AWS Subnet in the AWS Console.

Network Access Control List

After creating the Subnet, there should now also be a Network ACL. For this example, the Inbound and Outbound rules, and Subnet associations, are all sufficient to prove functionality, but should be made as strict as possible for real usage.

Security Group

Similarly with the Security Group, the one currently associated to the VPC has Inbound and Outbound rules that are sufficient for this example but should be made as strict as possible for real usage. The only suggested change would be to add a helpful name for easier management.

Customer Gateway

Now a Customer Gateway needs creating. Provide a helpful name and put the Public IP Address that was obtained when the Azure Virtual Network Gateway was created in for IP address. All the other values should be left default or blank. Click Create customer gateway.

Creating an AWS Customer Gateway in the AWS Console.
Creating an AWS Customer Gateway in the AWS Console.

Virtual Private Gateway

Very simply give the Virtual Private Gateway a helpful name and leave the ASN as “Amazon default ASN”. Click Create virtual private gateway.

Creating an AWS Virtual Private Gateway in the AWS Console.
Creating an AWS Virtual Private Gateway in the AWS Console.

Once created, attach the virtual private gateway to the VPC that was recently created.

Route Table

A route table is automatically created for a VPC so it can simply be edited. Click Add route and put in the exact IPv4 Address space that was selected in Azure. In this example it is 10.0.0.0/16 (ensure the IPv4 address and the network identifier bits are exactly correct). Under Target select Virtual Private Gateway and then choose the recently created Virtual Private Gateway. Click Save changes.

Site-to-Site VPN Connection

Now navigate to Site-to-Site VPN connections and click Create VPN connection. Choose a good name, then select the radio option for Virtual private gateway, and then select the newly created Virtual Private Gateway. Next select the radio button for existing Customer gateway and select the recently created Customer Gateway. Select the radio button for the Static routing option, enter the IPv4 address space from the Azure Virtual Network that was saved earlier (in this case, 10.0.0.0/16), and all the other fields can be left as default or blank. Click Create VPN connection.

Creating an AWS Site-to-Site VPN Connection in the AWS Console.
Creating an AWS Site-to-Site VPN Connection in the AWS Console.

After clicking Create VPN connection, select the newly created VPN Connection from the list and click the “Download configuration button” at the top above the table. Select the “Generic” vendor and then change the IKE version to ikev2. Click Download.

Outbound Endpoint

Now switch to the Route53 service and click “Outbound endpoints” from the left panel. Click Create outbound endpoint. Give the endpoint a name and select the VPC that was created along with the desired Security Group associated with it. The Endpoint Type should be IPv4 and Do53 should work as the Protocol for this endpoint. Next, for IP Address, both IP Address #1 and #2 can have similar settings. The Availability Zone shown may differ for you, as the option “No Preference” was selected earlier for the VPC and Zone “us-east-1c” was apparently selected the resulting zone. After that, choose the Private Subnet and to use an IPv4 address that is selected automatically. Click Create outbound endpoint.

Creating an AWS Outbound Endpoint in the AWS Console. General Settings.
Creating an AWS Outbound Endpoint in the AWS Console. General Settings.
Creating an AWS Outbound Endpoint in the AWS Console. IP Addresses.
Creating an AWS Outbound Endpoint in the AWS Console. IP Addresses.

AWS setup is nearly finished. But before it can be finished, Azure needs to be finished.

Azure – Part 2

Local Network Gateway

Back in Azure, create a Local Network Gateway. Choose your subscription, Resource group, and desired Region. Give the Local Network Gateway a name and choose “IP address” for the Endpoint. Fill in the IP address with the “Outbound IP address” that is shown in AWS under VPC -> Virtual Private Network (VPN) -> Site-to-Site VPN connections, select the VPN connection that was created, and then click the Tunnel details tab for that VPN connection. The Outbound IP address that is needed is for Tunnel 1. Then the Address Space(s) will be filled with the AWS VPC address space, which in this case is 10.1.0.0/16 as was specified when creating the VPC.

Creating an Azure Local Network Gateway in the Azure Portal. Basic Tab.
Creating an Azure Local Network Gateway in the Azure Portal. Basic Tab.
Corresponding location in the AWS Site-to-Site VPN Connection for the Azure Local Network Gateway IP Address field.
Corresponding location in the AWS Site-to-Site VPN Connection for the Azure Local Network Gateway IP Address field.

Click Next: Advanced and ensure that Configure BGP settings is set to No. Click Review + create and then Create.

DNS Private Resolver

Create a new DNS Private Resolver resource. As usual, choose your Subscription and Resource group along with a Name for the DNS Private Resolver. The Region is more important this time, as the DNS Private Resolver must reside in the same Region as the Virtual Network, so choose the same Region your Virtual Network is in and then choose the Virtual Network from the drop down. Click Next: Inbound Endpoints. Click Add an endpoint and give your endpoint a name and add a new Subnet by clicking Create new. The default Subnet address range was acceptable for me, which was 10.0.2.0/28, and I named this Subnet Inbound. For IP address assignment, choose Static and then I simply chose the first available IP above the Inbound Subnet (/28 means 4 IP Addresses, 10.0.2.0 – 10.0.2.3, so I chose 10.0.2.4 as the Static IP for this endpoint). Click Save to add the endpoint, Click Review + create, and then click Create to finish making the DNS Private Resolver.

Creating an Azure DNS Private Resolver in the Azure Portal. Basics Tab.
Creating an Azure DNS Private Resolver in the Azure Portal. Basics Tab.
Creating an Azure Inbound Endpoint within the Azure DNS Private Resolver in the Azure Portal.
Creating an Azure Inbound Endpoint within the Azure DNS Private Resolver in the Azure Portal.

CosmosDB

For this example CosmosDB will be used as the target of the Private Endpoint and connection from AWS to Azure. Many Azure resources support Private Endpoint so this is not limited to just CosmosDB. Create a CosmosDB and choose Create for Azure Cosmos DB for NoSQL. Select your Subscription and Resource group, then choose a name (CosmosDB only allows lowercase letters). Also for this example, Availability Zones are not needed, so select Disable. For Location, stick with the Region all of the other resources are in, which is (US) West US 2 in this example. For Capacity mode select Provisioned throughput, for Apply Free Tier Discount choose Apply, and uncheck Limit total account throughput. Click Next: Global Distribution.

Creating an Azure CosmosDB in the Azure Portal. Basics Tab.
Creating an Azure CosmosDB in the Azure Portal. Basics Tab.

Ensure all the available radio buttons are set to Disabled, again because this is just an example. Click Next: Networking. Leave everything as default here, but notice that Private endpoint is an option for Connectivity method, as that is what will be set once this has been created. Click Next: Backup Policy. The default values are fine for this example except the Backup storage redundancy should be lowered to Locally-redundant backup storage. Click Review + create and then click Create.

Creating an Azure CosmosDB in the Azure Portal. Global Distribution Tab.
Creating an Azure CosmosDB in the Azure Portal. Global Distribution Tab.
Creating an Azure CosmosDB in the Azure Portal. Backup Policy Tab.
Creating an Azure CosmosDB in the Azure Portal. Backup Policy Tab.

Private Endpoint

Once the CosmosDB has been created, create a new Private Endpoint. Select your Subscription and Resource group, then provide a Name for the Private Endpoint. It will automatically fill in the Network Interface Name for you. The Region should be the same as the Resource Group. Click Next: Resource.

Creating an Azure Private Endpoint in the Azure Portal. Basics Tab.
Creating an Azure Private Endpoint in the Azure Portal. Basics Tab.

For the Resource, select the radio option for Connect to an Azure resource in my directory. Choose your Subscription as usual. For Resource type, you can get a glimpse of the many Azure resources that support Private Endpoints, but for this example, filter the list using “cosmos” and select “Microsoft.AzureCosmosDB/databaseAccounts”. Select the newly created CosmosDB and the target sub-resource should fill in automatically to Sql. Click Next: Virtual Network.

Creating an Azure Private Endpoint in the Azure Portal. Resource Tab.
Creating an Azure Private Endpoint in the Azure Portal. Resource Tab.

Here the Virtual Network should be selected and the Subnet should be left as default. Network policy for private endpoints can be left as Disabled for this example. Private IP configuration can be left as Dynamically allocate IP address and no Application security group is needed. Click Next: DNS.

Creating an Azure Private Endpoint in the Azure Portal. Virtual Network Tab.
Creating an Azure Private Endpoint in the Azure Portal. Virtual Network Tab.

Ensure that Integrate with private DNS zone is set to Yes and the Subscription and Resource group values are acceptable. This will create the Private DNS zone for us. Click Next: Tags, then Next: Review + create, and then Create.

Creating an Azure Private Endpoint in the Azure Portal. DNS Tab.
Creating an Azure Private Endpoint in the Azure Portal. DNS Tab.

Connection

The final Azure resource is a Connection from the Local Network Gateway. Navigate to the Local Network Gateway resource and click Connections under the Settings area on the left side panel. Click Add. Choose your Subscription and Resource group. Then change Connection type to Site-to-site (IPsec) and provide a name for the Connection resource. The Region should be the same as the Resource Group’s Region. Click Next: Settings.

Creating an Azure Connection in the Azure Portal. Basics Tab.
Creating an Azure Connection in the Azure Portal. Basics Tab.

Here the downloaded document, from AWS when the Site-to-Site VPN Connection was created, should be brought up. Choose the Virtual and Local Network Gateways and then copy and paste the Pre-Shared Key (PSK) from the IPSec Tunnel #1 section of the AWS Site-to-Site VPN Connection document. At the time of this post it is line 36. All other values can be left blank or default. Click Review + create and then Create.

Creating an Azure Connection in the Azure Portal. Settings Tab.
Creating an Azure Connection in the Azure Portal. Settings Tab.

AWS – Part 2

Rule

The final resource to create is back in AWS. Navigate to the Route53 service and click Rules under the Resolver section in the left-side panel. Click Create rule. Provide a name for the outbound rule, ensure the Rule type is Forward, choose the VPC, and the Outbound endpoint. The value for Domain name depends on the resource that is connecting with the Azure Private Endpoint. Use this page, looking at the “Public DNS zone forwarders” column, to find the value you need to put here. Since this example uses CosmosDB, the value needed is documents.azure.com for Domain name. Also, very importantly, end the value for Domain name with a “.” for it to be properly configured. Lastly, the IPv4 address for the Target IP addresses is going to be the IP address that was chosen for the Azure Private DNS Resolver Inbound Endpoint. In this case, 10.0.2.4 was chosen and the other values should be left as default. Click Submit.

Creating an AWS Rule in the AWS Console.
Creating an AWS Rule in the AWS Console.

Testing

Sanity Check

Now that everything is created, a cursory check can be performed to ensure everything is working before going on to the real test. In Azure, navigate to the Connection resource and view the Overview blade. Look for the “Data in” property in the Essentials table. It should have a non-zero amount of Data coming in. Then in AWS, navigate to the VPC service and click Site-to-Site VPN connections under the Virtual private network (VPN) section on the left-side panel. Select the VPN connection, and then click the Tunnel details tab. The Status for Tunnel 1 should be Up in green with a check mark.

Smoke Test

Now the expected networking functionality can be tested with a real end-to-end test where an AWS Lambda Function in the Private Subnet of the VPC attempts to open a connection to the Azure CosmosDB database and read an entry. In Azure CosmosDB, navigate to the Networking blade under Settings from the left-side panel. Select the Selected networks radio button option and click Add my current IP under Firewall. Check the box for Allow access from Azure Portal and click Save. This operation may take some time, but the CosmosDB need to be populated with Sample data to prove the connection works. This change will be reverted after the values that prove the successful connection are copied.

Once that has finished updating, navigate to the Data Explorer blade from the left-side panel. Click Launch quick start and click Next through the various prompts until you can click Create container. When the process is complete, click Items and click any one item from the table. Copy the JSON contents as it will be needed later to verify the results. Once the value is copied locally, undo the Network access that was just granted to prove that connection is not being used to connect to the CosmosDB. The easiest way to do this is to set the Public network access radio option to Disabled on the Network blade.

To create a Lambda that can test the connection follow these instructions for “Creating .NET projects using the .NET CLI”. Then update the default files using this gist. Make sure to update the cosmosItemId and cosmosCategoryId variables with the “id” and “categoryId” obtained from the sample item in CosmosDB. When ready, deploy the Lambda using the instructions for “Deploying .NET projects using the .NET CLI”. If the lambda-deploy tool cannot automatically deploy the zip to AWS Lambda for you, you can always take the generated zip file and manually create a Lambda, using that generated zip as the Code source. Once that Lambda is deployed, go to the Configuration tab of the Lambda Function and select Environment variables from the left-side panel. Edit the Environment variables to add one with the Key COSMOS_CONNECTIONSTRING and the Value you get from Azure CosmosDB in the Keys blade under the Settings section and revealing the PRIMARY CONNECTION STRING value. Click Save. Then go to the VPC section and click Edit. Select your VPC from the dropdown, then select your Subnet and security group. Click Save. Once the Lambda Function has finished updating, click the Test tab and scroll down to edit the Event JSON. Replace the default text with “test”, including the quotes, and hit test.

The result should look like the picture below. Copy out the contents and compare the result to the value that you saved from the CosmosDB Database and make sure that the results are as expected.

Success result in the Lambda Test Tab.
Success result in the Lambda Test Tab.

Info Dump

Figuring this out took at least 4 tries of various different guides, a fruitless call with Microsoft, and at least 3 different Teams calls with as many different groups of co-workers to get this up and running. Although that last one is likely more an indication that we don’t have the right person working with us than it is an indication that anyone I called didn’t know what they should. Also, networking is deep arcane magic to me. My brain refuses to understand anything beyond the simplest networking concepts, so I was walking in to this with both hands tied behind my back from the word go.

Attempt #1

The first link I found was someone trying to connect Azure SQL, with a private endpoint, to AWS Lambda. I thought I hit the jackpot right off the bat. Unfortunately, as I was reading the accepted answer, I realized that I didn’t understand many of the concepts that were being described as matter-of-fact. Partly, again, because of my networking handicap, but also because this solution is tied up with AWS which I have not been actively using recently. The first and third bullet points of the answer, though, both mention private IP addresses for the Lambda Service, which I still don’t understand, as that is not how Lambdas work with IP addresses, as far as I can tell anyway. So I went ahead and said that I would figure out the first and third bullet points after I did the second one because that one had another link to a tutorial on learn.microsoft.com!

I was able to follow each step in that tutorial, easy-peasy. It is well written and even the AWS parts are accurate enough and essentially complete. So then I had an environment configured in Azure and in AWS. I could even confirm in both clouds that there was communication going through the tunnel from AWS to Azure and vice-versa. Now I needed to come back to the “Lambda private IP” concept that I didn’t understand. I figured since I didn’t know which IP address was the “Lambda private IP”, I would just basically whitelist any IPs I could find from AWS on the CosmosDB resource. When that didn’t work, I realized I also didn’t even know how to tell if any of the tests I was running on the Lambda were even going through the tunnels into Azure. I spent another few days on that, bringing in other people to help, searching for anything I could find on “how to use a private endpoint”, and even talking to Microsoft. None of those ultimately shed any new light, with no thanks to me not even being able to fully articulate the problem and what I need to happen since this may as well be an alien language to me. Ok time to give up here.

Attempt #2

Someone who was helping me, who also did not have any specific answers to any of this, said screw it and asked ChatGPT. They sent me the steps that ChatGPT decided was definitely the right answer and even I said whatever, I have nothing to lose here even though I had no confidence it would work. Here is what ChatGPT sent us:

Certainly! Establishing a secure connection between an AWS Lambda function and an Azure Cosmos DB involves configuring network settings in both AWS and Azure. Let’s break down the steps:

  1. Azure Cosmos DB Configuration:
    • Create a Private Endpoint for your Azure Cosmos DB account. Private endpoints allow you to access Cosmos DB through private IP addresses within your virtual network.
    • Configure Private DNS Zones (if needed) to resolve the Cosmos DB endpoint using a custom domain name within your virtual network.
    • Limit Access via Private IP Addresses: With Private Link, you can restrict access to your Cosmos DB account over private IP addresses.
    • Combine with Network Security Groups (NSGs): Use restrictive NSG policies to further reduce the risk of data exfiltration.
  2. AWS Lambda Configuration:
    • Create or Use an Existing VPC: Set up a Virtual Private Cloud (VPC) in AWS. You can create a new VPC or use an existing one.
    • Internet Gateway and Elastic IP:
      • Create an Internet Gateway to allow communication with the internet from within your VPC.
      • Configure a Public Subnet and add a route to the route table that routes to the Internet Gateway (0.0.0.0/0).
      • Create a new Elastic IP address.
      • Set up a NAT Gateway and assign it to the Public Subnet and the Elastic IP address.
      • Create a Private Subnet and add a route to the route table that routes to the NAT Gateway (0.0.0.0/0).
    • Configure Lambda Function:
      • Ensure your Lambda function runs within the VPC you created.
      • Whitelist the assigned Elastic IP (from the NAT Gateway) in your Azure Cosmos DB firewall rules.
  3. Connect Lambda to Cosmos DB:
    • Obtain the Cosmos DB connection string (including the account URL and keys) from the Azure Portal.
    • In your Lambda function code, use this connection string to connect to your Cosmos DB account.

Ok, great. So I started following the directions, filling in the blanks since they are obviously not extensive and complete. Create a CosmosDB and a Private Endpoint, so far so good. It is assuming a Virtual Network, fair enough, make one of those. Configure Private DNS Zones (if needed)? Whether that is needed is precisely what I am asking! Limit Access and NSGs are irrelevant here since they are restrictive measures and I can’t even get a connection before restrictions, so I ignored those. The Azure side is setup, I guess? It seems a little sparse though. Oh well, moving on to AWS.

Create a VPC, yup that’s right so far. Internet Gateway and Elastic IP, plus all the Subnets, NAT Gateway, and proper associations. It was a little confusing for me but I was able to do that too. Assign the Lambda to the VPC, yup that sounds right too. Oh whitelist the assigned Elastic IP in Azure CosmosDB? Maybe that is what I was missing in Attempt #1? Wait a minute. This, solution doesn’t even use the Private Endpoint, Virtual Network, or any of those in Azure! It literally just connects via the internet to CosmosDB and has CosmosDB whitelist that public IP that the Lambda will be using!

Well I had a hunch that ChatGPT would be wrong, but I didn’t expect it to try to sleight-of-hand me by having me create the resources I wanted to use and then just totally miss the point and use the public internet anyway. Delete all that and move on.

Attempt #3

Actually, this wasn’t even really an attempt. More like a literal sanity check to make sure that I can even make the simplest version of this work at all. I created a CosmosDB in Azure. I left the networking to “Public all networks”. That’s it. That’s the attempt. I then confirmed I could connect to that CosmosDB from my local machine and from an AWS Lambda (with a public IP of course). Thankfully, that did, in fact, work. I guess time to search even more and find something that will get me in the right direction. Maybe go broader.

Research Break

After setting up some of these resources at least 3-4 times by this point, I felt that I was familiar enough to start finding out exactly what they are for and what they do and maybe be able to reason from there how this connects. I found this page about Name resolution for resources in Azure virtual networks which one of my co-workers suspected was the issue from Attempt #1. I found the scenario that described my goal in the table and the solution column said Azure DNS private zones and Azure DNS Private Resolver. Making the Private Endpoint already auto-created a DNS Private Zone so maybe I also needed an Azure DNS Private Resolver?

I started digging into that and found the page What is Azure DNS Private Resolver? which helped provide some context about that resource and then led me to the page Azure DNS Private Resolver endpoints and rulesets. In this page it talks about Inbound Endpoints, which at the time were just more vague terms, but will become important soon. I went back to Azure Private DNS Zones, a page which showed that they work very simply, but only when resolving within the Azure Virtual Network. At the time, I figured that the peering from AWS to Azure would be as if they were within the Azure VNet, so I kept the page for reference. I found a few more links that looked promising for creating the network I wanted so felt it was time to try again.

Attempt #4

At this point I didn’t even care what resource in AWS connected with what resource in Azure. I just want them to connect. Let me preface the next two links by saying I appreciate the authors making the posts as each had information that helped me ultimately find a solution. I first found The Step-By-Step Guide to Connect Aws with Azure by vineetyadav97 and then the related AWS Site-to-Site VPN Connection by javedkhan0749 I was able to use the network diagram, the general idea, and the use of a Private Azure VM and a Private AWS EC2 from post by vineetyadav97, combined with the specific inputs and images from the post by javedkhan0749 to get very close to a solution. I did not know how to bastion from a public EC2 into a private EC2 in AWS, so I also used How do I use a bastion host to securely connect to my EC2 Linux instance in a private subnet? which was very clear and simple. But even then, I could only get the EC2 and VM to ping successfully when using the IP address. It still couldn’t resolve by name.

I kept looking at the Microsoft doc Configure Azure Private Link for an Azure Cosmos DB account but it only shows how to make a private endpoint, not how to use one. I went back and checked the AWS resources by referencing against Getting started with AWS Site-to-Site VPN but did not see anything that was misconfigured. It seemed that I was very close to something that would work, but I couldn’t figure out what. I had the connection, but only with IPs. I had finally gotten in touch with another co-worker who had much more of a background in networking and was able to sit down and help figure out what part of the network wasn’t working or, as it turns out, wasn’t there but should be.

She was actually the one to look more closely at the Azure DNS Private Resolver and in particular the Inbound Endpoints. When one of those were configured, it receives a private IP. At this point, while talking to her, she had identified that there is a confirmed connection between AWS and Azure, and the DNS resolution should now be completely setup in Azure, but how was AWS supposed to resolve anything, like the CosmosDB URI for example. I had thought that the Virtual Network having that confirmed connection with AWS would have allowed it to use the DNS resolution from Azure (shows what I know about all this even at this point). She said that it wouldn’t work like that and we needed a way to describe to AWS where it can go to resolve these specific URIs. That is where the final piece of the puzzle comes in.

AWS Route53 Outbound Endpoints and Rules. She just created an Inbound Endpoint in Azure for the DNS Private Resolver, so it stood to reason we needed something outbound in AWS to say where it should look for DNS resolution. As it turns out, the Outbound Rule has an input for the Domain that it needs to resolve and a target IP address that it should go to for resolution. The Inbound Endpoint has an IP address and that was the connection that was missing. Once that rule was added, everything worked.

Links:
Different strategies and scenarios for DNS resolution
Azure DNS Private Resolver documentation
Azure DNS Private Resolver Endpoints documentation
Azure Private DNS Zone documentation
Microsoft Q&A question about connection Azure SQL to AWS Lambda
Tutorial for connecting AWS and Azure using BGP-enabled VPN gateway from the Microsoft Q&A answer
How to create Private Endpoints documentation
Tutorial for setting up AWS Site-to-Site VPN
AWS Knowledge center for using a bastion host to a private EC2 instance
Blog post by vineetyadav97 for connecting AWS with Azure
Blog post by javedkhan0749 for setting up AWS Site-to-Site VPN Connection with Azure

Leave a Reply