Azure Networking Blog articles

Azure Networking Blog articles

https://techcommunity.microsoft.com/t5/azure-networking-blog/bg-p/AzureNetworkingBlog

Azure Networking Blog articles

Securing your Azure Networks with AVNM Security Admin Rules and VNet Flow Logs

Published

Securing your Azure Networks with AVNM Security Admin Rules and VNet Flow Logs

Introduction

Organizations adopting Microsoft Azure strive for a balance between providing application teams with the freedom to innovate while maintaining the security posture of the organization. Azure Virtual Network Manager provides Security Admin Rules to help achieve that goal. Security Admin Rules allow an organization to centrally manage the network security of its virtual networks to maintain compliance with its policies while giving business units the option to manage the network security of their individual workloads.

Before we dive into how Security Admin Rules work, let's first do a refresher of the basics of Azure Virtual Network Manager.

 

Azure Virtual Network Manager Foundations

An Azure Virtual Network Manager instance (Network Manager) is deployed to a region. The scope of management of a Network Manager is determined by a combination of its resource scope and its functional scope.

The resource scope represents the subscription or subscriptions a Network Manager can manage. Resource scopes can include management groups to manage groups of subscriptions or it can be configured to manage individual subscriptions. The image below provides an example of how an organization could configure the resource scope of a Network Manager.

 

sample_arch.png

The functional scope determines which types of configurations the Network Manager will support. Today, there are two functional scopes for management of virtual networks: Connectivity and Security Admin. Connectivity Configurations are used to manage the desired topology of virtual networks' connectivity and Security Admin Configurations are used to manage the network security of virtual networks across an organization's Azure estate. You can apply multiple Network Managers to the same resource scope only if the functional scope is different.

Virtual networks that are within the resource scope of the Network Manager can be added into a logical grouping referred to as a Network Group. Network Groups contain one or more virtual networks and are used by applying a Connectivity or Security Admin Configuration to one or more Network Groups (hence multiple virtual networks). Virtual networks are added to Network Groups manually or dynamically. When added dynamically through the use of Azure Policy, these virtual networks can be conditionally added to network groups, automatically connecting or security those virtual networks depending on the configurations deployed to the Network Group. Connectivity and Security Admin Configurations are only applied to virtual networks that are both within the resource scope and are a member of a Network Group that is targeted by a configuration.

A Connectivity Configuration enforces either a mesh or hub and spoke network topology across one or more Network Groups. In a mesh topology, all virtual networks are connected to each other. In a hub and spoke topology, all virtual networks are connected as spokes to a hub virtual network and can optionally be connected with each other within their respective network groups.

A Security Admin Configuration contains one or more security admin rule collections. Each rule collection contains one or more security admin rules. Network Groups are associated to one or more rule collections, which apply the security admin rules to the member virtual networks of its associated Network Group(s).

Security Admin Rules are similar to Network Security Group security rules in that they operate at layer 4 of the OSI model and support 5-tuple rules. Security Admin Rules differ in that they support the AlwaysAllow action in addition to Allow or Deny, and are applied on virtual networks. I will demonstrate a variety of scenarios using Security Admin Rules in this blog.

In order for a configuration to be applied to a virtual network, the configuration must be deployed to the region the virtual network is in. Per Network Manager instance, a single Security Admin Configuration and multiple Connectivity Configurations can be deployed per region.

hierarchy.png

Security Admin Rules and Network Security Groups

The key benefit to Security Admin Rules is that they are processed before the rules within a Network Security Group because they are evaluated at a virtual network-level vs the subnet or network interface level like a Network Security Group. This provides an organization with the ability to establish a core set of "guardrail" rules while giving application teams freedom to configure Network Security Groups to their own requirements.

The visual below illustrates how Security Admin Rules work with Network Security Group security rules. If a Security Admin Rule uses the Allow action, the traffic is passed on to downstream Network Security Groups where it can be allowed or denied as needed. When a Security Admin Rule uses a Deny action, the traffic is denied at the Security Admin Rule even if the Network Security Group allows the traffic. Security Admin Rules using the AlwaysAllow action will allow the traffic even if the Network Security Group denies the traffic. 

processing.png

Practice use cases include:

  • Protecting high-risk ports by default for all new and existing virtual networks.

  • Ensuring critical infrastructure services traffic such as DNS and Windows Active Directory can't be mistakenly blocked.
  • Ensuring security signals from applications and virtual machines cannot be blocked when being delivered to a security information and event management (SIEM) solution.
  • Providing support for SSH and RDP traffic to application teams while limiting the source of that traffic to a secure enclave of jump servers.
  • Allowing traffic from trusted boundaries by default unless application teams deny it.

 

Lab Environment

For the demonstrations included in this blog post, the lab environment below was utilized. The virtual network named vnetmgmt74188 contains two virtual machines. The machine named vm1mgmt74188 emulated a trusted machine and the other machine named vm2mgmt74188 emulated an untrusted machine.

The other three virtual networks emulated application team virtual networks. The network named vnets-r174188 emulated a virtual network with a workload storing or processing sensitive data, vnetp-r174188 emulated a production virtual network, and vnetnp-r174188 emulated a non-production virtual network.

Each virtual network contained a single virtual machine running a web server that was secured by a Network Security Group associated to the subnet.

avnm_lab.png

AlwaysAllow Demonstration

In this scenario, the organization's Central IT team must ensure that network traffic from production workloads to critical infrastructure services cannot be mistakenly blocked by a misconfiguration of a Network Security Group. DNS is considered a critical infrastructure service for the organization and is provided by a 3rd-party DNS service hosted at 1.1.1.1.

The scenario goal is pictured below:

d1-goal.png

The Network Security Group configured by an application team has mistakenly been configured to block DNS traffic to the organization's preferred DNS service, as seen in the image below.

d1-outbound-nsg.png

When a DNS lookup is performed on the production virtual machine directed to the DNS service, the request times out due to it being blocked by the security rule configured in the Network Security Group.

d1-dig-deny.png

The Central IT team creates an instance of Azure Virtual Network Manager and sets its resource scope to a management group that all of the application team subscriptions are children of. It then creates a new Security Admin Configuration and adds a rule collection. The rule collection is associated with a Network Group that uses Azure Policy to automatically manage its membership based on virtual networks containing the tag environment=production. Contained in this rule collection is a Security Admin Rule, which uses the AlwaysAllow action to allow outbound DNS traffic destined to the organization's DNS service.

d1-arch.png

The Central IT team creates a new Azure Policy definition that applies to any virtual networks with the tag environment=production. The Azure Policy is assigned to the same management group the Network Manager is scoped to.

"policyRule": {
  "if": {
    "allOf": [
      {
        "field": "type",
        "equals": "Microsoft.Network/virtualNetworks"
      },
      {
        "allOf": [
          {
            "field": "tags['environment']",
            "equals": "production"
          }
        ]
      }
    ]
  },
  "then": {
    "effect": "addToNetworkGroup",
    "details": {
      "networkGroupId": "/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/rg-demo-avnm-mgmt74188/providers/Microsoft.Network/networkManagers/avnm-central74188/networkGroups/ng-prod"
    }
  }
}

After Azure Policy is assigned and policy evaluation takes place, the production virtual network is dynamically added into this Network Group.

d1-policy-dynamic.png

The Central IT team deploys the Security Admin Configuration to the desired Azure region. A short time later, DNS queries to the organization's DNS service running at 1.1.1.1 are successful, demonstrating the Security Admin Rule with the AlwaysAllow action is indeed allowing the traffic.

d1-dig-success.png

 

Deny Demonstration

In this scenario, the organization has a requirement to ensure all web-based communication with production workloads that store or process sensitive data is encrypted. Production workloads that do not store or process sensitive data do not have this requirement and it should not be enforced on those workloads.

The scenario goal is pictured below.

d2-goal.png

A Network Security Group has been configured by the application team to allow HTTP to a production workload storing sensitive data, which does not align with the organization's security policy.

d2-inbound-nsg.png

Performing a curl on the virtual machine from one of the demonstration machines successfully returns the "Hello World" webpage, indicating the Network Security Group is allowing HTTP traffic.

d2-curl-success.png

The Central IT team does not need to create another Security Admin Configuration to satisfy this requirement. Instead, it uses the existing Security Admin Configuration and creates a new rule collection that will block this unencrypted network flow. It is associated with another Network Group that uses Azure Policy to automatically manage its membership based on virtual networks containing the tags environment=production and classification=sensitive. Contained in the rule collection is a Security Admin Rule, which uses the Deny action to block HTTP traffic.

d2-arch.png

The Central IT team creates a new Azure Policy definition that applies to any virtual networks with the tags environment=production and classification=sensitive. The Azure Policy is assigned to the same management group the Network Manager is scoped to.

"policyRule": {
  "if": {
    "allOf": [
      {
        "field": "type",
        "equals": "Microsoft.Network/virtualNetworks"
      },
      {
        "allOf": [
          {
            "field": "tags['environment']",
            "equals": "production"
          },
          {
            "field": "tags['classification']",
            "equals": "sensitive"
          }
        ]
      }
    ]
  },
  "then": {
    "effect": "addToNetworkGroup",
    "details": {
      "networkGroupId": "/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/rg-demo-avnm-mgmt74188/providers/Microsoft.Network/networkManagers/avnm-central74188/networkGroups/ng-sensitive"
    }
  }
}

After Azure Policy is assigned and policy evaluation takes place, the production virtual network containing sensitive workloads is dynamically added into this Network Group.

d2-policy-dynamic.png

The Central IT team re-deploys the Security Admin Configuration to the desired Azure region. A short time later, HTTP requests from curl from the demonstration virtual machine time out because the connection is blocked, demonstrating the Security Admin Rule with the Deny action blocks the traffic.

d2-curl-failure.png

 

Allow Demonstration

In this scenario, the organization must ensure that remote access to both production and non-production workloads is supported, but only when coming from a trusted enclave of jump servers. The Central IT team should allow application teams to determine if this type of access is needed for their workload. The application team has determined that this traffic is not required for their Production Workload A, but should be supported for their Non-Production Workload B.

The scenario goals are pictured below.

d3-goal.png

The Central IT team can use the existing Security Admin Configuration to satisfy these requirements. It will add additional rules to the existing production rule collection. A Security Admin Rule with the Allow action will allow SSH traffic from the trusted security enclave while a lower priority rule will deny SSH from all sources. A new rule collection for non-production will be created and associated to a new Network Group that uses Azure Policy to automatically manage its membership based on virtual networks containing the tag environment=nonproduction. This rule collection will contain the same two new rules as the production rule collection.

d3-arch.png

The Central IT team creates a new Azure Policy definition that applies to any virtual networks with the tag environment=nonproduction. The Azure Policy is assigned to the same management group the Network Manager is scoped to.

"policyRule": {
  "if": {
    "allOf": [
      {
        "field": "type",
        "equals": "Microsoft.Network/virtualNetworks"
      },
      {
        "allOf": [
          {
            "field": "tags['environment']",
            "equals": "nonproduction"
          }
        ]
      }
    ]
  },
  "then": {
    "effect": "addToNetworkGroup",
    "details": {
      "networkGroupId": "/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/rg-demo-avnm-mgmt74188/providers/Microsoft.Network/networkManagers/avnm-central74188/networkGroups/ng-nonprod"
    }
  }
}

After Azure Policy is assigned and policy evaluation takes place, the non-production virtual networks are dynamically added into the Network Group.

d3-policy-dynamic.png

The Central IT team re-deploys the Security Admin Configuration to the desired Azure region.

The application team configured the Network Security Group protecting Production Workload A to block all SSH traffic.

d3-nsg-inbound-deny.png

Attempts to SSH from a virtual machine in a trusted enclave to Production Workload A time out because it is denied by the Network Security Group. This demonstrates how the traffic must be allowed by both the Network Security Group and the Security Admin Rule when the Allow action is used.

d3-ssh-fail.png

The application team configured the Network Security Group protecting Non-Production Workload A to allow SSH traffic from all sources.

d3-nsg-inbound-all.png

Attempts to SSH from an untrusted virtual machine to Non-Production Workload A time out because the untrusted virtual machine is not a source included in the Security Admin Rule with the Allow action. It is instead matched to the lower-priority Deny rule, which causes the traffic to be blocked.

d3-ssh-failure-np.png

Attempts to SSH from a trusted virtual machine to Non-Production Workload A are successful because it matches the Security Admin Rule with the Allow action and is allowed by the Network Security Group.

d3-ssh-success.png

 

Multiple Azure Virtual Network Managers

In this scenario, one of the organization's business units has requested an Azure Virtual Network Manager instance to use to manage their subscriptions. Central IT must maintain their instance to ensure compliance with organizational security policy.

Azure Virtual Network Manager supports multiple instances as long as those Network Managers are applied at different scopes. In the scenario above, Central IT would set the resource scope of their instance higher up in the management group structure than where the business unit would assign its resource scope.

The architecture is pictured below.

d4-arch.png

The business unit builds an instance with a Security Admin Configuration containing a rule collection that applies to a new Network Group in the Network Manager for virtual networks running non-production workloads. The Network Group will use Azure Policy to automatically manage its membership based on the virtual networks with the tag environment=nonproduction. The policy will use similar logic as to the policy seen earlier.

The rule collection contains a single Security Admin Rule that has been mistakenly configured with the AlwaysAllow action to allow all inbound SSH traffic even if the Network Security Group is configured to block it.

The application team deploys the Security Admin Configuration to the relevant Azure regions.

d4-security-admin-rules.png

An attacker attempts to SSH into a non-production workload from an untrusted machine. The traffic is denied and the attacker is prevented from establishing the session.

d4-ssh-fail.png

The connection fails because when multiple Network Managers apply to a virtual network and Security Admin Rules between the two instances conflict, the rule from the higher-scope Network Manager is applied. In this scenario, the Central IT instance is applied at a higher-level management scope from the business unit instance so the traffic is blocked because the source of the traffic is an untrusted machine.

 

Virtual Network Flow Logs and Azure Virtual Network Manager Security Admin Rules

Organizations frequently have the requirement to log when network traffic is allowed or denied to satisfy regulatory requirements and assist with troubleshooting in day-to-day operations. Traffic that is processed by an Azure Virtual Network Manager Security Admin Rule can be logged using VNet Flow Logs. VNet Flow Logs are a feature of Network Watcher and log information about the IP traffic coming in and out of a virtual network for supported workloads. This includes IP traffic processed by Network Security Groups and Security Admin Rules. It also supports evaluating the encryption status of network traffic if scenarios where virtual network encryption is used.

In this scenario we will explore how to use a VNet Flow Logs to determine if traffic is being blocked by a Security Admin Rule or Network Security Group.

VNet Flow Logs must be enabled on each virtual network. As of the date of this post, VNet Flow Logs are in Public Preview and available in a limited set of regions. Once onboarded into the preview, the VNet Flow Log must be configured on the virtual network. The logs are delivered to an Azure Storage Account in this demonstration but can also be delivered to Network Watch Traffic Analytics to provide additional insights around risky flows and top talkers.

In the command below the production virtual network is enabled for VNet Flow Logs.

az network watcher flow-log create --location eastus --name flvnetp-r174188 --resource-group rg-demo-avnm-p74188 --vnet vnetp-r174188 --storage-account "/subscriptions/XXXXXXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/rg-demo-avnm-mgmt74188/providers/Microsoft.Storage/storageAccounts/stlogsr174188"

After the VNet Flow Logs are enabled, An attempt is made to establish an SSH connection to a workload in the production virtual network from one of the jump hosts in the trusted enclave. The SSH connection times out because it is blocked by either a Security Admin Rule or Network Security Group.

Let's explore how VNet Flow Logs can be used to determine which type of rule is blocking the traffic.

d5-ssh-fail.png

Within the Azure Storage Account a new container has been created named insights-logs-flowlogflowevent. This is the container where VNet Flow Logs are stored.

d5-az-storage-container.png

The latest VNet Flow Log is downloaded from the Azure Storage Account and reviewed. Searching for the jump server's IP identifies a flow record for the virtual network the workload is in and at the time the SSH connection was attempted. The aclId property indicates the resources that evaluated the flow. In this case we see that it is a Network Security Group named nsgp-pri-r169341. The rule property indicates the name of the security rule that evaluated the traffic which was named block-all. In the flowTuples array, the highlighted record indicates the SSH traffic from the trusted jump server was denied.

d5-nsg-fail.png

Let's look at another scenario where the application team is having issue doing load testing on their production application.

The latest VNet Flow Log is again downloaded from the Azure Storage Account. A search for the IP addressed used by the load testing service identifies a flow record for the virtual network the workload is in at the time the testing was performed Here the aclId property indicates the Network Manager Security Configuration with a Rule Collected named rc-prod evaluated the traffic. The traffic from the load testing service was denied by a rule named DenyHttp.

d5-http-fail.png

 

Putting It All Together

In this blog post, you've seen how Azure Virtual Network Manager Security Admin Rules, Network Security Groups, and VNet Flow Logs work together to enable organizations to secure their Azure estate. With Security Admin Rules, Central IT can install guardrails to ensure its organizational network security controls are enforced while giving application teams the flexibility to manage the network security of their applications using Network Security Groups. VNet Flow Logs provide the centralized visibility to the entire organization on how traffic is evaluated and processed across these products. This approach provides the network security and visibility organizations without sacrificing agility.

Continue to website...

More from Azure Networking Blog articles