10 min Security

My top 3 learnings: Implementing Network Access Control

My top 3 learnings: Implementing Network Access Control

Previously, as a red-teamer, I was assigned to break into a clients network with the goal to steal pre-defined flags. One of the flags was to establish consistent foothold in the network so we could gain access to internal systems from an internal perspective and use it as a way to exfiltrate data. Just as in the movies we prepped a raspberry pie along with a 4g modem, put some labels (“do not remove”) on it and started our reconnaissance of the physical buildings.

Prepped with an infiltration plan, pre-text and most importantly the “get out of jail free card” (authorization letter) signed by the client, we managed to get into the building. After looking around, we identified a very convenient location to install our device. There was a patched network outlet in the floor and the 4g connection was pretty good allowing us to connect to the device. After verifying the connection worked and we could reach internal systems of the client via the device we silently moved on. 

Flawed by design?

The lack of full network segmentation, filtering and control was something we often encountered and still do in today’s assignments. I think this links back to the perimeter security architecture that has been used as a concept for designing corporate networks. The concept aims to prevent external adversaries accessing trusted internal systems by installing “walls” between trusted and untrusted networks (e.g. internet vs internal network). And there you have it, there is suddenly a part of the network you “trust” and the focus is on mitigating risks from the “untrusted” part.

As you probably know, the Zero Trust architecture is the opposite of the perimeter security model and assumes that the network is always hostile. Within the Zero Trust concept all devices, users and network flows are authenticated and authorized. However, implementing this concept is for a lot of organizations a bridge to far because it takes a big investment and priority is given to other strategic IT initiatives (e.g. cloud migration, utilizing machine learning, insights in data, AI, cost control etc.). For those organizations, strengthening their perimeter architecture with Network Access Control (NAC) can be a good alternative.

In short, NAC is a security solution that provides control over the devices and users that are able to connect to your network and its segments. It is extensively used in wireless networks and less prevalent in wired networks. In this article I share our past experiences, the top lessons we have learned, and some considerations we encountered while implementing a NAC solution in a wired network infrastructure. 

My top 3 learnings during the implementation of Network Access control are:

💡 Reduce complexity as much as possible

Keep in mind that future maintainability of the solution and potential error reduction are worth a lot. Combine some departments / users on the same VLAN when they require overlapping access where possible.

💡 The implementation of redundancy and failover mechanisms are crucial

Try to limit the impact on the users as much as possible when things fails: Use redundant authentication servers, configure an auth failure VLAN with access to for instance a VPN server or cloud services (in case authentication fails) and/or publish the CRL on a separate website in case the CA is down. All in line with the risk appetite of the organization.

💡 Don’t underestimate the impact on user experience

It takes a lot of time to get it right and time should be spent in managing expectations and explanation in the organization about the authentication time (explain the urgency). During the transition it can be helpful to configure “open authentication” so the disruption when authentication fails is not too big. In case devices cannot authenticate, users can keep working while the debugging takes place in the background.

Drivers

The main drivers from an IT and security perspective were based on risk mitigation. We needed to be able to control the devices and users accessing the network, enriching our network forensic capabilities, implement network segmentation on basis of identity and to enforce policies on the devices accessing the network.  

As with most IT projects, you know on beforehand that challenges will be encountered along the way so we planned a phased, agile, approach instead of one big bang. We conducted initial research with regards to which product matched best our requirements, purchased the appliance and started designing the solution to our likening. 

It should be mentioned that many organizations run into challenges when they want to implement a NAC solution using 802.1X (standard for port-based network access control) in a wired environment:

  1. There are devices in the wired network that do not support 802.1x due to technical debt or just because there is no support for the protocol.
  2. Inconsistencies with the 802.1x configuration between different brands or partly implemented support for the protocol resulting in limited available features.
  3. Limited budget to replace non-supporting hardware that is still in support

For some companies a combination of these challenges result in the exploration of other solutions because the benefits do not outweigh the required investment. This was not the case for us. 

Complexity reduction

In the design process there are a few things to decide: How many different user/device groups do you want (or need) and how many virtual LANs (VLAN) belong to those groups. For instance, you can configure a VLAN for every department, just to be able to access the internal systems they need (e.g. finance only being able to connect to internal finance systems). This is a crucial part of the design where you can make it as complex as you want…

We realized that complexity reduction was more important for us in the long run, underpinned by the future maintainability of the solution and potential error reduction. This meant that we had to tip over the scale to IT and hand in on some of the security requirements because we would combine some departments on the same VLAN. This made sense because of the big overlap of systems they needed to access. Furthermore, it also meant that we would not implement the different features of the solution all at once (e.g. endpoint compliance checks with agents, post connect monitoring) but firstly lay a solid foundation and build upon that. Usually a best practice for phased deployment consists of three phases namely: 1) Monitor mode, 2) Low impact mode and 3) Closed mode.

Redundancy and failover mechanisms

We decided that devices and users have to authenticate with certificates from our internal Certificate Authority. This meant that all managed devices had to be supplied with a signed certificate which was fairly easy to roll-out on managed systems. The best practice is to automate the certificate enrollment process to simplify the deployment which is what we configured. By the way, do not forget printers (who uses those anyway), desk phones and unmanaged clients… if you have any.

This is also where availability concerns emerge: What if the Certificate Authority is down and a client certificate expired, the certificate revocation list unavailable or what if you only have one authentication server which happens to be down?

Beside the fact that some of these risks can be addressed with a failover mechanism, treated differently and can be accepted, the downside is that it can lead to a disruption in productivity and user frustration. And the latter two might influence a big part of how successful the implementation is perceived within the organization. So we did everything to limit the impacts on the end-users.

Failover mechanisms that can be considered:

  1. Make use of an authentication failed (auth fail) VLAN with VPN or limit (cloud) access possibilities in case authentication fails
  2. Build redundancy with multiple authentication servers
  3. Publish the CRL on a separate webserver

The lessons here is that building in redundancy and implementing failover mechanisms are crucial to ensure network access when implementing NAC. It should be implemented in such a way that it does not lower the security and it stays within budget.

Impact on User Experience

Even though we implemented NAC using an agile approach with a focus per department, we would often run into small access “problems” where users could not reach the service they needed to. While this was a good way to get to know your customer, resolving these errors was quite time intensive (more than expected).

Furthermore, while users would in the past be able to connect instantly with the network, the authentication process takes a good 5-10 seconds which was experienced as annoying at first. Although people understood the why and got used to it quickly… The well-known security versus usability balance. Speed is a very important factor and it does pay off to look into optimalisations here.

What also helped us ease the transition was the configuration of “open authentication” which provides regular network access in case devices could not authenticate. We would debug the problem in the background.

What to do with servers?

I clearly remember a wild brainstorm about enrolling our on-premise servers as well within the NAC solution. While NAC is used to authenticate and provide access to client-devices, could there be an added benefit to also have servers authenticate?

From a perimeter security architecture point of view it does not make sense because, well, we trust our internal network right. Furthermore, a lot of risks were quickly identified: what if some of our core servers will not be able to authenticate, the administrative burden for our network engineer and the increased potential for errors that can impact the services.

Some insights in how the technology can be beneficial: It can be used to implement micro-segmentation in the on-premise server LAN, it can help enforce security policies and standards for servers and, depended on the NAC solution, it can give the security team access to more features, e.g. threat containment, monitoring tools and an increase in network visibility.

Eventually we decided not to include the servers because filtering and segmentation was already in place, other security risks were appropriately mitigated and the added benefits were small versus the availability risks. 

Legacy solution or is it still usable in the cloud era?

While the technology is already with us for “quite some time” (2008) I believe it still adds value as a security control in the cloud era when companies have hybrid networks. There are some good integrations for some of the NAC solutions with for instance Microsoft Intune.

However, I think its features might slowly be replaced with other security controls or different network architectures like Zero Trust where the security approach is different. Furthermore, the costs of implementing it might not be in proportion to the risks you want to treat or can be covered with other security controls that are less expensive.

What do you think?

This is a contribution from Fox-IT. Through this link you can learn more about the company’s services.