Exchange Server 2010-Availability & Recovery

How to Design DAG environment in Exchange Server 2010

 

As we know, Exchange Server 2010 DAG supports 16 members with up to 1600 database; therefore, it turns into complex part to design the layout. The rule of thumb is that the more servers in the DAG provide us more option for sizing databases copies efficiently and resiliently.

  

Above scenario is designed to support a single-node failure. If more than one member is down, four databases would be offline. Just adding more nodes in the DAG does not automatically enable it to sustain multiple failures. You should be very careful while adding nodes in the DAG and keeping database copies to the relevant nodes.

As you can see in the below diagram, servers are mirrored to each other in a four-node DAG. Failure in A and B or C and D would cause a large number of databases unavailable. This design does not provide better redundancy.  

  

Follow the below two rules while designing DAG redundancy to meet your SLAs:-

1.       One-member failure requires two or more high-availability copies, two or more servers, and a witness server.

2.       Two-member failure requires three or more high-availability copies, four or more servers, and a witness server.

We can design DAGs in a variety of environments which solely depends upon your organization SLAs and recovery time/point objective for the mailbox services.  We will discuss on few DAG design which may fit in your environment:-

·         Two-Member DAG in Single Datacenter/Active Directory Site

·         Four-Member DAG in Single Datacenter/Active Directory Site

·         Four-Member DAG in Two Datacenter/Active Directory Sites

·         Two Four-Member DAGs in Two Datacenter/Active Directory Sites

 

Two-Member DAG in Single Datacenter/Active Directory Site

This design is suited for small offices and branch offices to meet their high availability requirements with the smallest possible DAG design taking into account the cost factor associated with adding more servers and don’t require site resiliency. This design provides redundancy for the Client Access, Hub Transport and mailbox roles by using only two servers.

Unified Messaging role can also be co-located in this scenario with other roles, but it is not recommended by the Microsoft.

It is always recommended to use load balancing for achieving high availability for Client Access and Hub Transport. Windows Network Load Balancing can’t be used, therefore other load balancing options must be used.

 

Two-member DAG provides three quorum votes (two member server and one witness server). Outage in one vote will not impact the services. But the loss of two of the voters (for example, a DAG member and the witness server) will result in loss of quorum and service interruption. 

 

Four-Member DAG in Single Datacenter/Active Directory Site

Two or three-member DAG restricts us to failure in only one vote, but four-member DAG provides greater resiliency which has five quorum voters and can sustain the loss of two voters without impacting the service.

 

 

 

 

Two Four-Member DAGs in Two Datacenter/Active Directory Sites

As discussed above, a four-member DAG would cause one datacenter losing voters and quorum and resulting offline. We can deploy the two four-member DAGs to overcome this problem. We can configure one witness server in each datacenters. Since both DAG contains equal number of members and one witness server, outage in WAN will maintain the service across the datacenters.

 

 

 

  • For DAG1, members REDMBX1 and REDMBX2 would be in the majority and would continue to service users in the Chicago datacenter because they can communicate with the DAG1's witness server, HUB1.
  • For DAG2, members BLMBX3 and BLMBX4 would be in the majority and would continue to service users in the Madrid datacenter because they can communicate with DAG2's witness server, HUB2.

 

Using Mailbox Servers That Don't Contain Databases in a DAG for Additional Votes

As we mentioned earlier, larger DAGs provide greater resilience because they can maintain more failures without service interruption. One more design strategy can help us to get more resilience by leveraging the Hub Transport servers. We can use the existing Hub Transport servers in our environment by adding Mailbox server role (without any databases or database copy) on them and then add that server to the DAG just for voting. More voters can sustain more voter failures in the DAG and still maintain the quorum.

 

 

 

 

In the above diagram, we have four-member DAG extended across two remote datacenters. We have four voters in this scenario (4 DAG members and 1 Witness server). Failure in losing two voters would still maintain the quorum. If the DAG loses a third voter, it loses quorum and would require manual administrative intervention to restore the services.

Based upon the above designing scenarios, let us try to put them in some possible designing structure for our organization. We can have either Active/Passive or Active/Active model.

 

 

Refer this link for more details