This information isn’t from Microsoft, is actually from in-the-field technicians quite simply with clusters in actual situations. At Phoenix Synergy we are contacted regularly to help local businesses (small and large) with any clustering help they may need. In this case the customer acquired 7 Windows 2003 Dell servers, each with the standard Dual Broadcom NIC that comes with 1-U servers these days. While you know those network interfaces can be “Teamed” to form an one interface. The client would like to make their existing production environment as wrong doing tolerant as it can be. These six new servers are new and are setup in a lab environment. Every running Windows 2003 standard. high available cpanel hosting

What do we must work with:

They have two dedicated Domain Controllers for Active Directory, they are running the DNS for these for both inside and external name quality. Their domains zone data will be hosted here, they are going to become ns1 and ns2. 

They have web two servers, and 3 complus servers. Later they will implement their SQL Clusters, but we will not get into that here.

So far it’s a straight forward configuration. They will want to have the NIC’s teamed, having NIC1 from each server blocked into switch-1, and NIC2 plugged into switch-2. Permitting a switch to are unsuccessful. They may have a cross-over cable between the two switches allowing either NIC to fail. Each change will be connected to it’s own Firewall/Router, however the Gateway on each machine will be set to generally indicate the FW that their switch-1 is plugged into, we add a second Gateway IP with a different metric enabling any failure of the main firewall. Each Fire wall is plugged into a different ISP and has a different External IP configured. This allows ns2 to be an IP on ISP-2, which allows for a whole failure of the first ISP. Simply by having all the sponsor records on ns2 aiming to IP’s from the second ISP allows for complete failure of an entire segment of their line.

That is the layout. Once we get AD setup and DNS configured, we team and setup the NIC’s. Move a few plugs to test the theory of the setup and we are going to confident everything is doing well. So now we have to setup and test the cluster.

The cluster:

Since we do not have a network load balancer we have to balance the insert through the web servers and complus servers by way of Microsoft’s Network Weight Balancing. We proceed with the NTLB Management software to cluster the internet web servers. Each of the two servers converge into the cluster somewhat flawlessly with regards of what it had to offer. The moment we try the same on the complus machines it does not go as well.

The problem:

We add complus1 to the cluster. It provides fine, of course it does it’s the only person in the cluster. This says “converging” for a moment and then will go green. We attempt to add complus2 to the cluster and it says “converging” forever, it never converges. It stays in the state of “converging” over 30 minutes, invigorate after refresh, stopping and starting, pausing, trying whatever. We can stay away from the second client to converge. We try adding complus3 and get the same result. All of us retrace our steps, checking out DNS for internal quality of both the web servers themselves and the bunch IP’s, all looks good. We attempt to titled ping all nodes, everyone seems to ping the other person fine. IPConfig shows the Bunch IP on each of your of the complus servers. NTLB is bound on each of your “Team” program. Searching Microsoft’s support they insist there exists a problem with the NIC. So we proceed to unteam and try each NIC separately. As we retrace our steps we find the same problem regardless of how which NIC’s we use on any system. On a whim we uncluster each, reboot, and add complus2 first. Then simply we add complus3 to the cluster. And they “converge” within seconds. Striving to add complus1 neglects. So we have cut off the condition to just one server.

The solution:

While it happens that NTLB was bound on complus1, it was bound to each NIC (both associates of the “team”). When we re-team the NIC’s and remove NTLB from nic1 and nic2, the server converges into the cluster without a problem.

Summary:

When clustering be certain to only pick the NTLB service on one of the NIC’s being used as the cluster. Zero other NIC should have NTLB sure to it. Because we continue with this suggested configuration, everything is useful. Most tests are successful and it looks like they will have a great fault tolerant production environment. Next is the SQL Clusters, implementing two SQL Clusters with an EMC SAN live, without having tolerance for downtime, this could be fun. Until then…

Leave a Reply

Your email address will not be published. Required fields are marked *