Difference between revisions of "Anvil! Networking"

From Alteeve Wiki
Jump to navigation Jump to search
 
Line 2: Line 2:


The [[Anvil! Cluster]] implements four main network types, each of which there could be one or more of.  
The [[Anvil! Cluster]] implements four main network types, each of which there could be one or more of.  
= Anvil! Networks =


{| class="wikitable"
{| class="wikitable"
Line 35: Line 37:
|}
|}


= Back-Channel Network =
== Back-Channel Network ==


Back-Channel Networks are the primary networks used by the various Anvil! cluster machines. Examples;
Back-Channel Networks are the primary networks used by the various Anvil! cluster machines. Examples;
Line 45: Line 47:
Most clusters have a single BCN, which uses the subnet <span class="code">10.201/16</span>. If there's need for a second BCN, it would be <span class="code">10.202/16</span>, and so forth, with the third digit of the second octet matching the network's sequence number (ie: BCN1 uses <span class="code">10.201.x.y</span>).  
Most clusters have a single BCN, which uses the subnet <span class="code">10.201/16</span>. If there's need for a second BCN, it would be <span class="code">10.202/16</span>, and so forth, with the third digit of the second octet matching the network's sequence number (ie: BCN1 uses <span class="code">10.201.x.y</span>).  


== Device IP Addresses ==
=== Device IP Addresses ===


{{note|1=The IPs assigned to devices on the BCN match how IPs are assigned on other networks. The way IPs are assigned will be discussed in detail here.}}
{{note|1=The IPs assigned to devices on the BCN match how IPs are assigned on other networks. The way IPs are assigned will be discussed in detail here.}}
Line 173: Line 175:
If you wanted to link additional DR hosts to a given node, you can do so by setting the fourth octet to <span class="code">4</span>, etc.  
If you wanted to link additional DR hosts to a given node, you can do so by setting the fourth octet to <span class="code">4</span>, etc.  


= Internet-Facing Network =
== Internet-Facing Network ==


This is the network used to connect machines in the Anvil! cluster to your network. For fully-offline installations, this could better be called the '''Intranet-Facing Network''', but either way, '''IFN''' is an accurate acronym.  
This is the network used to connect machines in the Anvil! cluster to your network. For fully-offline installations, this could better be called the '''Intranet-Facing Network''', but either way, '''IFN''' is an accurate acronym.  
Line 181: Line 183:
When the IFN is not yet defined, we recommend that the same method of using the third octet to identify the device type, and the fourth octet to define the device sequence number. Understandably though, this is rarely possible. As such, it is up to you to decide how IPs on the IFN are allocated to the various machines in the cluster.
When the IFN is not yet defined, we recommend that the same method of using the third octet to identify the device type, and the fourth octet to define the device sequence number. Understandably though, this is rarely possible. As such, it is up to you to decide how IPs on the IFN are allocated to the various machines in the cluster.


= Storage Network =
== Storage Network ==


Storage in an Anvil! node is kept synchronously replicated between the two subnodes over a dedicated network connection. This replication traffic uses the dedicated '''Storage Network''', and nothing else uses that network to ensure minimum latency and maximum bandwidth.  
Storage in an Anvil! node is kept synchronously replicated between the two subnodes over a dedicated network connection. This replication traffic uses the dedicated '''Storage Network''', and nothing else uses that network to ensure minimum latency and maximum bandwidth.  
Line 191: Line 193:
The storage network can be used for replication to [[DR hosts]], but it isn't required. When switching infrastructure is fast enough, running the SN through switches will allow the DR's storage network connection to receive the replication data on that dedicated network. However, given DR hosts are usually outside the local location, it's not often possible for this to be done. So in most cases, the storage network is connected back-to-back in the nodes, and the DR replication traffic will be sent over the BCN (or IFN, if that's the only available link).
The storage network can be used for replication to [[DR hosts]], but it isn't required. When switching infrastructure is fast enough, running the SN through switches will allow the DR's storage network connection to receive the replication data on that dedicated network. However, given DR hosts are usually outside the local location, it's not often possible for this to be done. So in most cases, the storage network is connected back-to-back in the nodes, and the DR replication traffic will be sent over the BCN (or IFN, if that's the only available link).


== Device IP Addresses ==
=== Device IP Addresses ===


The IP addresses used on the '''SN''' match the '''BCN''' IPs, save for the different subnet (ie: <span class="code">10.201.10.1</span> on the BCN 1, <span class="code">10.101.10.1</span> on the SN 1).  
The IP addresses used on the '''SN''' match the '''BCN''' IPs, save for the different subnet (ie: <span class="code">10.201.10.1</span> on the BCN 1, <span class="code">10.101.10.1</span> on the SN 1).  
Line 238: Line 240:
|}
|}


= Migration Network =
== Migration Network ==


When the Anvil! determines that the active host for a server is degraded, or when an operator chooses to move a server for any reason, a frozen clone of the server is created on the peer subnode. The RAM, actively in use, starts to be copied over to the frozen copy of the server. This is the process that determines how long a live migration will take to complete.  
When the Anvil! determines that the active host for a server is degraded, or when an operator chooses to move a server for any reason, a frozen clone of the server is created on the peer subnode. The RAM, actively in use, starts to be copied over to the frozen copy of the server. This is the process that determines how long a live migration will take to complete.  
Line 246: Line 248:
The migration network is an optional network used exclusively for copying the RAM of running servers during live migration. This is almost always run back-to-back between the subnodes in an Anvil! node, and it is not used for any other purpose. As such, a migration network can greatly reduce the time for live migrations to complete.  
The migration network is an optional network used exclusively for copying the RAM of running servers during live migration. This is almost always run back-to-back between the subnodes in an Anvil! node, and it is not used for any other purpose. As such, a migration network can greatly reduce the time for live migrations to complete.  


== Subnode IP Assignement ==
=== Subnode IP Assignement ===


Given that nothing else uses the MN, there is no reason for a second MN to exist. So the subnet is almost always <span class="code">10.199/16</span>. As above, the third and fourth octets indicate the node and subnode sequence.  
Given that nothing else uses the MN, there is no reason for a second MN to exist. So the subnet is almost always <span class="code">10.199/16</span>. As above, the third and fourth octets indicate the node and subnode sequence.  
Line 290: Line 292:
|class="code"|10.199.'''18'''.'''2'''
|class="code"|10.199.'''18'''.'''2'''
|}
|}
= Network Interface Types =
In the Anvil!, there are three types of network types.
== Interfaces ==
A network interface is a single, traditional network interface. Specifically, an '''interface''' is a physical connection to a specific port on a switch, or another machine.
== Bonds ==
A bonded interface is a virtual interface, made up using two physical network interfaces. The two real interfaces are "bonded" together to function as a single interface, but with the benefit of fault tolerance. If the active interface fails, the backup interface will take over extremely quickly, preventing any connections from being dropped or interrupted.
More specifically, the Anvil! uses "[https://www.ibm.com/docs/ja/linux-on-systems?topic=recommendations-bonding-modes mode=1 (active-backup)]" bonds. This bonding mode provides the fastest fault-detection and recovery out of all the available bonding modes in Linux.
Note that it is the only mode that does ''NOT'' aggregate bandwidth. That is to say, bonding two 10 Gbps interfaces results in a bonded interface that runs at 10 Gbps, '''not''' 20 Gbps. The reason for this is primarily the speed of fault detection and recovery, but also to ensure consistent performance in a fault condition.
As a principle of [[Intelligent Availability]]; If 10 Gbps (in this example) is not fast enough for normal operation, then faster interfaces should be used. If 10 Gbps is fast enough, then there's no reason to aggregate the bandwidth to 20 Gbps (which would fall to 10 Gbps on link failure).
== Bridges ==
A bridge is a virtual "ethernet switch". A bridged interface (which can connect to a bond or simple interface) acts as a "bridge" between hosted servers and the outside network.
Each server will have one (or more) virtual network interfaces. These interfaces, outside the server, needs to "plug into" a switch. Being that the servers are virtual, this connection is done using a special <span class="code">vnetX</span> interface that is created when the server boots, and is deleted when the server migrates or shuts off. This transient <span class="code">vnetX</span> device acts like a network cable, and "plugs into" a bridge.
The bridge itself connects to the outside world, almost always via a bridge interface, which in turn sends and receives data on the active interface to the outside world.
By default, all servers created on the Anvil! nodes have a single network interface connected to the ''Internet-Facing Network'' bridge. Optionally, an operator could add a second interface to connect to the ''Back-Channel Network'' if desired, or simply move the existing interface off of the IFN bridge over to the BCN bridge.


{{footer}}
{{footer}}

Latest revision as of 03:12, 4 August 2023

 Alteeve Wiki :: How To :: Anvil! Networking

The Anvil! Cluster implements four main network types, each of which there could be one or more of.

Anvil! Networks

Network Name Prefix Subnet Used By Description
Back-Channel Network BCN 10.20{x}/16 All Machines The BCN is used for cluster communications and monitoring of foundation pack equipment.
Internet-Facing Network IFN <user defined> All Machines The IFN is the connection between the servers hosted on the Anvil! and the user's main network. The subnet of this network is, therefor, defined by you.
Storage Network SN 10.10{x}/16 Nodes, some DR The SN is used for storage replication and nothing else. This is often connected back to back between subnodes in an Anvil! node. When connecting to a DR host
Migration Network MN 10.199/16 Nodes The MN is an optional back-to-back network link used between subnodes. When this network is available, when a hosted server live-migrates, the contents of the server's RAM is copied over this dedicated network. When this network is not available, RAM copy happens over the BCN (which, in some high load cases, could cause contention).

Back-Channel Network

Back-Channel Networks are the primary networks used by the various Anvil! cluster machines. Examples;

  • Commands from the Striker dashboard are transmitted to the nodes and DR hosts via the BCN.
  • Scancore used the BCN to communicate with UPSes, PDUs, IPMI out-of-band management interfaces, etc.
  • Servers being live-migrated between subnodes, when the MN is not available, copies over the BCN

Most clusters have a single BCN, which uses the subnet 10.201/16. If there's need for a second BCN, it would be 10.202/16, and so forth, with the third digit of the second octet matching the network's sequence number (ie: BCN1 uses 10.201.x.y).

Device IP Addresses

Template note icon.svg
Note: The IPs assigned to devices on the BCN match how IPs are assigned on other networks. The way IPs are assigned will be discussed in detail here.

In the BCN, SN and MN, the third octet of the IP indicates the device type, and the fourth octet defines the sequence number. The third octet will be between 1~9 for Strikers and foundation pack devices. The third octet of 10 and higher is assigned to nodes and DR hosts.

Lets lay this out in a table so that it makes more sense. For this example, 10.201/16 will be used as we're in the BCN section. Note that, where applicable, the same third and fourth octets will be used for the SN and MN also.

Device Third Octet Example (first device)
Ethernet Switches 1 10.201.1.1
Switched PDU 2 10.201.2.1
Managed UPS 3 10.201.3.1
Striker Dashboard 4 10.201.4.1
Striker IPMI 5 10.201.5.1

The octets 6 through 9 are not yet assigned to any device type.

Anvil! nodes start with the octet 10 for node 1, 12 for node 2, 14 for node 3, and so on. The reason for jumping by two is to allow for the next octet up to be used by the subnode's IPMI interface. Lets look at an example;

Node Subnode BCN 1 - IP IPMI - IP
Node 1 (an-anvil-01) an-a01n01 10.201.10.1 10.201.11.1
an-a01n02 10.201.10.2 10.201.11.2
Node 2 (an-anvil-02) an-a02n01 10.201.12.1 10.201.13.1
an-a02n02 10.201.12.2 10.201.13.2
Node 3 (an-anvil-03) an-a03n01 10.201.14.1 10.201.15.1
an-a03n02 10.201.14.2 10.201.15.2
Node 4 (an-anvil-04) an-a04n01 10.201.16.1 10.201.17.1
an-a03n02 10.201.16.2 10.201.17.2
Node 5 (an-anvil-05) an-a05n01 10.201.18.1 10.201.19.1
an-a05n02 10.201.18.2 10.201.19.2
Template note icon.svg
Note: Generally, DR Hosts are assigned to a node, matching its hardware. This isn't required, a DR host can protect servers from multiple nodes, but the IP assignment is based around peering DR hosts to nodes. As such, DR hosts have host names and IPs that link them to a node. If you plan to use a DR host for multiple nodes, you can connect it to which ever node you like. The naming and IPs are not used to enforce any actual linkages.

With the above note said, we'll show an example where there is a DR host per node.

Node DR Host BCN 1 - IP IPMI - IP
an-anvil-01 an-a01dr01 10.201.10.3 10.201.11.3
an-anvil-02 an-a02dr01 10.201.12.3 10.201.13.3
an-anvil-03 an-a03dr01 10.201.14.3 10.201.15.3
an-anvil-04 an-a04dr01 10.201.16.3 10.201.17.3
an-anvil-05 an-a05dr01 10.201.18.3 10.201.19.3

If you wanted to link additional DR hosts to a given node, you can do so by setting the fourth octet to 4, etc.

Internet-Facing Network

This is the network used to connect machines in the Anvil! cluster to your network. For fully-offline installations, this could better be called the Intranet-Facing Network, but either way, IFN is an accurate acronym.

The IFN is used for all traffic going into and out of the servers hosted on the Anvil! node(s). It is also the network used to connect to the Striker Dashboard's web interface, using to manage the Anvil! cluster. Lastly, when needed, it is usually the network used by Alteeve (with a support contract) to connect into the Anvil! cluster to provide support. Beyond this, the Anvil! does all it can to minimize it's use of the IFN, reserving as much bandwidth as possible for the hosted servers.

When the IFN is not yet defined, we recommend that the same method of using the third octet to identify the device type, and the fourth octet to define the device sequence number. Understandably though, this is rarely possible. As such, it is up to you to decide how IPs on the IFN are allocated to the various machines in the cluster.

Storage Network

Storage in an Anvil! node is kept synchronously replicated between the two subnodes over a dedicated network connection. This replication traffic uses the dedicated Storage Network, and nothing else uses that network to ensure minimum latency and maximum bandwidth.

This works by capturing the incoming write to storage from the guest server, and sending a copy to the peer subnode. When the data reaches persistent storage (or protected cache, like flash-backed write cache), the server's storage driver is told that the right is complete. This way, if the active subnode is destroyed without warning, the server will reboot on the surviving subnode, and recover just as is it was a normal server that lost power.

The storage network can be run through a switch, or it can be back-to-back connection between the subnodes. The subnet is 10.10x, where 'x' indicates the specific storage network. In almost all installations, there is only one storage network, and so the subnet is usually 10.101/16.

The storage network can be used for replication to DR hosts, but it isn't required. When switching infrastructure is fast enough, running the SN through switches will allow the DR's storage network connection to receive the replication data on that dedicated network. However, given DR hosts are usually outside the local location, it's not often possible for this to be done. So in most cases, the storage network is connected back-to-back in the nodes, and the DR replication traffic will be sent over the BCN (or IFN, if that's the only available link).

Device IP Addresses

The IP addresses used on the SN match the BCN IPs, save for the different subnet (ie: 10.201.10.1 on the BCN 1, 10.101.10.1 on the SN 1).

Given that there is no IPMI on the storage network, the third octet indicating the node simply jumps by two. For example;

Node Subnode SN 1 - IP
Node 1 (an-anvil-01) an-a01n01 10.101.10.1
an-a01n02 10.101.10.2
Node 2 (an-anvil-02) an-a02n01 10.101.12.1
an-a02n02 10.101.12.2
Node 3 (an-anvil-03) an-a03n01 10.101.14.1
an-a03n02 10.101.14.2
Node 4 (an-anvil-04) an-a04n01 10.101.16.1
an-a03n02 10.101.16.2
Node 5 (an-anvil-05) an-a05n01 10.101.18.1
an-a05n02 10.101.18.2

Migration Network

When the Anvil! determines that the active host for a server is degraded, or when an operator chooses to move a server for any reason, a frozen clone of the server is created on the peer subnode. The RAM, actively in use, starts to be copied over to the frozen copy of the server. This is the process that determines how long a live migration will take to complete.

Without a migration network, the RAM copy is done over the BCN. However, the BCN must go through a switch, and that often sets a limit on the bandwidth available, both because of the cost of higher speed switches, and also because of competing with other network traffic.

The migration network is an optional network used exclusively for copying the RAM of running servers during live migration. This is almost always run back-to-back between the subnodes in an Anvil! node, and it is not used for any other purpose. As such, a migration network can greatly reduce the time for live migrations to complete.

Subnode IP Assignement

Given that nothing else uses the MN, there is no reason for a second MN to exist. So the subnet is almost always 10.199/16. As above, the third and fourth octets indicate the node and subnode sequence.

Node Subnode MN 1 - IP
Node 1 (an-anvil-01) an-a01n01 10.199.10.1
an-a01n02 10.199.10.2
Node 2 (an-anvil-02) an-a02n01 10.199.12.1
an-a02n02 10.199.12.2
Node 3 (an-anvil-03) an-a03n01 10.199.14.1
an-a03n02 10.199.14.2
Node 4 (an-anvil-04) an-a04n01 10.199.16.1
an-a03n02 10.199.16.2
Node 5 (an-anvil-05) an-a05n01 10.199.18.1
an-a05n02 10.199.18.2

Network Interface Types

In the Anvil!, there are three types of network types.

Interfaces

A network interface is a single, traditional network interface. Specifically, an interface is a physical connection to a specific port on a switch, or another machine.

Bonds

A bonded interface is a virtual interface, made up using two physical network interfaces. The two real interfaces are "bonded" together to function as a single interface, but with the benefit of fault tolerance. If the active interface fails, the backup interface will take over extremely quickly, preventing any connections from being dropped or interrupted.

More specifically, the Anvil! uses "mode=1 (active-backup)" bonds. This bonding mode provides the fastest fault-detection and recovery out of all the available bonding modes in Linux.

Note that it is the only mode that does NOT aggregate bandwidth. That is to say, bonding two 10 Gbps interfaces results in a bonded interface that runs at 10 Gbps, not 20 Gbps. The reason for this is primarily the speed of fault detection and recovery, but also to ensure consistent performance in a fault condition.

As a principle of Intelligent Availability; If 10 Gbps (in this example) is not fast enough for normal operation, then faster interfaces should be used. If 10 Gbps is fast enough, then there's no reason to aggregate the bandwidth to 20 Gbps (which would fall to 10 Gbps on link failure).

Bridges

A bridge is a virtual "ethernet switch". A bridged interface (which can connect to a bond or simple interface) acts as a "bridge" between hosted servers and the outside network.

Each server will have one (or more) virtual network interfaces. These interfaces, outside the server, needs to "plug into" a switch. Being that the servers are virtual, this connection is done using a special vnetX interface that is created when the server boots, and is deleted when the server migrates or shuts off. This transient vnetX device acts like a network cable, and "plugs into" a bridge.

The bridge itself connects to the outside world, almost always via a bridge interface, which in turn sends and receives data on the active interface to the outside world.

By default, all servers created on the Anvil! nodes have a single network interface connected to the Internet-Facing Network bridge. Optionally, an operator could add a second interface to connect to the Back-Channel Network if desired, or simply move the existing interface off of the IFN bridge over to the BCN bridge.


 

Any questions, feedback, advice, complaints or meanderings are welcome.
Us: Alteeve's Niche! Support: Mailing List IRC: #clusterlabs on Libera Chat
© Alteeve's Niche! Inc. 1997-2023   Anvil! "Intelligent Availability™" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.