Server-to-server storage replication

You can use Storage Replica to configure two servers to sync data so that each has an identical copy of the same volume. This article provides some background of this server-to-server replication configuration, and how to set it up and manage the environment.

To manage Storage Replica, you can use Windows Admin Center or PowerShell.

Here's an overview video of using Storage Replica in Windows Admin Center.

Prerequisites

  • Active Directory Domain Services forest (doesn't need to run Windows Server 2016).
  • Two servers running either Windows Server 2019 or later, or Windows Server 2016 Datacenter Edition. If you're running Windows Server 2019, you can use Standard Edition if you're OK replicating only a single volume up to 2 TB in size.
  • Two sets of storage, using SAS JBODs, fibre channel SAN, iSCSI target, or local SCSI/SATA storage. The storage should contain a mix of HDD and SSD media. Make each storage set available only to each of the servers, with no shared access.
  • Each set of storage must allow creation of at least two virtual disks, one for replicated data and one for logs. The physical storage must have the same sector sizes on all the data disks. The physical storage must have the same sector sizes on all the log disks.
  • At least one ethernet/TCP connection on each server for synchronous replication, but preferably RDMA.
  • Appropriate firewall and router rules to allow ICMP, SMB port 445, SMB Direct port 5445, and WS-MAN port 5985 bi-directional traffic between all nodes.
  • A network between servers with enough bandwidth to contain your IO write workload and an average of ~5ms round trip latency, for synchronous replication. Asynchronous replication doesn't have a latency recommendation.
  • The replicated storage can't be located on the drive containing the Windows operating system folder.

If you're replicating between on-premises servers and Azure VMs, you must create a network link between the on-premises servers and the Azure VMs. To do so, use Express Route, a Site-to-Site VPN gateway connection, or install VPN software in your Azure VMs to connect them with your on-premises network.

Important

In this scenario, each server should be in a different physical or logical site. Each server must be able to communicate with the other via a network.

Many of these requirements can be determined by using the Test-SRTopology cmdlet. You get access to this tool if you install Storage Replica or the Storage Replica Management Tools features on at least one server. There's no need to configure Storage Replica to use this tool, only to install the cmdlet.

Windows Admin Center requirements

To use Storage Replica and Windows Admin Center together, you need the following requirements:

System Operating system Required for
Two servers
(any mix of on-premises hardware, VMs, and cloud VMs including Azure VMs)
Windows Server 2019, Windows Server 2016, or Windows Server (Semi-Annual Channel) Storage Replica
One PC Windows 10 or later Windows Admin Center

Note

Right now you can't use Windows Admin Center on a server to manage Storage Replica.

Terms

This walkthrough uses the following environment as an example:

  • Two servers, named SR-SRV05 and SR-SRV06.

  • A pair of logical "sites" that represent two different data centers, with one called Redmond and one called Bellevue.

Diagram showing a server in Building 5 replicating with a server in Building 9

Step 1: Install and configure Windows Admin Center

If you're using Windows Admin Center to manage Storage Replica, use the following steps to prep your PC to manage Storage Replica.

  1. Download and install Windows Admin Center.

  2. Download and install the Remote Server Administration Tools.

    If you're using Windows 10, version 1809 or later, install the RSAT: Storage Replica Module for Windows PowerShell from Features on Demand.

  3. Open an elevated PowerShell window and run the following command to enable the WS-Management protocol on the local computer and set up the default configuration for remote management on the client:

    winrm quickconfig
    
  4. Type Y to enable WinRM services and enable WinRM Firewall Exception.

Step 2: Provision operating system, features, roles, storage, and network

  1. Install Windows Server Desktop Experience on both server nodes.

    To use an Azure VM connected to your network via an ExpressRoute, see Adding an Azure VM connected to your network via ExpressRoute.

    Note

    Starting in Windows Admin Center version 1910, you can configure a destination server automatically in Azure. If you choose that option, install Windows Server on the source server and then skip to Step 3: Set up server-to-server replication.

  2. Add network information, join the servers to the same domain as your Windows 10 management PC (if you're using one), and then restart the servers.

    Note

    From this point on, always sign in as a domain user who is a member of the built-in administrator group on all servers. Always remember to elevate your PowerShell and CMD prompts going forward when running on a graphical server installation or on Windows 10 and later computer.

  3. Connect the first set of JBOD storage enclosure, iSCSI target, FC SAN, or local fixed disk (DAS) storage to the server in site Redmond.

  4. Connect the second set of storage to the server in site Bellevue.

  5. As appropriate, install latest vendor storage and enclosure firmware and drivers, latest vendor HBA drivers, latest vendor BIOS/UEFI firmware, latest vendor network drivers, and latest motherboard chipset drivers on both nodes. Restart nodes as needed.

    Note

    Consult your hardware vendor documentation for configuring shared storage and networking hardware.

  6. Ensure that BIOS/UEFI settings for servers enable high performance, such as disabling C-State, setting QPI speed, enabling NUMA, and setting highest memory frequency. Ensure power management in Windows Server is set to High Performance. Restart as required.

  7. Configure roles as follows:

    • Windows Admin Center method

      1. In Windows Admin Center, navigate to Server Manager, then select one of the servers.
      2. Navigate to Roles & Features.
      3. Select Features, select Storage Replica, then select Install.
      4. Repeat these steps on the other server.
    • Server Manager method

      1. In Server Manager, select Create a server group, then add all server nodes.

      2. Install the File Server role and Storage Replica feature on each of the nodes and restart them. To learn more, see Install or Uninstall Roles, Role Services, or Features

    • Windows PowerShell method

      On SR-SRV06 or a remote management computer, run the following command to install the required features and roles and restart them:

      $Servers = 'SR-SRV05','SR-SRV06'
      $Servers | ForEach { Install-WindowsFeature -ComputerName $_ -Name Storage-Replica,FS-FileServer -IncludeManagementTools -Restart }
      
  8. Configure storage as follows:

    Important

    • You must create two volumes on each enclosure: one for data and one for logs.
    • Log and data disks must be initialized as GPT, not MBR.
    • The two data volumes must be of identical size.
    • The two log volumes should be of identical size.
    • All replicated data disks must have the same sector sizes.
    • All log disks must have the same sector sizes.
    • The log volumes should use flash-based storage, such as SSD. Microsoft recommends that the log storage be faster than the data storage. Log volumes must never be used for other workloads.
    • The data disks can use HDD, SSD, or a tiered combination and can use either mirrored or parity spaces or RAID 1 or 10, or RAID 5 or RAID 50.
    • The log volume must be at least 9GB by default and may be larger or smaller based on log requirements.
    • The File Server role is only necessary for Test-SRTopology to operate, as it opens the necessary firewall ports for testing.
    • For JBOD enclosures:

      1. Ensure that each server can see that site's storage enclosures only and that the SAS connections are correctly configured.

      2. Provision the storage using Storage Spaces by following Steps 1 - 3 provided in the Deploy Storage Spaces on a Stand-Alone Server using Windows PowerShell or Server Manager.

    • For iSCSI storage:

      1. Ensure that each cluster can see that site's storage enclosures only. You should use more than one single network adapter if using iSCSI.

      2. Provision the storage using your vendor documentation. If using Windows-based iSCSI Targeting, see iSCSI Target Server overview.

    • For FC SAN storage:

      1. Ensure that each cluster can see that site's storage enclosures only and that you properly zoned the hosts.

      2. Provision the storage using your vendor documentation.

    • For local fixed disk storage:

      • Ensure the storage doesn't contain a system volume, page file, or dump files.

      • Provision the storage using your vendor documentation.

  9. Start Windows PowerShell and use the Test-SRTopology cmdlet to determine if you meet all the Storage Replica requirements. You can use the cmdlet in a requirements-only mode for a quick test and a long running performance evaluation mode. For example:

    To validate the proposed nodes that each have a F: and G: volume and run the test for 30 minutes:

    $params = @{
        SourceComputerName   = 'SR-SRV05'
        SourceVolumeName     = 'F:'
        SourceLogVolumeName  = 'G:'
        DestinationComputerName = 'SR-SRV06'
        DestinationVolumeName = 'F:'
        DestinationLogVolumeName = 'G:'
        DurationInMinutes    = 30
        ResultPath           = 'C:\Temp'
    }
    MD C:\Temp
    Test-SRTopology @params
    

    Important

    When using a test server with no write IO load on the specified source volume during the evaluation period, consider adding a workload to generate a useful report. You should test with production-like workloads in order to see real numbers and recommended log sizes. Alternatively, copy some files into the source volume during the test or download and run DISKSPD to generate write IOs. For instance, a sample with a low write IO workload for 10 minutes to the D: volume:

    Diskspd.exe -c1g -d600 -W5 -C5 -b8k -t2 -o2 -r -w5 -i100 -j100 d:\test

  10. Examine the TestSrTopologyReport.html report to ensure that you meet the Storage Replica requirements.

    A screenshot displaying the test s r topology report.

Step 3: Set up server-to-server replication

Using Windows Admin Center

  1. Add the source server.

    1. Select the Add button.
    2. Select Add server connection.
    3. Type the name of the server and then select Submit.
  2. On the All Connections page, select the source server.

  3. Select Storage Replica from Tools panel.

  4. Select New to create a new partnership. To create a new Azure VM to use as the destination for the partnership:

    1. Under Replicate with another server select Use a New Azure VM and then select Next. If you don't see this option, make sure that you're using Windows Admin Center version 1910 or a later version.

    2. Specify your source server information and replication group name, and then select Next.

      This begins a process that automatically selects a Windows Server 2019 or Windows Server 2016 Azure VM as a destination for the migration source. Storage Migration Service recommends VM sizes to match your source, but you can override this by selecting See all sizes. Inventory data is used to automatically configure your managed disks and their file systems, and join your new Azure VM to your Active Directory domain.

    3. After Windows Admin Center creates the Azure VM, provide a replication group name and then select Create. Windows Admin Center then begins the normal Storage Replica initial synchronization process to start protecting your data.

      Here's a video showing how to use Storage Replica to migrate to Azure VMs.

  5. Provide the details of the partnership, and then select Create.

    A screenshot of the storage replica new partnership screen in Windows Admin Center.

Note

Removing the partnership from Storage Replica in Windows Admin Center doesn't remove the replication group name.

Using Windows PowerShell

Configure server-to-server replication using Windows PowerShell. You must perform all of the steps on the nodes directly or from a remote management computer that contains the Windows Server Remote Server Administration Tools.

  1. Ensure you're using an elevated PowerShell console as an administrator.

  2. Configure the server-to-server replication, specifying the source and destination disks, the source and destination logs, the source and destination nodes, and the log size.

    $params = @{
       SourceComputerName      = 'SR-SRV05'
       SourceRGName            = 'RG01'
       SourceVolumeName        = 'F:'
       SourceLogVolumeName     = 'G:'
       DestinationComputerName = 'SR-SRV06'
       DestinationRGName       = 'RG02'
       DestinationVolumeName   = 'F:'
       DestinationLogVolumeName= 'G:'
       LogType                 = 'Raw'
    }
    New-SRPartnership @params
    
    DestinationComputerName : SR-SRV06
    DestinationRGName       : RG02
    SourceComputerName      : SR-SRV05
    PSComputerName          :
    

    Important

    The default log size is 8GB. Depending on the results of the Test-SRTopology cmdlet, you may decide to use the LogSizeInBytes parameter with a higher or lower value.

  3. To get replication source and destination state, use Get-SRGroup and Get-SRPartnership as follows:

    Get-SRGroup
    Get-SRPartnership
    (Get-SRGroup).replicas
    
    CurrentLsn             : 0
    DataVolume             : F:\
    LastInSyncTime         :
    LastKnownPrimaryLsn    : 1
    LastOutOfSyncTime      :
    NumOfBytesRecovered    : 37731958784
    NumOfBytesRemaining    : 30851203072
    PartitionId            : c3999f10-dbc9-4a8e-8f9c-dd2ee6ef3e9f
    PartitionSize          : 68583161856
    ReplicationMode        : synchronous
    ReplicationStatus      : InitialBlockCopy
    PSComputerName         :
    
  4. Determine the replication progress as follows:

    1. On the source server, run the following command and examine event IDs 1237, 2200, 5001, 5002, 5004, and 5015:

      Get-WinEvent -ProviderName Microsoft-Windows-StorageReplica -Max 20
      
    2. On the destination server, run the following command to see the Storage Replica events that show creation of the partnership. This event states the number of copied bytes and the time taken.

      Get-WinEvent -ProviderName Microsoft-Windows-StorageReplica | Where-Object {$_.ID -eq "1215"} | FL
      
      TimeCreated  : 4/8/2016 4:12:37 PM
      ProviderName : Microsoft-Windows-StorageReplica
      Id           : 1215
      Message      : Block copy completed for replica.
      
      ReplicationGroupName: RG02
      ReplicationGroupId: {616F1E00-5A68-4447-830F-B0B0EFBD359C}
      ReplicaName: F:\
      ReplicaId: {00000000-0000-0000-0000-000000000000}
      End LSN in bitmap:
      LogGeneration: {00000000-0000-0000-0000-000000000000}
      LogFileId: 0
      CLSFLsn: 0xFFFFFFFF
      Number of Bytes Recovered: 68583161856
      Elapsed Time (ms): 117
      

      Note

      Storage Replica dismounts the destination volumes and their drive letters or mount points. This is by design.

    3. Alternatively, the destination server group for the replica states the number of byte remaining to copy always, and can be queried through PowerShell. For example:

      (Get-SRGroup).Replicas | Select-Object numofbytesremaining
      

      As a progress sample (that doesn't terminate):

      while($true) {
      
       $v = (Get-SRGroup -Name "RG02").replicas | Select-Object numofbytesremaining
       [System.Console]::Write("Number of bytes remaining: {0}`r", $v.numofbytesremaining)
       Start-Sleep -s 5
      }
      
    4. On the destination server, run the following command and examine event IDs 1237, 2200, 5001, 5002, 5004, and 5015 to understand the processing progress. There should be no warnings of errors in this sequence. If several event IDs 1237 occur, this indicates progress.

      Get-WinEvent -ProviderName Microsoft-Windows-StorageReplica | FL
      

Step 4: Manage replication

Manage and operate your server-to-server replicated infrastructure. You can perform all of the steps on the nodes directly or from a remote management computer that contains the Windows Server Remote Server Administration Tools.

  1. Use Get-SRPartnership and Get-SRGroup to determine the current source and destination of replication and their status.

  2. To measure replication performance, use the Get-Counter cmdlet on both the source and destination nodes. The counter names are:

    \Storage Replica Partition I/O Statistics(*)\Number of times flush paused
    \Storage Replica Partition I/O Statistics(*)\Number of pending flush I/O
    \Storage Replica Partition I/O Statistics(*)\Number of requests for last log write
    \Storage Replica Partition I/O Statistics(*)\Avg. Flush Queue Length
    \Storage Replica Partition I/O Statistics(*)\Current Flush Queue Length
    \Storage Replica Partition I/O Statistics(*)\Number of Application Write Requests
    \Storage Replica Partition I/O Statistics(*)\Avg. Number of requests per log write
    \Storage Replica Partition I/O Statistics(*)\Avg. App Write Latency
    \Storage Replica Partition I/O Statistics(*)\Avg. App Read Latency
    \Storage Replica Statistics(*)\Target RPO
    \Storage Replica Statistics(*)\Current RPO
    \Storage Replica Statistics(*)\Avg. Log Queue Length
    \Storage Replica Statistics(*)\Current Log Queue Length
    \Storage Replica Statistics(*)\Total Bytes Received
    \Storage Replica Statistics(*)\Total Bytes Sent
    \Storage Replica Statistics(*)\Avg. Network Send Latency
    \Storage Replica Statistics(*)\Replication State
    \Storage Replica Statistics(*)\Avg. Message Round Trip Latency
    \Storage Replica Statistics(*)\Last Recovery Elapsed Time
    \Storage Replica Statistics(*)\Number of Flushed Recovery Transactions
    \Storage Replica Statistics(*)\Number of Recovery Transactions
    \Storage Replica Statistics(*)\Number of Flushed Replication Transactions
    \Storage Replica Statistics(*)\Number of Replication Transactions
    \Storage Replica Statistics(*)\Max Log Sequence Number
    \Storage Replica Statistics(*)\Number of Messages Received
    \Storage Replica Statistics(*)\Number of Messages Sent
    

    For more information on performance counters in Windows PowerShell, see Get-Counter.

  3. To move the replication direction from one site, use the Set-SRPartnership cmdlet.

    $params = @{
       NewSourceComputerName  = 'SR-SRV06'
       SourceRGName           = 'RG02'
       DestinationComputerName = 'SR-SRV05'
       DestinationRGName      = 'RG01'
    }
    Set-SRPartnership @params
    

    Warning

    Windows Server prevents role switching when the initial sync is ongoing, as it can lead to data loss if you attempt to switch before allowing initial replication to complete. Don't force switch directions until the initial sync is complete.

    Check the event logs to see the direction of replication change and recovery mode occur, and then reconcile. Write IOs can then write to the storage owned by the new source server. Changing the replication direction blocks write IOs on the previous source computer.

  4. To remove replication, use Get-SRGroup, Get-SRPartnership, Remove-SRGroup, and Remove-SRPartnership on each node. Ensure you run the Remove-SRPartnership cmdlet on the current source of replication only, not on the destination server. Run Remove-SRGroup on both servers. For example, to remove all replication from two servers:

    Get-SRPartnership
    Get-SRPartnership | Remove-SRPartnership
    Get-SRGroup | Remove-SRGroup
    

Replacing DFS Replication with Storage Replica

Many Microsoft customers deploy DFS Replication as a disaster recovery solution for unstructured user data like home folders and departmental shares. DFS Replication shipped in Windows Server 2003 R2 and all later operating systems and operates on low bandwidth networks, which make it attractive for high latency and low change environments with many nodes. However, DFS Replication has notable limitations as a data replication solution:

  • It doesn't replicate in-use or open files.
  • It doesn't replicate synchronously.
  • Its asynchronous replication latency can be many minutes, hours, or even days.
  • It relies on a database that can require lengthy consistency checks after a power interruption.
  • It's configured as multi-master, which allows changes to flow in both directions, possibly overwriting newer data.

Storage Replica has none of these limitations. It does, however, have several that might make it less interesting in some environments:

  • It only allows one-to-one replication between volumes. It's possible to replicate different volumes between multiple servers.
  • While it supports asynchronous replication, it's not designed for low bandwidth, high latency networks.
  • It doesn't allow user access to the protected data on the destination while replication is ongoing

If these aren't blocking factors, Storage Replica allows you to replace DFS Replication servers with this newer technology. The process at a high level that allows users to access their data is as follows:

  1. Install or upgrade Windows Server on two servers and configure storage.
  2. Ensure data to be replicated is on data volumes, not the C: drive.
    1. Optionally seed the data on the other server using backups or file copies.
  3. Share the data on the source server via a DFS namespace to maintain accessibility.
    1. Create matching shares on the destination server; keep them disabled in DFS Namespaces.
  4. Enable Storage Replica replication and complete the initial sync.
    1. Prefer synchronous replication for data consistency.
    2. Enable Volume Shadow Copies and periodically take snapshots for data consistency.
  5. Operate normally until a disaster occurs.
  6. Switch the destination server to the new source to surface replicated volumes.
    1. With synchronous replication, minimal data restoration is needed; with asynchronous, use VSS snapshots if necessary.
  7. Add the server and shares to DFS Namespaces as folder targets.

Note

Disaster Recovery planning is a complex subject and requires great attention to detail. Creation of runbooks and the performance of annual live failover drills is highly recommended. When an actual disaster strikes, experienced personnel might be unavailable.

Adding an Azure VM connected to your network via ExpressRoute

  1. Create an ExpressRoute in the Azure portal.

    After the ExpressRoute is approved, a resource group is added to the subscription. Navigate to Resource groups to view this new group. Take note of the virtual network name.

    A screenshot of the Azure portal showing the resource group added with ExpressRoute.

  2. Create a new resource group.

  3. Add a network security group. When creating it, select the subscription ID associated with the ExpressRoute you created, and select the resource group you just created as well.

    Add any inbound and outbound security rules you need to the network security group. For example, you might want to allow Remote Desktop access to the VM.

  4. Create an Azure VM with the following settings:

    • Public IP address: None
    • Virtual network: Select the virtual network you took note of from the resource group added with the ExpressRoute.
    • Network security group (firewall): Select the network security group you created previously.

    A screenshot of the Create a virtual machine screen in the Azure portal displaying ExpressRoute network settings.

  5. After the VM is created, see Step 2: Provision operating system, features, roles, storage, and network.

See also