1010 Sizing

Last modified

Maximum System Limits

The following are the maximum system limits of the StorSimple Armada Storage Appliance model 1010.  These limits should serve as “upper boundaries” in any performance, scalability, and sizing exercise.  Sizing should take into account all of the parameters and considerations defined through the rest of this document.  This section serves only to list the boundaries of system operation, and for those values listed as ‘maximum’ cannot be exceeded.

Item

Value

Maximum configurable NTP servers

2

Maximum configurable DNS servers

2

Maximum CHAP target credentials

1 (applied globally)

Maximum configured CHAP initiator credentials

256

Maximum configured volume access groups (VAG)

64

Maximum members per VAG

64

Maximum cloud storage service accounts

3

Maximum volume size

2047GB

Maximum connected initiators (all volumes)

64

Maximum connected initiators (single volume)

25

Maximum snapshots and Cloud Clones (all volumes)

10,000

Maximum snapshots and Cloud Clones (single volume)

256

Base capacity license

10TB

Maximum capacity license

50TB

Maximum LAN throughput

3Gbps

Typical LAN throughput

1Gbps

Maximum peak WAN throughput

155Mbps

Typical peak WAN throughput

45Mbps

Minimum recommended WAN bandwidth

20Mbps

Peak IOPS

5,000

Typical IOPS

1,000

SSD working set size (raw, linear)

25GB

SSD working set size (raw, deduplicated storage)

60GB

SSD working set size (unstructured data)

200GB (assuming 3:1 deduplication)

SSD working set size (structured data)

115GB (assuming 1.5:1 deduplication)

Back to Top

Sizing Summary

The following is a summarization of each of the use cases presented below as high-level guidelines for any performance, scalability, and sizing exercise when using Armada 1010.  It is recommended that you visit the appropriate sections below to accurately determine the performance, scalability, and sizing specific to your environment’s characteristics.

 

Application and Use Case

Summary for Armada 1010

Microsoft Exchange 2010
Primary Mailbox Storage

Up to 2TB of mailbox storage per system

High-availability through DAG (multiple servers)

No SPOF when using server RAID-1 and two appliances

 

Microsoft Exchange 2010
Archive Mailbox Storage

Up to 20TB of mailbox storage per system

High-availability through DAG (multiple servers)

No SPOF when using server RAID-1 and two appliances

 

SharePoint 2007/2010
Primary Storage

 

Up to 100GB of content database capacity

Up to 10TB of externalized BLOB capacity

High-availability, no SPOF through host-based RAID-1 on SQL Server

 

SharePoint 2007/2010
Backup Storage

 

Up to 20TB of backup data

Volumes of <=400GB if backup app uses virtual tapes (assumes 2:1 dedupe)

High-availability, no SPOF through backup server RAID-1 and two appliances

Disaster recovery through Cloud Clones and configuration backup/restore

 

Windows File Server Storage

Up to 20TB of file server data

High-availability, no SPOF through backup server RAID-1 and two appliances

High-availability through Distributed File System (DFS) with replication

Disaster recovery through Cloud Clones and configuration backup/restore

 

Backup Target Storage

Up to 20TB of backup target storage

Volumes of <=400GB if backup app uses virtual tapes (assumes 2:1 dedupe)

High-availability, no SPOF through file server RAID-1 and two appliances

Disaster recovery through Cloud Clones and configuration backup/restore

 

Back to Top

Recommended Reading for Exchange 2010

Please refer to the following Microsoft articles related to performance, scalability, and sizing metrics that are specific to Exchange 2010 and generic in terms of storage systems:

Back to Top

Recommended Reading for SharePoint 2007 and 2010

Please refer to the following Microsoft articles related to performance, scalability, and sizing metrics in SharePoint 2007 and 2010 environments.

Back to Top

Recommended Reading for Windows File Servers

Please refer to the following Microsoft articles related to performance, scalability, and sizing metrics for Windows file server environments.

Back to Top

Minimum WAN Capacity

Armada is a cloud storage gateway device, meaning data stored on StorSimple volumes will be tiered to cloud storage during its lifecycle.  As such, you should consider how much WAN capacity you need in order to support deployment of Armada.  The following factors should be considered:

  • How much data am I initially going to populate on the system?
  • Over what period of time do I plan to populate the system with the initial data set?
  • What is the change rate of data on a daily basis?
  • How much of the data do I plan to protect using Cloud Clones?
  • How frequently do I plan to create Cloud Clones?

 

StorSimple recommends a WAN link of 20Mbps or more for proper operation of Armada.  Additionally, StorSimple recommends that customers deploy Armada capacity to servers and follow these best practices:

Start with clean volumes for new servers, and migrate data slowly over a reasonable period of time, rather than all at once

Create snapshot and Cloud Clone policies (schedules) immediately upon creation of the volume so that backup copies are created as the volume is being populated

Determine your daily change rate by looking at your existing daily incremental backups, and ensure that you have sufficient WAN capacity to transfer that amount of data in addition to the existing amount of utilization you have on your WAN

When calculating daily change rate through examination of incremental backups, include the effects of deduplication, which may minimize the overall WAN bandwidth requirement.

Back to Top

Exchange 2010 Primary Mailbox Storage

The following are the performance, scalability, and sizing metrics for the StorSimple Armada 1010 when deployed as primary mailbox storage in an Exchange 2010 environment.

Best Practices

The following are best practices provided by Microsoft for implementing mailbox server storage in an Exchange 2010 environment:

 

  • Use Basic Disks as opposed to Dynamic Disks
  • Use GUID Partition Table volumes as opposed to Master Boot Record volumes
  • Use the NTFS file system with a 64KB unit allocation size.  NTFS compression and encryption are not supported
  • Use separate volumes for transaction logs and mailbox databases
  • Include white space and dumpster size in calculating total storage requirements across all users.  A conservative rule of thumb is 25% of the mailbox size
  • Include content indexing in calculating total storage requirements, which is approximately 10% of the size of the mailbox database
  • Allocate sufficient capacity on a separate volume for transaction logs.  A conservative estimate is 5% of the size of the mailbox database volume
  • Leave sufficient space in your volumes to facilitate mailbox moves.  A conservative recommendation is 10% of the mailbox database and transaction log volumes
  • Keep database sizes at or under 2TB.  Use multiple databases when you need to go beyond 2TB, each on separate volumes
  • High availability through DAG should include a minimum of 3 replicas, which includes 1 on-site and 1 off-site.  Each replica requires an identical amount of capacity, both in terms of mailbox database capacity and transaction log capacity

Back to Top

Volume Configuration

It is recommended that for Microsoft Exchange 2010 primary storage deployments, volumes are created in pairs, one pair supporting each mailbox database and its associated transaction logs.  In the Exchange Management Console, ensure that databases and transaction logs are stored on separately.

Maximum Primary Mailbox Storage

The maximum primary mailbox storage for Exchange 2010 is based on the percentage of total data that must be resident locally within Armada 1010 as the working set.  Assuming a 10% working set size (conservative estimation, assumes that users send and receive an amount of email equivalent to 10% of their mailbox quota within one week), and a maximum working set capacity of 200GB (assuming 3:1 deduplication of unstructured data such as messages and email attachments), the maximum primary mailbox storage using the Armada 1010 is 2TB.  2TB can be exceeded if the working set size is smaller, and you are satisfied with the performance provided by Armada.

 

The working set for Exchange 2010 can be calculated by multiplying the number of users by the number of messages sent and received daily, multiplied by the average message size, multiplied by the number of days that are considered the working set of email.  It is a best practice to assume that the last week of email is the active working set to achieve optimum performance.

Back to Top

Maximum Number of Mailboxes

Determining the maximum number of mailboxes for Exchange 2010 primary storage using Armada 1010 is based on two factors: IOPS and storage capacity.  In general, the vast majority of deployments will be bound by storage capacity rather than IOPS, as the IOPS requirement per user is quite low even with heavy users.

 

Per the hyperlink above (Understanding Database and Log Performance Factors) user profiles have the associated IOPS requirement:

  • Light user (50 messages sent/received per day) – 0.06
  • Medium user (100 messages sent/received per day) – 0.12
  • Heavy user (200 messages sent/received per day) – 0.24

 

Note: for Exchange 2010 deployments, even with heavy users, most deployments will be bound by storage capacity (number of mailboxes and mailbox size), as the IOPS per user is low even for a heavy user.

 

To determine the maximum number of mailboxes based on the size of the mailbox quota:

  • Divide maximum mailbox database storage (2TB) by mailbox quota including 25% overhead
  • Example: with 10GB quotas, 2TB / (10GB * 1.25) yields roughly 160 users

 

To determine the maximum number of mailboxes based on average IOPS per user (averaged across all users, using percentage based on user types):

  • Divide typical IOPS by number of IOPS per user (averaged across all users)
  • Example:
  • 50% of users are light users (0.06 IOPS)
  • 40% of users are medium users (0.12 IOPS)
  • 10% of users are heavy users (0.24 IOPS)
  • Average IOPS per user: (0.5 * 0.06) + (0.4 * 0.12) + (0.1 * 0.24) = 0.084 IOPS
  • 1000 IOPS (typical) / 0.084 = 11,904 users
  • As noted above, most deployments are bound by storage rather than IOPS

 

Back to Top

High Availability Configurations

The Armada 1010 does not provide integrated high availability.  Exchange 2010 relies on Database Availability Groups (DAG) for high availability across mailbox servers.  It is recommended that any deployment of Exchange 2010 in a production environment take advantage of DAG for high availability. 

 

To eliminate storage as a single point of failure (SPOF) behind an individual mailbox server while using Armada 1010, it is required that two Armada 1010 appliances be deployed, with volumes of equal size configured on each Armada 1010, and that server-based RAID-1 be configured to mirror data across those volumes in a pair-wise manner.

Back to Top

Exchange 2010 Archive Mailbox Storage

The following are the performance, scalability, and sizing metrics for the StorSimple Armada 1010 when deployed as archival mailbox storage in an Exchange 2010 environment where service pack 1 (SP1) has been deployed.

Best Practices

The following are best practices provided by Microsoft for implementing mailbox server storage in an Exchange 2010 environment:
  • Use Basic Disks as opposed to Dynamic Disks
  • Use GUID Partition Table volumes as opposed to Master Boot Record volumes
  • Use the NTFS file system with a 64KB unit allocation size.  NTFS compression and encryption are not supported
  • Use separate volumes for transaction logs and mailbox databases
  • Include white space and dumpster size in calculating total storage requirements across all users.  A conservative rule of thumb is 25% of the mailbox size
  • Include content indexing in calculating total storage requirements, which is approximately 10% of the size of the mailbox database
  • Allocate sufficient capacity on a separate volume for transaction logs.  A conservative estimate is 5% of the size of the mailbox database volume
  • Leave sufficient space in your volumes to facilitate mailbox moves.  A conservative recommendation is 10% of the mailbox database and transaction log volumes
  • Keep database sizes at or under 2TB.  Use multiple databases when you need to go beyond 2TB, each on separate volumes
  • High availability through DAG should include a minimum of 3 replicas, which includes 1 on-site and 1 off-site.  Each replica requires an identical amount of capacity, both in terms of mailbox database capacity and transaction log capacity.

Back to Top

Volume Configuration

It is recommended that for Microsoft Exchange 2010 primary storage deployments, volumes are created in pairs, one pair supporting each mailbox database and its associated transaction logs.  In the Exchange Management Console, ensure that databases and transaction logs are stored on separately.

Maximum Archive Mailbox Database Storage

The maximum archive mailbox database storage for Exchange 2010 is based on the percentage of total data that must be resident locally within Armada 1010 as the working set.  Given that the purpose of an archive mailbox is to provide long-term retention for aged objects, the frequency with which these objects are accessed is very low, particularly in environments where the primary mailbox size is large.

 

Assuming a 1% working set size (conservative estimation, assumes that 1% of their archive is accessed on a weekly basis), and a maximum working set capacity of 200GB (assuming 3:1 deduplication of unstructured data such as messages and email attachments), the maximum archive mailbox database storage using the Armada 1010 is 20TB.  Microsoft recommends that mailbox database volumes be at most 2TB in size, therefore, when implementing storage in such an environment, use multiple 2TB volumes.  20TB can be exceeded, assuming the working set size is less than 1% and you are satisfied with the performance provided by Armada.

 

As evidenced in the section above on primary storage, IOPS is of minimal consideration for Exchange 2010 primary storage.  It can be safely assumed that the IOPS requirement for archive storage will be substantially lower than that of primary storage, and thus, should not require consideration in determining the maximum number of users.

Back to Top

High Availability Configurations

The Armada 1010 does not provide integrated high availability.  Exchange 2010 relies on Database Availability Groups (DAG) for high availability across mailbox servers.  It is recommended that any deployment of Exchange 2010 in a production environment take advantage of DAG for high availability. 

 

To eliminate storage as a single point of failure (SPOF) behind an individual mailbox server while using Armada 1010, it is required that two Armada 1010 appliances be deployed, with volumes of equal size configured on each Armada 1010, and that server-based RAID-1 be configured to mirror data across those volumes in a pair-wise manner.

Back to Top

SharePoint 2007 and 2010 Primary Storage

The following are the performance, scalability, and sizing metrics for the StorSimple Armada 1010 when deployed as primary storage in a SharePoint 2007 or 2010 environment using StorSimple’s Database Optimizer, either External BLOB Storage (EBS) or Remote BLOB Storage (RBS).

Best Practices

The following are best practices provided by Microsoft for implementing the storage architecture supporting a SharePoint 2007 or 2010 environment:

  • Use Basic Disks as opposed to Dynamic Disks
  • Use GUID Partition Table volumes as opposed to Master Boot Record volumes
  • Use separate volumes for the content database, transaction logs, search databases, BLOB storage, and temporary databases
  • Other applications that use SQL Server should each have their own database volume and their own transaction log volume
  • Ensure that content databases are sized at or under 200GB in size
  • Create snapshot and Cloud Clone policies at the time of volume creation.  This will help ensure adequate backup performance.

Back to Top

Volume Configuration

It is recommended that for SharePoint 2007 and 2010 primary storage deployments, individual volumes are used as follows:
  • Individual volume for tempdb and transaction logs (same volume)
  • Individual volume for search database
  • Individual volume for search database transaction logs
  • Individual volume for each content database, when content database usage is heavy.  Otherwise, multiple content databases can be stored on an individual volume
  • Individual volume for each content database transaction logs, when content database usage is heavy.  Otherwise, multiple content database transaction logs can be stored on an individual volume
  • Multiple volumes of up to 2TB each supporting externalized BLOBs

Back to Top

Capacity Allocation Overview

In order to ensure adequate SharePoint performance, it is recommended that at least 50% of the data supporting database files and transaction logs (structured data) be kept local, and that at least 10% of the data supporting externalized BLOBs (unstructured data) be kept local. 

 

Additionally, it is generally safe to assume that the sum of capacity consumed by content databases will be approximately 1% of the sum capacity of BLOBs.  Similarly, the sum of capacity consumed by transaction logs will be approximately 25% of the sum capacity of the content databases. 

Back to Top

Maximum Database and Transaction Log Storage

Using these assumptions, the following are the recommended sizing guidelines for SharePoint when using EBS and RBS with StorSimple, where content databases, transaction logs, and externalized BLOBs are stored on volumes provided by StorSimple:

 

  • Maximum content database size (assuming unstructured data deduplication of 1.5X, and 50% of data is local): 100GB
  • Maximum transaction log size (assuming unstructured data deduplication of 1.5X, and 50% of data is local): 25GB
  • Maximum capacity for externalized BLOBs: 10TB

 

It is not recommended that sizes for content databases or transaction logs as described above exceed the maximum numbers unless performance qualification has been done and you are satisfied with the performance provided by Armada.  The maximum capacity for externalized BLOBs can be increased more liberally, but should also be done after performance qualification.

Back to Top

High Availability Configurations

The Armada 1010 does not provide integrated high availability.  SharePoint relies on SQL Server high availability, which utilizes cluster services and shared storage.  To eliminate storage as a single point of failure (SPOF) behind SQL Servers while using Armada 1010, it is required that two Armada 1010 appliances be deployed, with volumes of equal size configured on each Armada 1010, and that server-based RAID-1 be configured to mirror data across those volumes in a pair-wise manner.

Back to Top

SharePoint 2007 and 2010 Backup Storage

The following are the performance, scalability, and sizing metrics for the StorSimple Armada 1010 when deployed as backup storage in a SharePoint 2007 or 2010 environment.  This use case does not examine BLOB externalization, which is examined in the primary storage use case.  Rather, the volume(s) provided by StorSimple are used by backup applications for SharePoint data.

In backup and data protection environments, the speed at which data is backed up is generally of more concern than the speed at which data is restored, since restore is an infrequent event, whereas backup operations occur very regularly.

In order to support fast restore operations it is recommended that a minimum of 1% of the data be kept locally within the Armada 1010.  With this assumption, the maximum capacity that should be allocated for SharePoint backup and data protection is 20TB. 

Additionally, it is recommended that you create a snapshot and Cloud Clone policy immediately upon creation of the volume.  This way, Armada will begin populating data in the cloud as the volume is being populated, which will help ensure consistent performance.

Back to Top

Best Practices

The following are best practices when configuring Armada to provide volumes in support of SharePoint 2007 and  2010 backup:
  • Always create a snapshot and Cloud Clone schedule upon creation of the volume that will be used as a backup target
  • Ensure that you have sufficient WAN capacity to support transferring an entire day’s incremental backup to the cloud within a 24-hour period
  • If the SharePoint backup application stores data in a backup file format (similar to virtual tapes), use a virtual tape size (volume size) of 400GB or less, assuming a deduplication ratio of 2:1 (with 1:1 deduplication, use volume sizes of 200GB or less)

Back to Top

High Availability Configurations

In data protection use cases, high availability of the backup system itself is generally not required, especially in cases where the media is physically separate from the backup system.  In the case of Armada, backup copies are Cloud Clones, which are stored fully in the cloud.  The configuration backup and restore functionality of the Armada management console, along with the Data Protection MMC snap-in, ensure that you can quickly access your Cloud Clones after replacement of a failed device.

 

Should you require high availability of Armada, you can deploy two and use host-based RAID-1 across identically configured volumes from each of the two separate Armada devices.  This should be done from the backup server that will be writing data to Armada.

Back to Top

Windows File Server Storage

The following are the performance, scalability, and sizing metrics for the StorSimple Armada 1010 when deployed as primary storage supporting a Windows File Server.

 

In file server environments, interactive user performance is of importance, but user operations are typically delay-tolerant.  As such, it is recommended that a minimum of 1% of the data be kept locally within the Armada 1010.  With this assumption, the maximum capacity that should be allocated for Windows File Server storage is 20TB.

Back to Top

Best Practices

The following are best practices to follow when using Armada 1010 to provide capacity to Windows file servers:
  • It is recommended that data migration to file shares stored on Armada volumes be done over a period of time rather than all at once
  • Create snapshot and Cloud Clone policies immediately upon creation of the volume to ensure better initial backup performance
  • Maximum capacity allocated for Windows File Server storage can be increased, assuming you are comfortable with the performance provided by Armada

Back to Top

High Availability Configurations

The Armada 1010 does not provide integrated high availability.  To eliminate storage as a single point of failure (SPOF) behind Windows File Servers while using Armada 1010, it is required that two Armada 1010 appliances be deployed, with volumes of equal size configured on each Armada 1010, and that server-based RAID-1 be configured to mirror data across those volumes in a pair-wise manner. 

 

Alternatively, storage as a single point of failure is less of a concern when multiple file servers are used with Distributed File System (DFS) with replication (including DFS-R), which provides integrated high availability across servers through server-based replication.  In this manner, failure of a server or Armada storage has minimal impact, as users are redirected to servers that are available to satisfy their file requests.

Back to Top

Backup Target Storage

The following are the performance, scalability, and sizing metrics for the StorSimple Armada 1010 when deployed as a generic backup storage. 

 

In backup and data protection environments, the speed at which data is backed up is generally of more concern than the speed at which data is restored, since restore is an infrequent event, whereas backup operations occur very regularly.

 

In order to support fast restore operations it is recommended that a minimum of 1% of the data be kept locally within the Armada 1010.  With this assumption, the maximum capacity that should be allocated for general backup target storage applications is 20TB.

Back to Top

Best Practices

The following are best practices when configuring Armada to provide volumes in support of general data protection applications:
  • Always create a snapshot and Cloud Clone schedule upon creation of the volume that will be used as a backup target
  • Ensure that you have sufficient WAN capacity to support transferring an entire day’s incremental backup to the cloud within a 24-hour period
  • If the backup application stores data using a virtual tape format, or any format other than native content format, use a virtual tape size (volume size) of 400GB or less, assuming a deduplication ratio of 2:1 (with a deduplication ratio of 1:1, use volume sizes of 200GB or less)

Back to Top

High Availability Configurations

In data protection use cases, high availability of the backup system itself is generally not required, especially in cases where the media is physically separate from the backup system.  In the case of Armada, backup copies are Cloud Clones, which are stored fully in the cloud.  The configuration backup and restore functionality of the Armada management console, along with the Data Protection MMC snap-in, ensure that you can quickly access your Cloud Clones after replacement of a failed device.

 

Should you require high availability of Armada, you can deploy two and use host-based RAID-1 across identically configured volumes from each of the two separate Armada devices.  This should be done from the backup server that will be writing data to Armada.

Back to Top

 

Related

Page statistics
3778 view(s) and 13 edit(s)
Social share
Share this page?

Tags

This page has no custom tags.
This page has no classifications.

Comments

You must to post a comment.

Attachments