Deduplication

Deduplication is a form of data reduction that saves storage space. The deduplication process identifies duplicate content within a domain, and stores only one copy of that content.

The HPE storage array implementation of deduplication works at the volume block level on the following arrays:
  • All Flash arrays running release 3.x or later
  • Secondary Flash arrays running release 4.2.0 or later
  • Select models of Adaptive Flash arrays running release 5.0.1 or later

When deduplication is enabled, identical content stored on the array is deduplicated using inline deduplication. Inline deduplication involves arrays deduplicating data in real-time, as data is received.

The deduplication process uses a two-level fingerprint system, with short fingerprints for speed of detection and long cryptographically secure fingerprints to ensure reliability. The deduplication process optimizes for “flocks” of duplicate data, consecutive runs of blocks that are duplicated. This multi-layer deduplication process allows for near-perfect duplication detection, while dramatically reducing the amount of main memory required to efficiently deduplicate large capacity SSDs.

If data has already been written to the array with deduplication disabled, the data on the disk cannot be deduplicated unless you migrate the data either using array-side functionality (for example, move the volume to another deduplication-enabled pool in the group) or host tools to migrate to a new deduplication-enabled volume or pool.

NOTE: Volume (and snapshot) limits and reserves are based on pre-deduplication usage.