As technology has evolved in the IT sector, many businesses are transitioning to newer, redundant systems. These include redundancy through a RAID on a server or storage area network (SAN), computing redundancy (RAM/CPU) through virtualization and redundancy due to manufacturers incorporating field-replaceable units (FRUs) to swap out bad hardware. Additionally, many businesses have transformed in such a way that they are now supporting 24/7 operations and redundant systems have become a requirement.
This prompts the question, “Why even consider backups when everything is built on a redundant system?” Even with all the built-in redundancy, businesses often need some sort of historical point of recovery or need to accommodate for a disaster recovery. Oftentimes, backups are the last line of defense in a data loss scenario; one rule that has consistently been considered a best practice in the industry is the 3-2-1 backup rule.
What Is the 3-2-1 Backup Rule
The 3-2-1 backup rule translates to:
- Have three separate copies of data (primary copy with two backups),
- Stored on two different physical media,
- With one backup copy kept offsite.
Applying the 3-2-1 Backup Rule
As a consultant, I am frequently asked, “Do I really need to have at least three copies of my data?”
In truth, the answer is that it depends on the needs of the business; however, as a bare minimum, 3-2-1 is considered the rule of thumb to follow. One of the very few cases where 3-2-1 does not apply at a bare minimum is when a business decides that if the primary site were to permanently go offline (via tornado, fire, natural disaster, or man-made disaster), then the business would no longer continue.
Breaking Down the 3-2-1 Backup Rule
1st Copy of Data
To explain the 3-2-1 backup rule, why it is used and to give a few examples, let’s establish a data center as a centralized location where the primary/in-production data exists. In the 3-2-1 backup rule, this can be considered the “1st copy” and can reside on:
- A physical server with direct-attached storage
- SAN with virtualization on the backend.
Above: A Dell EqualLogic PS Series SAN.
2nd Copy of Data
The 2nd copy of data has historically been an onsite backup (onsite referring to being in the same data center as the primary data). In both the SAN/virtualization solution and in the case of a physical server (typically a file server), keeping the data onsite allows for increased recovery time objective (RTO).
With the cost of storage continuing to get cheaper, one practice I frequently see businesses trying to adopt when buying new hardware is to buy additional disk space on the SAN/local server and utilize those spare resources for their backups. It is very important that production data and backup data not be kept on the same physical media as the production/primary data. Although a SAN by nature has a built-in redundancy with many components such as disks, power supplies, etc., it is a really bad practice to have both primary and backup data on the same media. Why? Because if the SAN becomes unrecoverable due to too many disks failing or the entire unit going bad, then both production data and backups will be lost. Although losing an entire SAN is rare, it is possible and something I personally have seen occur multiple times.
With faster internet connection speeds and the cost of disk lowering so much, it can be argued that onsite backups no longer are a necessity for many businesses and that backups can go directly to the offsite media. Faster internet connection speeds also allow businesses to perform cloud-based backup and cloud-based recovery within acceptable recovery time objectives.
3rd Copy of Data
The 3rd copy of data should reside at an offsite location. Although having two sets of data at the onsite location accommodates for a lot of data-loss scenarios, they do not accommodate for everything. Fires, floods, blizzards and tornadoes could potentially destroy data centers, and hurricanes may impact large geographic regions to make data inaccessible or destroy a copy of the backup.
When selecting an offsite location, it is important to be mindful of what type of disasters may impact the data center and what type of geographical separation will be needed. For example, if a hurricane were to hit the northeast (such as Hurricane Sandy), this may impact hundreds of miles at a time – so an acceptable location for the second set of data may need to be a significant distance away.
Above: Disasters are rare, but recovering data from a physical machine once one hits is difficult if not impossible.
Other Considerations Beyond 3-2-1 Backup
Although the 3-2-1 backup rule creates a great baseline, there are multiple other considerations that should be made when creating a backup plan to achieve optimized recovery time objectives and recovery point objectives:
- How do backups need to be recovered during a major outage?
- How many historical restore points do I need?
- At what frequency do my backups need to occur?
- Does my business have compliance needs that require additional backups beyond 3-2-1?
- How quickly does my data need to be recovered from a backup?
- How quickly is my backup data growing in my environment?
- Does my backup need to be application-aware?
All of these considerations can be tackled with different solutions, each with it owns set of benefits. It is important to evaluate the needs of the business before selecting a set of products.
Look for some upcoming blogs on some considerations and practices to the questions above! In the meantime, if you have any questions about backing up your data, please get in touch!