The objective is to simplify the Azure resiliency options explaining the Local Redundancy (LR) , Zone Redundancy(ZR) and Geo Redundancy(GR) . Redundancy offers degree of High availability so does address the SLA % for Fault and PITR (Point in time recovery) for Disaster.
Redundancy is a key objective on Cloud paradigm giving the agility of the application compute and storage options so that any Disaster scenario could be handled with maximum flexibility , lowest possible downtime and with better cost effective option with minimal impact to platform and infrastructure.
Below diagram tries to address the key standards of redundancy from Architecture standpoint . When we talk about redundancy we should always differentiate Storage vs Compute redundancy. Below cases stated mostly applied for Storage redundancy however Compute redundancy is varied across services and offerings.
For e.g. for by-default Managed services having inherent redundancy inbuilt for e.g. Data is either replicated Synchronously three times in primary region using LRS (local redundant storage) and then replicated asynchronously to the secondary region as GRS (Geo redundant storage) or Data is replicated synchronously across 3 AZ’s in primary region using ZRS and then replicated to secondary in asynchronous way as GZRS (Geo Zone redundant storage)
In above example we focus on deploying Azure resources on US Eastern Region (US EAST). Similarly there would be other Geographical region exist making a Regional Pair for Geo Redundant Storage(GRS)
Now Primary EAST US region comprised of multiple Availability Zones(AZ) which is located into different hardware infrastructure . In this case US EAST having 3 AZ’s to give Zone Redundancy. All AZ’s are connected through Azure Virtual network to allow Synchronous replication. These AZ’s are logical entity combining physically different Datacenters.
Within one AZ there could be hundreds of Azure connected resources / services exist spread over one Data Center and across different floor of a building or could be separate out across hardware shelves/RAC’s. That’s how it is giving the local redundancy . LRS services are physically and logically closed to each other to allow minimal downtime for failure and maintenance for patches.
Hope this simplifies some cloudy areas of redundancy and explains the importance of Redundancy/Replication in cloud.