Distributed and Decoupled Resource groups & Storage Account paradigm
Literally all resources in Azure needs underlying Storage to operate that starts from VM/Compute to Database to API/ Middleware/ Logs etc.
It is key to segregate the Storage accounts based on target workloads and there are primary four reasons, with other auxiliary reasons, how can this distribution benefitting us:
Security – It is important to apply policy and security governance across various resources that includes Storage accounts . this gives better control of the underlying Data and target audience who will work with that data Developers vs Tester and have tighter security control across Storage units/ Containers etc.
Performance – Its heart for a healthy application . Isolating the workloads based on needs could be key to achieve great performance without impacting one vs another
Ease of Management – Based on different AD group level permissions and advanced roles and management distributed Storage account paradigm gives added level and gears to better controlled by Administrator and better policy and security enforcement can be applied.
Cost analysis – When resources are in hierarchies its easy to get the cost sheet in much detail
The below diagram is high level vision how storage account could be distributed based on different Technology stacks and Teams for a Business Unit . This could be further integrated with other Business Unit for a complete enterprise solution
Idea for the above distribution is to isolate the Storage account for individual resource groups . In this example, for a business unit Database resources could be co-located under a unified resource group while Devops ML , Kubernetes native workload can have there own resource group . With resource group isolation we are isolating the associated storage accounts as well and those storage account layered into multiple logical units .
At the middle(in Orange) we have the special resource group where underlying storage account is shared across multiple technology areas for the specific business unit. A typical use case is this could be file landing areas which will be shared by Machine learning workload, batch or streaming workload and act as shared resources .
“There are only two hard things in Computer Science: cache invalidation and naming things.”
As enterprises start to utilize Azure resources, even a reasonably small footprint can begin to accumulate thousands of individual resources. This means that the resource count for much larger enterprises could quickly grow to hundreds of thousands of resources.
Establishing a naming convention during the early stages of establishing Azure architecture for your enterprise is vital for automation, maintenance, and operational efficiency. For most enterprises, these aspects involve both humans and machines, and hence the naming should cater to both of them.
Objective – Naming standards saves any teams countless number of hours to think , making mistakes and carry with those mistakes and suffer others and be religious in cloud eco system. This thread will outline what are the technical guardrails we can adopt inside naming to instrument this as practice. Idea is to follow KISS principle – “Keep it simple, stupid” .
Naming conventions always end up with heated debates and someone could be very arrogant to propose “one-size-fits-all” kind of theory but chill ! Lets first understand what can we do best for our own benefit.
WHY TO ADOPT BEST PRACTICE/ RECOMMENDATIONS
What is Azure Resources
Any component got to do some work is resource in Azure World. Here is simple resource structure in Azure
Management groups: These groups are containers that help you manage access, policy, and compliance for multiple subscriptions. All subscriptions in a management group automatically inherit the conditions applied to the management group.
Subscriptions: A subscription logically associates user accounts and the resources that were created by those user accounts. Each subscription has limits or quotas on the amount of resources you can create and use. Organizations can use subscriptions to manage costs and the resources that are created by users, teams, or projects.
Resource groups: A resource group is a logical container into which Azure resources like web apps, databases, and storage accounts are deployed and managed.
Resources: Resources are instances of services that you create, like virtual machines, storage, or SQL databases.
When Naming resources in cloud, degree of governance and guidelines is very important to drive a successful tidy thoughtful environment.
Hence tagging and naming is BOTH important factors for developers/administrator/config manager to follow.
What important factors driving naming standards ?
Consistency – Makes DevOps engineers job easy
Scalability and Flexibility
Ease of Management – Easy to follow and easy to identify resources/pin point product and application owners in few seconds
Billing/Charge Back – Easy to find the resource usage cost by tagging and naming
Better monitoring and logging
Habit the practice – you will live with this names and spend lot of time with them so make them pleasure to work with
Renaming Resources – Azure resource and resource group can’t be renamed once created. Renaming is moving all your resources from one group to other and it potentially impact lot of places
What is excluded ?
Below will not follow the naming conventions as those are automated by Azure itself.
Naming for the Resources auto-generated by Azure like Managed Disk name (e.g. <VMName>_OsDisk_1_25fbb34c5b074882bcd1179ee8b87eeb)
Supporting resource group for Azure Kubernetes Service
Some resources require a globally unique name
Internal Resource group name related to Databricks for e.g. – databricks-rg-dev000hidpuse02dbricks001-44gxjl4c7ailk
Azure Regions for Deployment of Resource Group and Resources
EAST US2 is 10% cheaper from EAST US and have better resource availability
Co-locating resources at same location will help reducing ingress and egress cost of data transfer for any integration work
Key Resource Group(KRG) Naming
Resource group prefix – e.g. – rg (short stands for resource group) (this will simplify the search while creating any resources)
Department / Business Unit Name – e.g. – hosbi , crs – preceding BU Name will help us to identify the sub from the merged corporate subscription tree
Type of the resources under that resource groups. e.g. for db, app, devops, ml, k8s
Environment Type – e.g. – dev, prd
Regional Instance Identifier ( azure datacenter) – e.g. eastus, westus – Not mandatory to be explicitly mentioned and maintained specially when business agreed all resources to use specific region
Incrementor: 01, 02, 03 etc if required (can be used for redundant copy of existing resource group or brand new resource group to be used by different team under same business unit ) .
Role/Function of resources/Primary Use – e.g. operation, config, cache, analytics, oltpds, olapds, db, web, mail, shared (when this resource will be shared across resource groups)
Regional Instance Identifier( azure datacenter) -e.g. eastus, westus – Not mandatory to be explicitly mentioned and maintained specially when business agreed all resources to use specific region
Incrementor: 01, 02, 03 etc if required – e.g. can be used for HA resources like Cluster nodes, VM scale sets.
For e.g. – dev-databricks-olap, prd-aks-analytics-eastus, dev-redis-cache, prd-postgres-oltp-config
Note: BU Name / Product Name subject to change with organizational merge hence better to keep those name for each KRI inside Tags and not in the naming itself.
Micro Service Container Naming convention
this can follow the standard POD naming when deploying the resource using the corresponding helm charts. Certain naming in POD can’t be controlled because it is managed inside kubernetes own realm
Naming standard challenges:
There are lot of naming rules across azure services. Some resource offers naming in hyphen or period characters where some are not. Some likes underscore while other prohibits any naming except alphanumeric characters.
Some allows naming to be certain number of characters some can have pretty lengthy naming and some allows uppercase where some are strictly lowercase
Unifying the Naming Construct and Separators
Description of Rules
Use lowercase always. camelCase, PascalCase, Initcap is not allowed.
no space allowed across any name
ABBREVIATION to shorten characters
where Resource name length > 8 characters use conformed and consistent abbreviated form like df for datafactory .
HYPHEN or DASH as Separator to break them up
If hyphen or Dash not allowed use Underscore(“_”) to supplement. If no special characters allowed in between use 1 as separator for e.g. poc1storage1db, poc1storage1olapds, poc1storage1shared . Rare case, as an example Storage name which doesn’t accept hyphen/dash neither any special characters. But Fileservice/blob services/containers underneath it can be names with hyphenUse hyphen(“-“) on resource name or resource group name as separator between resource identifier values for e.g. – prd-react-analytics-eastusDo not use more then 1 hyphen (“–”) as separator for e.g. dev–aks–olapds . Don’t use hyphen characters inside the name of resources itself for e.g. eventhubs should never be called as event-hubs rather use continuous naming or abbreviation if long name
START AND END Character
Never start any resources with hyphen(“-“) or any charactersNever end any resource name with hyphen(“-“) or any characters (like period(“.”) )Never start with Numeric value but end with numeric value is allowed, for e.g. – dev-couchbase-config-eastus2
Use only when this make sense for e.g. Clusters, Multiple Nodes deployed for similar resources, VM Scale sets etcDon’t use 00 if no incrementor presentUse 0 padding – Never use 1/2/3 rather use 01/02/03
Shouldn’t be exactly 3 characters (dev, tst, stg, prd, prf, uat, poc) – qa should be analogous to tst or stg environment
If KRI strings length is long enough to fit the allowable limit use additional tag to identify the resources. Some cases abbreviation can be used
REDUNDANT NAMING Cross Region
Avoid creating unnecessary number of resources for redundancy and HA because some services having built-in high availability(fault domains and update domains) and no need to create in different regions.
This is very important to identify what is the purpose the resources and long term it will benefit identifying the purpose
Storage Account Name
Cannot have dash, dot
SQL Server Name, Storage Account Name
Must be unique across Azure not just subscription
Search Service and Virtual Machines
2 to 15 characters
Storage Account Name
Cannot be upper characters
Sample abbreviated naming for some Azure Resources
Public IP Address
Network Security Group
App Service Plan
Tagging : Very useful to find the resource usage in Azure by Tag Name. Also it helps to categorize the similar resources under one Application or Product.
Another great use of tagging is Billing . It is great way to report the Cost Analysis. Note tagging is Key Value and Can be changed anytime but KRG / KRI is not simple rename.
There would be multi level tagging options:
Tag name/value length: 512 / 256
Tag Name rule:
Name is the key , it can’t be duplicate string
Never use sequence like 1,2,3 in the name key for tag
Tagging Naming Conventions for Resource group (All parameters below are mandatory)
A common and good use of tags name and value combinations would be below:
Enforced by policy
Description (why needed ? )
should be only one value
if multiple product use hyphen(“-“) separator
use full name
identify who owns what, if middle name then add another hyphen(-)
Used for what platform poc, dev, test, production ?
e.g. couchbase, mongo, kafka, splunk, oracle which is marketplace product and non managed resources
Easy to filter and report cost sheet
Scripts to create resources and scripts to enforce such policies… Bookmark them. Non Prod or Prod subscription can’t create resource name/ groups without Script
Enforcing the naming and tagging practice:
ARM template scripts should be executed via Azure CLI / Powershell or Azure API while creating resources / tags following above rules for Prod and Non Prod env except POC environment
POC Subscription resource group should be easy to create without Scripts but should adhere above naming principles
Resources with a public endpoint already have an FQDN which accurately describes what they are so some cases resource name is self explanatory while looking at default public endpoint URL azure creates:
Storage is the backbone of any Cloud IaaS, PaaS or SaaS solution typically managed by Cloud provider. The importance is to identify the capabilities of different storage options and differentiate them and pick the right choices is key to the success of technology implementation. Here is the listed difference at high level but at low levels there could be many differences.
Note: The scope of below is to lay out the differences offered by Microsoft Azure only and not any third party storage options like Netapp or Purestorage or MinIO storages.
All these three technologies provided modern approach to Cloud Data warehousing but each of them having unique set of features to resolves problem , poses unique challenges to work with. Any modern technology platform for a big enterprise should not take monolithic approach for Data solutions unless clear understanding of business use case and polyglot persistence architecture must be keep in mind when designing the Data store.
Its hard to make the judgement initially about what data store to use for what purpose so does the Research , Proof of Concept and Due Diligence work required when architecting the data solution and this will help building right things in right way
To understand the key difference I have tried to put all three technology comparisons together in one frame and with very high level differentiation however at the low level there could be thousands of other difference on features which is out of scope at present for this thread. As of my writing the differences captured below and this is subject to change in future evolution
Lets deep drive on it and happy to hear feedback/comments below:
The objective is to simplify the Azure resiliency options explaining the Local Redundancy (LR) , Zone Redundancy(ZR) and Geo Redundancy(GR) . Redundancy offers degree of High availability so does address the SLA % for Fault and PITR (Point in time recovery) for Disaster.
Redundancy is a key objective on Cloud paradigm giving the agility of the application compute and storage options so that any Disaster scenario could be handled with maximum flexibility , lowest possible downtime and with better cost effective option with minimal impact to platform and infrastructure.
Below diagram tries to address the key standards of redundancy from Architecture standpoint . When we talk about redundancy we should always differentiate Storage vs Compute redundancy. Below cases stated mostly applied for Storage redundancy however Compute redundancy is varied across services and offerings.
For e.g. for by-default Managed services having inherent redundancy inbuilt for e.g. Data is either replicated Synchronously three times in primary region using LRS (local redundant storage) and then replicated asynchronously to the secondary region as GRS (Geo redundant storage) or Data is replicated synchronously across 3 AZ’s in primary region using ZRS and then replicated to secondary in asynchronous way as GZRS (Geo Zone redundant storage)
In above example we focus on deploying Azure resources on US Eastern Region (US EAST). Similarly there would be other Geographical region exist making a Regional Pair for Geo Redundant Storage(GRS)
Now Primary EAST US region comprised of multiple Availability Zones(AZ) which is located into different hardware infrastructure . In this case US EAST having 3 AZ’s to give Zone Redundancy. All AZ’s are connected through Azure Virtual network to allow Synchronous replication. These AZ’s are logical entity combining physically different Datacenters.
Within one AZ there could be hundreds of Azure connected resources / services exist spread over one Data Center and across different floor of a building or could be separate out across hardware shelves/RAC’s. That’s how it is giving the local redundancy . LRS services are physically and logically closed to each other to allow minimal downtime for failure and maintenance for patches.
Hope this simplifies some cloudy areas of redundancy and explains the importance of Redundancy/Replication in cloud.