Storage
Azure Storage
Azure Storage offers different services that you can use to store files, messages, tables and other types of information.
An Azure storage account is required to use Azure Storage services.
A storage account represents a collection of settings like location, replication strategy and subscription owner. You can have multiple Storage accounts.
The storage account name must be globally unique, it is used as part of the URI for API access like ${mystorageaccount}.blob.core.windows.net
Azure Blob Storage
Azure Blob Storage is an object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of unstructured data.
Things to know about containers and blobs
- All blobs must be in a container.
- A container can store an unlimited number of blobs.
- An Azure storage account can contain an unlimited number of containers.
- A container name must be unique within the Azure storage account.
- A container Public access level is Private by default. There are three access level choices :
- Private : Prohibit anonymous access to the container and blobs.
- Blob : Allow anonymous public read access for the blobs only.
- Container : Allow anonymous public read and list access to the entire container, including the blobs.
Blob Storage
Blob Storage access tiers :
- Hot access tier : Optimized for storing data that is accessed frequently.
- Cool access tier : Optimized for data that is infrequently accessed and stored for at least 30 days.
- Archive access tier : Appropriate for data that is rarely accessed and stored for at least 180 days, with flexible latency requirements.
- Premium Blob Storage : Best suited for I/O intensive workloads that require low and consistent storage latency.
- Best for workloads that perform many small transactions like a mapping application that requires frequent and fast updates.
Only the hot and cool access tiers can be set at the account level. The archive access tier isn't available at the account level.
You can use Azure Blob Storage lifecycle management policy rules to accomplish several tasks.
- Transition blobs to a cooler storage tier (Hot to Cool, Hot to Archive, Cool to Archive) to optimize for performance and cost.
- Delete blobs at the end of their lifecycles.
- Define rule-based conditions to run once per day at the Azure storage account level.
- Apply rule-based conditions to containers or a subset of blobs.
Object replication (ObjR) copies blobs in a container asynchronously according to policy rules that you configure.
During the replication process, the following contents are copied from the source container to the destination container :
- The blob contents.
- The blob metadata and properties.
- Any versions of data associated with the blob.
Things to know about blob object replication
- Object replication requires that blob versioning is enabled on both the source and destination accounts. Change feed must be enabled for the source account.
- ObjR doesn't support blob snapshots. Any snapshots on a blob in the source account aren't replicated to the destination account.
- ObjR is supported when the source and destination accounts are in the Hot or Cool tier. The source and destination accounts can be in different tiers.
- When you configure object replication, you create a replication policy that specifies the source Azure storage account and the destination storage account.
Azure Blob Storage supports two forms of immutability policies for implementing immutable storage:
-
Time-based retention policies let users set policies to store data for a specified interval.
When a time-based retention policy is in place, objects can be created and read, but not modified or deleted.
After the retention period has expired, objects can be deleted, but not overwritten.
The Hot, Cool, and Archive access tiers support immutable storage by using time-retention policies. -
Legal hold policies store immutable data until the legal hold is explicitly cleared.
When a legal hold is set, objects can be created and read, but not modified or deleted.
Premium Blob Storage uses legal holds to support immutable storage.
A blob can be any type of data and any size file. Azure Storage offers three types of blobs: block blob, page blob, and append blob.
-
Block blobs : consist of blocks of data that are assembled to make a blob. It is the default type for a new blob.
- When you're creating a new blob, if you don't choose a specific type, the new blob is created as a block blob.
- Block blobs are ideal for storing text and binary data in the cloud, like files, images, and videos.
-
Append blobs : are similar to a block blob because the append blob also consists of blocks of data.
- The blocks of data in an append blob are optimized for append operations.
- Append blobs are useful for logging scenarios, where the amount of data can increase as the logging operation continues.
-
Page blobs : can be up to 8 TB in size. Page blobs are more efficient for frequent read/write operations.
- Azure VM uses page blobs for Operating System disks and data disks.
After you create a blob, you can't change its type.
Azure Files
Azure Files offers fully managed file shares in the cloud that are accessible via the industry standard Server Message Block (SMB) and Network File System (NFS) protocols.
- Azure Files stores data as true directory objects in file shares.
- Azure Files provides shared access to files across multiple virtual machines.
- Applications that run in Azure VMs or cloud services can mount an Azure Files storage share to access file data.
There are two important settings for Azure Files that you need to be aware of when creating and configuring file shares.
- Open port 445 for Windows : Azure Files uses the SMB protocol that communicates over TCP port 445 on Windows.
- Enable secure transfer : The Secure transfer required setting enhances the security of your storage account by limiting requests to your storage account from secure connections only.
Premium file storage account - NFS
If you want support for network file system (NFS) in Azure Files, use the premium file shares account type (Storage account).
Azure File Sync
Azure File Sync enables you to cache several Azure Files shares on an on-premises Windows Server or cloud virtual machine.
You can use Azure File Sync to centralize your organization's file shares in Azure File.
- Azure File Sync transforms Windows Server into a quick cache of your Azure Files shares.
- You can use any protocol that's available on Windows Server to access your data locally with Azure File Sync, including SMB, NFS and FTPS.
- Azure File Sync supports as many caches as you need around the world.
Cloud tiering
Cloud tiering is an optional feature of Azure File Sync. Frequently accessed files are cached locally on the server while all other files are tiered or archived to Azure Files based on policy settings. When a file is tiered, File Sync replaces the file locally with a pointer.
File Sync components
Azure File Sync is composed of four main components that work together to provide caching for Azure Files shares on an on-premises Windows Server or VM.
The Storage Sync Service is the top-level Azure resource for Azure File Sync.
This resource is a peer of the storage account resource and can be deployed in a similar manner.
- The Storage Sync Service forms sync relationships with multiple storage accounts by using multiple sync groups.
- The service requires a distinct top-level resource from the storage account resource to support the sync relationships.
- A subscription can have multiple Storage Sync Service resources deployed.
A sync group defines the sync topology for a set of files. Endpoints within a sync group are kept in sync with each other.
The registered server object represents a trust relationship between your server/cluster and the Storage Sync Service resource.
You can register as many servers to a Storage Sync Service resource as you want.
The Azure File Sync agent is a downloadable package that enables Windows Server to be synced with an Azure Files share.
The Azure File Sync agent has three main components :
- FileSyncSvc.exe : This file is the background Windows service that's responsible for monitoring changes on server endpoints and for initiating sync sessions to Azure
- StorageSync.sys : This file is the Azure File Sync file system filter that supports cloud tiering.
- The filter is responsible for tiering files to Azure Files when cloud tiering is enabled.
- PowerShell cmdlets : These PowerShell management cmdlets allow you to interact with the
Microsoft.StorageSync
.
Server endpoint
A server endpoint represents a specific location on a registered server, such as a folder on a server volume.
Multiple server endpoints can exist on the same volume if their namespaces are unique.
Cloud endpoint
A cloud endpoint is an Azure Files share that's part of a sync group. As part of a sync group, the entire cloud endpoint (Azure Files share) syncs.
- An Azure Files share can be a member of one cloud endpoint only.
- An Azure Files share can be a member of one sync group only.
Azure Managed Disks
Disk Storage provides disks for Azure virtual machines. Disk Storage allows data to be persistently stored and accessed from an attached virtual hard disk.
Disks come in many different sizes and performance levels, from solid-state drives (SSDs) to traditional spinning hard disk drives (HDDs), with varying performance tiers.
Data disks are used by virtual machines to store data like database files, website static content, or custom application code.
The number of data disks you can add depends on the virtual machine size. Each data disk has a maximum capacity of 32,767 GB.
Choose an encryption option
There are several encryption types available for your managed disks.
Azure Disk Encryption (ADE) encrypts the VM's virtual hard disks (VHDs). If VHD is protected with ADE, the disk image is accessible only by the VM that owns the disk.
Server-Side Encryption (SSE) is performed on the physical disks in the data center. If someone directly accesses the physical disk, the data will be encrypted.
When the data is accessed from the disk, it's decrypted and loaded into memory. This form of encryption is also referred to as encryption at rest or Azure Storage encryption.
Encryption at host ensures that data stored on the VM host is encrypted at rest and flows encrypted to the Storage service.
Disks with encryption at host enabled aren't encrypted with SSE.
Instead, the server hosting your VM provides the encryption for your data, and that encrypted data flows into Azure Storage.
Azure Storage Security
Azure Storage provides a layered security model that lets you secure and control the level of access to your storage accounts.
The model consists of several storage security options, including firewall policies, customer-managed keys and endpoints.
Security options
Azure security baseline for Azure Storage grants limited access to Azure Storage resources. Azure security baseline provides a comprehensive list of ways to secure your Azure storage.
Ex : For blob storage, four options are available : Public access, Azure AD, Shared Key, SAS
All data written to Azure Storage is automatically encrypted. Azure Storage encryption offers two ways to manage encryption keys at the storage account level:
- Microsoft-managed keys : By default, Microsoft manages the keys used to encrypt your storage account.
- Customer-managed keys : You can optionally choose to manage encryption keys for your storage account. Customer-managed keys must be stored in Azure Key Vault.
Data in transit : Data can be secured in transit between an application and Azure by using Client-Side Encryption, HTTPS or SMB 3.
Disk encryption : Operating system disks and data disks used by Azure Virtual Machines can be encrypted by using Azure Disk Encryption.
Shared access signatures (SAS) provide secure delegated access to resources in your storage account. With a SAS, you have granular control over how a client can access your data.
Azure Storage supports three types of SAS :
- User delegation SAS : Can only be used for Blob storage and is secured with Azure AD credentials.
- Service SAS : delegates access to a resource in any one of four Azure Storage services: Blob, Queue, Table or File (shares using REST not with SMB).
- A service SAS is secured using a storage account key.
- Account SAS : has the same controls as a service SAS, but can also control access to service-level operations, such as Get Service Stats.
- An account SAS is secured using a storage account key.
When creating a SAS, you can set :
- Allowed resource types : Service, Container, Object.
- Expiration date/time.
- Allowed Ip adresses.
- Allowed protocols :
HTPPS Only
orHTTPS and HTPP
- Allowed permissions.
- Signing method and key.
- ...
Use stored access policies to delegate access
SAS is a secure way to give access to clients without having to share your Azure credentials. This ease of use comes with a downside.
Anyone with the correct SAS can access the file while it's still valid.
The only way you can revoke access to the storage is to regenerate access keys. Regeneration requires you to update all apps that are using the old shared key to use the new one.
Another option is to associate the SASs with a stored access policy.
Stored access policy give you the option to revoke permissions without having to regenerate the keys.
A stored access policy is created with the following properties : Identifier(name), Start/Expiry time, Permissions.
All these actions can be done after releasing the SAS token for resource access :
- Controlling permissions for the signature.
- Revoking access.
- Changing the start time and the end time for a signature's validity.
Azure Storage creates two 512-bit access keys for every storage account that's created. You share these keys to grant clients access to the storage account.
These keys grant anyone with access the equivalent of root access to your storage.
Shared Key authorization is supported with the four Azure Storage services : Blobs, Queues, Table Or File (Shares using REST/SMB)
Firewall policies and rules limit access to your storage account. Requests can be limited to specific IP addresses/ranges or to a list of subnets in an Azure virtual network.
The default network rule is to allow all connections from all networks.
VNet service endpoints restrict network access and provide direct connection to your Azure storage.
You can secure storage accounts to your VNet and enable private IP addresses in the virtual network to reach the service endpoint.
By default, Azure services are all designed for direct internet access. All Azure resources have public IP addresses, including PaaS services such as Azure SQL Database and Azure Storage.
Because these services are exposed to the internet, anyone can potentially access your Azure services.
Service endpoints can connect certain PaaS services directly to your private address space in Azure, so they act like they’re on the same virtual network.
Use your private address space to access the PaaS services directly. Adding service endpoints doesn't remove the public endpoint. It simply provides a redirection of traffic.
Azure service endpoints are available for many services, such as: Azure Storage, Azure SQL Database, Azure Cosmos DB, Azure Key Vault, Azure Data Lake, Azure Service Bus
How service endpoints work
To enable a service endpoint, you must:
- Turn off public access to the service.
- Add the service endpoint to a virtual network.
When you enable a service endpoint, you restrict the flow of traffic and enable your Azure VMs to access the service directly from your private address space.
Devices cannot access the service from a public network.
Secure transfer enables an Azure storage account to accept requests from secure connections.
When you require secure transfer, any requests originating from non-secure connections are rejected.
By default, storage accounts accept connections from clients on any network.
To limit access to selected networks, you must first change the default action. You can restrict access to specific IP addresses, ranges, or virtual networks.
Azure AD and RBAC are supported for Azure Storage for both resource management operations and data operations.
Azure AD supports : Blob, Table, Queue. File shares using SMB is only supported by Active DIrectory AD. Share using REST is not supported.
Authorization : Every request made against a secured resource in Blob Storage, Azure File... must be authorized.
Authorization type supported by storage service
Azure artifact | Shared Key storage account key | SAS | Azure AD | AD Domain Services |
---|---|---|---|---|
Azure Blobs | Yes | Yes | Yes | Not supported |
Azure Files (SMB) | Yes | Not supported | Only supported with AD DS | Yes but credentials must be synced to Azure AD |
Azure Files (REST) | Yes | Yes | Not supported | Not supported |
Azure Queues | Yes | Yes | Yes | Not supported |
Azure Tables | Yes | Yes | Yes | Not supported |
Data Redundancy
Azure Storage provides several redundancy options that can help ensure your data is available.
Redundancy in the primary region can be provided as follows :
-
Locally redundant storage (LRS) : helps protect your data against drive or server rack failures in a data center. But if a disaster occurs within the data center, all replicas of your storage account that uses LRS might be lost.
- Copies your data synchronously three times within a single physical location in the primary region.
- Is the least expensive replication option.
- Isn’t recommended for apps that require high availability or durability.
-
Zone-redundant storage (ZRS) : helps ensure that your data is still accessible for both read and write operations even if a zone becomes unavailable.
- Copies your data synchronously across three availability zones in the primary region.
- For HA, Microsoft recommends using ZRS in the primary region and also replicating to a secondary region.
For apps that require high durability, you can create copies of the data in a secondary region. Redundancy in the secondary region can be provided as follows :
-
Geo-redundant storage (GRS) :
- Copies your data synchronously three times within a single physical location in the primary region using LRS.
- Copies your data asynchronously to a single physical location in the secondary region.
- Copies your data synchronously three times within the secondary region using LRS.
-
Geo-zone-redundant storage (GZRS) :
- Copies your data synchronously across three AZs in the primary region using ZRS.
- Copies your data asynchronously to a single physical location in the secondary region.
- Copies your data synchronously three times using LRS within the secondary region.
With both GRS and GZRS, your data in the secondary region isn't available for read/write access unless there's a failover to the secondary region.
To enable read access to the secondary region, configure your storage account to use one of the following :
- Read-access geo-redundant storage (RA-GRS).
- Read-access geo-zone-redundant storage (RA-GZRS)
Geo-Redundant Failover Triggering
After manually or automatically triggering an account failover, the storage configured as geo-redundant will start the failover.
During failover, the DNS is updated to point to the secondary region and primary regions are removed from the redundancy.
After finishing the failover process, the secondary region becomes the storage account and replication is automatically changed to locally redundant.
If you want to have copies again in two regions, you will need to change the replication to geo-redundant gain.
Azure Recovery Services
Data Backup
The Azure Backup service uses Azure resources for short-term and long-term storage. It minimizes or even eliminates the need for maintaining physical backup media.
Azure Backup offers two types of replication to keep your storage/data : LRS & GRS.
You can use Azure Backup for these backup types :
-
On-premises. Azure Backup can back up files, folders and system state using the Microsoft Azure Recovery Services (MARS) agent.
- The MARS agent is a full-featured agent that offers many benefits for both backing up and restoring your data.
- Alternatively, you can use Data Protection Manager (DPM) or the Microsoft Azure Backup Server (MABS) agent to protect on-premises VMs and other workloads (Sharepoint, SQL Server ...)
-
Azure VM : Back up entire Windows or Linux VMs (using backup extensions) or back up files, folders and system state using the MARS agent.
- Azure Files shares : Back up Azure File shares to a storage account.
Soft delete for VM protects backups of your VMs from unintended deletion. Even after the backups are deleted, they're preserved in the soft-delete state for 14 more days.
Recovery Services vault
The Recovery Services vault is a storage entity in Azure that stores data. Recovery Services vaults make it easy to organize your backup data, while minimizing management overhead.
- The RS vault can be used to back up Azure Files file shares or on-premises files and folders.
- RS vaults store backup data for various Azure services, such as IaaS virtual machines (Linux or Windows) and Azure SQL databases.
- RS vaults support System Center Data Protection Manager, Windows Server, Azure Backup Server, and other service.
- RS vault must be in same the region as the resources it is protecting.
Azure Backup automatically handles the storage for your vault.
Azure Blob Backup & Recovery
Operational backup for blobs is a local backup solution. The backup data isn't transferred to the Backup vault but is stored in the source storage account itself.
This is a continuous backup solution. You don’t need to schedule any backups. All changes will be retained and restorable from a selected point in time.
Take advantage of blob soft delete and versioning.
Soft delete protects an individual blob, snapshot, container or version from accidental deletes or overwrites. Soft delete maintains the deleted data in the system for a specified retention period. During the retention period, you can restore a soft-deleted object to its state at the time it was deleted.
- Container soft delete can restore a container and its contents at the time of deletion.The default retention period is seven days and max 365 days.
- Blob soft delete can restore a blob, snapshot or version that has been deleted. The retention period for deleted blobs is between 1 and 365 days.
- Blob versioning works to automatically maintain previous versions of a blob. When blob versioning is enabled, you can restore an earlier version of a blob.
Azure Site Recovery
Azure Site Recovery is a service that provides BCDR (Business Continuity and Disaster Recovery) features for your applications in Azure, on-premises and in other cloud providers. Azure Site Recovery has plans that help automate your disaster recovery by enabling you to define how machines are failed over and the order in which they're restarted after being successfully failed over.
Azure Site Recovery is designed to provide continuous replication to a secondary region.
Data Migration
Storage Migration Service
The Azure Storage Migration Service can you help you migrate unstructured data stored in on-premises file servers to Azure Files and Azure-hosted virtual machines.
The Azure Storage Migration Service implements 3 steps to move your online on-premises unstructured data:
- Inventory servers : It inventories your servers to gather information about their files and configuration.
- Transfer : The migration service transfers your data from the source to the destination servers.
- Cut over : As an option, the Migration Service cuts over to the new servers.
Server Migration Overview
Azure Migrate project : Houses metadata regarding assessment findings and migration activities.
- Discover : Use Azure Migrate appliance to identify servers to migrate.
- Assess : Evaluate servers, provide estimated costs and assess readiness for migration.
- Migrate : Migrate the servers, VMs.
AzCopy Tool
An alternate method for transferring data is the AzCopy tool. AzCopy v10 is the next-generation command-line utility for copying data to and from Azure Blob Storage and Azure Files.
The characteristics of the AzCopy tool :
- Every AzCopy instance creates a job order and a related log file. You can view and restart previous jobs and resume failed jobs.
- AzCopy automatically retries a transfer when a failure occurs
-
You can use AzCopy to list/remove files/blobs in a given path. AzCopy supports wildcard patterns in a path, --include/exclude flags.
-
AzCopy can copy an entire account to another account with the Put command from URL APIs. No data transfer to the client is needed.
-
AzCopy supports Azure Data Lake Storage Gen2 APIs.
- AzCopy is built into Azure Storage Explorer.
- AzCopy is available on Windows, Linux and macOS.
Azure Storage Explorer : a GUI tool !
Azure Storage Explorer is a standalone GUI application that makes it easy to work with Azure Storage data On Windows/Linux/Mac.
With Azure Storage Explorer, you can access multiple accounts and subscriptions and manage all your Storage content.
Migrate Data Offline
Azure offers several options for migrating data offline: Azure Import/Export, Azure Data Box...
Azure Import/Export
Azure Import/Export service migrates large quantities of data between an on-premises location and an Azure storage account. By using the Import/Export service, you send and receive physical disks that contain your data between your on-premises location and an Azure.
- You can use the Azure Import/Export service to export data from Azure Blob Storage only.
- The Import/Export service supports only import of Azure Files into Azure Storage. Exporting Azure Files isn't supported.
- To use the Import/Export service, BitLocker must be enabled on the Windows system.
- You need an active shipping carrier account like FedEx or DHL for shipping drives to an Azure datacenter.
Import Data
- Prepare the drives : WAImportExport is the tool used to prepare drives before importing data and to repair any corrupted or missing files after data transfer.
- Create the
driveset.csv
file. The driveset file has a list of disks and corresponding drive letters so the tool can correctly pick the list of disks to be prepared.
- Create the
- Create an import job : When you create the job, you must create the journal first, upload the journal and then specify the storage account to which the journal will be uploaded.
- Ship the drives to Azure datacenter : Once you create the job, you must physically ship the disks to the Azure datacenter.
- Update the job with tracking information : After creating the job, you have two weeks to update the job to include the tracking information from the shipping carrier.
- If you do not fill in the tracking information, the job will be cancelled and the data will not be imported into Azure.
- Verify data upload to Azure : Track the job to completion. Then make sure that your data has been uploaded to Azure successfully.
Azure Data Box
Azure Data Box provides a quick, reliable and inexpensive method for moving large volumes of data to Azure.
By using Data Box, you can send terabytes of data into and out of Azure.
The solution is based on a secure storage device that's shipped to your organization. Your Data Box can include various devices such as disks, ruggedized server chassis or mobile disks.
-
Data Box device : A physical device (capacity 80 TB) that provides primary storage, manages communication with cloud storage and helps to ensure the security and confidentiality of all data stored on the device.
-
Data Box : An extension of the Azure portal that lets you manage a Data Box device by using a web interface that you can access from different geographical locations.
-
Data Box local web-based user interface : A web-based UI that's used to configure the device so it can connect to the local network and then register the device with the Data Box service.
- You can also use the local web UI to shut down and restart the Data Box device, view copy logs.