Database
Database
Azure Cosmos DB
Azure Cosmos DB is a globally distributed, multi-model database service.
Azure Cosmos DB supports schemaless data, which lets you build highly responsive and Always On applications to support constantly changing data.
You can use this feature to store data that's updated and maintained by users around the world.
At the lowest level, Azure Cosmos DB stores data in atom-record-sequence (ARS) format.
The data is then abstracted and projected as an API, which you specify when you're creating your database.
Your choices include NoSQL, MongoDB, Cassandra, Tables and Gremlin.
- Multi-model : Including key-value, documents, graphs and columnar.
- Massive scale : Built for global scale and performance.
- Multi+region writes : Supports active-active writes across replicas.
Point-in-time Restore
Azure Cosmos DB's point-in-time allows you to do the following :
- To recover from a write or delete operation within a container.
- To restore a deleted container, database or account.
- To restore into any region if a backup existed to a specific point in time.
A point-in-time restore does not restore the following configurations :
- Consistency settings.
- Stored procedures - Triggers - UDFs
- Firewall - VNet - Private Endpoint settings.
- Regions.
Azure Table Storage
Azure Table storage stores large amounts of non-relational structured data.
The service is a NoSQL datastore which accepts authenticated calls from inside and outside the Azure cloud.
Common uses of Table storage include :
- Storing TBs of structured data capable of serving web scale applications.
- Storing datasets that don't require complex joins, foreign keys or stored procedures and can be denormalized for fast access.
- Quickly querying data using a clustered index.
- Accessing data using the OData protocol and LINQ queries with WCF Data Service .NET Libraries.
Table VS Cosmos DB
Tables | Cosmos DB |
---|---|
For storing tabular, key-value data. | Supports a variety of data models and APIs for NoSQL data. |
Pricing predominantly based on storage. | Pricing predominantly based on throughput. |
Throughput limited to 20K operation/s. | Throughput not limited, supports >10M operations/s. |
No latency guarantees. | < 10ms reads, <10ms writes |
Leverages storage account redundancy (read-only). | Multi-region write support for global access. |
Azure SQL Database
Azure SQL Database is a relational database based on the latest stable version of the Microsoft SQL Server database engine.
Azure SQL Database is a PaaS database engine. It handles most of the database-management functions such as upgrading, patching, backups and monitoring without user involvement.
elastic database pool
You can create a SQL Database elastic database pool where all databases in the pool share the same set of compute and storage resources.
Each database can use the resources it needs within the limits you set depending on current load.
It is ideal when you have multiple databases that will be used infrequently and it also provides a cost-effective solution.
There are two primary pricing options for SQL Database : DTU and vCore (Microsoft recommended). A serverless option is also available for a single database.
- vCore : A vCore is a virtual core. You choose the number of virtual cores and have greater control over your compute costs. This option supports the Azure Hybrid Benefit for SQL Server and reserved capacity (pay in advance).
- DTU : A Database Transaction Unit is a combined measure of compute, storage and I/O resources.
- Serverless : A compute tier for single database in SQL Database. The serverless model automatically scales compute based on workload demand and bills only for the amount of compute used.
Scaling
Implement vertical scaling by using SQL Database elastic database pools. The databases within an elastic database pool share the allocated resources.
Vertical scaling allows you to change the compute size for a set of databases.
To properly configure SQL elastic database pools to reduce server costs, choose the appropriate purchasing model and service tier :
- DTU Model : Basic, Standard and Premium tiers.
- vCore Model : General Purpose and Business Critical tiers.
Horizontal scaling is managed by using the SQL Database Elastic Database client library. There are two ways to apply horizontal scaling : read scale-out provisioning and sharding.
-
Read scale-out : Load-balance read-only workloads for a set of SQL databases. Offload read-only workloads by using the compute capacity of a read-only replica instead of running workloads on the read-write replica.
-
Sharding Partition data across a set of SQL databases that are identically structured. A set consists of a primary read-write replica and secondary read-only replicas. You can split large databases into smaller components to improve performance and make them easier to manage.
Restore a database from a backup in Azure SQL Database
Azure SQL Managed Instance
Azure SQL Managed Instance is a scalable cloud data service that provides the broadest SQL Server database engine compatibility with all the benefits of a fully managed platform as a service.
Like Azure SQL Database, Azure SQL Managed Instance is a PaaS database engine to take advantage of the best features of moving your data to the cloud in a fully managed environment.
- You can use SQL Managed Instance to do lift-and-shift migrations to Azure without having to redesign your applications.
- Azure SQL Managed Instance is ideal for customers interested in instance-scoped features, such as
- SQL Server Agent.
- Common language runtime (CLR).
- Database Mail
- Distributed transactions.
- Machine Learning Services.
- SQL Managed Instance uses vCores mode. You can define the maximum CPU cores and storage allocated to your managed instance. All databases within the managed instance share the resources allocated to the instance.
Azure Database for MySQL
Azure Database for MySQL is a relational database service in the cloud and it's based on the MySQL Community Edition database engine
Azure Database for MySQL delivers :
- Built-in high availability with no additional cost.
- Predictable performance and inclusive, pay-as-you-go pricing.
- Scale as needed, within seconds.
- Ability to protect sensitive data at rest and in motion.
- Automatic backups.
- Enterprise-grade security and compliance.
Azure Database for PostgreSQL
Azure Database for PostgreSQL is a relational database service in the cloud.
Moreover, Azure Database for PostgreSQL delivers the following benefits :
- Built-in high availability compared to on-premises resources. There's no additional configuration, replication or cost required to make sure your applications are always available.
-
Simple and flexible pricing. You have predictable performance based on a selected pricing tier choice that includes software patching, automatic backups, monitoring and security.
-
Scale up or down as needed, within seconds. You can scale compute or storage independently as needed.
- Adjustable automatic backups and point-in-time-restore for up to 35 days.
- Enterprise-grade security and compliance to protect sensitive data at rest and in motion.
Azure Database for PostgreSQL is available in two deployment options : Single Server and Hyperscale (Citus).
Azure SQL Edge
Azure SQL Edge is an optimized relational database engine geared for IoT and IoT Edge deployments. Azure SQL Edge is built on the same engine as SQL Server and Azure SQL.
- Azure SQL Edge is a containerized Linux application. The startup-memory footprint is less than 500 MB.
- You can design and build apps that run on many IoT devices. Capture continuous data streams in real time or integrate data in a comprehensive organizational data solution.
- Azure SQL Edge interacts with components at the network edge including edge gateways, IoT devices and edge servers.
-
Azure SQL Edge is available in two editions that have identical feature sets (SQL Edge Developer & SQL Edge).
-
Access a built-in streaming engine to help derive insights from data streams.
- Perform transformation, Windowed aggregation, Simple anomaly detection and classification of incoming data streams.
- Use time-series storage for time-indexed data, which can be aggregated and stored in the cloud for future analysis.
Azure SQL Edge has two deployment options :
-
Connected deployment : For connected deployment, Azure SQL Edge is available on the Azure Marketplace and can be deployed as a module for Azure IoT Edge.
-
Disconnected deployment : Disconnected deployment is accomplished through Azure SQL Edge container images. The images can be pulled from docker hub and deployed either as a standalone docker container or on a Kubernetes cluster.
Database High Availability Solutions
What is active geo-replication?
Active geo-replication enables you to create readable secondary databases of individual databases. These databases can be on a server in the same or in a different region.
Active geo-replication is available for : Azure SQL Database & Cosmos DB.
You can have up to 4 readable secondary replicas.
-
Azure SQL Database : You can configure active geo-replication for any database in any elastic database pool. You can use active geo-replication to :
- Create a readable secondary replica in a different region.
- Fail over to a secondary database if your primary database fails or needs to be taken offline.
-
Cosmos DB : Supports geo-replication across regions. However, you also can :
- Designate one region as the writable region and all others as read-only replicas.
- Fail over by selecting another region to be the write region during an outage.
What are auto-failover groups?
Because active geo-replication isn’t supported by Azure SQL Managed Instance, for geographic failover of instances of SQL Managed Instance, you must use Auto-failover groups.
Auto-failover groups enable you to manage the replication and failover of a group of databases to another region. As with geo-replication, you can choose to initiate failover manually or you can delegate it to the Azure service based on a policy that you define.
Data Redundancy ? See Data Redundancy in storage part.
Data Encryption for structured data
Data exists in three basic states: data at rest, data in motion, and data in process.
Data Encryption
Data at rest is data on a storage device that isn't being moved or used. data at rest includes archived email messages stored in your Outlook inbox, or files on your laptop that you aren't using.
Data in motion (also called data in transit) is data that's being moved from one device to another within a private network or public network like the internet. data in motion can also be data that's being read (used) but not changed. Data in motion includes email messages in transit, browsing internet websites, or using company applications like an organization chart.
Data in process is data that's open and being changed. Data in process includes writing an email message, saving your work files, or ordering from a website.
There are different encryption methods for each of data state.
Data state | Encryption method | Encryption level |
---|---|---|
Data at rest | Transparent data encryption (TDE) | Always encrypted |
Data in motion | Secure Socket Layers and Transport Layer Security (SSL/TLS) | Always encrypted |
Data in process | Dynamic data masking | Specific data is unencrypted, Remaining data is encrypted |