What are consistency model for modern DBs offered by AWS

This recipe explains consistency model for modern DBs offered by AWS

What are consistency model for modern DBs offered by AWS?

Database consistency is defined by a set of values that all data points within the database system must align to in order to be read and accepted properly. If any data enters the database that does not match the preconditioned values, the dataset will experience consistency errors. By establishing rules, database consistency can be achieved. Any data transaction written to the database must only change affected data as defined by the specific constraints, triggers, variables, cascades, and so on established by the database's developer's rules.

Assume you work for the National Transportation Safety Institute (NTSI). You've been assigned the task of compiling a database of new California driver's licenses. California's population has exploded in the last ten years, necessitating a new alphabet and numerical format for all first-time driver's license holders. Your team has determined that the new set value in your database for a California driver's license is as follows: 1 Alphabetic + 7 Numeric This rule is now mandatory for all entries. An entry with the string "C08846024" would result in an error. Why? Because the entered value was 1 Alpha + 8 Numeric, which is essentially inconsistent data.

Learn to Transform your data pipeline with Azure Data Factory!

Consistency also implies that any data changes to a single object in one table must be reflected in all other tables where that object appears. Continuing with the driver's license example, if the new driver's home address changes, that change must be reflected in all tables where that prior address existed. If one table has the old address and the others have the new address, this is an example of data inconsistency.

Data-Centric Consistency Models

Tanenbaum and Maarten Van Steen, two experts in this field, define the consistency model as a contract between software (processes) and memory implementation (data store). This model ensures that if the software follows certain rules, the memory will function properly. Because defining the last operation writes in a system without a global clock is difficult, some constraints should be placed on the values that can be returned by a read operation.

Client-Centric Consistency Models

The emphasis in a client-centric consistency model is on how data is perceived by clients. If data replication is not complete, data may differ from client to client. Because faster data access is the primary goal, we may choose a less-strict consistency model, such as eventual consistency.

Eventual Consistency

In this approach, the system ensures that if no new updates are made to a specific piece of data, all reads to that item will eventually return the most recently updated value. The update messages are sent to all other replicas by the updated replicas. In these states, different replicas may return different values when queried, but all replicas will eventually receive the update and be consistent. This model is appropriate for applications with hundreds of thousands of concurrent reads and writes per second, such as Twitter updates, Instagram photo uploads, Facebook status pages, messaging systems, and so on, where data integrity is not a primary concern.

Read-Your-Write Consistency

RYW (Read-Your-Writes) consistency is achieved when the system guarantees that any attempt to read a record after it has been updated will return the updated value. RDBMS typically provides read-write consistency.

Read-after-Write Consistency

RAW consistency is more stringent than eventual consistency. All clients will see a newly inserted data item or record right away. Please keep in mind that it only applies to new data. This model does not take into account updates or deletions.

Amazon S3 Consistency Models

In all regions, Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket, as well as eventual consistency for overwrite PUTS and DELETES. As a result, if you add a new object to your bucket, both you and your clients will notice it. However, if you overwrite an object, it may take some time to update its replicas, which is why the eventual consistency model is used. Amazon S3 ensures high availability by replicating data across multiple servers and availability zones. When a new record is added, or a record/data is updated and deleted, it is obvious that data integrity must be maintained. The following are the scenarios for the aforementioned cases:

• A new PUT request is submitted. If the object is queried immediately, it may not appear in the list until the changes are propagated to all servers and AZs. The read-after-write consistency model is used in this case.

• An UPDATE request is submitted. Because the eventual consistency model is used for UPDATEs, a query to list the object may return an outdated value.

• A DELETE request is issued. Due to the use of the eventual consistency model for DELETES, a query to list or read the object may return the deleted object.

Amazon DynamoDB Consistency Models

Amazon DynamoDB is a popular NoSQL service provided by AWS. NoSQL storage is designed to be distributed. Amazon DynamoDB stores three geographically distributed replicas of each table to ensure high availability and data durability. In DynamoDB, a write operation follows eventual consistency. A DyanamoDB table read operation (GetItem, BatchGetItem, Query, or Scan operation) is an eventual consistent read by default. However, for the most recent data, you can configure a strong consistent read request. It is worth noting that a strong consistent read operation consumes twice as many read units as a subsequent consistent read request. In general, it is recommended to use eventual consistent read because DynamoDB's change propagation is very fast (DynamoDB uses SSDs for low-latency) and you will get the same result for half the price of a strong read consistent request.

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

AWS CDK and IoT Core for Migrating IoT-Based Data to AWS
Learn how to use AWS CDK and various AWS services to replicate an On-Premise Data Center infrastructure by ingesting real-time IoT-based.

Talend Real-Time Project for ETL Process Automation
In this Talend Project, you will learn how to build an ETL pipeline in Talend Open Studio to automate the process of File Loading and Processing.

AWS Project-Website Monitoring using AWS Lambda and Aurora
In this AWS Project, you will learn the best practices for website monitoring using AWS services like Lambda, Aurora MySQL, Amazon Dynamo DB and Kinesis.

Build a Spark Streaming Pipeline with Synapse and CosmosDB
In this Spark Streaming project, you will learn to build a robust and scalable spark streaming pipeline using Azure Synapse Analytics and Azure Cosmos DB and also gain expertise in window functions, joins, and logic apps for comprehensive real-time data analysis and processing.

AWS CDK Project for Building Real-Time IoT Infrastructure
AWS CDK Project for Beginners to Build Real-Time IoT Infrastructure and migrate and analyze data to

Azure Stream Analytics for Real-Time Cab Service Monitoring
Build an end-to-end stream processing pipeline using Azure Stream Analytics for real time cab service monitoring

Build a Data Pipeline with Azure Synapse and Spark Pool
In this Azure Project, you will learn to build a Data Pipeline in Azure using Azure Synapse Analytics, Azure Storage, Azure Synapse Spark Pool to perform data transformations on an Airline dataset and visualize the results in Power BI.

How to deal with slowly changing dimensions using snowflake?
Implement Slowly Changing Dimensions using Snowflake Method - Build Type 1 and Type 2 SCD in Snowflake using the Stream and Task Functionalities

dbt Snowflake Project to Master dbt Fundamentals in Snowflake
DBT Snowflake Project to Master the Fundamentals of DBT and learn how it can be used to build efficient and robust data pipelines with Snowflake.

Deploy an Application to Kubernetes in Google Cloud using GKE
In this Kubernetes Big Data Project, you will automate and deploy an application using Docker, Google Kubernetes Engine (GKE), and Google Cloud Functions.