๐ย I just need to improving myself in everyday and sharing my learned for everyone.
Hi reader ๐๐ผ, The adoption of DevOps practices
has become increasingly prevalent within development team today. Many organizations have integrated the role of a DevOps engineer into their teams or have expanded the responsibilities of existing team members to encompass DevOps methodologies
, akin to the Agile framework. In these scenarios, development team either recruit dedicated individuals to drive DevOps processes
or distribute these responsibilities across the job description of all team members. This approach acknowledges that DevOps
aims to enhance productivity by optimizing the workflows and responsibilities associated with each role. ๐จ๐ปโ๐ป๐งโ๐ป๐ฉ๐ปโ๐ป
๐ย A crucial aspect of DevOps is the application of the Idempotence principle
(alternatively referred to as idempotent or idempotency) in coding practices. This concept ensures that Infrastructure
can be consistently managed and controlled through code. Fundamentally, it builds upon the previously established concept of Infrastructure as Code
, which is widely implemented using tools such as Ansible, Terraform, Puppet, and other.๐บ
What is Idempotence
๐ Itโs a important principle in IaC or several programming, particularly in the context of operations or functional that modify state or data. An operation is considered idempotent if it can be applied multiple times without changing the result beyond the initial application.
In programming, idempotence is desirable for several reasons:
- Reliability: Idempotent operations can be safely retried or repeat without causing unintended side effects or data corruption. This is especially crucial in distributed systems, where network failures or other issues may cause requests to be duplicated or retried.
- Consistency: Idempotent operations ensure that the final state of the system is consistent and predictable, regardless of how many times the operation was performed.
- Simplicity: Idempotent operations are easier to reason about and debug since their behavior is deterministic and predictable.
The example use case of Idempotence
Idempotence is particularly valuable in scenarios like:
- RESTful APIs: In this context is about HTTP methods like GET, PUT, DELETE, and HEAD โ are designed to be idempotent. For instance, multiple identical PUT requests should have the same effect as a single PUT request, ensuring data integrity. Furthermore, multiple request DELETE should have only first request was completed, hence other requests will returned the previous state.
- Database operations: Operations like UPDATE or DELETE queries that target specific rows based on a unique identifier (e.g., a primary key) are often idempotent. Executing the same query multiple times will result in the same final state.
- Caching: Caching invalidation or cache warming operations are typically idempotent. Repeated cache invalidation requests for the same key will effectively invalidate the cache entry once, ensuring consistency. Moreover, cache invalidation is the process of invalidating cache by removing data from system when those data is no longer valid or useful.
- Infrastructure as Code: Tools like Ansible, Terraform, and Puppet often rely on idempotent operations to manage infrastructure resources. Running the same configuration for multiple times should result in the desired state without unintended modifications. Example, you executing the Terraform to provisioning instances more than one time with the same configuration. You could saw the desired state of your resources without modification.
- Data processing: The operations as deduplication, normalization, or transformation of data can be designed to be idempotent, ensuring that repeated applications produce the same output.
Itโs significant to note that idempotence is a property of an operation, nor necessarily of an entire system or application. In practice, developers usually strive to make critical operations idempotent to improvement reliability, consistency, and simplicity in the software operational. ๐โโฌ
The example Golang code
The provided example code written in Golang
illustrates the practical implementation of the idempotent concept within the context of a basic data management application.
A Person
struct is defined to represent individual records, with the Email
field serving as the unique identifier. Two simple functions are implemented:
- For inserting new
Person
records into an in-memory database. - For retrieving all existing records.
Letโs writing a code
Beginning with define a Person
struct
type Person struct {
Email string
Name string
Age int
}
Define a new schema of in-memory database with init()
func init() {
// Create the DB schema
schema := &memdb.DBSchema{
Tables: map[string]*memdb.TableSchema{
"person": &memdb.TableSchema{
Name: "person",
Indexes: map[string]*memdb.IndexSchema{
"id": &memdb.IndexSchema{
Name: "id",
Unique: true,
Indexer: &memdb.StringFieldIndex{Field: "Email"},
},
"age": &memdb.IndexSchema{
Name: "age",
Unique: false,
Indexer: &memdb.IntFieldIndex{Field: "Age"},
},
},
},
},
}
var err error
// Create a new data base
db, err = memdb.NewMemDB(schema)
if err != nil {
panic(err)
}
}
Create an Insert function for built transaction into database.
func InsertDataWithID(member *Person) error {
// Create a write transaction
txn := db.Txn(true)
// Insert the new person
fmt.Printf("Inserting ID: %s\n", member.Email)
if err := txn.Insert("person", member); err != nil {
return err
}
// Commit the transaction
txn.Commit()
// Create read-only transaction
txn = db.Txn(false)
defer txn.Abort()
return nil
}
func InsertMultipleData(people []*Person) error {
for _, p := range people {
if err := InsertDataWithID(p); err != nil {
return err
}
}
return nil
}
Create a retrieving all existing data from database and print out to console.
func ListAllValues() error {
txn := db.Txn(false)
// List all the people
it, err := txn.Get("person", "id")
if err != nil {
return err
}
fmt.Println("All the people:")
for obj := it.Next(); obj != nil; obj = it.Next() {
p := obj.(*Person)
fmt.Printf("[+] %s | %s | %d\n", p.Name, p.Email, p.Age)
}
return nil
}
Define main()
function to call both functions as insert and then retrieve them.
func main() {
// Insert some people
people := []*Person{
&Person{"joe@aol.com", "Joe", 30},
&Person{"lucy@aol.com", "Lucy", 35},
&Person{"joe@aol.com", "Joey", 26},
&Person{"tariq@aol.com", "Tariq", 21},
&Person{"dorothy@aol.com", "Dorothy", 53},
&Person{"joely@aol.com", "Joe", 35},
}
if err := InsertMultipleData(people); err != nil {
panic(err)
}
if err := ListAllValues(); err != nil {
panic(err)
}
}
You can see fully code in one file below. you can see my test data that have a duplicated index as 1 and 3.
// main.go
package main
import (
"fmt"
"github.com/hashicorp/go-memdb"
)
type Person struct {
Email string
Name string
Age int
}
var db *memdb.MemDB
func init() {
// Create the DB schema
schema := &memdb.DBSchema{
Tables: map[string]*memdb.TableSchema{
"person": &memdb.TableSchema{
Name: "person",
Indexes: map[string]*memdb.IndexSchema{
"id": &memdb.IndexSchema{
Name: "id",
Unique: true,
Indexer: &memdb.StringFieldIndex{Field: "Email"},
},
"age": &memdb.IndexSchema{
Name: "age",
Unique: false,
Indexer: &memdb.IntFieldIndex{Field: "Age"},
},
},
},
},
}
var err error
// Create a new data base
db, err = memdb.NewMemDB(schema)
if err != nil {
panic(err)
}
}
func InsertDataWithID(member *Person) error {
// Create a write transaction
txn := db.Txn(true)
// Insert the new person
fmt.Printf("Inserting ID: %s\n", member.Email)
if err := txn.Insert("person", member); err != nil {
return err
}
// Commit the transaction
txn.Commit()
// Create read-only transaction
txn = db.Txn(false)
defer txn.Abort()
return nil
}
func InsertMultipleData(people []*Person) error {
for _, p := range people {
if err := InsertDataWithID(p); err != nil {
return err
}
}
return nil
}
func ListAllValues() error {
txn := db.Txn(false)
// List all the people
it, err := txn.Get("person", "id")
if err != nil {
return err
}
fmt.Println("All the people:")
for obj := it.Next(); obj != nil; obj = it.Next() {
p := obj.(*Person)
fmt.Printf("[+] %s | %s | %d\n", p.Name, p.Email, p.Age)
}
return nil
}
func main() {
// Insert some people
people := []*Person{
&Person{"joe@aol.com", "Joe", 30},
&Person{"lucy@aol.com", "Lucy", 35},
&Person{"joe@aol.com", "Joey", 26},
&Person{"tariq@aol.com", "Tariq", 21},
&Person{"dorothy@aol.com", "Dorothy", 53},
&Person{"joely@aol.com", "Joe", 35},
}
if err := InsertMultipleData(people); err != nil {
panic(err)
}
if err := ListAllValues(); err != nil {
panic(err)
}
}
When you executed the above code, you will see this result. you can see function was inserted the duplicated data and update that row with a new information, which you will see from last function. It isnโt applied idempotent concept.
Inserting ID: joe@aol.com
Inserting ID: lucy@aol.com
Inserting ID: joe@aol.com
Inserting ID: tariq@aol.com
Inserting ID: dorothy@aol.com
Inserting ID: joely@aol.com
All the people:
[+] Dorothy | dorothy@aol.com | 53
[+] Joey | joe@aol.com | 26
[+] Joe | joely@aol.com | 35
[+] Lucy | lucy@aol.com | 35
[+] Tariq | tariq@aol.com | 21
Letโs see how can apply idempotence
For implement the idempotence, I should add code for query as verify existing data before insert or update data.
Firstly, I just create new function to retrieving existing data.
func GetDataByID(key string) *Person {
// Create a read-only transaction
txn := db.Txn(false)
raw, err := txn.First("person", "id", key)
if err != nil || raw == nil {
fmt.Printf("[-] Not found %s", key)
return &Person{}
}
return raw.(*Person)
}
Then, I will update the InsertMultipleData
function to get data that will skipped if it found.
func InsertIdempotentData(people []*Person) error {
for _, p := range people {
existed := GetDataByID(p.Email)
if existed.Email == p.Email {
fmt.Printf("Skipping insertion for ID: %s (already exists)\n", p.Email)
continue
}
// Insert the new person
if err := InsertDataWithID(p); err != nil {
return err
}
}
return nil
}
If you replace the previous function call from InsertMultipleData
to InsertIdempotentData
, you could see result below. You can found 1 logging message that skipped insert operation.
[-] Not found joe@aol.comInserting ID: joe@aol.com
[-] Not found lucy@aol.comInserting ID: lucy@aol.com
Skipping insertion for ID: joe@aol.com (already exists)
[-] Not found tariq@aol.comInserting ID: tariq@aol.com
[-] Not found dorothy@aol.comInserting ID: dorothy@aol.com
[-] Not found joely@aol.comInserting ID: joely@aol.com
All the people:
[+] Dorothy | dorothy@aol.com | 53
[+] Joe | joe@aol.com | 30
[+] Joe | joely@aol.com | 35
[+] Lucy | lucy@aol.com | 35
[+] Tariq | tariq@aol.com | 21
Conclusion
๐ฏ Idempotence is an important concept in programming, particularly when dealing with operations that modify state or data. An operation is considered idempotent if it can be applied multiple times without changing the result beyond the initial application. Idempotent operations are desirable because they enhance reliability, consistency, and simplicity in software designs. They ensure that the final state remains predictable and free from unintended side effects or data corruption, even if the operation is retried or repeated due to failures or other issues. ๐ป
Top comments (1)
I think if you follow your examples a little further it starts to get confusing.
Imagine inserting the same email address but with a different name (because people change their names from time to time). Now, this probably should be an update rather than an insertion, but you don't know that until you look them up, and relying on a system which works this way can lead to things getting overlooked. You'd need to put an excaption in there and handle it somehow.
I'm not saying thing shouldn't be idempotent, I'm saying it's not always straightforward at anything above the lowest level.