As the continuous growth and adoption of cloud native and distributed systems due to it's flexibility and high availability, It is also becoming a continuous increase of complexities, specially for IT Teams to properly operate and monitor these distributed systems.
But what the hell is observability? Why is it important, and How it can actually help organizations?
What is Observability?
Observability is the ability to measure the internal states of a system by examining its outputs. A system is considered “observable” if the current state can be estimated by only using information from outputs, namely sensor data.
You simply put it as a KPI. You'll never know if the current state is abnormal or stable if you don't have the historic data or a state that can be estimated based on it's outputs (Performance).
For example, How can you tell if your employee is performing well without him/her reporting his/her daily activities or tasks? Can you tell if your employee is working well just by looking at him/her work? Nah,
The same goes to your applications, You cannot tell if your application is properly performing well just by looking at the cpu / memory utilization without knowing it's internal state.
But how can you tell to your application to become observable to report and allow you to validate it's Key Performance Indicator's (KPI)?
Instrumentation
This is where instrumentation comes in, Instrumentation is a method or modifying your source code to give you an output to validate if your application is currently performing well.
But how can we instrument it?
OpenTelemetry
OpenTelemetry allows us to achieve our goal to tell our application to give us the right output to validate it's performance.
OpenTelemetry or OTel is an open source observability framework made up of a collection of tools, APIs, and SDKs.
OTel enables IT teams to instrument, generate, collect, and export telemetry data for analysis and to understand software performance and behavior.
Back to Basic
Now, we have a basic picture on how observability works and how instrumentation will bring something to the table. For our example, we will be using the ff. tech:
- Signoz (Open Source Observability Tool)
- Opentelemetry (Open Source Telemetry Instrumentation Framework)
- Golang
- Gin Gonic (REST API Framework for Go)
Tracing with Golang
First, Imagining we have a microservice application that contains, product-service
, reviews-service
, ratings-service
. Each one of them are integrated or calling each other to get needed information requested by rest client.
Let's assume that we already build our basic rest application, since it will become a huge discussion if we will start from scratch.
.
├── Makefile
├── README.md
├── docker-compose.yaml
├── k8s
├── load-gen
│ ├── Dockerfile
│ ├── go.mod
│ ├── main.go
│ └── seed
│ ├── products.json
│ ├── ratings.json
│ └── reviews.json
├── product-service
│ ├── Dockerfile
│ ├── Makefile
│ ├── README.md
│ ├── controller
│ │ ├── CheckEnv.go
│ │ ├── DefaultResponseController.go
│ │ ├── FeatureProductController.go
│ │ ├── ProductController.go
│ │ ├── RatingsController.go
│ │ ├── ReviewsController.go
│ │ └── UserController.go
│ ├── crypt
│ │ └── passwdhash.go
│ ├── db
│ │ ├── database.go
│ │ └── setup.go
│ ├── go.mod
│ ├── go.sum
│ ├── logging
│ │ └── logging.go
│ ├── main.go
│ ├── models
│ │ └── models.go
│ ├── routes
│ │ └── routes.go
│ └── tracer
│ └── tracer.go
├── ratings-service
│ ├── Dockerfile
│ ├── Makefile
│ ├── controller
│ │ ├── RatingsController.go
│ │ └── RestAPIResponseController.go
│ ├── db
│ │ ├── database.go
│ │ └── setup.go
│ ├── docker-compose.yaml
│ ├── go.mod
│ ├── go.sum
│ ├── logging
│ │ └── logging.go
│ ├── main.go
│ ├── models
│ │ └── model.go
│ ├── routes
│ │ └── routes.go
│ └── tracer
│ └── tracer.go
└── reviews-service
├── Dockerfile
├── Makefile
├── controller
│ ├── RestAPIResponseController.go
│ └── ReviewsController.go
├── db
│ ├── database.go
│ └── setup.go
├── docker-compose.yaml
├── go.mod
├── go.sum
├── logging
│ └── logging.go
├── main.go
├── models
│ └── model.go
├── routes
│ └── routes.go
└── tracer
└── tracer.go
25 directories, 57 files
Now, Let's look with an example by creating tracer.go
under product-service
. This file will contain the initialization of our global trace. That contains, trace provider, OTEL Endpoint and our Service Name (product-service).
package tracer
import (
"context"
"log"
"os"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"google.golang.org/grpc/credentials"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"strings"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/attribute"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)
var (
collectorURL = os.Getenv("OTEL_EXPORTER_OTLP_ENDPOINT")
insecure = os.Getenv("INSECURE_MODE")
ServiceName = os.Getenv("SERVICE_NAME")
Tracer = otel.Tracer("gin-server")
)
func InitTracer() (*sdktrace.TracerProvider, error) {
var secureOption otlptracegrpc.Option
if strings.ToLower(insecure) == "false" || insecure == "0" || strings.ToLower(insecure) == "f" {
secureOption = otlptracegrpc.WithTLSCredentials(credentials.NewClientTLSFromCert(nil, ""))
} else {
secureOption = otlptracegrpc.WithInsecure()
}
exporter, err := otlptrace.New(
context.Background(),
otlptracegrpc.NewClient(
secureOption,
otlptracegrpc.WithEndpoint(collectorURL),
),
)
if err != nil {
log.Fatalf("Failed to create exporter: %v", err)
}
resources, err := resource.New(
context.Background(),
resource.WithAttributes(
attribute.String("service.name", ServiceName),
attribute.String("library.language", "go"),
),
)
traceProvider := sdktrace.NewTracerProvider(
sdktrace.WithSampler(sdktrace.AlwaysSample()),
sdktrace.WithBatcher(exporter),
sdktrace.WithResource(resources),
)
otel.SetTracerProvider(traceProvider)
otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{}))
return traceProvider, nil
}
Injecting traces with Contexts
We have provided the tracer.go
, Now let's start by injecting trace from our Gin Context from our ProductController.go
:
func GetProducts(c *gin.Context) {
dbInstance, err := db.SetupDatabase()
// Inject Gin Context
otel.GetTextMapPropagator().Inject(c.Request.Context(), propagation.HeaderCarrier(c.Request.Header))
// Start Tracing
_, span := tracer.Tracer.Start(c.Request.Context(), "GetProducts")
// End Span when function ends,
defer span.End()
if err != nil {
ServerError(c)
}
sql := db.MySQLDB{DBhandler: dbInstance}
products, err := sql.GetProducts(c)
if err != nil {
ServerError(c)
}
c.JSON(http.StatusOK, products)
}
Next, lets initialize our trace on our main.go
file:
package main
import (
"github.com/gin-gonic/gin"
"productservice/routes"
"productservice/tracer"
"go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin"
"log"
"context"
)
func main() {
tp, err := tracer.InitTracer()
if err != nil {
log.Fatal(err)
}
defer func() {
if err := tp.Shutdown(context.Background()); err != nil {
log.Printf("Error shutting down tracer provider: %v", err)
}
}()
router := gin.New()
router.Use(otelgin.Middleware(tracer.ServiceName))
routes.SetupRoute(router)
router.Run(":8000")
}
Wrap
We don't have to go deeper on how to implement observability, Since, there's tons of ways to implement it and different kinds of use cases. But one thing to note is that, Do not ever Operate an application on PRODUCTION without knowing it's internal state. 👌😁
Top comments (0)