Logging Framoworks... Moving from SEQ to f/Lo/G... and why I didn't pick ELK

First.. anyone with a "FLOG" app... not trying to take the name away or claim it, but I do think Flog is a neat little way of explaining the logging stack... similar to ELK... (FluentBit / Loki / Grafana)

the why

With the latest flare-ups of companies doing a "rug pull" on the licensing schemas, locking down functionality, and outright abandoning the people who put them in the spots they are in (looking at you, redis)... I felt it was time to start migrating any proprietary solutions over to more free and open-source solutions. I'm not in any way saying DataLust is doing this with their SEQ product. I have used this for years as a log aggregator and it has worked very well. I did have a few hiccups a bit back over looking at logs on my computer and phone at the same time due to limitations of the software, but overall, it has done what I wanted it to do, which is a very lightweight logging system that just works out of the box. Evaluating the options though was rough. The industry standard is ELK (or an Opensearch variation) is the way to go. Having Elestisearch as your main engine, Logstash as the aggregator, and Kilbana as your main UI felt very heavy to me in terms of running something that does one thing... or in my mind does one thing). Also.. is it just me, or do setting up these solutions feel like you need a PHD in Logging to get them to work? So I did what any developer would do... google something for 5 minutes, give up, and ask the ChatGPT what it would do. A lot of the recommendations it gave Im sure are great, but for the full suite of things to run on something like a Raspberry Pi.. most of my research pointed to Grafana Loki as the main product to use.

I've used Loki before, and it was OK, but Grafana was kind of overkill. I originally started with Grafana / Prometheus / Agents to monitor my systems. This solution is great, but very, very overkill for what I have, which is a basic home network with a handful of services I host. I prefer and still use, Uptime Kuma as my main monitoring solution, but when using Loki, I was overwhelmed. Not so much in the data, but in the amount of data. I was using Promtail as my collector, and I felt the system was very, very slow... especially with the amount of data I collected. See, I was collecting all of my docker container logs, and some of my home lab services are very chatty to the console log. That said, I figured I'd give it a fresh start, and use something other than promtail. To me, the concept of something like Promtail that would either have to read the data live, or query it every so many minutes felt like a waste. When it comes to my applications I want live data coming into one and only one location, and limited data.

I could have used Lokis rest API for my solution. I hate adding complexity to a project, but I do love performance. For me, FluentBit felt like a good solution for this. I can configure my endpoints, and how I want to connect to them, which I did. I picked a TCP connection since it is a level down in the network stack from http, and doesn't need a handshake or the overhead of establishing a connection to transmit the data. Meaning my log event is as light as possible. It also gave me a similar trio of services as the ELK platform each serving the same use case. The 3rd reason I went with FluentBit was if I ever needed to switch over to ELK or Opensearch, I could. FluentBit is a great, quick middleware platform and is recommended for people who don't want to use Logstash in their ELK platform.

final thought on this... In terms of resources used, looking them up on my Portainer instance, it appears my "Flog" stack uses just a few more resources than my SEQ instance. This is good since I was worried that by cutting over to a multi-app stack, somehow I'd be pushing the hardware limits on my Raspberry Pi.

the code

The cutover. Im at the time doing a proof of concept more than anything, but I found that Serilog for DotNet was probably the best bet in cutting over. My Asp.net services already use the Microsoft Logging platform, and my original service was calling in the service.cs a simple

services.AddLogging(loggingBuilder =>
 { loggingBuilder.AddSeq(Configuration.GetSection(SEQ));

With Serilog, it took a bit, but I was able to get it fairly close to the same code. I needed to go to Nuget and install the serilog.aspnetcore and serilog.sinks.network packages, and really, really tried using the serilog.configuration package to make the imports easy, but settled without the automatic configuration bindings. This left me with

 services.AddLogging(loggingBuilder =>
 {
    var logger = new LoggerConfiguration()
      .Enrich.WithProperty("App", name)
      .MinimumLevel.Debug()
      .WriteTo.Console()
      .WriteTo.TCPSink(loggerConnection, new CompactJsonFormatter(), Serilog.Events.LogEventLevel.Warning)
      .CreateLogger();
    loggingBuilder.AddSerilog(logger);
});

Not as clean as Seq importing, but to be fair, not too bad either. I can reduce it by not writing to the console if I don't want to, and I am adding an extra app attribute to the JSON structure so that I can tell what app is causing the error message.

Now to the server-side stuff.
I am a huge fan of docker compose files. This stack is no exception. for it, the setup is as follows (note: anything with an _ is a folder)

.
├── docker-compose.yml
├── fluent-bit.conf
├── loki-config.yaml
├── _fluentbit
│ └── _logs
├── _loki
│ ├── _loki
│ └── _wal
└── _grafana

I prefer having my configs under the same as my root, but each their own. Folders are needed for storing the live data. Im not a fan of volumes in the docker sense, and just mount to the folder that exists in the project.

The first step in the process is fluent. The config I have set is

[INPUT]
    name http
    listen 0.0.0.0
    port 24224   

[INPUT]
    Name            tcp
    Listen          0.0.0.0
    Port            12201

[OUTPUT]
    Name        grafana-loki
    Match       *
    Url         http://loki:3100/loki/api/v1/push
    RemoveKeys  source,container_id
    Labels      {job="fluent-bit"}
    LabelKeys   container_name
    BatchWait   1s
    BatchSize   1001024
    LineFormat  json
    LogLevel    info

I give 2 endpoints, on 2 different ports (12201, 24224)
Input 1 is an HTTP endpoint that I can make post-calls to. This will come in handy for basic bash shell scripts I have, and logging their items. The second input is the TCP input. This one I will use for any application-based logging I do. The output is configured specifically for Loki, and the log levels are set to info, while I could have gone higher, I think having that limitation in the application fits better than in the logger (at least for my scenario which is hosted 100% internally and has a lesser likelihood of being breached).

From FluentBit, we go to the Loki Configuration

auth_enabled: false
server:
  http_listen_port: 3100
  grpc_listen_port: 9095

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 15m
  max_chunk_age: 1h
  chunk_target_size: 1048576
  chunk_retain_period: 30s


schema_config:
  configs:
    - from: 2020-02-25
      store: boltdb
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb:
    directory: /var/lib/loki/index

  filesystem:
    directory: /var/lib/loki/chunks

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: true
  retention_period: 30d

compactor:
  working_directory: /var/lib/loki/boltdb-shipper
  shared_store: filesystem

ruler:
  storage:
    type: local
    local:
      directory: /var/lib/loki/rules
  rule_path: /var/lib/loki/rules-temp

Im sure there is some cleanup I can do with this. For one, Im not using grpc for any of my logging, so I don't think I need this (could be wrong though) My goal for this though is to keep data for about a month before throwing it away (which means persistence with the Loki instance) I've yet to run this any longer then a few hours, so there might end up being some corrections to these files.

Third and last is the compose file. I could have set up loki right off the bat in the compose, or something like that, but it takes a few seconds to add Loki (HTTP://loki:3100) to the sources in grafana.

version: '3'

services:
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3003:3000"
    volumes:
      - ./grafana:/var/lib/grafana
    networks:
      - logging_network

  loki:
    image: grafana/loki:latest
    command: -config.file=/etc/loki/local-config.yaml
    ports:
      - "3100:3100"
    volumes:
      - ./loki/loki:/var/lib/loki/
      - ./loki-config.yaml:/etc/loki/local-config.yaml
      - ./loki/wal:/wal
    networks:
      - logging_network

  fluent-bit:
    image: grafana/fluent-bit-plugin-loki:latest
    #image: fluent/fluent-bit:latest
    container_name: fluent-bit
    environment:
      - LOKI_URL=http://loki:3100/loki/api/v1/push
      - LOG_PATH=/var/log/*.log=value
    ports:
      - "24224:24224"
      - "12201:12201"
    volumes:
      - ./fluentbit/log:/var/log
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
    networks:
      - logging_network

networks:
  logging_network:

Im currently using the Grafana/fluent-bit-plugin container which only runs on x64 at the moment. I do plan on trying to cut this over to the official Fluentbit container, which supports ARM-based CPUS.

As with any of my posts... if you see any improvements I can make or ways to clean things up, let me know. Im still in the middle of my project, but at a point right now where publishing a quick little how-to / why article felt ok to do.

DEV Community

Logging Framoworks... Moving from SEQ to f/Lo/G... and why I didn't pick ELK

the why

the code

Top comments (0)

Read next

DSA Using C++

Lithe Framework: Incredible Projects Built with It!

How Microsoft for Startups Boosts Entrepreneurial Success

A Comprehensive Guide to Grasping Quantum Computing