TL;DR - Fixed Length files can be troublesome to handle, and even more troublesome when there are multiple kinds of records on the lines. To solve this, we'll use Guiabolso's fixed length file handler, a library designed just for this purpose.
Hello Kotliners!
In this article we will see a little bit on how to handle fixed-length files by using this library, by Guiabolso (with some contributions made by myself). There are many solutions for this kind of problem on the JVM, but all of them are focused on Java.
These libraries are old and not optimized for Kotlin, and become very cumbersome/verbose to use. We created a Kotlin-DSL to handle these cases in a beautiful and concise way.
Remembering: What is a Fixed-Length file?
If you ever worked with a fixed length file, you know that most of the times it's a big pain in the ass.
Photo by Tim Gouw from Pexels
A fixed length or fixed width file is a file containing data separated by fields. These fields have a specific length and is sometimes prefixed/suffixed with a character to denote emptiness, such as 0003
for an int 3
of fixed length 4.
An example of a fixed length file is the following:
Kotlin 1201.63
Java 0219.52
Python 0129.62
Javascript0308.43
It represents data from PYPL, and the data is organized as:
Field | From index | To Index | Padding |
---|---|---|---|
Language Name | 0 | 10 | RightPadding(' ') |
Ranking | 10 | 12 | LeftPadding('0') |
Share % | 12 | 17 | LeftPadding('0') |
This kind of file is broadly used by legacy systems (I'm looking at you, banking system), and integrating with them is troublesome and not fun.
Parsing fixed-length files
These files are hard to understand and boring to deal with. The solution to parsing them usually involves a lot of string manipulation and manual buffer control (or bringing the entire file to memory if it's viable).
When dealing with Kotlin code, this verbosity is annoying. We want simplicity and conciseness.
For this, the company I work for, Guiabolso, developed a small library for fixed-length file handling in Kotlin.
GuiaBolso / fixed-length-file-handler
Handlers for Fixed Length files in a beautiful Kotlin DSL
This library provides a beautiful (I wrote it, so I can say it's beautiful, right?) Kotlin DSL to parse this kind of file. Using our previous example as example:
data class PYPLRecord(langName: String, ranking: Int, share: Double)
val pyplSequence = fixedLengthFileParser<PYPLRecord>(fileStream) {
PYPLRecord(
field(0, 10, Padding.PaddingRight(' '),
field(10, 12, Padding.PaddingLeft('0'),
field(12, 17, Padding.PaddingLeft('0')
)
}
This will allow us to map our file to a lazy Sequence
, which will process the file as a stream instead of bringing it to memory. The library already supports many of the usual Java/Kotlin types, without having to cast and translate them.
"Advanced" fixed-length files
For some reason yet to be defined reason, some of these legacy systems use the same file for more than one record type
Photo by Juan Pablo Serrano Arenas from Pexels
In these cases, our example above will be used for more things, such as Developer Name and Preferred Language
1Kotlin 1201.63
1Java 0219.52
1Python 0129.62
1Javascript0308.43
2Leonardo Colman LopesKotlin
2Jane Doe Javascript
The type of the record is marked at some position in the line, and your system must find a way to parse it any way.
This leads to a bigger String manipulation spaghetti and a more unsustainable code.
The library also provides a way to parse this kind of file:
data class PYPLRecord(langName: String, ranking: Int, share: Double)
data class DevRecord(devName: String, preferredLang: String)
fixedLengthFileParser<Any>(fileInputStream) {
withRecord({ line -> line[0] == '1' }) {
PYPLRecord(
field(1, 11, Padding.PaddingRight(' '),
field(11, 13, Padding.PaddingLeft('0'),
field(13, 18, Padding.PaddingLeft('0')
)
}
withRecord( { line -> line[0] == '2' }) {
DevRecord(
field(1, 22, Padding.PaddingRight(' '),
field(22, 32, Padding.PaddingRight(' ')
)
}
}
We believe that parsing fixed-length files will be easier with this library, and we hope to help anyone that needs this kind of feature. Take a look!
GuiaBolso / fixed-length-file-handler
Handlers for Fixed Length files in a beautiful Kotlin DSL
Fixed Length File Handler
Introduction
When processing data from some systems (mainly legacy ones), it's usual to have Fixed Length Files, which are files that contain lines which content is split using a specific length for each field of a record.
This kind of files are sometimes tricky to handle as many times there is a spaghetti of string manipulations and padding, and character counting and... Well, many things to take care of.
This library comes to the rescue of programmers dealing with fixed length files. It enables you to simply define how your records are structured and it will handle these records for you in a nice Kotlin DSL for further processing.
Using with Gradle
Import it into your dependencies:
dependencies {
implementation("br.com.guiabolso:FixedLengthFileHandler:{version}")
}
Basic Usage
The basic usage assumes that you're reading a file with a single type of record.
Given a Fixed-Length File:
…
Top comments (0)