Hello World
Near the end of the last post, I noted we would put the static site generator project aside for the time being. I decided that to keep things moving forward I'd change up what we're looking at every few posts. @ladydascalie suggested a couple of exercises that I thought would be good to tackle. This time around we are going to take a swing at the first idea.
End Goal
- Write a program to sort files within a folder by their extension
- Later make it sort them in logical folders ex: .txt in Documents, .jpg in Images etc...
We are going to focus on the first point this time around. The idea is that we'll take a bunch of file names (strings) and print to standard out
in alphabetical order. With that in mind, I decided to start with a slice of filename-like strings. That is strings with a period .
in there somewhere. We could then take this slice of strings range
through them. In each step of our range
, we will strings.Split()
the string at the .
. If Split()
returns more than one element we have an extension. Extensions are two to three characters but could be any number. We're not judging, and will take anything after the last .
. The extension and the filename will go into a map[string][]string
. We can imagine our final map as JSON which looks something like:
{
"epub": [
"lil-go-book.epub"
],
"jpg": [
"as23dsd.jpg"
],
"md": [
"README.md"
],
"mp3": [
"something.mp3"
],
"pdf": [
"go-in-action.pdf"
],
"txt": [
"asdf.txt",
"qwerwe.txt"
]
}
In fact, I'll add in a feature to print the list in plain text output or a JSON object. Then you could pipe it to jq
, that might be useful.
Let's Go
Let's take a look at our first iteration of the code! It follows the pattern I laid out in my head and works as expected - which was a nice touch.
package main
import (
"fmt"
"strings"
)
func main() {
var m = make(map[string][]string)
list := []string{"no-ext", "README.md", "asdf.txt", "qwe.rwe.txt", "as23dsd.jpg", "something.mp3", "go-in-action.pdf", "lil-go-book.epub"}
for _, s := range list {
ext := strings.Split(s, ".")
if len(ext) > 1 {
m[ext[len(ext)-1]] = append(m[ext[len(ext)-1]], s)
}
}
fmt.Printf("%v", m)
}
From here I added an if
statement to account for files with no extensions. While we're at it let's add a sort.Strings
so we print each group in alphabetical order. I'm not sorting extensions though at this point, though, that comes later. You can see our small tweaks in the snippet below.
...
for _, s := range list {
ext := strings.Split(s, ".")
if len(ext) > 1 {
m[ext[len(ext)-1]] = append(m[ext[len(ext)-1]], s)
}
if len(ext) == 1 {
m["no-ext"] = append(m["no-ext"], s)
}
sort.Strings(m[ext[len(ext)-1]])
}
fmt.Printf("%v", m)
}
...
Edit As pointed out by @detunized the sort.Strings()
is not in the best spot. As it is in the examples it would trigger every loop which is not what we want in the end.
Hah. You caught me! The sort
should have been moved up into the print function(s) at the very least. It's a bad design decision - I put it there at first just for the sake of simplicity and never got around to cleaning it up. It doesn't really matter in a directory of a few files but would really impact performance in a larger directory. Something like the following might be fine, and still pretty simple to follow.
func plainList(m map[string][]string, v []string) {
for _, value := range v {
sort.Strings(m[value])
for _, file := range m[value] {
fmt.Println(file)
}
}
}
I think I may update the article to make sure it's called out for clarity.
Do It For Real-ish
We have the basic program done! Now we need to be able to run it against the actual file system. To do this we are going to use the os
standard library, as well as reflect
. We're going to add a couple of different pieces in this iteration of the code so let's dive in.
package main
import (
"fmt"
"io/ioutil"
"log"
"os"
"reflect"
"sort"
"strings"
)
In main()
we're adding os.Getwd()
to grab the users current working directory. If we can't determine it for some reason we'll panic with a message. Note, that I'm trying to give a bit of a more detailed error. We also don't panic but instead os.Exit()
. Why? Exiting with an error code felt better in this situation rather than a wordy panic()
. If not we'll try and read the directory, again failing if we can't read it. We also check to see if the file is a directory and skip over since we're only looking at files for now. We could sort them into a "directory" group I suppose, next time.
func main() {
wd, err := os.Getwd()
if err != nil {
msg := fmt.Sprintf("An error occured getting the current working directory.\n%s", err)
fmt.Println(msg)
os.Exit(1)
}
dir, err := ioutil.ReadDir(wd)
if err != nil {
msg := fmt.Sprintf("An error occured reading the current working directory.\n%s", err)
fmt.Println(msg)
os.Exit(1)
}
var m = make(map[string][]string)
for _, file := range dir {
if !file.IsDir() {
fileName := file.Name()
ext := strings.Split(fileName, ".")
if len(ext) > 1 {
m[ext[len(ext)-1]] = append(m[ext[len(ext)-1]], fileName)
}
if len(ext) == 1 {
m["no-ext"] = append(m["no-ext"], fileName)
}
sort.Strings(m[ext[len(ext)-1]])
}
}
We're using reflect
to get the values of our extension strings. Thank goodness for Go Docs! This will let us print them out as a separated list with the extension followed by the files that are in each group.
values := reflect.ValueOf(m).MapKeys()
for i, k := range values {
fmt.Println(values[i])
for _, val := range m[k.String()] {
fmt.Println(" -", val)
}
}
}
That seems to fulfill the base program requirements...
But Wait There's More
We're not done yet! We need to do one more iteration. Since this post is already getting a bit long we're going to skip forward. I'm going to add in several things we mentioned above. A switch to output JSON, a "plain" ls
style and the nested style hinted at above. We'll read the output format from the command line and use a simple switch statement to choose the right one. I wasn't very explicit with the variable names, it should be followable though.
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"log"
"os"
"reflect"
"sort"
"strings"
)
First thing I did on this iteration is pulling the print routines out of the main loop and into its own function. I then make two more print functions for each output type. I was going to try and be clever and over complicate things but having only one "print" function. In the end, I decided they were different enough it would be fine to have each routine on its own.
func plainList(m map[string][]string, v []string) {
for _, value := range v {
for _, file := range m[value] {
fmt.Println(file)
}
}
}
func nestedList(m map[string][]string, v []string) {
for i, value := range v {
fmt.Println(v[i])
for _, file := range m[value] {
fmt.Println(" - ", file)
}
}
}
If you look at the next three err
sections you'll see that they are more or less the same. If I extend this program any further beyond the basics it may be worth pulling these bits out. We could make an isOK()
type of function I suppose. This function would check the error and either exit or return as needed at the time.
func jsonList(m map[string][]string) {
j, err := json.Marshal(m)
if err != nil {
msg := fmt.Sprintf("An error occured formatting the JSON.\n%s", err)
fmt.Println(msg)
os.Exit(1)
}
fmt.Printf("%s", j)
}
func main() {
wd, err := os.Getwd()
if err != nil {
msg := fmt.Sprintf("An error occured getting the current working directory.\n%s", err)
fmt.Println(msg)
os.Exit(1)
}
dir, err := ioutil.ReadDir(wd)
if err != nil {
msg := fmt.Sprintf("An error occured reading the current working directory.\n%s", err)
fmt.Println(msg)
os.Exit(1)
}
var m = make(map[string][]string)
for _, file := range dir {
if !file.IsDir() {
fileName := file.Name()
ext := strings.Split(fileName, ".")
if len(ext) > 1 {
m[ext[len(ext)-1]] = append(m[ext[len(ext)-1]], fileName)
}
if len(ext) == 1 {
m["no-ext"] = append(m["no-ext"], fileName)
}
sort.Strings(m[ext[len(ext)-1]])
}
}
values := reflect.ValueOf(m).MapKeys()
To print the extensions in alphabetical order, I've added this quick loop. We use the values
that we got from the reflect
to and an ordered list of the extensions. The now sorted extensions
are passed into our print functions.
var extensions []string
for _, value := range values {
extensions = append(extensions, value.String())
}
sort.Strings(extensions)
When the program executes we check for the number of command line arguments. If we have more than one we check if it matches one of the cases. If not we print the usage instructions. If we have no command line arguments we print out the nested style file list.
if len(os.Args) > 1 {
switch arg := os.Args[1]; arg {
case "plain":
plainList(m, extensions)
case "nested":
nestedList(m, extensions)
case "json":
jsonList(m)
default:
fmt.Println("Usage: gls [plain|nested|json]")
}
} else {
nestedList(m, extensions)
}
}
Next time
And there we go! The post is getting a bit long so we'll hold off on the "bonus goal" of sorting files into directories. This code will become the base for that next time around. In the meantime, how would you have written a similar program? Let me know in the comments!
You can find the code for this and most of the other Attempting to Learn Go posts in the repo on GitHub.
shindakun / atlg
Source repo for the "Attempting to Learn Go" posts I've been putting up over on dev.to
Attempting to Learn Go
Here you can find the code I've been writing for my Attempting to Learn Go posts that I've been writing and posting over on Dev.to.
Post Index
Enjoy this post? |
---|
How about buying me a coffee? |
Top comments (11)
Steve, why do you sort inside the loop on every iteration?
And here's my take on it. You can sort by predicate. It's not exactly very efficient though, since the extension is recalculated every time. But come on, Go could be really annoying sometimes. Look at this verbosity:
In Ruby that would be:
Nah, no need for the boilerplate to define a custom sort. This would do sorting the map:
Edit: sorting the ห[]stringห within the หmap[string][]stringห. The map itself can't be sorted.
I don't have a map in my version. I sort an array by a predicate.
Yes, I have seen it. Your approach is different by just sorting a list of filenames. The orginal intent is to sort files by extension into different buckets.
Your use of filepath.Ext() is quit clever. Haven't thought of that.
This would make the example even shorter:
@detunized @dirkolbrich
Thanks for the replies!
filepath.Ext()
! Didn't occur to me to try that. It goes to show that the standard library really is pretty complete.Dirk, I like the
for ext := range m { sort.Strings(m[ext]) }
solution then I wouldn't need to have a separate sort in each "print" function, it's much clearer that way.Hah. You caught me! The
sort
should have been moved up into the print function(s) at the very least. It's a bad design decision - I put it there at first just for the sake of simplicity and never got around to cleaning it up. It doesn't really matter in a directory of a few files but would really impact performance in a larger directory. Something like the following might be fine, and still pretty simple to follow.I think I may update the article to make sure it's called out for clarity.
You don't mind?
use it with:
Hey Steve, fantastic start.
You've come up with some great solutions in there, so I thought I'd share my own.
I restricted myself to fitting a subset of what you've solved thus far, that is to say, get all the files organised by category, and print them out as JSON. I've ignored plain / nested output, since that is somewhat trivial/not business logic.
Here's my solution:
As you can see, I've drastically cut down on the number of operations needed to get there, as well as corrected for a few problems you weren't looking out for yet. These are mainly:
You should skip dotfiles or hidden files, which start with a
.
character (at least by default), as these are frequently config files or important somehow.You're expending a lot of effort sorting / printing your data, when really all you need is a map to handle the listing
output from my program (against a sample directory):
usage:
sorter -dir ./sample | jq
If I wanted plain output, with a map I could do something like this:
which would output like so:
This is somewhat trite and gross but you get the point, dealing with one map makes this much easier to handle!
Looking forward to seeing what you come up with next!
I think the core of my issue is I'm also not leaning on the standard library as much as I should. I didn't realize
filepath.Ext()
was a thing. :/ Yeah, I read "sorting files by ext" as just that sorting alphabetically, had I left that out I would have been done quite a bit quicker. I suppose that made me go off the rails a bit so to speak. The different printing methods were not needed at all but what are you gonna do lol.