TaiKedz

Posted on Sep 7, 2020 • Edited on May 5, 2022

Your bash scripts are rubbish, use another language

#bash #linux #python #reaction

(Headline photo from nixcraft's post to which I was reacting below)

TLDR:

Shell scripting is real programming

If you want to write shell script learn it properly

Use programming best practices like you would with other languages

Source-control it

It's weirdly idiosyncratic and needs extra attention

If the above doesn't suit you, you'll be better served by a less idiosyncratic language

So the below was a rant I posted in response to some pushback - someone suggested using Python instead of bash, and a few people complained about how it's overkill, how there are two versions of Python you need to get right, or you have to get it onto the machines in the first place, or suchlike.

I love shell scripting, and I still use it a lot, but I'm no fool. There are so many issues with it that blindly defending it for all use cases is foolhardy.

Not least because in company settings, most other people you will work with haven't the slightest idea what they're doing with shell scripts. They already get brownie points for thinking of putting the scripts in source control, kind a like getting grades for writing your name on your test.

SHELL SCRIPTING IS REAL PROGRAMMING. It should be source controlled, code reviewed and written to clean, maintainable standards. Because that code is meant for production.

So this is what I retorted:

If it's a personal machine, install Python - or other language of choice.

If it's the enterprise machines, have a policy to ensure Python - or other language of choice.

Shell is good and great, I'm a command line junkie myself, and I still turn to shell scripts for a lot of my work, but a shell script that's for more than wrapping a long pipe or two quickly becomes madness unless you actually put in the time and effort to learn the language PROPERLY. (The article feature pic is an example of utterly shitty code and their problem is not the filename spaces YET, but their RUBBISH handling of variables, and the lack of any effort to write cleanly)

I am an extensive bash scripter, and became so historically only for the reasons listed around here, which I once saw as valid reasons. They can all be worked around - and the overhead of solving "language X not on our machine farm" is often better return on effort than the years of unmaintainable brittle shell scripts you've been writing dozens of and never maintaining, or your colleagues cannot fathom and won't touch for love nor money. Unfoathomable shell scripts that run the backbones of deployments, builds, farm management and more - the backbones of many a company.

I've seen bash scripts by seasoned developers, and those are also utter trash. Your code may be cleaner than the average, but that's an extremely low bar.

Shell scripting is good and powerful in its own right, it is true, and I have advocated that people give it a proper try and actually learn it, but the sad reality is that nobody does. If it's any "proper" language, they learn the ins and outs gladly, by peer expectation or inherited bias; but shell, even if you learn it properly, someone else will come f/ck up your clean code because they can't be arsed.

I am endlessly pushing for developers and admins to actually learn to use bash/shell properly, make use of functions, encapsulate steps of logic, write clean code. "It's just a shell script," "that's overkill," "it's fine like this," "I don't want to sink time into learning this, it's not a real language anyway."

The other truth is that, in the sysadmins space, most can't even write clean Python/Ruby/Perl/PHP/JavaScript/chosen-lang either, and given the number of gotchas and things shell lets you get away with until you hit a catastrophic bug (of the coder's carelessness, the misunderstood shell behaviour is documented) (Steam bug anyone?), they'd be better off in a safer environment than shell scripting.

Shell scripting perennial issues that are not the fault of the coder:

will gladly let you get away by default with undefined variables (unless you explicitly set -ue)
comparison vs assignment is the only place where space matters (a=b and a = b are COMPLETELY different statements, wtf)
you cannot return arrays from functions, only a stream of text (the power and the Achille's Heel) (this one issue compounds many of the others, by preventing workaround functions from being written)
arrays cannot be passed down to functions as distinct items alongside other arguments (you can use references as a way around, but how many bash scripters know those?) (easier to use global variables right? yuck)
variables are global by default. unless you make your iteration counter local, you stand to see some weeeeird bugs
string splitting is done around an inherent part of the string, not as a a function operating on it (do we all know about IFS, does everybody know how to use it? didn't think so)
Is it really the shell you thought it you were running? Ever deployed bash scripts only to find that the only interpreter on the machine is sh? Or the environment forces you into sh by default? Or that in fact you're not running bash but ash? Or maybe the system default is dash. Anyhow, you have to write everything now in plain sh and lose any improvements that bash ever brought that make the task more bearable.
Inconsistent environments for common commands are rife. Your script uses the "mail" function? GNU or BSD, which options to use? You use netcat? Which variant, which options? You use tar, grep, rsync? You using GNU, BSD or Busybox implementations? (these variations happen endlessly when mixing Ubuntu, CentOS and Alpine deployments, and that's just the surface)
(I scoff at any pushback of "ensuring the right version of Python on the company systems")
attempting anything remotely event-driven yields a nasty pile of workarounds (I've tried, with muted success) Granted, this is a space which shell is definitely not designed for, but that's to say how far I tried to do everything in shell at one point. It's possible, but it's damn hard work where another language would have been better.

Most fundamentally, the view of shell as "not a proper language" hampers any impetus at large to learn it correctly and extensively, and understand its own idiosyncracies. At least with one of the other languages, developers have an inherited mindset that their skill in that language needs continual improvement, and will work towards this.

I still write tons of bash scripting. I love it. But recommending other people use another language is much more sane. Personally, I chose Python too. But in the end, I wouldn't recommend it unless you are going to do your darndest to learn. It. Properly.

Top comments (68)

Piotr Gaczkowski • Sep 7 '20

I saw once a Go library designed to mimic a UNIX shell and some of UNIX utilities. The idea was to use Go for what would traditionally be shell scripts.

The main two strengths of a UNIX shell are effortless pipes (try that with Python!) and external command execution as a first-class citizen.

xtofl • Sep 7 '20

Most modern languages allow function composition (pipes between functions). And they have extensive, stable repositories of libraries.

This boils down to shipping the right packages with the distribution - something Linux distro's already have, of course.

Cliff • Sep 10 '20 • Edited

Python's implementation support for pipes is reasonable and not that hard to use, but it's painful compared to actually using pipes directly in bash/tcsh/etc. Even using FIFOs on the command line feels more natural than the way you have to compose them in Python. The Unix way of composing operations is probably the best implementation of functional/stream programming ever.

And I am a Python lover, so don't think I'm knocking Python!

xtofl • Sep 10 '20

Absolutely right if you're going in and out of Python to create a pipeline of full fledged processes (I suppose you refer to the subprocess module and alike).

What I meant is: you can stay in the environment that supports 'tacit/point-free programming', and make sure you have everything you need:

# 'function' composition is a dash
# functions are processes
tac logs.txt | grep "http://" | xargs wget

Could be just as easy:

# function composition not built-in
# functions are native
compose( read_file("logs.txt"), filter_lines("http://"), wget )

provided you have these functions lying around somewhere.

Granted: Python has these FP concepts built-in, but not as nicely as the unix way. There are better languages for that: Haskell, F#, erlang...

-- my haskell is rusty - but function composition is a dot:
-- functions are native
pipeline = read_file . (filter_lines "http://") . web_get
pipeline "logs.txt"

Interestingly enough, someone already thought up a Haskell shell: Turtle

Cliff • Sep 10 '20

Yes, I was talking about composing processes like you would at the shell.

I see what you're saying about point-free programming, though. I still think the Unix style is the cleanest, most natural implementation of point-free programming, and I think the fact that it is a genuine stream of processing is a big point in its camp. However, if your Haskell example is accurate, I like it. The examples the Wikipedia article give seem less intuitive and a lot more LISP-y.

I think most programmers would probably find the use of compose in Python a lot less intuitive than nested generator functions, and it's certainly an inelegant implementation of point-free programming. I also wonder if it can eliminate some of the advantages of the generators? It probably doesn't based on the sample implementation, but I'd have to think carefully about if applying partial like that would have unintended consequences, at least in some cases.

xtofl • Sep 10 '20 • Edited

I think so, too. You can make a nice 'fluent' DSL out of it, though.


class Chain:
        def __init__(self, *fns):
                self.fns = fns
        def __or__(self, fn):
                return Chain(*self.fns, fn)
        def __call__(self, arg):
                return reduce(lambda ret, f: f(ret), self.fns, arg)

Chain() | read_file | create_filter("https://") | web_get

def double(x):
        return 2*x

def fromstr(s):
        return int(s)

def inc(x):
        return x+1

def repeat(n):
        return lambda s: s * n

c = Chain() | fromstr | inc | double | str | repeat(3)

assert c("1") == "444"
assert c("20") == "424242"

TaiKedz • Sep 10 '20

..... ingenious.

I'm still trying to brain this, its possibilities and its limitations but... wow.

A bit of commentary would be very welcome :-)

xtofl • Sep 10 '20

I'll expand it in a full fledged post :) Or 'leave it as an exercise'?

Cliff • Sep 10 '20

I either love this or hate it, I can't decide. Bravo, sir!

xtofl • Sep 16 '20

Somehow, I have hit the 'publish' button on dev.to/xtofl/i-want-my-bash-pipe-34i2.

Cliff • Sep 16 '20

I've only skimmed the article so far, but it looks like a good one. I like the title! 😁

TaiKedz • Dec 9 '20 • Edited

@xtofl , @Cliff , I have finally gotten round to this, I think you will be gleefully dismayed.

dev.to/taikedz/shellpipe-shellpipe...

TaiKedz • Sep 7 '20

Yeah, doing pipes is a nightmare in anything other than shells, and the reason why I still love using shell scripts - chaining tools.

But there's a lot of logic that can often be sub-moduled out to other languages to make a cohesive whole. Some systems people don't like the idea of a program not being self-contained in a single file and you end up with 2000+ lines of sub-optimal code at best, more often than not downright horrendous though... take a peek at the install scripts of some of your favorite software to see what I mean...

Shawn McElroy • Sep 8 '20

In regards to having to choose between 2 versions of python, no not anymore. 2.7 is end of live and at minimum everyone should be using 3.6+. But you should be on the latest stable, currently 3.8

TaiKedz • Sep 8 '20

Yeah, but legacy scripts are totally a thing. Some outfits don't even know you can have both installed side-by-side so you can do a progressive script migration. So they did no migration. And they mandate all Python2 even now. It's not pretty. Doing my best on my side to further educate my colleagues, but that's just my corner...

Shawn McElroy • Sep 8 '20

Yea legacy is a different thing I agree. But we should all be encouraging to get people to upgrade.

Andrew Pazikas • Sep 7 '20

As an Oracle Engineer for a large coporation 95% of the automation I write I use ksh, no python3 in RHEL 7 where I work and old legacy hosts dont even have bash so ksh works everywhere. So much of the automation my team writes is all ksh for a whole host of tasks and while that last 5% is a bit of Python I can't see it dynamic changing anytime soon as SHELL scripting is so imbedded into the way the team and the business functions.

TaiKedz • Sep 7 '20

Once a technology is adopted it's hard to unstick it... That being said in your case it sounds like everybody gets the training required to write clean ksh? So long as they've learnt ksh fully and not "just the easy bits," then my argument remains the same: so long as there is impetus and requirement to learn the language properly, there's no (or much less of a) problem!

Louis Low • Sep 7 '20 • Edited

Bash is real programming. I wrote tons of modular complex logics and functions with just Bash running on top of many critical backbone servers. Bash is a beautiful language to me. Once come to the Linux environment on Server or IoT platforms, Bash is a mandatory language. If I need my program doing floating-point maths. I'll use C or C++ to create a tiny efficient engine with added APIs for the Bash script to sit on top of it. You know... Bash loves piggyback anything. It's cute.

TaiKedz • Sep 7 '20

Yep it's real programming alright. That said, I've seen many cases (and been guilty of a few) where the logic would have been better moved to other languages, in their own succinct modules, and used bash to tie the pieces together. It all depends on the use case of course :-)

csgeek • Sep 8 '20 • Edited

I usually start with a bash script. Because it's just easy. Then I need to do something too complicated like concatenate a string or use arrays, hashmaps etc and I end up rewriting the entire thing in python.

I'm sure there are many people that are amazing at writing bash scripts but it's never been very readable. Once you get just a bit complicated languages like PERL are starting to look pretty by comparison.

Also my biggest petpeave bash sed and awk are not compatible between Linux, mac and likely other variants.

TaiKedz • Sep 8 '20 • Edited

I'm sure there are many people that are amazing at writing bash scripts

No there are only a handful :-D (I'm only half kidding)

Also my biggest petpeave bash sed and awk are not compatible between Linux, mac and likely bad variants.

My biggest gripe with shell scripting is the tooling dependency.

I used to quip "in bash, ANY language is your library!". The obverse of course is true: any tool can subsume any other in any given environment, and you won;t know til your script crashes.

macOS uses the BSD Utils by default, as do the BSDs in general ; Fedora uses GNU Coreutils, except when they use a BSD adaptation ; Ubuntu is GNU Coreutils most of the time; Alpine uses BusyBox (a great tool in and of itself, but a thorn for cross-platform shell scripting) ; ...

csgeek • Sep 9 '20

it's just limited. It makes it harder to write tests and follow most coding patterns.

Granted there are tools like this: github.com/sstephenson/bats but not sure if anyone uses them. Also.. Libraries!! How many times do we need to re-write the same fix to the same problem.

macOS uses the BSD Utils by default, as do the BSDs in general ; Fedora uses GNU Coreutils,
except when they use a BSD adaptation ; Ubuntu is GNU Coreutils most of the time; Alpine uses
BusyBox (a great tool in and of itself, but a thorn for cross-platform shell scripting) ; ...

So.. you're saying that it's a very repeatable and consistent ecosystem? O.o

Yeah that's part of the issue. It's easy, it's everywhere just run bash foobar.sh, except when it doesn't work and you have to write 7 versions to support all the various edge cases.

It's not as easy to write, but I'm really liking go. It's way more complicated and verbose than bash, but at the end of the day i end up with 1 file to copy around.

TaiKedz • Sep 9 '20

Yeah libraires... is why I started my bash-builder project and its sibling bash-libs. Build the script and have... a single file to copy around ;-)

The backbone of most of my bash scripting nowadays...

John Robertson • Sep 10 '20

"you cannot return arrays from functions, only a stream of text"

Never tell an engineer they can't do something. Please see:
dev.to/jrbrtsn/returning-an-array-...

TaiKedz • Sep 10 '20

That's not "returning" an array, but passing a reference to be manipulated in-place :-)

Still, a useful way of passing arrays down/up. You just.... need to know the technique... forget about recursive functions though. The name in the receiving function must be different from the one in the caller function.

John Robertson • Sep 10 '20 • Edited

Passing a result buffer by reference into a function is exactly what C++ does under the hood to present the illusion that functions may return something too large to fit in a CPU register. Recursive functions in Bash are a bit trickier, but I'll post a solution to that later today ;-)

TaiKedz • Sep 10 '20

I dunno, last I heard (and I am not versed in C so I could be talking complete nonesense), C functions return memory addresses... and of course, it is up to the programmer to know what any one function is passing back, be it an actual int, or a reference to a data structure of any kind...

I look forward to whatever solution you have to enable getting around the name clash during recursion with reference variables :-)

John Robertson • Sep 10 '20

Tai,
C functions can only "return" something which can fit in a CPU register, because that is an efficient place to stash information which can be retrieved after the function "returns" - meaning the content of the CPU register gets copied into a stack variable or some other place useful to the programmer. C++ presents the illusion of returning something larger than a CPU register by silently creating space on the stack (a return buffer, if you will) before the function gets called, and then silently passing a reference to this return buffer into the function where it will get populated. Semantic shenanigans if you ask me.

xtofl • Sep 10 '20 • Edited

That is wrong indeed.

C functions return typed values. Indeed once the structures grow bigger, programmers tend to revert to output arguments, which are often pointers to structures. Also, for lack of exceptions in C, functions often return error codes, and accept 'output' arguments as well.

In C++, types are much more used to advance the correctness of the program. Exceptions, variants, ... all make this a lot easier.

But... you can pass mere void* around. Severe pain should be your punishment - though there are legitimate cases for it.

John Robertson • Sep 10 '20

xtofl - you may wish to review the assembly code produced by a function call in C before declaring that I have made an error ;-)

xtofl • Sep 10 '20

I'm sure the assembler does emit low level stuff like that. The semantics of the language are copying a value out of the function, though.

John Robertson • Sep 10 '20 • Edited

In C the semantics what a function can return are limited to something which which will fit in a CPU register because that is what the term return means. The C compiler provides an additional service of promoting or truncating this returned value to match a simple return type. You can't successfully return any arbitrary local-to-the-called-function stack based struct from a C function, because the content of this struct is undefined as soon as the function returns. You can pass the address of any arbitrary struct into a C function, where it may then get populated.
In other words, for C there is a bright line of distinction found in what may be returned, while in C++ this line is syntactically blurred with some compiler tricks.

xtofl • Sep 10 '20

I am very confused now. It's off topic here, but what does 6.9.1/3 of n1256 mean, then?

The return type of a function shall be void or an object type other than array type.

And then there is paragraph 12...

12 If the } that terminates a function is reached, and the value of the function call is used by
the caller, the behavior is undefined.

Maybe I should stick to C++ then.

John Robertson • Sep 10 '20

If you don't care about the assembly generated by your compiler, stick with C++ is a good choice. That said, I picked up C 32 years ago, and it remains one of the most popular programming languages. I am roughly 5x as productive in C as C++, and I've been programming in C++ for 27 years. It's getting worse with each subsequent C++ standard.

xtofl • Sep 10 '20

Not debating language superiority.

But I'm truly dazzled you would not be allowed to return a perfectly valid object. Let the compiler inject the correct assembly to accomodate it.

What about this one stackoverflow.com/a/9653083/6610? Wrong, too?

John Robertson • Sep 10 '20

The stackoverflow example returns a value which is the size of an int. That fits just dandy into a CPU register. One of the focuses of the C++ standards committee has been trying to squeeze out all the superfluous copying which has been implemented to support clear semantics - so there's your reason.

xtofl • Sep 10 '20 • Edited

Sorry, I keep finding counterarguments to your statement, unless I misunderstood.

This article clarifies some things, too: uninformativ.de/blog/postings/2020....

Could you be talking about 'early versions' of C?

I'll try compiler explorer tomorrow - now's nap time here.

John Robertson • Sep 10 '20 • Edited

From the Wikipedia page:

In regard to how to return values, some compilers return simple data structures with a length of
2 registers or less in the register pair EAX:EDX, and larger structures and class objects requiring
special treatment by the exception handler (e.g., a defined constructor, destructor, or assignment)
are returned in memory. To pass "in memory", the caller allocates memory and passes a pointer to it
as a hidden first parameter; the callee populates the memory and returns the pointer, popping the
hidden pointer when returning.[2]

There's your compiler trick of silently allocating stack space, and then passing it by reference to the function to get populated. After the function returns, C++ compilers historically have then copied the result from the invisible-to-the-programmer return buffer into something the programmer knows about. For large objects this is a significant performance penalty to pay for some syntactic sugar.

xtofl • Sep 11 '20 • Edited

Can you put a date on that? "Historically", that must mean pre-2003: since then, every major C++ compiler does copy elision. C++ has wildly drifted away from what C used to be, to follow its credo "don't pay for what you don't use".

But thanks. It is indeed very interesting to see how you're drawing totally opposite conclusions depending on your viewpoint. Here's a (reasonably) honest analysis of the schysm between C and C++, by someone in the standardization world: cor3ntin.github.io/posts/c/index.html

hardware-focused: a C compiler has to perform assembly 'tricks' to accomodate struct returning
problem-focused: returning structs is the most trivial thing a compiler should allow

I conclude that

the C standard totally allows returning structs
it leaves the semantics to compiler builders ('undefined')

Btw. with godbolt.org/z/a4exrM, you can compare the assembly generated by a large number of compilers.

John Robertson • Sep 11 '20 • Edited

I'll concede that my understanding of C++ compilers is dated - in 2003 I had already been coding in C++ for 10 years.
The crux of this disagreement is the definition of the word "return". In my opinion, creating the semantics of returning an object by silently passing in the reference to a possibly significantly sized return buffer of which the programmer may or may not be aware is not "returning" an object at all. Passing return buffers to a function by reference has been possible and clear since 1972, but I still cannot write in C or C++:

BigClass1, BigClass2, BigClass3 funcname(int arg1, double arg2)
{
   BigClass1 rtn1;
   BigClass2 rtn2;
   BigClass3 rtn3;
 // do stuff with arg1, arg2. Populate rtn1, rtn2, rtn3
  return rtn1, rtn2, rtn3;
}

However, I have always been able to write:

int funcname (BigClass1 *rtn1, BigClass2 *rtn2, BigClass3 *rtn3, int arg1, double arg2)
{
   // do stuff with arg1, arg2. Populate *rtn1, *rtn2, *rtn3
   return SOME_ERROR_CODE;
}

So, is the second example "returning" 3 objects? If not, then why?
By implementing the semantic charade of returning a single arbitrary object, modern compilers have accommodated a single case where it appears in source code that a function can return something which will not fit in CPU registers:

HugeSizeClass funcname (int arg1, double arg2)
{
   HugeSizeClass rtn;
   // do stuff with arg1, arg2, populate rtn
  // Throw exception on error
   return rtn;
}

So, where (stack|heap|static data segment) does rtn exist? How does the contents of rtn make it back to the caller? How is this not superfluous copying?
I contend that the following code is much clearer and guaranteed to be at least as efficient :

HugeSizeClass* funcname (HugeSizeClass *rtn, int arg1, double arg2)
{
   // do stuff with arg1, arg2, populate *rtn
   // Return NULL or throw exception on error
   return rtn;
}

xtofl • Sep 13 '20

"Returning" is indeed an abstract concept. Each platform makes it concrete in its own way. To the CPU, there's no such thing as 'returning'. I don't think opinions should matter here; it's always a choice.

To me, returning a tuple is fare more readable than mixing input- and output arguments of a function. Modern languages accomodate this, and together with 'destructuring', it leads to code that most developers can readily understand.

#python
def explode_url(url):
  ...
  return prot, dom, path

protocol, domain, path = explode_url("https://x.y.z.com/a/b/c")

// C++
auto explode_url(const string_view& url) {
  ...
  return make_tuple(prot, dom, path);
};
auto [protocol, domain, path] = explode_url("https://x.y.z.com/a/b/c");

As an application programmer (less so as a driver implementer or kernel hacker), I value this s expressiveness in a language. Any compiler that does not know this 'magic' forces me to work around it.

Hey, as an aside - cor3ntin wrote a nice overview of what divides C and C++ worlds: cor3ntin.github.io/posts/c/. He shouldn't have called it 'The Problem", but I like his analysis, which goes way beyond the technical.

John Robertson • Sep 13 '20 • Edited

I had already read the article, thanks for sharing. "Expressiveness" is an entirely subjective term which merits little discussion. In my opinion, any pointer passed in which is not marked const refers to a return buffer. If you study libc's prototypes, you can see the established convention of passing in the address of the return buffer(s) first.
Back in 1993 I would have asserted confidently that C++ will overtake C in a decade or so. Live and learn.

xtofl • Nov 12 '20

I have noticed that in most circles I have communicate with (I started around 2000), abstractions are welcomed (within limits), while in the embedded domain, the feel with the hardware is appreciated more. There seems to be a scale ranging from bare metal to functional programming.

It would be an interesting exercise to see which concepts make code more expressive for you, and which for me, and if there are certain 'clusters' of developers that benefit from the same kind of style. Probably there are some studies about it already - I'll have to look around.

Lex • Sep 7 '20

I support you, I love bashScript is my favorite code language, it is very powerful, in my begin with bash, my due to my big ego I did it my bashScripts complex, you know, to prove that I knew coding with bash, right now I always try to do my code very, very simple, for anyone can understand it easily, I like a lot the one liners and use the test [[ ]] and && for the flow control, in one line, but that do it more complex of read my code, so the best is use indent😁 and do your code of the most simplest way😁👍✌️

Eljay-Adobe • Sep 7 '20

I use bash day in and day out. It's my favorite shell. I've been using it for a long time.

But... whenever I need to do something for production, I reach for Python 3.x. I use git for source control. I have my code peer reviewed.

I've converted some other peoples bash scripts into Python. I've converted some other peoples Perl scripts into Python. I've converted some other peoples JavaScript on Node.js into Python.

Because I love Python? (Well, I am fond of it, true.) No, because the other code was hard to understand and hard to maintain and hadn't been code reviewed and wasn't abiding by the approved scripting engines. (Node.js is actually approved, but the other points hold.)

The "hard" part isn't a shortcoming of the language (after all, these aren't PHP), it's a shortcoming of the programmer making a tasty pot of spaghetti code -- and spaghetti code can be written in any language.

TaiKedz • Sep 8 '20

spaghetti code can be written in any language.

Yes, but some languages (and their baggage) can be more conducive to spaghetti ;-)

If you look for examples of good Python, Java, JavaScript, C, Golang, etc, you can find them, and there are LOTS of people trying to demonstrate how to do it properly. Examples are everywhere.

Shell (and to a lesser extent Perl) is plentiful in the wild - and it is mostly the awful stuff that is most readily available. (this was my point about peer expectations to improve skills in some languages but not others).

If you did the conversions in a company, and everyone else can write clean Python then great :) Although perhaps getting people to write clean lang-x in the first place would have been just as productive. I speak from experience when I say trying to make people learn and write clean shell is a pain.

Eljay-Adobe • Sep 8 '20

That is a valid counterpoint, and I concur. Good code can be written in any language, except PHP of course. (I would have said PHP or Perl, but I've actually seen good code in Perl, so I know it is actually possible.)

Cliff • Sep 10 '20

I really can't say I find much to disagree with here.

If I can think of how to do a task 80-90% of the way on the command line without much head-scratching, I almost immediately reach for Bash. If it's a clearly OS- or file-system-centric or especially stream-oriented task, same. But as soon as it seems like there's a lot of in-place mutation needed and certainly if I feel like an array/list/hashmap will be necessary -- arrays are such a pain in Bash -- then I reach for Python. If it seems like it's going to be complex with optional arguments or different types of runs, straight to Python. As soon as what was a straight-forward Bash script starts getting requests to do more stuff, rewrite it in Python.

I once write a 650+ line Bash script -- including the generous comments and formatting -- to generate reports to streamline security auditing on a network of user systems and servers, because one of the sys admins didn't know Python and said it would be easier if they ever needed to modify it. It was solid, effective, fast, and even configurable with the ability to add new checks using regular expressions. But that was the point at which I said "never again" for something like that. The last I heard, it's still in use a decade later and has hardly been touched except for a little tweak here or there due to some network changes. They actually use it to vet any new tool they bring in. So I'm pretty proud of that, even if I'd never write such a monstrosity now.

TaiKedz • Sep 10 '20

I know the pain of such pride :-)

I did a couple projects where a fair bit from one was being re-used in the other (generic stuff, like output control, string validations etc) and I started putting bits into individual files etc and eventually ended up putting together a build tool that allows using scripts from a "library" and writing #%include lines in the main script. There's some extra management in there to prevent double-including files (two dependencies with a shared third dependency) etc. Everything gets built into a single script that can then be passed around/deployed.

Some of my scripts are now beyond the 1000+ line but that's only because they pull in a few "external" libs :-)

And yeah, your heuristic for transition point sounds eminently sensible :)

Cliff • Sep 10 '20 • Edited

Ha! Bash can definitely give you a bit of a Frankenstein complex, simultaneously proud and horrified at what you've created.

(Not to be confused with Asimov's definition of "Frankenstein complex".)

Cliff • Sep 10 '20

Ha! Bash can definitely give you a bit of a Frankenstein complex, simultaneously proud and horrified at what you've created.

(Not to be confused with Asimov's definition of "Frankenstein complex".)

Sheldon • Sep 9 '20

This is where PowerShell shines. Cross platform, shell capable, rich objects and ecosystem. Pester test framework supports tested robust modules. If dotnet is in your environment it works great and the pipeline is powerful.
I was surprised at how much of a transition I had leaving windows behind and moving to macOS and docker based workspaces. Most of my stuff runs smoothly on all 3 systems

TaiKedz • Sep 10 '20

I never gave PowerShell a go on anything other than Windows... I found it there to be excessively verbose for a command language, but I can see where that might in fact improve its use as a scripting language.

The availability of object output from tools is certainly a step up from parsing text streams (especially those that are designed for on-screen display, instead of automations), but then that puts the onus on the tool writer to write for that compatibility. I'd more readily have a JSON parser in lang-x and use it to extract from the outputs from other tools, and standardise other tools around JSON (seems to be the most common thing to handle these days). This could even unify web-API-based programming and shell programming nicely...

Question remains though - is most of its functionality derived from built-in functions, or calls to commands (actual executables) in the local system? Not sure, but as I understand you, you are running the same PS in Windows and non-Windows environments?

Also what's the average quality of examples out in the wild to learn from? From Python / JS / C etc there are plenty of articles, blogs, tutorials and the likes where "clean code" is insisted on, and professional environments may insist on it.

shell scripts - including PowerShell so far as I have seen - are rarely seen as targets for the same zeal of cleanliness and suffer as a result....

View full discussion (68 comments)