Basics of Semigroup in Haskell

aibhstin profile image Aibhstin ・3 min read

Fundamental to learning Haskell is learning the algebras that make Haskell as powerful as it is. The ‘Semigroup’ typeclass in Haskell is an abstraction that serves to provide the semigroup algebraic structure for types that implement it.

Any type ‘a’ can implement the Semigroup typeclass, providing that the type has an associative binary operation of the type:

a -> a -> b

The binary function needs only to be associative, not commutative. If the function is both associative and commutative then the semigroup can be called an abelian semigroup.

In Haskell, the Semigroup typeclass is (minimally) defined like this[1]:

class Semigroup a where
  (<>) :: a -> a -> a

In order to have a valid instance of Semigroup defined for a type, the only function that must be defined is (<>). This function serves as the associative binary operation mentioned before, that takes two values of type ‘a’ and returns a single value of type ‘a’.

GHCi> [1, 2, 3] <> [4, 5, 6]
GHCi> "Hello " <> "World"
"Hello World"

The binary operation (<>) performs on lists works as expected, by concatenating the two lists together. You can see the associative properties here:

GHCi> [1, 2, 3] <> ([4, 5, 6] <> [7, 8, 9])
GHCi> ([1, 2, 3] <> [4, 5, 6]) <> [7, 8, 9]

So the semigroup operations works as expected with lists. What about numbers? Numbers also form a semigroup (Actually, an abelian semigroup.) under the addition operation. So 5 <> 5 should yield 10? Let’s see:

GHCi> 5 <> 5

<interactive>:23:1: error:
    • Ambiguous type variable ‘a0’ arising from a use of ‘print’
      prevents the constraint ‘(Show a0)’ from being solved.
      Probable fix: use a type annotation to specify what ‘a0’ should be.
      These potential instances exist:
        instance (Show a, Show b) => Show (Either a b)
          -- Defined in ‘Data.Either’
        instance Show Ordering -- Defined in ‘GHC.Show’
        instance Show Integer -- Defined in ‘GHC.Show’
        ...plus 23 others
        ...plus 89 instances involving out-of-scope types
        (use -fprint-potential-instances to see them all)
    • In a stmt of an interactive GHCi command: print it

Instead of the expected 10, GHCi throws an error. Why?

The reason, as GHCi vaguely implies, is because of ambiguity. Whilst integers do form a semigroup under the additive operator, they also form a semigroup under the multiplicative operator. The ambiguity is that GHCi doesn’t know which one to use.

data Sum a =
  Sum a
  deriving (Eq, Ord Show)

First, we define a new datatype called ‘Sum’.

instance Num a => Semigroup (Sum a) where
  (<>) (Sum x) (Sum y) =
    Sum (x + y)

By creating a new datatype to wrap around the numerical part, we can define a semigroup specifically for this new datatype to implement only one of the possible semigroup operations. We can do the same for multiplication too:

data Product a =
  Product a
  deriving (Eq, Ord, Show)

instance Num a => Semigroup (Product a) where
  (<>) (Product x) (Product y) =
    Product (x * y)

Now, the semigroup operation works as expected through these new datatypes:

GHCi> Sum 1 <> Sum 3
Sum 4
GHCi> Product 3 <> Product 4
Product 12

Now, the only issue we face is extracting the numerical part from this datatype in order to use it as normal. To do this, we can make one small change to the datatypes we defined:

data Sum a =
  Sum {
        getSum :: a
  deriving (Eq, Ord, Show)

data Product a =
  Product {
            getProduct :: a
  deriving (Eq, Ord, Show)

We’ve simply turned the datatypes into records, which makes extracting the number trivial:

GHCi> Sum 1 <> Sum 3
Sum {getSum = 4}
GHCi> getSum $ Sum 1 <> Sum 3
GHCi> Product 2 <> Product 10
Product {getProduct = 20}
GHCi> getProduct $ Product 2 <> Product 10

Key Takeaways

  • Semigroup is an algebraic structure that ties a type together with an associative, binary operation.
  • An abelian semigroup is a semigroup whose associative binary operation is also commutative.
  • In Haskell, this algebra is abstracted out through the ‘Semigroup’ typeclass
  • An instance of the Semigroup typeclass can be defined for any type as long as the semigroup operation (<>) :: a -> a -> a is defined.
  • The definition of datatypes to wrap around other types is necessary to make use of semigroup for certain types that have multiple valid semigroup operations.

And that’s it. The Semigroup typeclass is the simplest and most intuitive in Haskell, but is necessary both in the sense that knowledge of it is important to the understanding of more advanced algebras, and that many of the more advanced typeclasses derive from Semigroup or one of its other derivations.

Posted on by:

aibhstin profile



I'm an Ethical Hacking & Cybersecurity student and a Haskell programmer.


markdown guide

Any type ‘a’ can implement the Semigroup typeclass, providing that the type has an associative binary operation of the type: a -> a -> b

Don't you mean a -> a -> a?