Here is my first effort to explain, in basic terms, the concept of a “module”. I tried to make it accessible to those not studying pure math, while still remaining interesting to those who are. I have no idea how well this can actually work in practice (or if it even did work); hopefully people will just skip over any terminology they don’t understand instead of giving up on the whole article. Better yet, ask me questions in the comments!
In linear algebra courses, we learn about vector spaces: these are algebraic structures where you can add two vectors (), or scale a vector by some amount (). In applied treatments, these “scalars” are usually tacitly assumed to come from the real or complex numbers ( or ); it is seldom mentioned that the “scalar multiplication” of a vector space is really an action of a field on an abelian group .
If you’re puzzled by this last phrase, don’t worry. The word action merely indicates a rule for assigning to each element of a transformation (indeed, a group homomorphism) of , in a particularly nice way: to each we associate the “multiplication by ” map given by the rule
To indicate the field of scalars explicitly, we might call an -vector space or a vector space over . To summarize, an -vector space is a set , equipped with a rule for adding two elements of , as well as a rule for scaling elements of “by” elements of .
A ring is, loosely speaking, a structure in which we can add and multiply elements, which satisfies most of the usual arithmetic laws. Fields are a very special kind of ring: the multiplication of a field is commutative (), and every nonzero element has a reciprocal . Since they have so many wonderful properties, they are much more “rigid”: on an elementary level, their nature is far less complicated than that of rings in general (don’t get me wrong, there’s still a lot we don’t know about fields). However, there are rings we deal with every day which (for simple reasons e.g. the lack of reciprocals) are not fields, for example, the ring of integers:
Something I’ve been meaning to learn more about for a while are “vector spaces where the scalars are allowed to come from any ring at all, not necessarily a field“. In mathematics we call these -modules, or modules over (or just modules when the ring of scalars is clear). Hence, a vector space is just a module over a field. They’re immensely useful in the study of rings themselves, and most people usually glimpse them for the first time in a course on commutative algebra (perhaps when they begin as a grad student). Unfortunately, it will be another 8 months before I have a chance to take an actual course in commutative algebra (PMATH 446), and I’m too impatient for that.
One thing we notice about vector spaces is that their structure theory is trivial; it’s about as nice as it could possibly be. Namely, -vector spaces are in some sense “completely determined” by their dimension. You’re probably familiar (at least in the case where the dimension is finite) with the fact that, by choosing a basis, any such space can be viewed simply as .
Intriguingly, when we move from the setting of vector spaces to the more broad world of modules, the “more complicated” personae of general rings (compared to fields) mangles the situation significantly. A lot of our linear algebra, which we were able to develop with elementary methods, is vehemently defenestrated. In particular, it is no longer even true that a basis always exists for a module (in fact this is a pretty rare situation, and such modules are called free). This means our much-applauded concept of dimension doesn’t, in general, even make sense for modules. Nor is the “completely decomposable” nature of vector spaces shared by modules: it’s actually possible to construct huge modules which don’t even have a single ”submodule” (other than the obvious ones and ).
For our very first example of some of the ideas modules generalize, let’s talk about abelian groups: these are simply sets equipped with an associative, commutative binary operation , an identity element, and inverses. Given any abelian group, we can define a rule for “scaling” elements of by integers, that is, elements of . Namely, for we define ( times), simply using the group operation , and otherwise if we define (that is, with recourse to the case). This is a perfectly natural way of turning any abelian group into a -module. On the other hand, it is obvious that if you start with a -module you can just forget about the scalar multiplication altogether, and you’re left with an abelian group. So -modules are the same thing as abelian groups.
If you think back, you’ll probably recall that a large part of linear algebra had to do with linear operators (more concretely, matrices) and doing things with them, like finding their eigenvalues and eigenvectors, characteristic polynomials, determinants, traces, and so on. A lot of work was put into discussing when “diagonalisation” is possible, and how to achieve it. Since we’re not always lucky enough to be able to do this, you probably learned about canonical forms: the “next best thing” to diagonalisation where we usually try to get some kind of “block diagonal” form. Namely, Jordan canonical form, rational canonical form, and all that. So why should we even care about modules? Furthermore, why were linear operators (square matrices) so much subtler objects to deal with than vector spaces themselves?
The following cool idea provides what I believe is an epistemologically satisfactory answer. First, recall that the set of all polynomials with coefficients in forms a ring under the usual operations of addition and multiplication, known as the polynomial ring . Suppose is an -vector space. I claim that a linear operator is the same thing as an -module structure on . Notice that is already an -vector space, and any action must respect the ring structure of , so my previous claim reduces merely to saying that a linear operator is the same as a rule for scaling an element by the element . What is the obvious thing to do? Well, simply define for all , right? Then right away, this gives us a scalar multiplication of on : namely,
Here, of course, refers to the composition of with itself times, that is, the map . For this reason many authors will refer to this as an -module structure on , since the indeterminate is literally acting as . Okay, so every linear map gives rise to an -module structure on . What about the other way? That is, if we have some -module structure on , can we get a linear map from it? We definitely can: define by merely setting where denotes the scalar multiplication of by , provided to us by the -module structure! Then, simply by the conditions we impose on how a “scalar multiplication rule” must behave, it follows that is linear.
If we think about it for a second, we realize that the submodules of the module obtained from the linear map are precisely the subspaces of which are invariant under , namely, the subspaces such that for all , or stated another way, . When we diagonalise a matrix by finding a basis consisting of eigenvectors, what we’re effectively doing is understanding how the associated linear map’s domain is made up of a bunch of one-dimensional invariant subspaces (the eigenspaces). Since we know this is not possible in general, we deduce that these modules will not, in general, admit a decomposition into one-dimensional submodules. It’s interesting to think about how properties of the matrix, like its characteristic polynomial for example, are encoded in the algebraic properties of the resulting -module…
Canonical form theory for square matrices over a field falls out as an easy consequence of structure theory for certain kinds of modules (to be precise, the “finitely-generated modules over principal ideal domains”). Recall that we previously mentioned the interchangeability of the concepts of “abelian group” and “-module”. Since is one of the first examples of a “principal ideal domain”, the celebrated structure theorem for finitely generated abelian groups (and its special case for finite abelian groups) is also a special case of this theorem on modules! So, aside from the formality, one could almost argue that you were essentially doing some primitive, well-cloaked module theory in Linear Algebra 2.
To close off this quick initial glimpse into module theory, I will mention one more place modules crop up: a branch of mathematics called the representation theory of finite groups. Loosely speaking, a representation of a group is a way of viewing the group as some set of matrices acting on a vector space.
The Yoneda lemma from category theory tells us that contemplating how one algebraic structure acts on others can yield profound revelations about the object itself: for a concrete example, Cayley’s theorem in group theory says that every group “is” just a permutation group of some set, and this lies at the heart of why we study representations of groups, modules over rings, and so on.
Formally, a representation is a group homomorphism where is the “automorphism group” of , or in undoubtedly more friendly language, the set of invertible linear operators . It turns out that, in much the same way we whipped out a module over a polynomial ring in one variable to capture the “essence” of a linear operator on , we can construct a pretty natural ring from the group , known as the group ring or group algebra, denoted . Basically, you consider the set of all finite “formal sums” of elements in with coefficients from and define a multiplication on it by using the group operation of . Then it turns out that representations of and -modules are (just like abelian groups and -modules) completely interchangeable concepts. Of course, the analogy becomes a bit more complicated if you move to, say, the representation theory of topological (for example, Lie) groups, since then you need to introduce a kind of “analytic version” of the group algebra.
Anyway, I’ve barely scratched the surface of all the interesting questions you can ask. Thanks for reading, and again, feel free to leave questions or comments.