Sitemap

Linear Transformations: How Matrices Move and Transform Vectors

6 min readJul 2, 2025

--

Introduction: From Combining to Transforming

In this blog, we learnt how new vectors can be created from existing vectors in a plane. By scaling (or squishing) vectors and then adding them together, we create new vectors — one at a time.

But what if we could transform ALL the vectors in the plane in one go? What if we had a systematic way to rotate, stretch, or flip every single vector in our space simultaneously?

Think of a video game you might have played where in one moment the room changes from front-view to top-view. Or imagine applying an Instagram filter to a photo — every pixel (which is just data represented as vectors) gets transformed according to the same rule.

That’s exactly what matrices do!

They’re like transformation machines that take in vectors and spit out new, transformed versions. And the best part? This is happening millions of times every second in AI systems — from rotating images in computer vision to transforming data through neural network layers.

Let’s see how this mathematical magic works and why it’s absolutely everywhere in machine learning.

What is a Linear Transformation? (Visual First)

Let’s first visualize what a transformation actually looks like.

Now, here’s the thing — not every transformation qualifies as a “linear transformation.”

There are a couple of important rules:

  • Rule 1: Keep things straight. Linear transformations must preserve straight lines. Imagine the coordinate grid as a bunch of parallel lines running horizontally and vertically. After transformation, these lines might get rotated, stretched, or squished — but they must remain straight and parallel. No curves, no bending! This ensures that vectors (which are straight lines from origin to a point) stay as vectors.
  • Rule 2: Origin stays home. The origin (0,0) cannot move. Think of it as the anchor point of our entire coordinate system. If we allowed the origin to shift, we’d be doing translation (moving everything by the same amount), which makes it an affine transformation, not a linear one. Linear transformations can stretch, rotate, and flip — but they can’t slide the entire space around.

These rules might seem restrictive, but they’re what make linear transformations so powerful and mathematically elegant. They preserve the fundamental structure that makes vector math work beautifully.

Scaling

When all vectors in the space are stretched or compressed by the same factor. This can be uniform (same scaling in all directions) or non-uniform (different scaling along different axes). Think of zooming in/out on an image or making everything twice as tall.

Rotation

When the entire vector space is rotated around the origin by a specific angle. Every vector gets rotated by the same angle, but their lengths stay the same. Imagine spinning a piece of paper around a pin stuck at the center.

Reflection (Flipping)

When the vector space is flipped across a line (like the x-axis or y-axis) or plane. It’s like holding up a mirror to the coordinate system — everything appears on the opposite side. For example, reflecting across the x-axis flips all y-coordinates to their negative values.

Shearing

When the vector space gets “slanted” or “skewed” in one direction while keeping one axis fixed. Imagine pushing the top of a rectangle sideways while keeping the bottom edge in place — that’s shearing. It creates a parallelogram-like distortion of the grid.

Enter the Matrix: The Transformation Machine

Let’s try to understand how does complete vector space and all the vectors in that space are transformed.

As we read in rules above, origin of the plane does not change. So it works like pivot around which space can move, and thus x-axis and y-axis also move around origin

Each row of the matrix corresponds to one of the axis of the plane. Like for a 2D plane, a matrix will have 2 rows.

Like for a normal plane, below matrix shows the position of basis vector î and ĵ.

And from linear combination we know that each vector can be represented in the form of î and ĵ.

And it can be represented by the above matrix vector multiplication.

That means if we can identify where î and ĵ will land in new, twisted vector space, we can find new position of any original vector.

Let’s understand how matrices transform the complete vector space and all vectors within it.

Remember our transformation rules? The origin stays fixed, acting like a pivot point around which the entire space can rotate, stretch, or flip. This means our x-axis and y-axis (along with everything else) move around this fixed origin.

Here’s the key insight: A matrix is like a recipe that tells us exactly how this transformation happens.

What the matrix tells us:

  • Each column shows where our basis vectors (î and ĵ) land after transformation
  • Each row gives us the recipe for calculating new coordinates

For a 2D plane, we use a 2×2 matrix:

[a b] ← Column 1: where î lands, Column 2: where ĵ lands

[c d]

Why this works: From our previous blog, we know any vector can be written as a linear combination of î and ĵ:

Any vector = x * î + y * ĵ

So if we know where î and ĵ land after transformation, we can figure out where ANY vector lands!

The magic happens through matrix multiplication:

New Vector = Transformation Matrix × Original Vector

The bottom line: Find where î and ĵ go, and you’ve cracked the code for transforming the entire vector space. The matrix is just a compact way to store this information!

The Connection: From Span to Transformation

Remember from our previous blog how two non-parallel vectors could “span” the entire 2D plane? Well, transformations completely change this game!

How Transformations Affect Span

When we transform our vector space, we’re essentially changing what our “building blocks” (basis vectors î and ĵ) look like. This directly impacts what vectors we can create through linear combinations.

Example: What Happens to Our Span?

Let’s say we apply this shearing transformation:

Before transformation: î = [1,0] and ĵ = [0,1] could span the entire 2D plane.

After transformation:

  • î becomes [1,0] (stays the same)
  • ĵ becomes [1,1] (gets tilted)

These transformed vectors can still span the entire 2D plane because they’re not parallel — just in a “skewed” coordinate system!

The Critical Question: Invertible vs Non-Invertible

  • Invertible transformations: Preserve the span. If you could reach any point before, you still can after transformation.
  • Non-invertible transformations: Collapse the span. Some information gets lost and you can’t reach all points anymore.

Example of collapse: A transformation that squashes everything onto a line destroys the 2D span — you lose a whole dimension of possibilities!

Why This Matters in AI/ML

You might be wondering — “Cool math, but why does this matter for AI?” Here’s the exciting part:

Images Are Just Matrices

Every photo is a matrix of pixel values. When you rotate, resize, or flip images for data augmentation, you’re applying linear transformations! Computer vision models use these transformations constantly.

Neural Networks = Transformation Chains

Each layer in a neural network is a linear transformation (matrix multiplication) followed by an activation function. Your data gets transformed multiple times as it flows through the network — that’s how AI learns complex patterns.

Data Preprocessing Magic

  • Scaling features to the same range? Scaling transformation.
  • PCA for dimensionality reduction? Finding the best rotation.
  • Normalizing data? Combination of transformations.

Bottom Line

Every AI system — whether recognizing faces, translating text, or predicting stock prices — is fundamentally applying sequences of matrix transformations to convert raw input into meaningful output.

--

--

No responses yet