Why Melt (unpivot) is the most powerful function in Pandas

Pandas is a Python library that lets us do Excel-type-stuff. Well, that’s not really giving it the credit it deserves. Pandas is a Python library which makes Excel-type stuff waaaaaaaaaaaaaaaaaay easier.

You might have seen me speak about how Jupyter Notebooks can make our lives easier as marketers (if not – you’ve clearly been missing out on Distilled Searchlove and you should absolutely buy a ticket). A lot of the examples I use are to do with how using Pandas is much much easier than trying to do the same stuff in Excel.

One function I haven’t been able to talk about on-stage is melt. As I said in the title, melt is kind of like unpivot and it is one of the best functions in Pandas because it lets us easily do things that wouldn’t just be harder in Excel – they would be pretty much impossible for anyone who isn’t a pretty advanced Excel user.

When is melt useful?

Think of it like this. Imagine you have a table of people and kinds of expertise;

PersonExpertise 1Expertise 2Expertise 3
SharonTech SEOArcheryPlumbing
JaneWildlife TamingPlumbingAstrophysics
BuellerLooking AroundRoller BladingDancing

Imagine you need to find two contractors – someone to fix the plumbing in your house and someone to fix the plumbing in your site (tech SEO). So you want to get a list of all of the people who have tech SEO listed as a skill, and separately find a list of all the people who have plumbing as a skill.

If you were doing this in Excel, you would probably have to ask at least six questions;

  1. Who has Tech SEO as skill number 1? OR
  2. Who has Tech SEO as skill number 2? OR
  3. Who has Tech SEO as skill number 3?
  4. Who has Plumbing as skill number 1? OR
  5. Who has Plumbing as skill number 2? OR
  6. Who has Plumbing as skill number 3?

You could be thinking “ah I’d just copy and paste that around” or “I don’t mind asking six questions” but things get much more complicated when we apply this to a real-world situation.

Imagine we are an e-commerce site. We have crawled a competitor website and we’ve extracted all of the product titles they have listed on their category pages. Maybe they have started selling products with our name on them and we need to know which ones. You could have other reasons – I’ll claim ignorance.

Now we have a table in a similar format but it looks quite different. There are way more columns to deal with.

URLProduct 1Product 2….Product 999
page-1Red dressCompetitor’s dressYellow dress
page-50Full length jeansJean shortsCompetitor’s jeans
page-500Competitor’s jumperCardigan

So now, assuming the brand name is always used in the same way we’re going to have to ask up to 999 questions for every category page on their site. Ain’t nobody got time for that.

Another example case is trying to fix redirect chains. If we use Screaming Frog to crawl our site, the redirect chains report will look something like the below. We have a column showing us where each redirect chain ends up, and all of the individual steps.

If we want to say to our devs – we need to make sure each of these redirects goes straight to the final address page, that’s going to be a massive pain for them – scrolling left to right to find each individual hop – and let’s be fair, it’ll probably be pushed to the bottom of their list.

Final AddressRedirect 1Redirect 2….Redirect 20
page-1page-apage-bpage-c
page-2page-dpage-epage-f
page-3page-gpage-hpage-i

Our problem is we can’t say to Excel – “all of these columns are essentially the same and I want to treat them that way”. This is where Melt comes in.

What does melt do?

With melt we can stack columns on top of each other so we can look at them all at the same time.

With melt – we give the columns we want to stay in the same place, and the columns we want to turn into rows and Pandas handles that for us. Let’s use our first example to see how this works. So we start with a table like this, with the Expertise split across different columns.

PersonExpertise 1Expertise 2Expertise 3
SharonTech SEOArcheryPlumbing
JaneWildlife TamingPlumbingAstrophysics
BuellerLooking AroundRoller BladingDancing

We say to melt – the Person column should stay in place, and it should melt everything else. What we get back looks like this;

PersonVariableValue
SharonExpertise 1Tech SEO
SharonExpertise 2Archery
Sharon Expertise 3Plumbing
JaneExpertise 1Wildlife Taming
JaneExpertise 2Plumbing
etc.etc.etc.

Now we only have to ask one question by looking in the values column. Same goes for our product scraping or redirect examples;

Final AddressVariableValue
page-1Redirect 1page-a
page-1Redirect 2page-b
page-1Redirect 20page-c
etc.etc.etc.
URLVariableValue
page-1Product 1Red dress
page-1Product 2Competitor’s dress
page-1Product 999Yellow dress
etc.etc.etc.

If you’d like to try using melt, the Pandas documentation is a great source of information. If you’d like to get stuck in with a real-world example can could make a difference to your site today – I also use the redirect example in the free Google Colab code I include in my How to fix broken links post

Leave a Reply

Your email address will not be published. Required fields are marked *