Data Science, Data Pipelining, and Tools

Tag Mapper - impact graph

In my experience – the main problem companies face isn’t a lack of data, it’s often too much data, that’s too hard to to understand.

People are overwhelmed, and too busy to spend all the time they need to draw out insights and consider different interpretations.

That’s where I come in.

I get teams the insights they need, when they need them. Sometimes that’s about accessing data they couldn’t, sometimes it’s just giving them a new way to understand things they already know.

Show, don’t tell

Anyone can say they do data science or pipelining, so how about some examples?

Topic clustering & opportunity at scale
Saving tens of thousands of pounds in billable hours

Millions in sales and retainers rely on Keyword Navigator to identify
SEO opportunities

Challenge: Keyword clustering and opportunity finding is high value SEO work but very time consuming. I was working with a team who was pushed for time, and was wasting hours weeding out keywords that aren’t valuable. The team couldn’t code but needed a simple, cheap, way to do lots of processing.

Solution: I wrote a Python framework which gets live Google data and groups keywords based on similarity in search results, estimating opportunity page-by-page and creating Google Sheets with pivot tables to share results. Thanks to Git library management – all code is maintained and updated centrally by the in-house tech team.

Secure google workspace add-on
Saving £ thousands in capacity across the company

Challenge: Often consultants waste time on small tasks in Google Sheets but the time required to build, track, and release new tools to handle those little tasks mean it’s hard to find ROI when building tools.

Solution: I created a workspace add-on framework that lets Aira just add new tools and push them out to every consultant who already has an add-on installed.

The add-on cut our dev cycle from weeks to hours, which meant we could release a series of tools over two months which save £ thousands in billable hours.

Machine Learning forecasting and measuring impact
Finding >£100K in wasted spend

Challenge: An ecommerce client was spending lots of money on advertising but didn’t know how well it was working. Standard tracking methods can over-report, hiding the fact that spending money on some activity actually has no impact at all.

Solution: I worked with them to strategically switch on and off campaigns, and used Machine Learning tools like Causal Impact to measure the revenue impact while handling things like seasonal fluctuations.

Revenue impact chart

We identified over 30% of potentially wasted spend, earmarked for further testing. We also established at least £45K uplift from the activity that was running, and found new information about how long customers were taking to convert.

3.2 Billion rows of Search Console data
Millions of rows of client insights saved

Video we shared publicly explaining one way we use this data.

Challenge: Google Search Console offers crucial insights into how websites are performing and top opportunities, but the user interface only offers a fraction of the data and everything over than 16 months old is automatically deleted.

Solution: I wrote a Python data pipeline which regulary extracts all Search Console data, and puts it into a secure database for use later. To avoid getting blocked by search console (which can happen if you ask for too much at the same time) we used Google Cloud Tasks to ask for everything as quickly as possible without breaking the limits.

SQL learning web app
Thousands of players getting better insights from their own data

More than 40K plays to-date

Challenge: Lots of databases rely on SQL as a way to give users data. It’s a really powerful language, and with tools like BigQuery, lets us process millions of rows of information, easily. I’ve worked with many consultants who say they want to learn but it can be hard because SQL is quite dry, not always very clear, and it can be hard to have an excuse to practice (or the data to practice on).

Solution: I created Lost At SQL – a rich learning game with a story mode that takes players from the absolute basics all the way to advanced multi-step SQL queries, as well as challenge modes where players can pit their skills against others from around the globe.

Tag Manager and Analytics errors visualised
Avoiding tens of thousands of pounds of analytics problems

Over 1K uses by the public

Challenge: A client had a huge number of businesses with different websites trying to use the same tracking platform. When something broke it took a lot of time to work out why.

Solution: I created Tag Mapper which helps visualise the dependencies in a Google Tag Manager account. It helped us quickly avoid and fix tracking issues that could have cost the client tens of thousands, and the tool has been used by analytics practitioners across the globe.

Speaking and training

From classes of 10 to audiences of 2,000

Challenge: I have long had a passion for teaching and sharing knowledge, but topics like Python and data processing can be intimidating.

Solution: I have given conference talks designed to get audiences excited about coding (they have ben called “one of the best talks for the last 10 years”). I’ve also created in-person and on-demand training to help people learn.

Get in touch

Scroll to Top