Skip to main content

This blogpost marks the beginning of a long and very exciting journey in the world of Analytics Engineering at Nimbus Intelligence. 

That journey of course starts from the basics, and SQL (Structured Query Language) is a must. SQL is a powerful language for managing and querying data. One of its most useful features is the ability to perform aggregations on large datasets using functions like SUM(), AVG(), COUNT() etc. However, there are times when we need to perform more complex calculations that involve multiple levels of aggregation. 

In this post, we will explore nested aggregated functions in SQL, and how they can be used to solve real-world data problems.

To begin with

Let’s take a look at a table named shipping_details that contains 8 columns:

ID INTEGER
customer_f_name VARCHAR
customer_l_name VARCHAR
quantity INTEGER
price FLOAT
shipping_date DATETIME
city VARCHAR
country VARCHAR

The table holds information about the customer’s full name, the quantity of products that were shipped to them, the price of each product, the shipping date, the city and the country they are located in.

The table looks like this: 

Suppose we want to calculate the total revenue generated by each country, and then find the average revenue per city within each country. In simple words how each city performs inside its country.

To answer that question, we can use the following SQL query:

WITH country_revenue AS (
        SELECT
            country,
            SUM(quantity * price) AS total_revenue
        FROM
            shipping_details
        GROUP BY
            country
    ),
    city_revenue AS (
        SELECT
            country,
            city,
            SUM(quantity * price) AS total_revenue
        FROM
            shipping_details
        GROUP BY
            country,
            city
    )
SELECT
    country_revenue.country,
    city_revenue.city,
    city_revenue.total_revenue / country_revenue.total_revenue AS avg_revenue_per_city
FROM
    country_revenue
    JOIN city_revenue ON country_revenue.country = city_revenue.country;

The result of that query is the following:

In this query we first calculate the total_revenue in each country. Then we calculate the total_revenue at each city. Finally, we calculate the average_total_revenue_per_city.

We clearly see that Turin for instance holds a higher market share in Italy, and The Hague in The Netherlands. This resulted by calculating the average revenue of each city in its country.

Consequently, nested aggregated functions allow us to perform aggregations on the results of other aggregations. This can be useful when we need to calculate more complex metrics that can’t be computed with a single aggregation function.

Conclusion

To conclude with, nested aggregated functions are powerful techniques for performing complex calculations on large datasets in SQL. By using these techniques, we can unlock deeper insights into our data and gain a better understanding of the relationships between different dimensions of the given data.

Auteur