Are you ready for a new challenge? In this episode of “What’s Under the Snowflake?” let’s talk about Python UDF and Snowflake! Frosty Frida – week 5!
“To start out create a simple table with a single column with a number, the size and amount are up to you. After that we’ll start with a very basic function: multiply those numbers by 3. The challenge here is not ‘build a very difficult python function’ but to build and use the function in Snowflake”
Frosty Friday ‘Week 5 – Basic’
Snowflake SuperHeros suggest to test the code with a simple select statement :
SELECT timesthree(start_int)
FROM FF_week_5
How to proceed? A possible strategy!
- P1: Create a table
- Just remember that before doing anything, you need to configure the environment into your worksheet.
- You need to CREATE a table. No matter what you choose. The task simply requests a single column with a number. By means a select statement you can see the original values of such table.
- P2: Multiply values for others
- The goal of this challenge is to build and use a Python UDF. Do you want a clear but comprehensive presentation of this topic? Read the blog! It will really help you!
- In Snowflake you can use several User Defined Functions since you can write both Python, SQL, Java, and JavaScript functions.
P1: Create a table
Remember to configure the working environment before starting No idea? Look at the blog!
--Create a new table and insert data into it
CREATE OR REPLACE TABLE SIMPLE_TABLE
AS SELECT uniform(1, 100, random()) as VALUES_TO_MULT;
--Query your table to see how actually it is
SELECT * FROM SIMPLE_TABLE;
At this point, you should have something like the picture below:
P2: Multiply values for others
The goal of the challenge is to multiply the obtained value by three. By a Python UDF we can reach it. How? We need to create the function that takes as input the value into the SIMPLE_TABLE as returns the result of the multiplication. Hence:
CREATE OR REPLACE FUNCTION multiply_by_three(i int)
RETURNS INT
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
HANDLER = 'multiply_by_three_py'
AS
$$
def multiply_by_three_py(i):
return i*3
$$;
Having set your desired function, you can make a SELECT statement that returns the result of the defined UDF ‘multiply_by_three’ which has the value into the SIMPLE_TABLE as input.
SELECT multiply_by_three (SELECT * FROM SIMPLE_TABLE) AS RESULT_OF_MULTx3;
Unsurprisingly, we expect the outcome of this statement to be…
All we did was multiply the original table value by three!