User-Defined Functions allow us to write functions inside our SQL code. In Snowflake you can use SQL, Java, JavaScript and Python to write UDFs. While you are creating a UDF, the main thing that you should consider is to be sure that the function has a return value. Spoiler!! This is the main difference between User-Defined Functions and the Stored Procedures that we are going to talk about in the next blog. There are two types of them; User-Defined Function (UDF) which is a scalar function that returns one output and User-Defined Table Function (UDTF) which is a tabular function that returns tabular value.
Once you have created these functions; you can call them from any part of your code. So it gives you the ability to write your whole code in a more determine and in a cleaner way.
With UDFs you can’t take actions to modify the database. You can’t use temporary tables inside the UDFs.
Let’s have some examples to understand better how we can use them. We will create 2 differet User-Defined Functions where one will be executed in Python and will have UDF and the other one will execute SQL and will have UDTF. For simplicity let’s do the exact same operation for two of them; just multiply the inserte value with 3:
User-Defined Function Example (in Python)
CREATE OR REPLACE FUNCTION multiply_three(num integer)
RETURNS int
LANGUAGE python
RUNTIME_VERSION = '3.8'
HANDLER = 'multiply_three_py'
--PACKAGES = ('math')
AS
$$
def multiply_three_py(i):
return i*3
$$;
SELECT multiply_three(3);
The most important part that you have to pay attention to is naming your HANDLER with the same name of your function declaration of your code. Since HANDLER will be your function name; your system will require that handler from your SQL code; while a user want to use our function, they will call it with the SQL name as in the example with multiply_three(x).
While using Python we could insert the python package that will help us in the code. To do so we should add the package parameter in the function creation and give the packages that we want to use in our function. For example if we would like to use Numpy in our function, we could add:
packages = ('numpy', 'pandas')
we could use multiple packages in the same function.
User-Defined Table Function Example (in SQL)
The return value will be the biggest difference between UTDF and UDF. Since we are expecting a table to return; the return value will be table and we should specify the column types of that table at the same time. Let me show you with the example that we mentioned before.
As you can see here inside of our udtf_exercise table we have 6 rows of integer values.
And we expect from our function to return us a table that has these values in a column and to add one more column that stores the triplicated ones. (Without making changes on table)
CREATE OR REPLACE FUNCTION mult_columns_per_three()
returns TABLE(column_1 integer, mult_three integer)
AS
$$
SELECT col_1, col_1*3 FROM udtf_exercise
$$
;
SELECT * FROM TABLE(mult_columns_per_three());
after the function creation and function call we will have a returned value as below:
with this operation we could have some extra data that will not be stored in any storage place in our Snowflake account. If we would like to store these values in a new table we could simply create a new table or view using this function returned values.
CREATE TABLE udtf_result_table AS
(
SELECT * FROM TABLE(mult_columns_per_three())
);
SELECT * FROM udtf_result_table;
and now we have our results stored in the table udtf_result_table.
Conclusion
User-Defined Functions and User-Defined Table Functions are features that simplify our code complexity. We could easily create a function for the repeating process and use that for the rest of the code.
There are also Stored Procedures that have specific areas of use and they are very similar to UDFs. You can check the blog in here for the details of Stored Procedures and check here for the detailed difference between UDFs. Thanks for reading.
Credits