Managing too many tools can quickly become a headache, especially when building complex data pipelines. During one of my latest projects I experimented with using Python stored procedures in Snowflake. Looking for a way to simplify development and reduce the number of different applications necessary to build an ingestion pipeline. With tasks, external access integrations, stored procedures and computing resources all in one location Snowflake has the potential to do it all. However during development I discovered that there are some trade-offs of using this method.
The Pros: Centralized Logic
Snowflake stored procedures written in Python are a game-changer on paper.
- All Logic in One Place Having business logic live directly inside Snowflake simplifies architecture. You’re no longer juggling between transformation logic in dbt, orchestration in Airflow and database logic in Snowflake. Everything runs in one place, with less latency and no data movement.
- Python in the Warehouse Python is familiar to most data engineers, so leveraging Snowpark Python inside stored procedures allows for more expressive, maintainable, and powerful logic. The support for common libraries and built-in Snowpark functions makes this especially appealing.
- Runs on Snowflake Resources Since the code runs directly on Snowflake’s computing resources, you don’t need to provision separate compute or worry about setting up additional orchestration frameworks. This means all your costs are associated with one platform simplifying cost and development.
The Cons: One big messy string
But the day-to-day developer experience? That’s where the cracks start to show.
- A Big String Blob When working with stored procedures using any language but SQL, you are creating an entire Python function inside an SQL string block. This results in indentation issues and no syntax highlighting in the Snowflake UI. Even if you’re developing locally, you often find yourself copying large blocks of Python into SQL scripts. This development process is awkward and error-prone.
- Version Control is non-existent Stored procedures live in the Snowflake database. That means no native Git integration, no history tracking, and no PR reviews unless you’re exporting the SQL string to a file and managing it externally. Even then, reading diffs in giant string blocks is rough. Your Python IDE isn’t going to lint or test that block like normal code either.
Looks simple, right? Now imagine this for 500+ lines of logic.
A possible solution?: @sproc decorator
A while back Snowflake introduced support for the @sproc decorator as part of the Snowpark for Python API. This is a huge step toward improving maintainability.
With @sproc, you can define a stored procedure as a standard Python function in a Python file outside of SQL and register it directly into Snowflake. Using this decorator enables:
- Clean, testable Python code (not one big string)
- Proper use of IDE features during development (linting, autocomplete, etc.)
- Actual version control via Git
But even with this improvement, the solution is not perfect. The developer experience is definitely improved but there are still some shortcomings.
- Argument Names Vanish One subtle (but annoying) detail: argument names to procedures aren’t visible once deployed. This means that unless you’re thorough with documentation, you’ll have to look up the original code to remember what each parameter was.
- Code is Hidden in Snowflake Once deployed, if the code block is large, the actual code won’t be visible in the Snowflake UI. The procedure will reference a zip file containing your function. You can see that a stored proc exists, but not what it does. If you’re debugging or trying to audit code, you’re stuck. You will have to look where the procedure was created in the git or external documentation.
Using this decorator will turn the procedure created above in snowflake into the unreadable mess below when created using the @sproc decorator. This makes it very challenging to use and understand this procedure if a user only has access to Snowflake.
Final Thoughts
Python stored procedures in Snowflake are a promising feature and for many teams, they’re already solving problems. But for large scale production code it is not there yet in terms of maintainability, visibility, and version control. I would suggest using it for small procedures with simple logic that do not require version control.
Have you had similar issues with Python stored procedures in Snowflake, or found clever solutions for these issues let me know!