Skip to main content
Uncategorized

Beyond Warehousing: Snowflake’s Leap into AI and LLMs

By januari 19, 2024maart 19th, 2024No Comments

For the last few years, Snowflake has position itself in the data world with its exceptional handling of massive datasets. It is a powerhouse in the realm of data management, particularly noted for its capabilities in efficiently managing and querying data at scale.

However, its orientation historically has been more directed towards BI and Datawarehouse than to Machine Learning or AI. With the appearance of LLMs and the profound impact that is having in the data world, Snowflake could no longer afford to stay behind.

This occasion presented a unique challenge beyond merely incorporating tools like Jupyter Notebooks, as done previously. Large Language Models (LLMs) are a distinct entity, posing new challenges to certain aspects of the Snowflake platform

Snowflake’s Architecture Adaptation

Snowflake’s architecture primarily involves handling structured and semi-structured data. Integrating LLMs, which often excel in processing unstructured text data, requires Snowflake to adapt or extend its architecture. This includes ensuring efficient data transfer between structured data storage and the LLMs, potentially demanding additional layers of data processing.

Additionally, limited integration with the machine learning ecosystem made ML tasks more difficult. Users typically had to extract data from Snowflake, process it using external ML tools (like Python scripts, Jupyter notebooks, or platforms like TensorFlow or PyTorch), and then, if necessary, load the results back into Snowflake. This would make the process slower and less secure.

Regarding the high computational demands of LLMs, models often require high-performance GPUs, specialized in neural network chips, which were not the central focus of Snowflake’s original data warehousing design.

To address these issues  Snowflake has open 3 new fronts: Snowflake Cortex, Snowpark Container Services and Document AI and as Jack the Ripper would say, let’s go by parts.

Snowflake Cortex

This service forms the foundation for incorporating ML into the Snowflake environment. One of the key features of Snowflake Cortex is its set of serverless functions. These functions enable users to perform a variety of tasks, such as data analysis and application development, directly within the Snowflake environment. This means that users can execute these functions without having to manage the underlying infrastructure, which significantly simplifies the process of developing and running AI-powered applications.

Snowpark Container Services

Through the introduction of Snowpark Container Services,a framework that provides the infrastructure for running applications or services, Snowflake allows developers to run containerized data apps using Snowflake-managed infrastructure. This includes the ability to run containers accelerated with NVIDIA GPUs, which is vital for handling the computational demands of LLMs. By enabling these containers to operate directly within Snowflake accounts, the service ensures efficient data transfer and processing, bridging the gap between Snowflake’s structured data storage and the unstructured data processing capabilities of LLMs.

Document AI

As part of its strategy to better handle unstructured data, Snowflake has also developed Document AI. This tool leverages a purpose-built, multimodal LLM that is natively integrated within the Snowflake platform. Document AI allows users to extract and process content from unstructured documents, such as invoices or contractual terms, using a visual interface and natural language. This integration directly addresses the challenge of processing unstructured data within Snowflake’s traditionally structured data environment

From enhancing serverless functions in Snowflake Cortex to harnessing the power of NVIDIA GPUs in Snowpark Container Services, and simplifying unstructured data processing with Document AI, Snowflake is poised to enter the battle of the next generation of data.

Let’s dive deeper into these new functionalities in future posts.

Auteur

Leave a Reply