top of page
hand-businesswoman-touching-hand-artificial-intelligence-meaning-technology-connection-go-

Snowflake in the Clouds

Snowflake is a cloud native database and can be hosted on any of the clouds namely AWS, GCP or Azure. While using snowflake we pay for only the storage we use and the compute which was actually used during the cycle.


It has a 3 layer architecture

  1. Global Services Layer

  2. Compute Layer

  3. Storage Layer

Snowflake stores the data on the storage layer in the underlying blob storage of the cloud. It then creates a compute warehouse on top of this which helps to query the data and deliver the results to the user. The services layer manage all others tasks like authorization, metadata management, infrastructure management and caching of the results.


Snowflake Data Lifecycle

Organizing Data -> Database -> Schemas -> Tables

Storing Data -> INSERT/COPY INTO commands

Querying Data -> SELECT statements

Working Data -> DML statements, DDL statements and CLONING

Removing Data -> DELETE, DROP and TRUNCATE


Snowflake Tables

  1. Permanent - They are permanent tables which allow timetravel of 90 days and have failsafe period of 7 days.

  2. Temproray - These tables only exist during the session and have timetravel of 1 day and no fail safe period.

  3. Transient - These tables exist until they are dropped but timetravel is of 1 day and no fail safe period.

  4. External - These tables are outside snowflake and are read only. They do not have timetravel or failsafe.

There are many other objects in snowflake like Organization, Account, Database, Schema, View, Functions, Stored Procedures, Sequences, Tasks and Streams .


Snowflake can be integration with variety of tools from the data world:

  1. Business Intelligence (Tableau, Power BI, Qlikview, Thoughtspot

  2. Data Integration (DBT Labs, Informatica, Pentaho, Fivetran

  3. Security and Governance (Collibra, DATADOG, HashiCorpVault)

  4. Machine Learning and Data Science (DataRobot, Dataiku, Amazon Sagemaker, Zepl)

  5. SQL Devlopment and Management (SQL DBM, Seekwell, Hackolade, Agiledata engine)

SnowPark

It is an API used to access data outside snowflake interface. It supports languages like Python, JAVA, Scala

It provides various functions like .select(), .join() etc to work with the dataframes.

It executes the code lazily ie only when requested.

It has a push down computation where data is not moved outside snowflake, instead code is pushed down to snowflake.


There are many other aspects to Snowflake which is a cloud native database. This blog just gives the overview.

Thanks for Reading!!


24 views1 comment

+1 (302) 200-8320

NumPy_Ninja_Logo (1).png

Numpy Ninja Inc. 8 The Grn Ste A Dover, DE 19901

© Copyright 2022 by NumPy Ninja

  • Twitter
  • LinkedIn
bottom of page