Blog Layout

MAY CONTAIN BAD ASS CONTENT: ANALYST DISCRETION IS ADVISED

Alter Snowflake with Dynamic SQL and Python

Dec 17, 2020

Adding Python to Alteryx can make your Snowflake dreams come true

Anyone who follows me knows how much I love the combination of Alteryx and Snowflake.  Alteryx provides a logic layer that enables any analyst to manipulate data and Snowflake provides the scale needed to get some amazing insights across large data sets.  However, I have found there were always a few things missing.  The Alteryx In-DB tools, while comprehensive, cannot do everything.  There are some gaps (most notably - ahem -any Alteryx product managers reading this - simple Pivot or Unpivot tools in-DB).  There is also a lot of functionality like changing warehouses, schemas or databases that cannot be easily done from within Alteryx.

Enter the python tool

Now, I am more business and less code.  I muddle through code when I have to, and I finally decided this was something I had to do.  I had to be able to use a mix of custom SQL with the native in-db tools on Snowflake.  I was pleasantly surprised that this was WAY easier than I expected with building some custom macros that take advantage of the Snowflake -Python connectors.  


To Start with, the newer versions of Alteryx with integrated jupyter notebooks help a lot!   


When you drop a Python tool on to the Alteryx canvas, this notebook will open up.  You can enter in your code (or for us non-coders - find the code on stack overflow and paste it in :))



This code loads the packages that are required to execute code against Snowflake.  As a note, you may get an error when loading these.  If you do,  try to run Alteryx as an administrator on your machine.  You also may get success by adding install_type=User   to your command (package = "snowflake-sqlalchemy", install_type=User ). Right clicking on Alteryx designer and running as administrator is what worked for me.


from ayx import Package

from ayx import Alteryx

Alteryx.installPackages(package="snowflake-connector-python")

Alteryx.installPackages(package="snowflake-sqlalchemy")


I am using the standard code samples from snowflake on how to set up your connections; With one little trick :).


As your see, we are creating a variable object, inputs, which will read the data stream #1 that I drag into my input tool.  This data stream will contain the basic information about the Snowflake account.  We can set these values using interface tools as part of a macro or analytic app. 


The code sfUser = (inputs['username'].values[0]) sets that variable to the first row of the column 'username'.     


The rest below is just the standard Snowflake connector syntax.  You will know it is working as we have requested Snowflake to send the current version (which you will see in your Python tool or in your Alteryx messages when you run this in a workflow.)



from ayx import Alteryx

import snowflake as sf



inputs = Alteryx.read("#1")


sfUser = (inputs['username'].values[0]) 

sfAccount = (inputs['account'].values[0])

sfPswd = (inputs['password'].values[0])


from sqlalchemy import create_engine

engine = create_engine(

 'snowflake://{user}:{password}@{account}/'.format(

 user = sfUser,

 password = sfPswd,

 account = sfAccount,

 )

)

 

con = sf.connector.connect(

 user = sfUser,

 password = sfPswd,

 account = sfAccount,

)


sfq = con.cursor()



inputs2 = Alteryx.read("#2")


for index, row in inputs2.head().iterrows():

  sfq.execute(row['SQL'])

  print(row['SQL']+" executed")

   

con.close()

engine.dispose()



Now for the dynamic part

What we are doing here is creating a second box for the SQL script we want to run. 

The command sfg.execute('WILL EXECUTE THIS SQL SCRIPT').  This problem with this is it only accepts one statement at a time.


That said, we can read in multiple SQL statements and use a very simple loop; to loop through each statement and execute it.


inputs2 = Alteryx.read("#2")


for index, row in inputs2.head().iterrows():

  sfq.execute(row['SQL'])

  print(row['SQL']+" executed")

   

con.close()

engine.dispose()


This means we can input individual SQL statements and execute them in an Alteryx workflow



Thats a wrap!

Now, it's just a question of connecting a few interface tools.  I am splitting the SQL statement into rows using a semicolon to mirror the effect of writing out a long SQL statement.  The input and output tools on the bottom are really just there to give this macro a seamless in-db feel.  We could very easily tweak the Python macro to return a Snowflake select statement as a data frame and have an output from the macro as well.  However, the purpose of this macro is to automate some administrative tasks in a workflow.  I can now have complete command of my Snowflake environment from within Alteryx, and I can easily build up a suite of macros to deal with common tasks or execute anything remotely in Snowflake from Alteryx!

Now in Action!

Full code below:

02 Nov, 2020
In our previous post, we discussed why d emand forecasting matters and how an accurate demand forecast can mathematically and quantitatively improve your service level and/or inventory levels (cost). Now it is time to focus on the biggest uncertainty of them all . Time. Advanced forecasting algorithms can help eliminate uncertainty through reducing variables that are “unknown” by understanding relationships at a scale that no human could comprehend. This helps turn the seemingly “unknown” in to a “known”. As we stated earlier, advances in math and cloud computing make finding these relationships much easier and can be computed almost instantaneously. However, the element of time is a tricky compounding factor. Even little unknowns become big over time as time compounds all errors and introduces more uncertainties. The most certainty you have is in the precise moment you make an estimate. When you look ahead 1 day, 1 week or even 1-month (as most corporate S&OP) cycles still do – then you must start to rely less and less on facts and more and more on assumptions. Assumptions mean uncertainty and uncertainty means cost. There are loads of examples of this. When I lived in Florida, every September during hurricane season , we would get the dreaded cone of uncertainty any time a hurricane formed. As below shows , the further out you get, the expected position gets wider. By day 5 - the uncertainty stretched over 500 miles – hardly accurate enough to do any planning of significance  .
By James Smith 14 Sep, 2020
You cannot go far these days without hearing about data science, digital transformation or AI. This is especially true in the supply chain discipline. However, when taking it back to basics, what AI, ML and Data Science are really doing are solving mathematical equations that are hardly new. These equations are just trying to optimize a series of variables to minimize cost and maximize service – something supply chain professionals have been doing for decades. So what has changed? Cloud Computing and Cloud Storage! Cloud Computing is enabling supply chain professionals to perform more math, faster. Cheap Storage is providing the ability to store more data about products and environmental variables which gives more fuel for the math to solve.
By James Smith 14 Sep, 2020
This post will give you all the tools and details on how to use weather to drive business/sales to your organization. The high level process will involve taking locations of interest (in our sample, we will use retail locations of major chains) and then establish a custom weather criteria. The output is a list of locations meeting that criteria upon which an action can be taken. We will explore actions in a later post but such actions include things like: creating a custom facebook territory for a campaign, sending customers targeted weather specific offers, inputting into a retailers target stock program, adjusting your stores forecast....the list goes on).
Share by: