Creating a Postgres docker container and connecting dbt

Creating a Postgres docker container and connecting dbt

  • data
  • 2024-05-09
  • 5 minutes to read
Table of Contents

Introduction

I am currently researching dbt, a data transformation tool that allows you to transform data in your warehouse more effectively. One of the things that is really interesting about it, is that dbt has a set of adapters that allow you to interchange between different database engines.

It is also really easy to get started with local development. I initially tried to get the SQL Server adapter up and running because I am able to run SQL Server locally on my Mac. However, the SQL Server adapter is a community adapter and it’s not so quick to get started, so for now I’m going to work with Postgres.

I’ve chosen to use the Postgres adaptor because I can spin up a Postgres database locally in a Docker container in a few minutes. This means I can do dedicated development offline without having to sign up for any cloud services. Here’s how I set things up.

Pre-requisites

Before you can complete the steps below, you need to install and configure Docker from here.

Create a container instance

The command I’m going to run to create a Postgres container is:

1# make sure that Docker is running when you run this command
2Docker run -d --name dbt-quickstart -p 5432:5432 -e POSTGRES_USER=justin -e POSTGRES_PASSWORD=justin postgres:15-alpine 

The parameters are as follows:

  • --name dbt-quickstart is the name of the container.
  • -p 5432:5432 maps the port 5432 on the host to the port 5432 in the container.
  • -e POSTGRES_USER=justin is the user name.
  • -e POSTGRES_PASSWORD=justin is the password (look away security fans).
  • postgres:15-alpine is the image that will be used to create the container.

If you haven’t had a Postgres container up and running before, then Docker will download the image, then create the container.

 1Unable to find image 'postgres:15-alpine' locally
 215-alpine: Pulling from library/postgres
 3bca4290a9639: Pull complete 
 473755f5e6a3e: Pull complete 
 53a60b4a71875: Pull complete 
 656089f613fb2: Pull complete 
 73972f5d4acdc: Pull complete 
 8edbb275221bf: Pull complete 
 9fe375eefc236: Pull complete 
10f8ac95c0dcf9: Pull complete 
11f89251e2b9f9: Pull complete 
12Digest: sha256:8a8d55343d6fc456cb183453e3094ff140b984157b36b48f817dd581654f2aec
13Status: Downloaded newer image for postgres:15-alpine
148b122e2d8b752c8ef7a32094a897c826193415151bcf0fa9371edafa46e3196a

You can check that the container is running by running Docker ps or by reviewing the Docker dashboard.

the screenshot shows the Docker dashboard, displaying a running container called dbt-quickstart, it also displays the image postgres:15-alpine and is listening on port 5432

Install and configure dbt-sqlserver

Next I need to install the dbt adapter for postgres:

1# installs dbt adapter
2pip install dbt-sqlserver

Run dbt

I’m using the jaffle-shop sandbox project to review dbt. All I need to do from here is set up my profile in the profiles.yml file. I want this project to be built within my new postgres database, and so the profiles.yml file will look like this:

 1quickstart-postgres:
 2  target: dev
 3  outputs:
 4    dev:
 5      type: postgres
 6      host: localhost
 7      user: justin
 8      password: justin
 9      port:  5432
10      dbname: jaffle
11      schema: dbo

By updating the profile in the dbt_project.yml file to quickstart-postgres I can now run dbt commands to build and test the project against my Postgres database.

Accessing the database in Azure Data Studio

I can also connect to the database using Azure Data Studio. I can see the tables that have been created by dbt as a result of running the dbt seed command so I know that everything is working.

The image shows Azure Data Studio connected to my local Postgres database, the jaffle database is expanded showing multiple tables created by the dbt seed command

And so from there I can run dbt run to build the project:

 1dbt run
 2
 3# output
 4Running with dbt=1.7.14
 5Registered adapter: postgres=1.7.14
 6Found 13 models, 6 seeds, 6 sources, 0 exposures, 19 metrics, 651 macros, 0 groups, 6 semantic models
 7Concurrency: 1 threads (target='dev')
 8
 91 of 13 START sql table model dbo.metricflow_time_spine ........................ [RUN]
101 of 13 OK created sql table model dbo.metricflow_time_spine ................... [SELECT 3651 in 0.14s]
112 of 13 START sql view model dbo.stg_customers ................................. [RUN]
122 of 13 OK created sql view model dbo.stg_customers ............................ [CREATE VIEW in 0.06s]
133 of 13 START sql view model dbo.stg_locations ................................. [RUN]
143 of 13 OK created sql view model dbo.stg_locations ............................ [CREATE VIEW in 0.05s]
154 of 13 START sql view model dbo.stg_order_items ............................... [RUN]
164 of 13 OK created sql view model dbo.stg_order_items .......................... [CREATE VIEW in 0.04s]
175 of 13 START sql view model dbo.stg_orders .................................... [RUN]
185 of 13 OK created sql view model dbo.stg_orders ............................... [CREATE VIEW in 0.05s]
196 of 13 START sql view model dbo.stg_products .................................. [RUN]
206 of 13 OK created sql view model dbo.stg_products ............................. [CREATE VIEW in 0.05s]
217 of 13 START sql view model dbo.stg_supplies .................................. [RUN]
227 of 13 OK created sql view model dbo.stg_supplies ............................. [CREATE VIEW in 0.05s]
238 of 13 START sql table model dbo.locations .................................... [RUN]
248 of 13 OK created sql table model dbo.locations ............................... [SELECT 6 in 0.04s]
259 of 13 START sql table model dbo.products ..................................... [RUN]
269 of 13 OK created sql table model dbo.products ................................ [SELECT 10 in 0.04s]
2710 of 13 START sql table model dbo.order_items ................................. [RUN]
2810 of 13 OK created sql table model dbo.order_items ............................ [SELECT 90900 in 0.15s]
2911 of 13 START sql table model dbo.supplies .................................... [RUN]
3011 of 13 OK created sql table model dbo.supplies ............................... [SELECT 65 in 0.03s]
3112 of 13 START sql table model dbo.orders ...................................... [RUN]
3212 of 13 OK created sql table model dbo.orders ................................. [SELECT 61948 in 0.33s]
3313 of 13 START sql table model dbo.customers ................................... [RUN]
3413 of 13 OK created sql table model dbo.customers .............................. [SELECT 935 in 0.09s]
35Finished running 7 table models, 6 view models in 0 hours 0 minutes and 1.31 seconds (1.31s).
36Completed successfully

Which will build the project in the Postgres database:

The image shows Azure Data Studio connected to my local Postgres database, the jaffle database is expanded showing multiple tables created by the dbt run command

Conclusion

If you want to explore dbt and you don’t want to go through the hassle of setting up a cloud database rource, then this is a pretty quick way to get started. I do like the idea of being able to develop and test the entire solution offline. There is also the option to connect to DuckDB, but I haven’t tried that yet.

References

#mtfbwy



Recent Posts

Hugo Live Reload Keeps Reloading

Hugo Live Reload Keeps Reloading

  • 2024-05-18
  • 3 minutes to read

I have run into an issue from time to time where Hugo's live reload feature keeps reloading the page. This is how I fixed it.

Read More
First Steps with Mermaid in Vscode

First Steps with Mermaid in Vscode

  • 2024-05-17
  • 7 minutes to read

My blog theme supports mermaid, a visual diagram syntax, so I wanted to try it out in vscode.

Read More
T-SQL Tuesday #174 My favourite job interview question

T-SQL Tuesday #174 My favourite job interview question

  • 2024-05-14
  • 4 minutes to read

This is my 12th contribution to TSQL Tuesday, the data community blog party. Have a read, and why not join in next time!

Read More