In recent weeks, I have been working intensively with SQLalchemy for a consulting client. SQLalchemy is a Python-language toolkit for using SQL databases in applications. I’ve used Python, and SQL databases, and SQL queries, and a different Python-language toolkit for using SQL databases in applications, this was my first in-depth encounter with SQLalchemy. I had to do a lot of learning. SQLalchemy, despite its scads of documentation, and good tutorials, didn’t tell me some important concepts. Here’s a brief list, in an attempt to gather my thoughts and insights.

This list doesn’t include the important concepts the documentation does include, just what it (to my reading) left out. And I haven’t attempted to flesh out these points. That might be a good future blog.  It is concepts that I wish I had learned earlier and more easily.

If you are writing complex queries to retrieve data, expect to have the tutorials let you down. They tend to explain well tasks like defining a database table structure using Python classes, and adequately tasks like inserting data to the database. Generally the tutorials fall short in explaining how to retrieve data which requires a complex query.

You can write queries using uninstantiated classes. Normally in Python language object-oriented programming, the classes describe a capability, but you create instances of the class to use that capability.  The functions you call are attributes of the instance.  By contrast, SQLalchemy includes functions, usable as attributes of the class, which get real work done. You don’t need to instantiate the class if you aren’t inserting rows into the database.  I wasted time trying to figure out how to instantiate my model classes to get a query written, before I realised I could get the query written with the classes themselves.

I wanted a tutorial on query writing for the SQL developer, and I couldn’t find it. I had my fairly complex query written in raw SQL, and it worked fine. I looked for a tutorial in converting this query to SQLalchemy. The tutorials I found seemed aimed at Python language developers, explaining database queries. I wanted the complement.

Expect to have to get a PhD in SQLalchemy to get serious development done. Some Python packages help developers become proficient easily through a simple one-page document, or by intuitive method and parameter names, or by reading code written using the package.  By contrast, I was stuck on SQLalchemy until I took some time to read several tutorials and long class documentation in careful detail, in order to get the concepts clear in my head.

SQLalchemy seems to have a few classes which are crucial to understand, e.g. Select, FromClause, Table, ColumnClause. Each has families of similarly-named and related classes. Each has methods that let you transform one to another. Once I finished the Python code to represent my fairly complex SQL query, I could see that it started with a Table class, modulated to Select classes, to FromClause, to Select again.  I suspect that part of SQLalchemy mastery is figuring out how this path can work in general. I don’t see documentation which describe using SQLalchemy in terms of this general path between crucial classes.

Once you have the object representing a fairly complex query, it is hard from this object to get to the kinds of objects representing the columns which I could use for collating the query. There may be such a path, but I couldn’t find it easily. So, I ended up storing a dict with the final column names of the query as keys, and the objects representing the columns as values.

SQLalchemy can do powerful things. It’s role abstracting away differences between databases is valuable. I appreciate all the work of the developers who write this wonderful tool, and the writers who produce the scads of documentation.  But I sure would have been more effective sooner if I had learned this things at the start.

Do you have things the documentation never told you about SQLalchemy? Please add them to the comments.