Model queries with raw (open-ended) SQL
As extensive as Django model queries are, there can be circumstances when neither of the options presented in previous sections is sufficient to execute certain CRUD operations. Under these circumstances, you must rely on raw (open-ended) SQL queries, which represent the most flexible approach to execute operations against databases connected to Django projects.
Django offers two ways to execute
raw SQL queries. The first consists of using a model manager's
raw()
method that fits the resulting SQL query data
into a Django model, which has the added advantage of raw SQL
queries behaving as close as possible to native Django model
queries. And a second approach which consists of using a Python
database connection -- managed by Django -- to fetch the SQL query
data and process it with lower level Python DB API functions (e.g.
cursor
)[6].
SQL queries with a model manager's raw() method
A model manager's
raw()
method should be your first option to execute
raw SQL queries, since the results are structured as a
RawQuerySet
class instance, which is very similar to
the QuerySet
class instances produced by Django model
queries you've used up to this point.
Thus a RawQuerySet
class instance -- just like a QuerySet
class instance
-- offers an easy way to access records using a Django model's
fields, the ability to defer the loading of model fields, as well
as indexing and slicing. Listing 8-58 illustrates a series a raw
SQL query performed with a model manager's raw()
method.
Listing 8-58. Django model manager raw() method
from coffeehouse.items.models import Drink, Item # Get all drink all_drinks = Drink.objects.raw("SELECT * FROM items_drink") # Confirm type type(all_drinks) # Outputs: <class 'django.db.models.query.RawQuerySet'> # Get first drink with index 0 first_drink = all_drinks[0] # Get Drink name (via item OneToOne relationship) first_drink.item.name # Use parameters to limit a raw SQL query caffeine_limit = 100 # Create raw() query with params argument to pass dynamic arguments drinks_low_caffeine = Drink.objects.raw("SELECT * FROM items_drink where caffeine < %s",params=[caffeine_limit]);
The first snippet in listing 8-58
uses the Drink
model manager's raw()
method to issue the SELECT * FROM items_drink
query.
It's worth mentioning this raw query produces the same results as
the native Django query Drink.objects.all()
, but
unlike native queries that produce QuerySet
data
structures, notice how raw()
method queries produce a RawQuerySet
data structure.
Because a
RawQuerySet
data structure is a subclass of
QuerySet
, listing 8-58 shows how it's possible to use
many of the same mechanisms as QuerySet
data
structures. For example, access to records is also done by index
(e.g. [0] to get the first elements) and it's also possible to
access related models using dot notation.
Finally, the last example in
listing 8-58 illustrates how to create raw SQL queries with dynamic
arguments using the params
argument. In all cases
where you need to create raw()
SQL queries that depend
on dynamic values (e.g. provided by a user or another sub-routine)
you should always create the backing raw SQL query string with
placeholders -- %s -- which then get substituted through the
params
argument. In the case of listing 8-58, notice
how the caffeine_limit
variable is declared in
params
to later be substituted into the raw SQL query.
The params
argument ensures dynamic values are escaped
from queries before being applied to the database, avoiding a
potential SQL injection security attack[7].
The raw()
SQL
example in listing 8-58 is straightforward because the results of
the query map directly to the intended model. In other words, the
SELECT * FROM items_drink
query produces the necessary
results for Django to create Item
records without
additional help. Sometimes though, raw()
SQL queries
require additional configuration to be able to create the
underlying model records.
For example, if you perform a raw
SQL query on a legacy table or multiple tables, with the intention
to use the raw()
method of a certain model, you must ensure Django
is able to interpret the results of raw SQL query to the model, by
either using SQL AS statements in the raw SQL or relying on the
raw()
translations parameter. Listing 8-59 illustrates
both techniques.
Listing 8-59. Django model manager raw() method with mapping, deferred fields and aggregate queries.
# Map results from legacy table into Item model all_legacy_items = Item.objects.raw("SELECT product_name AS name, product_description AS description from coffeehouse_products") # Access legacy results as if they are standard Item model records all_legacy_items[0].name # Use explicit mapping argument instead of 'as' statements in SQL query legacy_mapping = {'product_name':'name','product_description':'description'} # Create raw() query with translations argument to map table results all_legacy_items_with_mapping = Item.objects.raw("SELECT * from coffeehouse_products", translations=legacy_mapping) # Deferred model field loading, get item one with limited fields item_one = Item.objects.raw("SELECT id,name from items_item where id=1") # Acess model fields not referenced in the raw query, just like QuerySet defer() item_one[0].calories item_one[0].price # Raw SQL query with aggregate function added as extra model field items_with_inventory = Item.objects.raw("SELECT *, sum(price*stock) as assets from items_item"); # Access extra field directly as part of the model items_with_inventory[0].assets
The first example in listing 8-59
declares a raw()
method with multiple SQL AS
statements, in this case, each of the AS clauses corresponds to an
Item model field. In this manner, when Django inspects the results
of the raw query, it knows how to map the results to
Item
instances, irrespective of the underlying
database table column names.
The second example in listing
8-59 declares a raw()
method with the
translations
argument whose value is a Python
dictionary that maps database table column names to Django model
fields. In this case, when Django encounters an unknown database
table column name in the raw query results, it uses the
translations
dictionary to determine how to map the
results to Item instances.
The third example in listing 8-59
illustrates how even when issuing a partial raw SQL query with the
raw()
method, Django is capable of fetching missing
fields as if it were a native QuerySet
data structure.
Finally, the fourth example in listing 8-59 illustrates how the
raw()
method is capable of handling extra fields
declared as aggregation queries and how they become accessible as
if they were added with the native model aggregate()
method.
SQL queries with Python's DB API
Although the Django model
raw()
method offers a great alternative to create raw
SQL queries and have the ability to leverage native Django model
features, there are circumstances where raw SQL queries with a
model's raw()
method won't work. Either because the
results of a raw SQL query can't be mapped to a Django model, or
because, you simply want access to the raw data without any Django
model influence.
Under such circumstances, you'll need to use the second Django alternative to perform raw SQL queries, which consists of directly connecting to a database and explicitly extracting the results of a query. Although this second Django alternative is technically the most flexible to interact with a database, it also requires using lower-level calls from Python's DB API.
The only thing you can leverage
from Django when performing raw SQL queries using this technique is
the database connection defined in a Django project (i.e. the
DATABASES
variable in settings.py
). Once
a database connection is established, you'll need to rely on Python
DB API methods like cursor()
, fetchone()
and fetchall()
-- as well as perform manual extraction
of the results -- to be able to successfully run raw SQL
queries.
Listing 8-60 illustrates SQL queries using the Python DB API in the context of Django.
Listing 8-60. Django raw SQL queries with connection() and low-level DB API methods
from django.db import connection # Delete record target_id = 1 with connection.cursor() as cursor: cursor.execute("DELETE from items_item where id = %s", [target_id]) # Select one record salad_item = None with connection.cursor() as cursor: cursor.execute("SELECT * from items_item where name='Red Fruit Salad'") salad_item = cursor.fetchone() # DB API fetchone produces a tuple, where elements are accessible by index salad_item[0] # id salad_item[1] # name salad_item[2] # description # Select multiple records all_drinks = None with connection.cursor() as cursor: cursor.execute("SELECT * from items_drink") all_drinks = cursor.fetchall() # DB API fetchall produces a list of tuples all_drinks[0][0] # first drink id
The first statement in listing
8-60 imports django.db.connection
which represents the
default
database connection defined in the
DATABASES
variable in settings.py
. Once
you have a connection
reference to the Django
database, you can start to make use of the Python DB API, which
generally starts with the use of the cursor()
method[8].
The first raw SQL query in
listing 8-60 opens a cursor
on the
connection
and executes the
cursor.execute()
method to perform a delete operation.
Because delete queries don't return results, the operation is
considered concluded after the calling the
cursor.execute()
method
Tip If you declare multiple database reference in DATABASES, you can can use thedjango.db.connections
reference to create a cursor on a specific database, instad of the default:from django.db import connections cursor = connections['analytics'].cursor() # Cursor connects to 'analytics' DB
The second raw SQL query in
listing 8-60 first declares the salad_item
placeholder
variable to store the results of the raw SQL query. Once this is
done, another cursor
is opened on the
connection
to execute a select operation using the
same cursor.execute()
method. Because select queries
return a result, an additional call is made on the
cursor.fetchone()
method to extract the results of the
query and assign them to the placeholder variable. Note the
fetchone()
method is used because it's expected the
raw SQL query will return a single record result.
Next, observe how the results of
raw SQL query in salad_item
are accessed by index.
Since the Python DB API cursor.fetchone()
method makes
no use of field names or other references, you have to know the
order in which record fields are returned, a process that can be
particularly cumbersome for raw SQL with many fields.
The third raw SQL query in
listing 8-60 first declares the all_drinks
placeholder
variable to store the results of the raw SQL query. Once this is
done, another cursor
is opened on the
connection
to execute anther select operation using
the same cursor.execute()
method. Because select
queries return a result, an additional call is made on the
cursor.fetchall()
method to extract the results of the
query and assign them to the placeholder variable. Note the
fetchall()
method is used because it's expected the
raw SQL query will return multiple record result.
Next, observe how the results of
raw SQL query in all_drinks
are accessed by multiple
index. Since the Python DB API cursor.fetchall()
method makes no use of field names or other references, the first
index represents a record in the result and the second index
represents a field value from a given record.