Model queries by SQL keyword

In the previous sections you learned how to query single records , multiple records and related records with Django model methods. However, the matching process was done for the most part on exact values. For example, a query for the Store record with id=1 translated into the SQL WHERE ID=1 or a query for all Store records with state="CA" translated into the SQL WHERE STATE="CA".

In reality, exact SQL matching patterns are far from most real world scenarios that require finer grained SQL queries. In the following sub-sections, you'll learn about the various Django model query options classified by SQL keywords, this way you can easily identify the required Django syntax using the better known SQL keywords as identifiers.

WHERE queries: Django field lookups

The SQL WHERE keyword is among the most used keywords in relational database queries, because it's used to delimit the amount of records in a query through a field value. Up to this point, you've mostly used the SQL WHERE keyword to create queries on exact values (e.g. WHERE ID=1), however, there are many other variations of the SQL WHERE keyword.

In Django models, variations of the SQL WHERE keyword are supported through field lookups, which are keywords appended to field filters using __ (two underscores) (a.k.a. "follow notation").

The pk lookup shortcut

Django queries rely on model field names to classify queries. For example, the SQL WHERE ID=1 statement in a Django query is written as ...(id=1), the SQL WHERE NAME="CA" statement in a Django query is written as ...(state="CA").In addition, Django models can also use the pk shortcut -- where pk="primary key" -- to perform queries against a model's primary key. By default, a Django model's id field is the primary key, so id field and pk shortcut queries are considered equivalent (e.g. Store.objects.get(id=1) Store.objects.get(pk=1))
A query with a pk lookup only has a different meaning than one with an id field, when a model defines a custom primary key model field.

=/EQUAL and !=/NOT EQUAL queries: exact, iexact

Equality or = queries is the default WHERE behavior used in Django models. There are two syntax variations for equality searches, one is a short-handed version and the other uses the exact field lookup, listing 8-33 shows both approaches.

Listing 8-33. Django equality = or EQUAL query

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item

# Get the Store object with id=1
Store.objects.get(id__exact=1)

# Get the Store object with id=1 (Short-handed version)
Store.objects.get(id=1)

# Get the Drink objects with name="Mocha"
Item.objects.filter(name__exact="Mocha")

# Get the Drink objects with name="Mocha" (Short-handed version)
Item.objects.filter(name="Mocha")

As you can see in listing 8-33, you can either use the exact field lookup to explicitly qualify the query or use the short-handed syntax <field>=<value>. Because exact WHERE queries are the most common, Django implies exact searches by default.

Tip You can do case insensitive equality queries with the iexact field lookup (e.g. match 'IF','if','If' or 'iF'). See LIKE and ILIKE queries section for details.

Inequality or != searches also have two syntax variations presented in listing 8-34.

Listing 8-34. Django inequality != or NOT EQUAL query with exclude() and Q objects

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item
from django.db.models import Q

# Get the Store records that don't have state 'CA'
Store.objects.exclude(state='CA')

# Get the Store records that don't have state 'CA', using Q
Store.objects.filter(~Q(state="CA"))

# Get the Item records and exclude items that have more than 100 calories
Item.objects.exclude(calories__gt=100)

# Get the Item records and exclude those with 100 or more calories, using Q
Item.objects.filter(~Q(calories__gt=100))

As you can see in listing 8-34, one syntax variation uses the exclude() method to exclude objects that match a given statement. Another alternative is to use a Django Q object to negate a query. In listing 8-34 you can see the Q object Q(state="CA") that matches state values with CA, but because the Q object is preceded with ~(tilde symbol) it's a negation pattern (i.e. matches state values that aren't CA).

Both the exclude() and Q object syntax produce the same results. Q objects are mostly used in more complex queries, but in this case a negated Q object works just like exclude().

AND queries

To create SQL WHERE queries with an AND statement you can add multiple statements to a query or use Q objects, as illustrated in listing 8-35.

Listing 8-35. Django AND query

from coffeehouse.stores.models import Store
from django.db.models import Q

# Get the Store records that have state 'CA' AND city 'San Diego'
Store.objects.filter(state='CA', city='San Diego')

# Get the Store records that have state 'CA' AND city not 'San Diego'
Store.objects.filter(Q(state='CA') & ~Q(city='San Diego'))

The first example in listing 8-35 adds multiple field values to the filter() method to produce a WHERE <field_1> AND <field_2> statement. The second example in listing 8-35 also uses the filter() method, but uses two Q objects to produce a negation with the AND statement (i.e. WHERE <field_1> AND NOT <field2>) through the & operator.

Tip If you're looking for a broader AND query than the ones in listing 8-35, for example, get Store objects with state 'CA' AND those with state 'AZ', look at either OR queries or IN queries.

Tip If you're looking to combine two queries, for example query1 AND query 2, look at the Merge queries section later in this same chapter.

OR queries: Q() objects

To create SQL WHERE queries with an OR statement you can use Q objects, as illustrated in listing 8-36.

Listing 8-36 Django OR query

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item
from django.db.models import Q

# Get the Store records that have state 'CA' OR state='AZ'
Store.objects.filter(Q(state='CA') | Q(state='AZ'))

# Get the Item records with name "Mocha" or "Latte"
Item.objects.filter(Q(name="Mocha") | Q(name='Latte'))

Both examples in listing 8-36 uses the | (pipe) operator between Q objects to produce a WHERE <field1> OR <field2> statement , similar to how the & operator is used for AND conditions.

IS and IS NOT queries: isnull

The SQL IS and IS NOT statements are typically used with WHERE in queries involving NULL values. And depending on the database brand, SQL IS and IS NOT can also be used in boolean queries. To create SQL WHERE queries with an IS or IS NOT statement you can use a Python None data type with an equivalency test or the isnull field lookup, as illustrated in listing 8-37.

Listing 8-37. Django IS and IS NOT queries

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Drink
from django.db.models import Q

# Get the Store records that have email NULL
Store.objects.filter(email=None)

# Get the Store records that have email NULL
Store.objects.filter(email__isnull=True)

# Get the Store records that have email NOT NULL
Store.objects.filter(email__isnull=False)

The first example in listing 8-37 attempts a query on Python's None value, in this case None gets translated to SQL's NULL (i.e. IS NULL). The second and third examples in listing 8-37 use the isnull field lookup, to create IS NULL and IS NOT NULL queries, respectively.

IN queries: in

The SQL IN statement is used with WHERE clauses to generate queries that match a list of values. To create SQL WHERE queries with an IN statement you can use the in field lookup, as illustrated in listing 8-38.

Listing 8-38 Django IN queries

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Drink

# Get the Store records that have state 'CA' OR state='AZ'
Store.objects.filter(state__in=['CA','AZ'])

# Get the Item records with id 1,2 or 3
Item.objects.filter(id__in=[1,2,3])

As you can see in listing 8-38, the Django in field lookup can be used to create a query for records that match a list values from any field (e.g. integers, strings).

LIKE and ILIKE queries: contains, icontains, startswith, istartswith, endswith, iendswith

The SQL LIKE and ILIKE queries are used with WHERE clauses to match string patterns, with the former being case-sensitive and the latter case-insensitive. Django offers three field lookups to generate SQL LIKE queries, depending on the string pattern you wish to match. Listing 8-39 illustrates how to generate three different SQL LIKE queries with Django field lookups.

Listing 8-39. Django LIKE queries

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item, Drink

# Get the Store records that contain a 'C' anywhere in state (LIKE '%C%')
Store.objects.filter(state__contains='C')

# Get the Store records that start with 'San' in city (LIKE 'San%')
Store.objects.filter(city__startswith='San')

# Get the Item records that end with 'e' in name (LIKE '%e')
Drink.objects.filter(item__name__endswith='e')

As you can see in listing 8-39, the % symbol represents an SQL wildcard and is placed in different positions in the SQL LIKE pattern value depending on the Django field lookup: to generate an SQL query with the LIKE '%PATTERN%' you use the contains field lookup; to generate an SQL query with the LIKE 'PATTERN%' you used the startswith field lookup; and to generate an SQL query with the LIKE '%PATTERN' you use the endswith field lookup.

Django also supports the SQL ILIKE queries, which functions as LIKE queries, but are case-insensitive. Listing 8-40 illustrates how to create ILIKE queries with Django field lookups.

Listing 8-40. Django ILIKE queries

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item

# Get the Store recoeds that contain 'a' in state anywhere case insensitive (ILIKE '%a%')
Store.objects.filter(state__icontains='a')

# Get the Store records that start with 'san' in city case insensitive (ILIKE 'san%')
Store.objects.filter(city__istartswith='san')

# Get the Item records that end with 'a' in name case insensitive (ILIKE '%A')
Item.objects.filter(name__iendswith='A')

# Get the Store records that have state 'ca' case insensitive (ILIKE 'ca')
Store.objects.filter(state__iexact='ca')

The examples in listing 8-40 are just like those in listing 8-39, the only different is Django's field lookups are preceded with the letter i to indicate a case-insensitive ILIKE query.

It's worth mentioning the last example in listing 8-40 is a case-insensitive version of =/EQUAL and !=/NOT EQUAL queries. However, because iexact uses ILIKE under the hood it's mentioned again in this section.

REGEXP queries: regex, iregex

Sometimes the patterns supported by SQL LIKE & ILIKE statements are too basic, in which case you can use an SQL REGEXP statement to define a complex pattern as a regular expression. Regular expressions are more powerful because they can define fragmented patterns, for example: a pattern that starts with sa followed by any letters, followed by a number; or a conditional pattern, such as pattern that starts with Los or ends in Angeles. Django supports the SQL REGEXP keyword through the regex field lookup and also supports case-insensitive regular expression queries through the iregex field lookup.

Although it would be beyond the scope of our discussion to describe the many regular expression syntax variations, a sample regular expression query to match Store records with a city that starts with Los or San would be: Store.objects.filter(city__regex=r'^(Los|San) +').

Note the recommended practice to define patterns for regex or iregex field lookups is to use Python raw string literals. A Python raw string literal is a string preceded by r that conveniently expresses strings that would be modified by escape sequence processing (e.g. the raw string r'\n' is identical to the standard string '\\n'). This behavior is particularly helpful with regular expressions that rely heavily on escape characters. Appendix A describes the use of Python raw strings in greater detail.

>/GREATER THAN and </LESS THAN queries: gt, gte, lt, lte

SQL WHERE statements associated with numeric fields often use the mathematical operators >, >=, < and <= to restrict queries to a certain number ranges. Django models support the use of the mathematical operators >, >=, < and <= through the gt, gte, lt and lte field lookups, respectively. Listing 8-41 illustrates the use of these field lookups in Django.

Listing 8-41. Django GREATER THAN and LESSER THAN queries

from coffeehouse.items.models import Item

# Get Item records with stock > 5
Item.objects.filter(stock__gt=5)

# Get Item records with stock > or equal 10
Item.objects.filter(stock__gte=10)

# Get Item records with stock < 100
Item.objects.filter(stock__lt=100)

# Get Item records with stock < or equal 50
Item.objects.filter(stock__lte=50)

Date and time queries: range, date, year, month, day, week, week_day, time, hour, minute, second

Although SQL WHERE queries for date and time fields can be done with equality, greater than and lesser than symbols, writing SQL date and time queries can be time consuming due to their special characteristics. For example, to create an SQL query to get all records with a 2018 year timestamp, you need to create a query like 'WHERE date BETWEEN '2018-01-01' AND '2018-12-31'. As you can see, syntax wise these queries can become complex and error prone if you add the need to deal with things like timezones, months and things like leap years.

To simplify the creation of SQL WHERE queries with date and time values, Django offers various field lookups, which are illustrated in listing 8-42.

Listing 8-42. Django date and time queries with field lookups

from coffeehouse.online.models import Order
from django.utils.timezone import utc
import datetime

# Define custom dates
start_date = datetime.datetime(2017, 5, 10).replace(tzinfo=utc)
end_date = datetime.datetime(2018, 5, 21).replace(tzinfo=utc)

# Get Order recrods from custom dates, starting May 10 2017 to May 21 2018
Order.objects.filter(created__range=(start_date, end_date))

# Get Order records with exact start date
orders_2018 = Order.objects.filter(created__date=start_date)

# Get Order records with year 2018
Order.objects.filter(created__year=2018)

# Get Order records with month January, values can be 1 through 12 (1=January, 12=December).
Order.objects.filter(created__month=1)

# Get Order records with day 1, where values can be 1 through 31.
Order.objects.filter(created__day=1)

# Get Order records from January 1 2018
Order.objects.filter(created__year=2018,create__month=1,created__day=1)

# Get Order records that fall on week number 24 of the yr, where values can be 1 to 53.
Order.objects.filter(created__week=24)

# Get Order recrods that fall on Monday, where values can be 1 to 7 (1=Sunday, 7=Saturday).
Order.objects.filter(created__week_day=2)

# Get Order records made at 2:30pm using a time object
Order.objects.filter(created__time=datetime.time(14, 30))

# Get Order records made at 10am, where values can be 0 to 23 (0=12am, 23=11pm).
Order.objects.filter(date__hour=10)

# Get Order records made at the top of the hour, where values are 0 to 59.
Order.objects.filter(date__minute=0)

# Get Order records made the 30 second mark of every minute, where values are 0 to 59.
Order.objects.filter(date__second=30)

The first example in listing 8-42 uses the range field lookup which takes two Python datetime.datetime objects to define a date range for the query. Although range is the most flexible approach to create date and time queries, there are other field lookup alternatives that offer simpler syntax. The date field lookup allows you to create query for an exact date.

The year, month and day field lookups allow you to create queries that match records for a given year, month or day, respectively. In addition, if you look at the middle of listing 8-42, you'll notice it's also possible to create a query with multiple field lookups to match a combination of year, month and day.

Finally, toward the bottom half of listing 8-42 you can see the week and week_day field lookups can create a query for records that match a given week of the year or day of the week, respectively. In addition to the time field lookup designed to make a query based on a datetime.time object, as well as the hour, minute and second field lookups designed to create queries for records that match a given hour, minute or second, respectively.

Tip To make a query that only extracts dates and times from a record (and not the full record), look at the DISTINCT section under the date and time sub-section.

Can't find an SQL WHERE statement ? Use custom lookups, extra(), sub-queries or raw query

Although Django provides an extensive list of field lookups to generate various SQL WHERE statements, this doesn't mean you will always find the necessary field lookup to generate a desired SQL WHERE statements. In such cases, you have the following alternatives:

Create a custom lookup: Just like other Django custom constructs, you can create custom lookups with custom SQL WHERE statements^[2].

Use the extra() method: The Django model extra() method can also be used to create custom SQL WHERE statements^[3].

Use a sub-query: Sub-queries allow the creation of WHERE statements dependent on the results of other queries. A later section in this chapter addresses how to crate SQL sub-queries on Django models.

Raw SQL query: You can create a raw(open-ended) SQL query with verbatim SQL WHERE statements. A later section in this chapter addresses how to execute raw SQL queries on Django models.

DISTINCT queries

The SQL DISTINCT keyword is used to filter duplicate records and is supported in Django models through the distinct() method. By default, SQL DISTINCT and the Django distinct() method are applied against the contents of entire records. This means that unless a query limits its number of fields or a query spans multiple models, the distinct() method will never produce distinct results. Listing 8-43 illustrates several queries that use the distinct() method that better illustrate this behavior.

Listing 8-43. Django DISTINCT queries with distinct()

from coffeehouse.stores.models import Store

# Get all Store records number
Store.objects.all().count()
4

# Get all distinct Store record number
Store.objects.distinct().count()
4

# Get distinct state Store record values 
Store.objects.values('state').distinct().count()
1
# ONLY for PostgreSQL, distinct() can accept model fields to create DISTINCT ON query
Store.objects.distinct('state')

The first query in listing 8-43 gets the total count for all Store records, where as the second query gets the total count for distinct Store records. Notice how even though the second query uses the distinct() method both counts are the same, since there's at least one field value (e.g. id) across all records that's distinct.

The third query in listing 8-43 makes use of the values() method to restrict the query records to only the state field. Once this is done, the distinct() method is applied to the query followed by the count() method, to get the total number of distinct state values. By applying a selective query field method (e.g. values() or values_list()) prior to the distinct() method, the logic performed by the distinct() method produces a logical output.

The final example in listing 8-43 passes a model field to the distinct() method to produce an SQL DISTINCT ON query. This last distinct() method syntax is only supported for PostgreSQL databases which understand the SQL DISTINCT ON statement.

Dates and times queries: dates() and datetimes()

In addition to the distinct() method, Django also offer two special methods designed to extract DISTINCT date and time values from records. The dates() and datetimes() methods generate a list of datetime.date or datetime.datetime objects (respectively) based on model record values that match distinct dates or times.

The dates() method accepts three arguments, two required and one optional. The first argument (required) is a date field on which to perform the DISTINCT query, the second argument (required) is the date component on which to perform the DISTINCT query, which can be 'year', 'month' or 'day'. The third argument (optional) is the query order which defaults to 'ASC' for ascending, but can also be for 'DESC' descending.

The datetimes() method also accepts three arguments, two required and one optional. The first argument (required) is a date time field on which to perform the DISTINCT query, the second argument (required) is the date time component on which to perform the DISTINCT query, which can be 'year', 'month', 'day', 'hour', 'minute' or 'second'. The third argument (optional) is the query order which defaults to 'ASC' for ascending, but can also be for 'DESC' descending.

Listing 8-44 illustrates a series of examples using the dates() and datetimes() methods.

Listing 8-44 Django DISTINCT date and time queries with dates and datetimes() methods.

from coffeehouse.online.models import Order

# Get distinct years (as datetime.date) for Order objects 
Order.objects.dates('created','year')
# Outputs: <QuerySet [datetime.date(2017, 1, 1),datetime.date(2018, 1, 1)]>

# Get distinct months (as datetime.date) for Order objects
Order.objects.dates('created','month')
# Outputs: <QuerySet [datetime.date(2017, 3, 1),datetime.date(2017, 6, 1),datetime.date(2018, 2, 1)]>

# Get distinct days (as datetime.datetime) for Order objects
Order.objects.datetimes('created','day')
# Outputs: <QuerySet [datetime.datetime(2017, 6, 17, 0, 0, tzinfo=<UTC>)...]>

# Get distinct minutes (as datetime.datetime) for Order objects
Order.objects.datetimes('created','minute')
# Outputs: <QuerySet [datetime.datetime(2017, 6, 17, 3, 13, tzinfo=<UTC>)...]>

As you can see in listing 8-44, the dates() method produces a list of datetime.date objects generated from a given date component across all model records, where as the datetimes() method produces a list of datetime.datetime objects generated from a given date time component across all model records. Note the examples in listing 8-44 apply the dates() and datetimes() methods to all model records, but it's valid to use these methods on any query (i.e. filter() or exclude()).

Tip You can also use an aggregation query to count distinct values. See the Aggregation queries section for additional details on this process.

ORDER queries: order_by() and reverse()

SQL queries often use the ORDER keyword to tell the database engine to sort query results based on certain field or fields. This technique is helpful because it avoids the additional overhead of sorting records outside the database (i.e. in Python). Django models support the SQL ORDER statement through the order_by() method. The order_by() method accepts model fields as input to define the query order, a process that's illustrated in listing 8-45.

Listing 8-45. Django ORDER queries

from coffeehouse.stores.models import Store

# Get Store records and order by city (ORDER BY city)
Store.objects.all().order_by('city')

# Get Store recrods, order by name descending, email ascending (ORDER BY name DESC, email ASC)
Store.objects.filter(city='San Diego').order_by('-name','email')

The first example in listing 8-45 defines a query for all Store objects ordered by city. By default, order_by sets the order ascending (i.e. 'A' records first, 'Z' records last). The second example in listing 8-45 defines a Store query but with multple fields, so the query is first ordered by the first field and then with the second. In addition, the second example illustrates the use of the -- (minus) symbol to override the default ascending order, -name indicates to order records by name but in descending order (i.e. 'Z' records first, 'A' records last).

Tip You can declare the ordering Meta option on a model to set its default query ordering behavior, instead of declaring the order_by() method. See the previous chapter's Query meta options section.

In addition to the order_by() method, Django models also support the reverse() method which inverts the results of a QuerySet. The reverse() method works just like Python's standard reverse() method that inverts the order of a list, except it's designed to operate on Django QuerySet data structures before the data is materialized.

LIMIT queries

THE SQL LIMIT statement is used when you want to avoid reading an entire set of records in a query and instead limit the resulting records to a smaller set. The SQL LIMIT statement is helpful for cases when you purposely want to read query records gradually (e.g. large queries that are displayed on multiple pages, a.k.a. pagination) -- or you want to sample a query (e.g. get the first, last, latest or oldest record in a query).

Django models offers various mechanisms to generate LIMIT queries described in the next sections.

LIMIT and OFFSET queries: Python slice syntax

SQL LIMIT queries are often accompanied by the OFFSET statement, the last of which is used to extract records starting from a given point in the whole set of records. Django models support the creation of SQL queries with LIMIT and OFFSET statements using standard Python slice syntax (i.e. the same syntax used to split lists). Listing 8-46 illustrates how to generate LIMIT and OFFSET queries.

Listing 8-46. Django LIMIT and OFFSET queries with Python slice syntax

from coffeehouse.stores.models import Store
from coffeehouse.items.models import Item

# Get the first five (LIMIT=5) Store records that have state 'CA'
Store.objects.filter(state='CA')[:5]

# Get the second five (OFFSET=5,LIMIT=5) Item records (after the first 5)
Item.objects.all()[5:10]

# Get the first (LIMIT=1) Item object
Item.objects.all()[0]

As you can see in listing 8-46, the technique to generate LIMIT and OFFSET queries is through Python's slice syntax applied directly to QuerySet data structures. In case you've never used Python's slice syntax, the technique is straightforward: The syntax QuersySet[start:end] gets items from start to end-1 of a QuerySet, the syntax QuerySet[start:] gets items from start through the rest of a QuerySet and the syntax QuerySet[:end] gets items from the beginning of a QuerySet through end-1.

Pseudo LIMIT 1 order queries: first() and last()

Under certain circumstances, the SQL LIMIT statement is used to get a single record that's the first or last record in a set of records. Django models support the first() and last() methods which generate a LIMIT 1 query just as if you created a query with the slice syntax [0] -- describe in listing 8-46.

The first() and last() methods are typically preceded by the order_by() model method, in order to guarantee an expected record order and thus get the first or last record of said records. If the first() and last() methods are applied without the order_by() model method, then the query is applied against the default ordering mechanism -- by the id field -- and thus first() returns the record with the first id value and last() returns the record with the last id value.

For example, the query Store.objects.filter(state='CA').first() gets the first Store record with state='CA' with the lowest id (since order defaults to id), a query that's equivalent to Store.objects.filter(state='CA')[0]. The query Item.objects.all().order_by('name').last() gets the last Item record with the name that comes last in the alphabet (since order is specified by name), a query that's equivalent to Item.objects.all().order_by('name').reverse()[0].

Pseudo LIMIT 1 date and time queries: latest() and earliest()

For SQL LIMIT queries associated with dates or times, Django offers the latest() and earliest() methods to obtain the most recently or first created model records (respectively) based on a date field. Both the latest() and earliest() methods accept a date field on which to perform a query and provide much shorter syntax to deal with LIMIT queries related to dates or times vs. the first() and last(). This is because latest() and earliest() methods automatically perform the order_by() operation on the field provided as an argument.

For example, Order.objects.latest('created') gets the most recent Order record based on the created field, where as Order.objects.earliest('created') gets the oldest Order record based on the created field.

Tip Use the get_latest_by Meta option in a model to set a default field on which to execute the latest() and earliest() methods. See the previous chapter on model Meta options for additional details.

Merge queries

SQL queries often need to be merged to produce different sets of results, such as combining the records of multiple SQL queries or obtaining common records between multiple SQL queries. Django supports various ways to merge SQL queries, both as QuerySet data structures, as well as through SQL query statements like UNION, INTERSECT and EXCEPT.

QuerySet merger: pipe and itertools.chain

As you've learned throughout this chapter, Django models most often use QuerySet data structures to represent SQL queries. Such QuerySet data structures often require to be merged, to present a larger set of results and avoid having to perform new database queries. Listing 8-47 illustrates the two syntax variations available to merge QuerySet data structures.

Listing 8-47. Combine two Django queries with | (pipe) and itertools.chain

from coffeehouse.items.models import Item, Drink
from itertools import chain

menu_sandwich_items = Item.objects.filter(menu__name='Sandwiches')
menu_salads_items = Item.objects.filter(menu__name='Salads')

drinks = Drink.objects.all()

# A pipe applied to two QuerySets generates a larger QuerySet
lunch_items = menu_sandwich_items | menu_salads_items

# | can't be used to merge QuerySet's with different models
# ERROR menu_sandwich_items | drinks
# itertools.chain generates a Python list and can merge different QuerySet model types
lunch_items_with_drinks = list(chain(menu_sandwich_items, drinks))

The first option in listing 8-47 uses the | (pipe) operator to combine two QuerySet data structures. This technique produces yet another QuerySet data structure, but has the caveat of only working on QuerySet's that use the same model (e.g. Item).

The second option in listing 8-47 uses the Python itertools package to merge two QuerySet data structure with the chain() method. This technique produces a standard Python list -- with the respective model objects -- and is the more flexible option because it can combine QuerySet data structures even if they use different models (e.g. Item and Drink).

UNION queries: union()

The SQL UNION statement is used to merge two or more queries directly in the database. Unlike the previous merge query techniques -- illustrated in listing 8-47 -- which take place in Django/Python, UNION queries are done entirely by the database engine. Django supports the SQL UNION statement through the union() method, as illustrated in listing 8-48.

Listing 8-48. Merge Django queries with union()

from coffeehouse.items.models import Item
menu_breakfast_items = Item.objects.filter(menu__name='Breakfast')
menu_sandwich_items = Item.objects.filter(menu__name='Sandwiches')
menu_salads_items = Item.objects.filter(menu__name='Salads')

# All items merged with union() 
all_items = menu_breakfast_items.union(menu_sandwich_items,menu_salads_items)
print(all_items.query)
SELECT "items_item"."id", "items_item"."menu_id" ... WHERE "items_menu"."name" = Breakfast UNION
SELECT "items_item"."id", "items_item"."menu_id" ... WHERE "items_menu"."name" = Sandwiches UNION
SELECT "items_item"."id", "items_item"."menu_id"... WHERE "items_menu"."name" = Salads

Listing 8-48 first declares three standard SQL queries that produce QuerySet data structures. Next, notice how the union() method is linked to one of the queries and the remaining queries are passed as arguments. Finally, listing 8-48 illustrates how the results of the union() method produce a query with multiple SQL UNION statements that merge the individual queries.

In addition to the union() method accepting different QuerySet instances as arguments, the union() method also accepts the optional keyword all argument which is set to False. By default, the union() method ignores duplicates values across QuerySet instances, however, you can set the all argument to True to tell Django to merge duplicate records (e.g. menu_breakfast_items.union(menu_sandwich_items,menu_salads_items, all=True)).

INTERSECT queries: intersection()

The SQL INTERSECT statement is used to obtain records that intersect (i.e. are present) across multiple queries. Django supports the SQL INTERSECT statement through the intersection() method, as illustrated in listing 8-49.

Listing 8-49. Intersect (Common) Django query records with intersection()

from coffeehouse.items.models import Item

all_items = Item.objects.all()
menu_breakfast_items = Item.objects.filter(menu__name='Breakfast')

# Intersected (common) records merged with intersect() 
intersection_items = all_items.intersection(menu_breakfast_items)
print(intersection_items.query)
SELECT "items_item"."id", "items_item"."menu_id", "items_item"."name"... INTERSECT
SELECT "items_item"."id", "items_item"."menu_id", "items_item"."name"... WHERE "items_menu"."name" = Breakfast

Listing 8-49 first declares two standard SQL queries that produce QuerySet data structures. Next, notice how the intersection() method is linked to one of the queries and the remaining query is passed as an argument. Finally, listing 8-49 illustrates how the results of the intersection() method produce a query with an SQL INTERSECTION statements to produce the common records across queries.

The intersection() method only accept QuerySet instances as arguments. In addition, be careful when declaring more than two QuerySet instances on an intersection() query, as only records that are present in all QuerySet instances form part of the final query result.

EXCEPT queries: difference()

The SQL EXCEPT statement is used to obtain records that are present in a query, but missing in other queries. Django supports the SQL EXCEPT statement through the difference() method, as illustrated in listing 8-50.

Listing 8-50. Except Django query records with difference()

from coffeehouse.items.models import Item
all_items = Item.objects.all()

menu_breakfast_items = Item.objects.filter(menu__name='Breakfast')
menu_sandwich_items = Item.objects.filter(menu__name='Sandwiches')
menu_salads_items = Item.objects.filter(menu__name='Salads')

# Extract records in all_items, except those in:
#     menu_breakfast_items, menu_sandwich_items & menu_salads_items
ex_items = all_items.difference(menu_breakfast_items, menu_sandwich_items, menu_salads_items)
print(ex_items.query)
SELECT "items_item"."id", "items_item"."menu_id", "items_item"."name"...EXCEPT
SELECT "items_item"."id", "items_item"."menu_id", "items_item"."name"... EXCEPT
SELECT "items_item"."id", "items_item"."menu_id", "items_item"."name", ... EXCEPT
SELECT "items_item"."id", "items_item"."menu_id", "items_item"."name" ... WHERE "items_menu"."name" = Salads

Listing 8-50 first declares four standard SQL queries that produce QuerySet data structures. Next, notice how the difference() method is called on the all_items query and the remaining queries are passed as arguments to be excluded from the all_items query. Finally, listing 8-50 illustrates how the results of the difference() method produce a query with multiple SQL EXCEPT statements that exclude query records from the parent query.

Aggregation queries

SQL queries sometimes need to produce values derived from the core fields contained in Django models (e.g. mathematical calculations such as counts, averages, maximum or minimum values from sets of records). Storing this type of aggregate information as individual Django model fields is redundant -- since it can be derived from core data -- and calculating this data outside the context of a database is also wasteful (e.g. reading all records and producing the aggregate results in Python).

SQL offers the necessary statements for a database to solve this problem, through aggregation functions. An aggregation function forms part of an SQL query, which is executed by the database engine and returned as a standalone result -- when used in conjunction with the aggregate() method -- or as an additional field along with the resulting SQL response -- when used in conjunction with the annotate() method. Django supports aggregation queries through a series of methods the include count(), aggregate() and annotate(), as well as aggregation classes.

COUNT queries: count() method and Count() class

The SQL COUNT aggregation function is used in cases where you only need to get the number of records that match a certain criteria, rather than reading all records the make up a query. Queries that use SQL COUNT are also more efficient because it's the database engine that makes the calculation, instead of getting all the data and making the calculation in Python.

Django models supports the SQL COUNT aggregation function through the count() method and the aggregate Count class. Both variations are illustrated in listing 8-51.

Listing 8-51. Django COUNT queries with aggregate(), annotate() and Count()

from coffeehouse.stores.models import Store
from django.db.models import Count

# Get the number of stores (COUNT(*))
stores_count = Store.objects.all().count()
print(stores_count)
4

# Get the number of stores that have city 'San Diego' (COUNT(*))
stores_san_diego_count = Store.objects.filter(city='San Diego').count()

# Get the number of emails, NULL values are not counted (COUNT(email))
emails_count = Store.objects.aggregate(Count('email'))
print(emails_count)
{'email__count': 4}

# Get the number of emails, NULL values are not counted (COUNT(email) AS "coffeehouse_store_emails_count")
emails_count_custom = Store.objects.aggregate(coffeehouse_store_emails_count=Count('email'))
print(emails_count_custom)
{'coffeehouse_store_emails_count': 4}

# Get number of distinct Amenities in all Stores, NULL values not counted (COUNT(DISTINCT name))
different_amenities_count = Store.objects.aggregate(Count('amenities',distinct=True))
print(different_amenities_count)
{'amenities__count': 5}

# Get number of Amenities per Store with annotate
stores_with_amenities_count = Store.objects.annotate(Count('amenities'))

# Get amenities count in individual Store 
stores_With_amenities_count[0].amenities__count

# Get number of Amenities per Store with annotate and custom name
stores_amenities_count_custom = Store.objects.annotate(amenities_per_store=Count('amenities'))
stores_amenities_count_custom[0].amenities_per_store

The first two examples in listing 8-51 append the count() method as the last part of a Django query to get a total count.

The third example in listing 8-51 uses the aggregate() function and the aggregate Count class to get the total count of emails in Store records. Notice how a query with the aggregate() method produces a dictionary, where the key is the counted field -- in this case email -- suffixed with the __count to indicate aggregate class, and the dictionary value is the resulting count. The fourth example in listing 8-51 is very similar to the third one, except it prefixes the aggregate Count class with a custom string to simulate the SQL AS keyword, so the resulting dictionary value uses the custom string coffeehouse_store_emails_count as the key result.

Note If no string is assigned to an aggregate class (e.g. Count) in a query, the resulting query output defaults to: <field>__<aggregate_class>.

The fifth example in listing 8-51 illustrates how the aggregate Count class can accept the optional distinct=True argument to omit duplicate values in the count. In this case, a count is made for all amenities associated with Store records, but the count only reflects distinct amenities values.

Although the aggregate() method produces aggregation results, it's limited to only producing the aggregation result by itself, that is, it requires additional queries to get the core data from where the aggregation result was calculated. The annotate() method solves this problem, as show in listing 8-51.

The last two examples in listing 8-51 use the annotate() method to add an additional field to a query's records to hold an aggregate result. The second to last example in listing 8-51 adds the amenities__count field via the aggregate Count() class to all Store records. And the last example in listing 8-51, assigns a custom string to the aggregate Count() class to create the custom amenities_per_store field to hold the amenities count for all Store records.

MAX, MIN, SUM, AVG, VARIANCE and STDDEV queries: Max(), Min(), Sum(), Avg(), Variance() and StdDev() classes

In addition to the SQL COUNT aggregation function, SQL queries also support other aggregation functions for mathematical operations that are best done in the database. These SQL aggregation functions include MAX to get a maximum value from a set of records, MIN to get a minimum value from a set of records, SUM to a sum of values from a set of records, AVG to get the average from a set of records, VARIANCE to get the statistical variance of values from a set of records and STDDEV to get the statistical deviation from a set of records.

Django models support all the previous SQL aggregation functions through the use of aggregation classes -- just like Count() aggregation described in listing 8-51. Therefore to make use of these additional SQL aggregation functions, you use a Django model's aggregate() or annotate() methods in conjunction with the relevant aggregation class, a processed that's llustrated in listing 8-52.

Listing 8-52 Django MAX, MIN,SUM, AVG, VARIANCE and STDDEV queries with Max(), Min(), Sum(), Avg(), Variance() and StdDev() classes

from coffeehouse.items.models import Item
from django.db.models import Avg, Max, Min
from django.db.models import Sum
from django.db.models import Variance, StdDev

# Get the average, maximum and minimum number of stock for all Item records
avg_max_min_stock = Item.objects.aggregate(Avg('stock'), Max('stock'), Min('stock'))
print(avg_max_min_stock)
{'stock__avg': 29.0, 'stock__max': 36, 'stock__min': 27}

# Get the total stock for all Items
item_all_stock = Item.objects.aggregate(all_stock=Sum('stock'))
print(item_all_stock)
{'all_stock': 261}

# Get the variance and standard deviation for all Item records
# NOTE: Variance & StdDev return the population variance & standard deviation, respectively.
#       But it's also possible to return sample variance & standard deviation, 
#       using the sample=True argument
item_statistics = Item.objects.aggregate(Variance('stock'), std_dev_stock= StdDev('stock'))
{'std_dev_stock': 5.3748, 'stock__variance': 28.8888}

As you can see in the first example in listing 8-52, it's possible to define multiple aggregate classes to a single query as part of the aggregate() method, in this case the query gets the average, minimum and maximum stock values across all Item records.

The second example in listing 8-52 gets the sum of all stock values across all Item records by using the aggregate Sum class. Notice how it's possible in this second example to prefix the aggregate class with a custom string to act as an SQL AS keyword, in order for the query to output the results with a different value than the aggregate class name. Finally, the last example in listing 8-52 calculates the variance and standard deviation for all stock values across all Item records.

Tip If want to perform more complex aggregation queries, such as multi-field math operations (e.g. multiplication), see the F expressions sub-section.

Tip If want to perform more complex aggregation queries, see the section on Model queries with raw (open-ended) SQL.

Expression and function queries

SQL queries irrespective of their many statements generally reference values provided by the calling environment. For example, when you create a query to get all Store records that have certain state value, Django/Python provides a value reference for the state, similarly, if you create a query to get all Item records that belong to a certain Menu model, Django/Python provides a value reference for the Store.

For certain SQL queries though, it's necessary to use references that point toward data in the actual database. This is necessary because the results for certain SQL queries depend on the data present in the database, or because, manipulating the data outside the context of a database (i.e. Python) represents an additional effort that can easily be solved in SQL.

You already learned about this technique in the past section on Aggregation queries, where an SQL query can tell a database engine to calculate things like counts and averages, without the need to pull the data and do the operations outside the database (i.e. in Python). Aggregation queries rely on a special sub-set of expressions properly called aggregation expressions, but in this upcoming sections you'll learn how Django supports many other types of SQL expressions.

Another SQL technique designed in the same spirit of SQL expressions to favor the delegation of work to the database engine are: SQL functions. SQL functions are intended to allow a database to alter the results of query (e.g. concatenate two fields or transform a field to upper/lower case) and alleviate the need to do such tasks in the calling party environment (i.e. Django/Python). In the upcoming section you'll also learn about different SQL functions supported by Django models.

SQL expression queries: F expressions

Django F expressions are among the most common type of SQL expressions you'll use in Django models. At the start of this chapter you were exposed to the utility of F expressions when you learned how it's possible to update a record in a single step, and let the database engine perform the logic without the need pull the record out of the database.

Through an F expression, it's possible to reference a model field in a query and let the database perform an operation on the model field value without the need to pull the data from the database. In turn, this not only provides a more succinct query syntax -- a single update query, instead of two (one to read, one to update) -- it also avoids 'race conditions'^[4].

Listing 8-53 illustrates various ways F expressions can be used on update queries.

Listing 8-53 Django F() expression update queries

from coffeehouse.items.models import Item
from django.db.models import F

# Get single item
egg_biscuit = Item.objects.get(id=2)

# Check stock
egg_biscuit.stock
2

# Add 10 to stock value with F() expression
egg_biscuit.stock = F('stock') + 10

# Trigger save() to apply F() expression
egg_biscuit.save()

# Check stock again
egg_biscuit.stock
<CombinedExpression: F(stock) + Value(10)>

# Ups, need to re-read/refresh from DB
egg_biscuit.refresh_from_db()

# Check stock again
egg_biscuit.stock
12

# Decrease stock value by 1 for Item records on the Breakfast menu
breakfast_items = Item.objects.filter(menu__name='Breakfast')
breakfast_items.update(stock=F('stock') -- 1)

# Increase all Item records stock by 20 
Item.objects.all().update(stock=F('stock') + 20)

The first example in listing 8-53 reads a single model record and applies an F() expression to the stock field. Once the F() expression is applied, it's necessary to call the save() method on the record for the database to trigger the update. Next, notice that in order for the model record reference to reflect the results of the F() expression, you must re-read the record from the database -- in this case with the refresh_from_db() method -- given the database is the only party aware of the result of the update operation.

Next in listing 8-53, you can see how it's also possible to perform a subtraction operation on an F() expression, as well as apply an F() expression to all the records in a QuerySet through the update() method.

In addition to updating records without the need to extract data from the database, F() expressions can also be used in database read operations. F() expressions are helpful for read queries and aggregation queries where the results are best determined by the database engine, as shown in listing 8-54.

Listing 8-54. Django F() expressions in read queries and aggregate queries

from django.db.models import  F, ExpressionWrapper, FloatField
from coffeehouse.items.models import Drink, Item 

calories_dbl_caffeine_drinks = Drink.objects.filter(item__calories__gt=F('caffeine')*2)

items_with_assets = Item.objects.annotate( 
                                 assets=ExpressionWrapper(F('stock')*F('price'),
                                 output_field=FloatField()))

Notice how the first query example in listing 8-54 lacks any fixed values and is instead composed of references that are checked by the database engine to return records that match the condition. In this case, the query obtains all Drink records that have calories greater than two times their caffeine content, where it's the database engine -- via the F() expression -- tasked with determining which Drink records comply with this rule.

The second example in listing 8-54 creates an aggregate query from two F() expressions. In this case, a new field called assets is calculated with the annotate() method, by multiplying the value of a record's stock and price fields via F() expressions. Unlike the aggregation queries examples in the previous section dedicated to aggregation queries, this aggregation has two important differences and arguments:

ExpressionWrapper.- Because the aggregate query in listing 8-54 is composed of multiple model fields, it's necessary to delimit its scope by wrapping the aggregate query in the ExpressionWrapper statement.
output_field.- When an aggregate query is composed of multiple model fields and the data types differ, it's necessary to specify output_field with a model data type. In listing 8-54, because stock is an IntegerField model field and price is a FloatField model field, the output_field tells Django to generate the aggregated assets field as a FloatField, thus avoiding data type ambiguity.

SQL function queries: Func expressions & Django database functions

Func expressions are another expression sub-set supported by Django models, which have the same purpose as other SQL expressions: to use the database to execute operations, instead of fetching data and later performing the operation outside the database (i.e. in Python).

Func expressions are used in Django to trigger the execution of database functions. Unlike F expressions which are used to perform basic operations against model fields, Func expressions are used to execute more sophisticated functions supported by databases and run them against models fields.

Listing 8-55 illustrates an example of a Func expression that calls an SQL function, as well as a couple of Django database functions that simulate SQL functions.

Listing 8-55. Django Func() expressions for SQL functions and Django SQL functions

from django.db.models import  F, Func, Value
from django.db.models.functions import Upper, Concat
from coffeehouse.stores.models import Store

# SQL Upper function call via Func expression and F expression
stores_w_upper_names = Store.objects.annotate(name_upper=Func(F('name'), function='Upper'))
stores_w_upper_names[0].name_upper
'CORPORATE'
stores_w_upper_names[0].name
'Corporate'

# Equivalent SQL Upper function call directly with Django SQL Upper function
stores_w_upper_names_function = Store.objects.annotate(name_upper=Upper('name'))
stores_w_upper_names_function[0].name_upper
'CORPORATE'

# SQL Concat function called directly with Django SQL Concat function 
stores_w_full_address = Store.objects.annotate(full_address=
                               Concat('address',Value(' - '),'city',Value(' , '),'state'))
stores_w_full_address[0].full_address
'624 Broadway - San Diego , CA'
stores_w_full_address[0].city
'San Diego'

The first example in listing 8-55 makes use of the Func() expression to generate the additional name_upper field via annotate(). The purpose of the additional name_upper field is to get the name of all Store records in an upper case format, a process that fits perfectly with the SQL UPPER function. In the case of listing 8-55, the Func() expression declares two arguments: an F expression to specify the model field on which to apply the function and the function argument to specify the SQL function to use. Once the query is created, you can see in listing 8-55, each record has access to the additional name_upper field with an upper case version of the name field, as well as access to the other model fields.

Although Func() expressions are the most flexible option to generate Django model queries with SQL expressions, Func() expressions can be verbose for default scenarios. Django offers a quicker alternative to generate SQL expressions via SQL functions that are part of the django.db.models.functions package.

The second query in listing 8-55 is equivalent to the first query, but notice this variation uses the Django database Upper() function as an argument of the annotate() method, similar to how Django aggregate classes are declared in annotate() statements.

The third example in listing 8-55 generates the additional full_address field via annotate() and makes use of the Django database Concat() function. The purpose of the Concat() function is to concatenate the values of multiple model fields. In the case of listing 8-55, the Concat() function concatenate the values of the Store model's address, city and state. In order to leave spaces between the concatenated field values, the Concat() function uses the Django Value() expression to output verbatim separators and spaces. Once the query is created, you can see in listing 8-55, each record has access to the additional full_address field with a concatenated value of the address, city and state fields, as well as access to the other model fields.

Django includes over a dozen database functions in the django.db.models.functions package^[5] for strings, dates and other data types you can leverage as SQL functions in queries.

SQL sub-queries: Subquery expressions

SQL sub-queries are queries that are nested inside other standard CRUD queries or inclusively other sub-queries. Most SQL sub-queries are used under two scenarios. The first scenario occurs when you need to create SQL queries with related fields that span multiple tables, yet the underlying tables don't have an explicit relationship between one another.

This first SQL sub-query scenario is common for queries involving multiple Django models with missing relationship data types (i.e. OneToOneField, ForeignKey and ManyToManyField). Listing 8-56 illustrates this SQL sub-query scenario solved through the use of Subquery expressions.

Listing 8-56. Django Subquery expression with SQL sub-query to get related model data

from django.db.models import OuterRef, Subquery

class Order(models.Model):
    created = models.DateTimeField(auto_now_add=True)

class OrderItem(models.Model):
    item = models.IntegerField()
    amount = models.IntegerField()
    order = models.ForeignKey(Order) 

# Get Items in order number 1
order_items = OrderItem.objects.filter(order__id=1)

# Get item 
order_items[0].item
1

# Get item name ?
# OrderItem item field is IntegerField, lacks Item relationship
# Create sub-query to get Item records with id
item_subquery = Item.objects.filter(id=(OuterRef('id')))

# Annotate previous query with sub-query
order_items_w_name = order_items.annotate(item_name=Subquery(item_subquery.values('name')[:1]))

# Output SQL to verify
print(order_items_w_name.query)
SELECT `online_orderitem`.`id`, `online_orderitem`.`item`,
    `online_orderitem`.`amount`, `online_orderitem`.`order_id`, 
     (SELECT U0.`name` FROM `items_item` U0 WHERE U0.`id` = (online_orderitem.`id`) LIMIT 1) 
     AS `item_name` FROM `online_orderitem` WHERE `online_orderitem`.`order_id` = 1

# Access item and item_name
order_items_w_name[0].item
1

order_items_w_name[0].item_name
'Whole-Grain Oatmeal'

The first lines in listing 8-56 show the Order and OrderItem models, including a query that gets all the OrderItem records that belong to Order number 1. Next, you can see that although the OrderItem model has an item field, its value is an integer. This presents a problem because it isn't possible to obtain the name and other properties associated with an item field integer value, or in other words, the OrderItem records are missing a relationship to the Item model. This problem can be solved with a sub-query.

Next in listing 8-56, the Item.objects.filter(id=(OuterRef('id'))) sub-query is declared to get all the Item records by id, which is the value the main OrderItem expects to map to item values. The special OuterRef syntax work like an F expression to be evaluated until the parent query is resolved, after all, the Item records to get by id are dependent on the parent query (e.g. The sub-query should only Item records by id for only those items in an OrderItem record).

Once the sub-query is defined in listing 8-56, it's linked via the annotate() method and the Subquery() expression to the initial OrderItem query. Next, you can see the SQL generated by the query contains a sub-query referencing the Item model. Finally, listing 8-56 illustrates the output of the additional item_name field on the OrderItem query that's generated via a sub-query.

The second scenario involving sub-queries is when an SQL query must generate a WHERE statement with values that are dependent on the results another SQL query. This scenario is illustrated in listing 8-57.

Listing 8-57. Django Subquery expression with SQL sub-query in WHERE statement

# See listing 8-56 for referenced model definitions
from coffeehouse.online.models import Order
from coffeehouse.items.models import Item
from django.db.models import OuterRef, Subquery

# Get Item records in lastest Order to replenish stock
most_recent_items_on_order = Order.objects.latest('created').orderitem_set.all()

# Get a list of Item records based on recent order using a sub-query
items_to_replenish = Item.objects.filter(id__in=Subquery(
                                    most_recent_items_on_order.values('item')))
print(items_to_replenish.query)
SELECT `items_item`.`id`, `items_item`.`menu_id`, `items_item`.`name`, `items_item`.`description`,
`items_item`.`size`, `items_item`.`calories`,
 `items_item`.`price`, `items_item`.`stock` FROM `items_item` WHERE `items_item`.`id` 
  IN (SELECT U0.`item` FROM `online_orderitem` U0 WHERE U0.`order_id` = 1)

The first step in listing 8-57 gets all OrderItem records from the latest Order record, with the purpose to detect which Item stock to replenish. However, because OrderItem records use a plain integer id to reference Item records, it's necessary to create a sub-query which gets all Item records based on the OrderItem integer reference.

Next in listing 8-57, a query is made for Item records whose id is contained in a sub-query. In this case, a Subquery expression is used to point toward the most_recent_items_on_order query that only gets the item values (i.e. integer values) from the most recent Order record.

Finally, listing 8-57 illustrates how the generated query uses a WHERE statement that makes use of a sub-query.

JOIN QUERIES

The SQL JOIN keyword is used to produce queries from multiple database tables. Django supports JOIN queries for related models through the select_related() method, described earlier in the CRUD relationship records across Django models.
If you want to create a JOIN query between tables that don't have a Django model relationship, you can use a raw SQL query, described in the next section.