I am struggling to create the correct prefetch behavior in Django. Here is the outline of the problem:
- Each Account has DailyQuotes, updated daily at different times (think snapshot)
- Need to query all of those DailyQuotes, and only get the most recent quotes for each account
Here are the models:
class Account(models.Model): name = models.TextField(default="") ... class DailyQuotes(models.Model): account = models.ForeignKey(Account, related_name="quote", on_delete=models.CASCADE) date = models.DateField(default=None) ...
Currently the query inside of my view for this looks like:
acc_ids = [1,2,3] max_date = DailyQuotes.objects.aggregate(Max("date"))["date__max"] accounts = ( Account.objects.filter(id__in=acc_ids) .prefetch_related( Prefetch( "quote", queryset=DailyQuotes.objects.filter(date=date), ), ) ) # Feed into serializer, etc
This works and generates 3 queries: 1 for the max date, 1 for accounts, and 1 for the quotes. The problem with this solution is that if one account has more up to date DailyQuotes, then the other accounts will return no quotes. So I need to get the latest DailyQuotes for each account based on the max date for that account, not all accounts.
I have generated the SQL query that does what I want, but turning it into Django code has been giving me issues. I could execute the raw SQL but I would like to keep it in Django. Here is what the current SQL looks like and what it needs to be:
Current prefetch query (generated by Django):
SELECT ... FROM dailyquotes WHERE (dailyquotes.date = 2022-05-05 AND dailyquotes.account_id IN (1,2,3))
Required prefetch query (or similar):
SELECT ... FROM dailyquotes dq JOIN (SELECT account_id, MAX(date) AS date__max FROM dailyquotes WHERE account_id in (1,2,3) group by account_id) dates ON dq.account_id = dates.account_id AND dq.date = dates.date__max
Any help would be greatly appreciated!
With some help from SamSparx, here is what I came up with:
acc_ids = [1,2,3] max_dates = (DailyQuotes.objects.filter(account_id__in=acc_ids) .values("account_id") .annotate(max_date=Max("date"))) recordsets = None # get the recordsets for max_date in max_dates: qs = DailyQuotes.objects.filter( account_id=max_date["account_id"], date=max_date["max_date"]) if recordsets is None: recordsets = qs else: recordsets = recordsets | qs accounts = (Account.objects.filter( id__in=permissions["acc_ids"].values()) .prefetch_related( Prefetch( "quote", queryset=recordsets, ), ) )
If you are not tied to prefetch_related you can do it in Django via DailyQuotes in 2 calls – 1 to gather the max dates and 1 for the final recordset (even using select_related if you want accompanying account info).
from django.db.models import Max #define lists acc_ids = [0,1,2] max_dates =  recordsets =  final_recordset =  #get the max date for each account ID max_dates = DailyQuotes.objects.filter(account_id__in=acc_ids).values('account_id').annotate(max_date = Max('date')) #get the recordsets for max_date in max_dates: qs = DailyQuotes.objects.filter(account_id = max_date['account_id'], date = max_date['max_date'] ) #qs = DailyQuotes.objects.filter(account_id = max_date['account_id'], date = max_date['max_date']).select_related('account') if you need associated account info recordsets.append(qs) #combine the recordsets for serialising - you may want to modify this based on length of recordsets list (in case of empty accounts) for robustness final_recordset = recordsets.union( recordsets, recordsets)
Answered By – SamSparx