Discussion:
Max function and grouping with Oracle
Dan Davis
2018-11-26 17:32:09 UTC
Permalink
I have a parent model that has a relationship to some data that is changing:


class Parent(models.Model):
name = models.CharField(...)
created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
updated_timestamp = models.DateTimeField(auto_now=True, null=True)


class Child(models.Model):
parent = models.ForeignKey(Parent, on_delete=models.CASCADE)
created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
updated_timestamp = models.DateTimeField(auto_now=True, null=True)



I am trying to annotate a query with a Max updated_timestamp for the
children:

Parent.objects.annotate(child_updated_timestamp=models.Max('child__updated_timestamp',
output_field=models.DateTimeField()))


It seems like Oracle backend is attempting to GROUP BY every field in the
child model.

Can anyone tell me whether they've seen anything like this and how to
constrain the GROUP BY?

Thanks,

-Dan
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/4a41b74a-a51b-4282-b292-2e6a59fe1bd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Dan Davis
2018-11-26 18:03:19 UTC
Permalink
Looks like there is no problem here. Django groups by whatever is in the
query, so that this would be done as follows:

Parent.objects.values('id').annotate(child_updated_timestamp=models.Max('child_updated_timestamp'))
Post by Dan Davis
name = models.CharField(...)
created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
updated_timestamp = models.DateTimeField(auto_now=True, null=True)
parent = models.ForeignKey(Parent, on_delete=models.CASCADE)
created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
updated_timestamp = models.DateTimeField(auto_now=True, null=True)
I am trying to annotate a query with a Max updated_timestamp for the
Parent.objects.annotate(child_updated_timestamp=models.Max('child__updated_timestamp',
output_field=models.DateTimeField()))
It seems like Oracle backend is attempting to GROUP BY every field in the
child model.
Can anyone tell me whether they've seen anything like this and how to
constrain the GROUP BY?
Thanks,
-Dan
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/71f1bd0e-73e9-45ea-8c94-3d83e2015bc9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Simon Charette
2018-11-27 08:46:08 UTC
Permalink
Dan,

The root of the issue here is that you query is asking to retrieve all
fields from
the Parent table and perform an aggregation on the JOIN'ed child table.

That results in the following query

SELECT parent.*, MAX(child.updated_timestamp)
FROM parent
JOIN child ON (child.parent_id = parent.id)
...

Since you SELECT all parent columns Django has no choice but to GROUP BY
parent.* as well because on Oracle GROUP BY must contain all non-aggregate
SELECT clauses. That is not the case on MySQL and PostgreSQL for example.

If you use .values('id') as you described you'll only SELECT parent.id
which will
solve your issue as you've come to discover.

FWIW .only('id') would have worked as well and returned Parent objects
instead
of a dict but all fields accesses would have been deferred.

Also there's a ticket tracking support for aggregation through subquery
that would
have worked for your case[0]

Parent.objects.annotate(
child_updated_timestamp=Child.objects.filter(
parent=OuterRef('pk'),
).aggregate(Max('updated_timestamp'))
)

Cheers,
Simon

[0] https://code.djangoproject.com/ticket/28296
Post by Dan Davis
name = models.CharField(...)
created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
updated_timestamp = models.DateTimeField(auto_now=True, null=True)
parent = models.ForeignKey(Parent, on_delete=models.CASCADE)
created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
updated_timestamp = models.DateTimeField(auto_now=True, null=True)
I am trying to annotate a query with a Max updated_timestamp for the
Parent.objects.annotate(child_updated_timestamp=models.Max('child__updated_timestamp',
output_field=models.DateTimeField()))
It seems like Oracle backend is attempting to GROUP BY every field in the
child model.
Can anyone tell me whether they've seen anything like this and how to
constrain the GROUP BY?
Thanks,
-Dan
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/1e2feec6-bd1f-4ea6-9c85-6f8abd8aaf12%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...