When trying to understand the inner-workings of Django last week, I learned about traceback , select_related , and the N+1 query problem.
To give some context, I was working on creating a Django list view and was curious about how the get_queryset() that we define is invoked. I found out that by using traceback and viewing the stack, the get_queryset() is called by the BaseListView under the hood. The important part of this is not the traceback itself, but gradually understanding the inner-workings of Django. For my own learning and understanding, the framework shouldn’t be a mysterious entity and I want to continue to remind myself that.
After that, I continued working on displaying the list view. The issue that I ran into was related to multiple queries. The views display reading records and its foreign key is a readings file. If we decided to display the file name associated with a particular record in the list view as is (with no performance improvement), each query for the record would also make a query for the file. This was confirmed was by using the Django Debug Toolbar.
The N+1 query problem can occur in ORMs. In our example, if we had two million records, we would end up querying two million and one times (one for the reading records and two million for each record’s associated file). By using select_related , this issue is avoided and the performance is improved by using a join under the hood.
Side note: prefetch_related is also another useful tool, but behaves differently. I might cover that in a future article.