Chapter 22. Overview of PostgreSQL Internals

Table of Contents
The Path of a Query
How Connections are Established
The Parser Stage
The Postgres Rule System
Planner/Optimizer
Executor

Author: This chapter originally appeared as a part of Simkovics, 1998, Stefan Simkovics' Master's Thesis prepared at Vienna University of Technology under the direction of O.Univ.Prof.Dr. Georg Gottlob and Univ.Ass. Mag. Katrin Seyr.

This chapter gives an overview of the internal structure of the backend of Postgres. After having read the following sections you should have an idea of how a query is processed. Don't expect a detailed description here (I think such a description dealing with all data structures and functions used within Postgres would exceed 1000 pages!). This chapter is intended to help understanding the general control and data flow within the backend from receiving a query to sending the results.

The Path of a Query

Here we give a short overview of the stages a query has to pass in order to obtain a result.

  1. A connection from an application program to the Postgres server has to be established. The application program transmits a query to the server and receives the results sent back by the server.

  2. The parser stage checks the query transmitted by the application program (client) for correct syntax and creates a query tree.

  3. The rewrite system takes the query tree created by the parser stage and looks for any rules (stored in the system catalogs) to apply to the querytree and performs the transformations given in the rule bodies. One application of the rewrite system is given in the realization of views.

    Whenever a query against a view (i.e. a virtual table) is made, the rewrite system rewrites the user's query to a query that accesses the base tables given in the view definition instead.

  4. The planner/optimizer takes the (rewritten) querytree and creates a queryplan that will be the input to the executor.

    It does so by first creating all possible paths leading to the same result. For example if there is an index on a relation to be scanned, there are two paths for the scan. One possibility is a simple sequential scan and the other possibility is to use the index. Next the cost for the execution of each plan is estimated and the cheapest plan is chosen and handed back.

  5. The executor recursively steps through the plan tree and retrieves tuples in the way represented by the plan. The executor makes use of the storage system while scanning relations, performs sorts and joins, evaluates qualifications and finally hands back the tuples derived.

In the following sections we will cover every of the above listed items in more detail to give a better understanding on Postgres's internal control and data structures.