Type-Safe Queries with Plain Typescript

Typescript is God-sent. It turns a frustrating, toy-like language into one of the best languages ever. But it really doesn’t get the love it deserves. It’s often seen as just a slight improvement on the JavaScript situation by adding support for basic static typing. But Typescript’s type system is far from basic. In fact, it’s one of the most expressive, yet elegant, type systems out there.

Mainstream statically typed languages like Java and Go were designed with the conventional expectations from an object-oriented static type system in mind. All code written in those languages implicitly respect those constraints from the start. Design patterns that lend themselves well to those constraints are automatically favored for implementing the desired logic. In this respect, Typescript had it the other way around. From the start, one of the goals of Typescript was to develop a type system capable of representing the public API of existing JavaScript libraries. By it’s very nature, it did not get to impose patterns, but rather try to create a type system which was expressive and flexible enough to represent the vast variety of JavaScript APIs out there - APIs which make full use of JavaScript’s dynamically typed nature, and are often difficult, if not impossible, to express in conventional object-oriented languages. As a result, Typescript has developed some very interesting type manipulation features, not found in other mainstream languages.

In this post, we will explore some of these fairly unique features of Typescript and learn how to use them to build type-safe APIs for database interaction using pure Typescript magic.

Indexed Access Types with Generics

One really cool feature of Typescript is Indexed Access Types. It sounds fancy, but it’s basically just the ability to refer to the type of a specific property of another type. For example:

type User = {
  id: number,
  email: string,
}

type Email = User['email']  // string

const user1 = {
  id: 1,
  email: 'user1@email.com'
}

const email: Email = user1.email

In the code above, the type Email is an alias for string. With the type expression User['email'], we are referencing the email property of the User type (not the user1 object), which is also a type - string.

Note that the . notation that works for objects doesn’t work with types. You even get a really helpful error message if you try it:

type Email = User['email']  // works

type Email = User.email  // error!
// Cannot access 'User.email' because 'User' is a type, but not a namespace. 
// Did you mean to retrieve the type of the property 'email' in 'User' 
// with 'User["email"]'?
// ts(2713)

Using a union of strings to index the User type gives us a union of the types of all those properties.

type IdOrEmail = User['id' | 'email'] // number | string

Sometimes, you want to define the parameters or the return type of a function in terms of other types. For example, suppose we have a function getUserAttr that accepts

a User object
the name of one of it’s properties

and returns the value of the specified property from the object after some validation:

const userEmail = getUserAttr(user, 'email')  // returns a string
const userId = getUserAttr(user, 'id')        // returns a number

So how would we type this getUserAttr function? A first attempt could look like this:

type UserAttrs = 'id' | 'email'

function getUserAttr(user: User, attrName: UserAttrs): User[UserAttrs] {
  const attr = user[attrName]
  // do validations
  return attr
}

const email = getUserAttr(user, 'email') // string | number
const id = getUserAttr(user, 'id')       // string | number

getUserAttr(user, 'age') // error - good, `age` is not in `UserAttrs`

This is not quite there yet, but we’re close. We are able to pass in a User object, along with any of its properties (but no unknown property) as a string. The issue is that we still get a union type as the result. The return type User[UserAttrs] resolves to string | number. Ideally, we want getUserAttr(user, 'email') to return a string, not a string | number. We want the return type to change depending on the string we’re passing in as the attrName parameter. And that should be possible, because the string email is statically known.

This is the perfect use case for generics:

function getUserAttr<T extends 'id' | 'email'>(user: User, attrName: T): User[T] {
  const attr = user[attrName]
  // do validations
  return attr
}

const email = getUserAttr(user, 'email') // string
const id = getUserAttr(user, 'id')       // number

A few things have changed:

We’ve created a generic type parameter T, which has a constraint extends 'id' | 'email'.
The attrName parameter is now T instead of UserAttr.
The return type is now User[T] instead of User[UserAttr].

The key here is the type parameter T which represents a specific sub-type of 'id' | 'email'. When T is used to index the User type, as in User[T], we can refer to a specific property of the User type, depending of the value of T. So when T is initialized to the literal type 'email', the expression User[T] resolves to User['email'], which is string.

Note that we don’t need to manually specify the type for T because it is inferred from the attrName argument.

const email = getUserAttr(user, 'email')

// equivalent to this

const email = getUserAttr<'email'>(user, 'email')

Obviously, our example of a function to simply retrieve a particular property of a User object is a little contrived, but this ability to define the types of a function’s parameters or return value in terms of the type of another parameter can be very powerful for creating flexible and concise interfaces. Not only that, using generics with indexed access types like this also enables type-safety for a lot of common patterns in JavaScript which rely on receiving arguments of different types based on a previous argument, like the addEventListener(event, callback) DOM API, where the expected parameters of callback depends on the value of event.

This is going to be the core mechanism behind our type-safe queries!

Type-Safe Query Executor

Before we get into the implementation, let’s put down some basic requirements for our type-safe query interface:

It must allow only known queries to be executed - i.e. it should not be possible to execute arbitrary queries using this interface.
Type-safety for the query parameters and result - i.e. different parameters and result types depending on the query being executed.

As an example, we’re going to envision a simple Reddit-like platform where users can post links and comment on them.

Let’s say our query interface is essentially a single function - executeQuery().

For our first requirement - that arbitrary queries cannot be executed - we have a familiar requirement, very common in all of programming - to restrict the set of possible values. Enumerations is the right tool for the job, and we can use a Typescript enum to represent the set of queries that are “known”, and thus should be allowed to be executed via the executeQuery() function.

enum QUERIES {

  SELECT_USER_POSTS = `
    SELECT 
      post_id, title, blurb
    FROM 
      posts 
    WHERE 
      user_id = $userid
    ;
  `,

  SELECT_POST_COMMENTS = `
    SELECT
      comment_id, comment_text, parent_comment_id
    FROM
      comments
    WHERE
      post_id = $postid
    ;
  `,

}

Here, we have an enum called QUERIES, which was two members - SELECT_USER_POSTS and SELECT_POST_COMMENTS. Each of these queries are written using $ placeholders for the input values - $userid and $postid respectively.

With this, we can start defining the types for our executeQuery() function like this:

function executeQuery(
  query: QUERIES, 
  args: Record<string, any>
): Record<string, any>[] {
  // TODO: use db client lib to make query
}

We will deal with implementing the actual logic of the function later, for now we will just focus on the interface, or the signature of the function. Also, currently, the args parameter and the return type of the function are typed using Record<string, any>. We will deal with removing these uses of any later as well.

For now, let’s focus on the query parameter - we’ve set the type of the query parameter as QUERIES - the enum we defined above. This means that we can call this function only with members of the QUERIES enum, and anything else will give us a type error:

executeQuery(QUERIES.SELECT_USER_POSTS, {userid: 1})      // works
executeQuery(QUERIES.SELECT_POST_COMMENTS, {postid: 1})   // works

executeQuery("DROP TABLE posts;", {}) // ERROR!
// Argument of type '"DROP TABLE posts;"' is not assignable to 
// parameter of type 'QUERIES'.ts(2345)

Obviously, this does not stop us from adding dangerous queries to the QUERIES enum itself, but it does guarantee that we can make the executeQuery() function available to the rest of our code knowing that only queries listed in our QUERIES enum can be executed, and not any random, potentially dangerous, query - at least not without a blasphemous cast to any. As long as the QUERIES enum is not changed, we can be sure no other unexpected and un-reviewed queries are being executed.

Now we will tackle our second requirement - type-safety for the query parameters and result. Notice that so far, we had typed the args parameter of our executeQuery() function as Record<string, any>, which is basically the same as { [key: string]: any }. This doesn’t prevent us from passing in the wrong arguments for the query. For example, if we execute the QUERIES.SELECT_USER_POSTS query, we should pass in { userid: number } as the args parameter, but we wouldn’t get a type error if we did this:


executeQuery(QUERIES.SELECT_USER_POSTS, {someotherid: 123})
// wrong args but no type error :(       ^

To accomplish proper type-safety, we must first define the types for the inputs and outputs of each query. We want to create a mapping between each query and it’s expected arguments.

type QueryArgsMapping = {
  [QUERIES.SELECT_USER_POSTS]: { userid: number }
  [QUERIES.SELECT_POST_COMMENTS]: { postid: number }
}

We’ve created a new type which describes an object where the each of our queries is a property of this object, and type associated with each of these properties corresponds to the arguments for that query. In other words, it is a mapping of each query to the type of that query’s arguments. We’re not going to use this QueryArgsMapping directly, but instead we’ll use it to express the type for our executeQuery() function’s args parameter, which should depend on the query argument.

We’ll use the trick with generic type parameters and indexed access types we saw earlier.

function executeQuery<Q extends QUERIES>(
  query: Q,
  args: QueryArgsMapping[Q],
): Record<string, any>[] {
  // TODO: use db client lib to make query
}

We’ve introduced a generic type parameter Q, which we used directly for the query argument, and also to index the QueryArgsMapping type we created earlier for the args parameter’s type. In this way, we’ve made the type of the args parameter dependent on the query parameter. With this, we have real type safety for our queries.

executeQuery(QUERIES.SELECT_USER_POSTS, { userid: 1 }) // correct args - no error

executeQuery(QUERIES.SELECT_USER_POSTS, { postid: 1 }) // wrong args - ERROR :)

Again, the type of the type parameter Q is inferred from the query argument, so it’s really the equivalent of doing this:

executeQuery<QUERIES.SELECT_USER_POSTS>(QUERIES.SELECT_USER_POSTS, {
  userid: 1,
})

This works because QUERIES.SELECT_USER_POSTS is itself a type, since in a type context, constant enums are treated as a union of its members. Typescript is able to tell that a value of type QUERIES.SELECT_USER_POSTS is being passed as the args parameter, and therefore, substitutes Q with the type QUERIES.SELECT_USER_POSTS.

We could apply the same trick to enable type-safety for the queries’ results. We create a type that maps each query to a type representing a row in the result set, and we use the generic type parameter Q to index into that mapping.

type QueryRowMapping = {
  [QUERIES.SELECT_USER_POSTS]: { post_id: number, title: string, blurb: string }
  [QUERIES.SELECT_POST_COMMENTS]: { comment_id: number, body: string, parent_comment_id: number }
}

function executeQuery<Q extends QUERIES>(
  query: Q,
  args: QueryArgsMapping[Q],
): QueryRowMapping[Q][] {
  // TODO: use db client lib to make query
}

Of course, now we have two mappings, which feels repetitive. With a bit of Typescript magic, we can tidy this up to require only one mapping. We could do even further and create concise type aliases for the arguments and return types.

Here’s the final version, including all the code snippets discussed above, made a bit cleaner by introducing new type aliases QuerySig, QArgs and QRows.

enum QUERIES {

  SELECT_USER_POSTS = `
    SELECT 
      post_id, title, blurb
    FROM 
      posts 
    WHERE 
      user_id = $userid
    ;
  `,

  SELECT_POST_COMMENTS = `
    SELECT
      comment_id, comment_text, parent_comment_id
    FROM
      comments
    WHERE
      post_id = $postid
    ;
  `,

}

type QuerySig<A = never, R = never> = {
  args: A
  rows: R
}

type QueryToSigMapping = {

  [QUERIES.SELECT_USER_POSTS]: QuerySig<
    { userid: number },
    { post_id: number; title: string; blurb: string }
  >,

  [QUERIES.SELECT_POST_COMMENTS]: QuerySig<
    { postid: number },
    { comment_id: number, body: string, parent_comment_id: number }
  >,

}

type QArgs<Q extends QUERIES> = QueryToSigMapping[Q]['args']
type QRows<Q extends QUERIES> = QueryToSigMapping[Q]['rows']

function executeQuery<Q extends QUERIES>(query: Q, args: QArgs<Q>): QRows<Q>[] {
  // TODO: use db client lib to make query
}

Now we have a fully type-safe interface to make queries with! Every query has the types of it’s arguments and result rows defined, and the executeQuery() function’s signature changes depending on it’s first argument - query, which is an enum - QUERIES.

The real magic is how this evolves as queries are added. Any query we want to execute has to be a member of our QUERIES enum. Let’s add a new query:

enum QUERIES {

  SELECT_USER_POSTS = `
    SELECT 
      post_id, title, blurb
    FROM 
      posts 
    WHERE 
      user_id = $userid
    ;
  `,

  SELECT_POST_COMMENTS = `
    SELECT
      comment_id, comment_text, parent_comment_id
    FROM
      comments
    WHERE
      post_id = $postid
    ;
  `,

  SELECT_LATEST_POSTS = `
    SELECT
      post_id, title, blurb, created_time
    FROM
      posts
    ORDER BY created_time DESC
    LIMIT $maxposts
    ;
  `

}

Now that we’ve added a new member to the QUERIES enum, unless we define it’s signature type in QueryToSigMapping, we get a type error when we try to index QueryToSigMapping with QUERIES:

type QArgs<Q extends QUERIES> = QueryToSigMapping[Q]['args']
// error TS2536                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// Type 'Q' cannot be used to index type 'QueryToSigMapping'

type QRows<Q extends QUERIES> = QueryToSigMapping[Q]['rows']
// error TS2536                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// Type 'Q' cannot be used to index type 'QueryToSigMapping'

As long as we don’t add an entry for QUERIES.SELECT_LATEST_POSTS in QueryToSigMapping, it is not type-safe to use a QUERIES value as the index, since Typescript knows that the QUERIES.SELECT_LATEST_POSTS property does not exist on QueryToSigMapping.

To fix it, we must add the new query’s signature to QueryToSigMapping.

type QueryToSigMapping = {

  [QUERIES.SELECT_USER_POSTS]: QuerySig<
    { userid: number },
    { post_id: number; title: string; blurb: string }
  >,

  [QUERIES.SELECT_POST_COMMENTS]: QuerySig<
    { postid: number },
    { comment_id: number, body: string, parent_comment_id: number }
  >,

  [QUERIES.SELECT_LATEST_POSTS]: QuerySig<
    { maxposts: number },
    { post_id: number, title: string, blurb: string, created_time: Date }
  >,

}

type QArgs<Q extends QUERIES> = QueryToSigMapping[Q]['args']  // no error
type QRows<Q extends QUERIES> = QueryToSigMapping[Q]['rows']  // no error

This way, we are forced to declare signatures for any new queries we add, because our interface enforces it. It compiles only when given the signature of each query that can be executed, and those signatures must include any arguments or result fields that are referenced by the code executing the query. We get one type error after another as we work our way through all the plumbing issues in our code till we are left to worry only about the application logic, just as God intended.

Obviously, this does not save us from specifying the signature of the queries incorrectly. There is no actual understanding of the database schema here, just a typed layer over the queries, which has to be maintained manually. But, that’s not necessarily a bad thing. Typescript is really all about layering strong types over non-type-safe interfaces. A database is really no different from a JavaScript library in that sense - we must manually ensure that the types match up with the actual API. Once that’s done and tested, the rest of the program’s interaction with the database is 100% type-safe. Now there is more we can do to guard against human error - more on that later.

Now comes what is possibly the simplest part of the whole puzzle - actually executing the query on a real database. Let’s just assume we’re working with Deno and its postgres module. A few things to note about our executeQuery() function in a practical implementation:

Usually, there will be some state involved in the implementation of the interface, like a database connection.
There will be a mechanism to instantiate an implementation of the interface.

Below, we’ve defined our executeQuery() function as a method of a Typescript interface called QueryExecutor. We’ve also defined a factory function - getDbClient() - which returns an object which implements our QueryExecutor interface.

interface QueryExecutor {
    execQuery<Q extends QUERIES> (query: Q, args?: QArgs<Q>): Promise<QRows<Q>[]>
}

import { Client } from 'postgres'

export async function getDbClient (): Promise<QueryExecutor> {

  const client = new Client({
    hostname: "localhost",
    port: 5432,
    database: "postgres",
    user: "postgres",
    password: "password",
  });
  
  await client.connect();

  return {
    async execQuery<Q extends QUERIES> (query: Q, args?: QArgs<Q>): Promise<QRows<Q>[]>{
      const { rows } = await client.queryObject<QRows<Q>>(query, args)
      return rows
    }
  }
}

And now, finally, we have our fully working, type-safe query executor interface.

Runtime Validation

There is one last thing we could do to get that sweet extra layer of safety - runtime validation of arguments and results. That is, the guarantee that even if, by some miracle, we end up with the wrong type of value for the query arguments, or for the query results, our program will detect it and fail fast instead of continuing down the destructive path of proceeding with invalid or unexpected data. Similar to how schema validation is used when interacting with other APIs and micro-services, we can use schema validation for database queries too. The motive is the same - at the system boundary, we take pains to ensure that only valid data enters the system.

The obvious choice here is zod. With a little modification to the code above, we can express the query signatures using Zod schemas instead of Typescript types, and we get runtime schema validation basically for free. Below, we define an enhanced implementation of the QueryExecutor interface which performs runtime validation on the executed query’s arguments and results.

import { z } from "https://deno.land/x/zod@v3.21.4/mod.ts";

const QUERY_TO_Z_MAPPING = {

  [QUERIES.SELECT_USER_POSTS]: {
    args: z.object({ userid: z.number() }),
    rows: z.object({ post_id: z.number(), title: z.string(), blurb: z.string() })
  },

  [QUERIES.SELECT_POST_COMMENTS]: {
    args: z.object({ postid: z.number() }),
    rows: z.object({ comment_id: z.number(), body: z.string(), parent_comment_id: z.number() })
  },

  [QUERIES.SELECT_LATEST_POSTS]: {
    args: z.object({ maxposts: z.number() }),
    rows: z.object({ post_id: z.number(), title: z.string(), blurb: z.string(), created_time: z.date() }),
  },

}

type QArgs<Q extends QUERIES> = z.infer<(typeof QUERY_TO_Z_MAPPING)[Q]['args']>
type QRows<Q extends QUERIES> = z.infer<(typeof QUERY_TO_Z_MAPPING)[Q]['rows']>

export async function getStrictDbClient (): Promise<QueryExecutor> {

  const client = new Client({
    hostname: "localhost",
    port: 5432,
    database: "postgres",
    user: "postgres",
    password: "password",
  });

  await client.connect();

  return {
    async executeQuery<Q extends QUERIES> (query: Q, args: QArgs<Q>): Promise<QRows<Q>[]> {

      // validate query arguments
      args = QUERY_TO_Z_MAPPING[query].args.parse(args)   // throws if parse fails

      const { rows } = await client.queryObject<QRows<Q>>(query, args)

      return rows.map(
        row => QUERY_TO_Z_MAPPING[query].rows.parse(row)  // throws if parse fails
      )

    }
  }

}

We’ve replaced QueryToSigMapping (which was a Typescript type, declared with the type keyword), with QUERY_TO_Z_MAPPING (an actual object constant, declared with the const keyword). Instead of mapping the queries to Typescript types, QUERY_TO_Z_MAPPING maps the queries to Zod schema objects representing the types for the arguments and result rows.

Next, we use z.infer to pull out the inferred types from the Zod schemas and use them to type our args argument and return type as we did before.

Finally, we use the .parse() method on the Zod schema objects to validate the arguments and result rows. If they don’t pass validation, an error will be thrown before any more harm can be done. This makes it practically impossible to get our query signatures wrong, since any discrepancy between the actual and expected shape of the query results will immediately give us an error during development.

But what about ORMs?

So you’re probably asking - why not just use an ORM? Don’t we get type safety for free with most modern ORMs? Well, yes, but you might not want to use an ORM.

Whether or not to use an ORM is a tired debate, and it won’t be settled today. Ultimately, it depends on the situation. Even with all the benefits that the state-of-the-art ORMs like TypeORM and Prisma provide, sometimes, you need to work directly with raw SQL. Sometimes, you don’t have control over the database or the schema, you just need to work with the data. Sometimes, you need to optimize your SQL query by understanding exactly what’s going on in the database.

Sometimes, you just want to work directly with SQL instead of avoiding it till you are forced to think in SQL.

For those cases, a 100-line module like the one described above gives us practically all the type-safety we could ever ask for without having Typescript literally understand SQL. No model definitions, no verbose overloads, no duplicate type definitions, just runtime-checked type-safe signatures for all your raw SQL queries!