Your Book as a Database: A Primer

In Guest Contributors by Guest Contributor


By Chris Kubica

Read Part Two and Part Three of this article on our news blog.

Imagine the future of books not as physical objects, but as relational databases…

  • Autobiographies, written in semi-real-time as the authors live their lives
  • Massively multi-reader “Choose Your Own Adventure”-like role-playing books where everyone’s choices shape the story
  • Serialized novels, like David Copperfield, only infinite and with alternate story lines
  • Recipe books that keep growing and puzzle books that always have more puzzles
  • Multimedia automobile manuals that self-update by pushing recall warnings and maintenance reminders out to you and to mechanics around the world, who then share their fix-it tips with each other and with everyone else
  • Textbooks where student annotations, highlights, and notes are more valuable than the original text, so much so that students can monetize their contributions
  • Series of technical books built with shared chapters: an update to a chapter in one book automatically updates every book in which that chapter appears

What is a relational database?

You use relational databases every day whether you realize it or not. When you get money from an ATM you are accessing the bank’s relational database (which is quietly keeping track of you and how much money you have available). When you travel by plane massive, international, relational databases keep track of the airport you’re at, the plane you’re in, where you are, where your bags are, et cetera. When you search for a good book to buy at the website of a bookstore, you are accessing a vast database of inter-related titles, authors, subjects, reviews and cover shots.

It may be easier to think of a relational database this way: You know that old, tin recipe box on your kitchen counter that’s filled with all the wonderful hand-written recipe cards you inherited from Grandma? That recipe box and all its contents is a relational database, too.

A relational database is something that stores information in a structured, organized way.

Let’s mull over that recipe box example for a bit.

The recipe box itself is the relational database. All the information about Grandma’s recipes is stored (hopefully) neatly inside. The recipe cards inside the recipe box are related to the recipe box. In other words, this particular set of recipe cards can only be inside one recipe box at a time, right?

Each and every recipe card in the box has its own related list of ingredients as well as a related list of instructions or steps you take to make the recipe. There might be a list of necessary equipment (like cookie cutters) on each card as well.

And if Grandma was a Fancy Nancy she may have written a category (dessert, main course, appetizer, et cetera) on each card to keep the recipes organized. She may have even had the recipes sectioned up into groups or sets using tabbed dividers.

A diagram for this tin and paper-based relational recipe database looks like this:

If you have an upgraded Grandma 2.0, she has moved beyond the tin and paper and now types all of her recipes as “records” into a relational recipe database on her kitchen netbook (which is why she has donated her tin and paper “backup” to you). Now that everything is stored electronically and relationally (meaning she’s put things in their right place, like by typing only instructions into the instructions part of a recipe record, for example), Grandma (and you) can use the data contained in all of the recipes — in aggregate — in ways impossible to do with paper and tin alone.

For example, you can run queries (searches) on your recipes:

  • Which recipes contain chicken?
  • Which recipes are desserts?
  • Which recipes require my egg beater (or don’t, because it’s broken)?

Or you can run reports or summaries on the recipes:

  • Print out these five recipes to share with my sister.
  • Print out a shopping list based a selection of recipes and their ingredients.
  • Tell me how much I need of each ingredient if I make a quadruple batch of Grandma’s Famous Mary Jane Brownies.

Your book as a relational database

So what would a relational database of a book look like in diagram form? Somewhat similar. Like this:

The book is the anchor of this relational database much like the recipe box is in the recipe box example. Inside the book: chapters, a table of contents, maybe an index. Each chapter is made up of sentences (at least sentences…maybe pictures, too!) and each sentence is made up of words.

You can extend this as I do above to include the book’s genre (or several genres) and language (like English), as well.

There are other relationships at work in a book, too, but for simplicity I’ve left them off the diagram. For example, each entry in the table of contents has a related chapter and certain sentences or words might relate to an entry or three in the index, and so on.

Assuming the program or “platform” that your book-as-database is published on has tools you can use to do such things, what types of queries might you or your readers like to do on/in the book?

How about:

  • Show me sentences with the word “green” in them.
  • Show me, please, the next occurrence of the phrase “if you want to know the truth.”
  • Show me all chapters that include x and y characters together.

If it’s an e-book, you could run summaries such as:

  • How many times does the word “pimple” appear in this book?
  • How many total words are in the book? Just in chapter 3? What are the average number of words per chapter? Per sentence?
  • How many total unique words are used?
  • Show me a list of the most frequently used words in this book minus common articles such as “the”, “a” and conjunction like “and”

If your book is paper only, the above queries are simply not possible. That “intelligence” is locked away in the book with no way to get to it.

The reading experience of a book-as-database must suck, Chris.

Not at all! Just because the book is built sentence-by-sentence, record by record, in a database doesn’t stand in the way of a simple, elegant “interface” (how the book looks/how it works) on the reader side. In other words, it will still walk and talk like the types of books we’re currently used to reading. For example, here’s a screenshot of a book I’m writing myself using my own book-as-database platform, neverend books:

See? It looks like a pretty calm, easy-on-the-eyes e-book, doesn’t it? But it is a book-as-database behind the scenes.

Writing/building a book-as-database from the start requires thinking about how the contents of a book can later be searched, shared, aggregated, re-organized, re-presented, re-purposed and indexed.

However, the interface for readers and even for the author need not be complex or extra-technical in the slightest. A writer could write the book the “old fashioned way”, using a word processor and then upload the manuscript to be “processed” into database format by the publishing platform (more on platforms later). Or a writer could simply write right in a Web browser while the platform automatically saves the work regularly into book-as-database format.

The world is already filled with many millions of existing books, and new books are being written and published every day. So while the diagram of your book by itself that we see here is interesting, it doesn’t paint a complete picture of the ecosystem in which your book can live.

Tomorrow, in Part 2 — which will be posted here to PP’s “News Blog” —  I consider how to incorporate your book-as-database into the larger ecosystem of libraries, collections and information.

Chris Kubica of Chapel Hill, North Carolina is the founder of He is a software developer by day, author by night, reader always. He can be reached via e-mail, phone at 1-919-259-8023 or Twitter @chriskubica

DISCUSS: Should Writers Write for Themselves or Others?

VISIT: Chris Kubica’s own “book as database”

About the Author

Guest Contributor

Guest contributors to Publishing Perspectives have diverse backgrounds in publishing, media and technology. They live across the globe and bring unique, first-hand experience to their writing.