Cleaning up an ebook library with Calibre

Reader Susan Erich seeks a bit of organization in her literature. She writes:

I’ve downloaded hundreds of free ebooks in a variety of formats and I need help organizing them. Some, I guess, are meant for a Kindle and others can be read on my iPad. I think there are duplicate titles and some of the author information is incorrect. Is there an easy way to sort these things out?

I was completely with you until you mentioned “easy.” Separating ebook types is a cinch, locating and deleting duplicates isn’t terribly difficult, but when you talk about tidying up title and author information (which relies on the book’s metadata) you could be looking at a long and tedious process. But let’s hope for the best and run through the steps.

Organizing your books

There are multiple ways to organize ebooks. Some people are concerned only with placing their Kindle-compatible files (with the .mobi extension) in one folder and their iBooks-compatible files (epub files) in another. You can do this with a couple of smart folders.

Move to the Finder and choose Finder > New Smart Folder (or Command-Option-N). In the window that appears click on the Plus (+) button near the window’s top-right corner so that you see a row of conditions that reads, by default, Kind is Any. Click on the Kind pop-up menu and choose Other. In the sheet that appears enter extension in the Search field. Two matching items will appear—File Extension and File Extension Hidden. Tick the box to the right of File Extension and click OK.

Filtering files via file extension.

The window’s condition will now read File Extension Is. Enter mobi. The window will fill up with ebook files. These are the files compatible with the Kindle. Click Save and in the sheet that appears name the smart folder “Kindle Books,” ensure that Add to Sidebar is checked, and click Save.

Repeat this process with a new smart folder, but this time enter epub as the file extension. Save this smart folder as well with a title such as “iBooks Books.” You’ve now created virtual folders for each kind of book.

But you want more. And more is what you’ll get.

Using Calibre

Download a copy of the free and open-source Calibre ebook management application. This is a powerful tool, but one that’s not entirely intuitive. However, in your case, we can manage what you want without digging into the really obscure stuff.

Create a library: You’ll want to start by creating a folder for your books. When you first launch Calibre it will prompt you to do that. By default the library is called Calibre Library and is placed within your user folder, but you can change that location if you like. When you click Next in the setup wizard you’ll be asked to choose the kind of device you’re using. If you have a Kindle or want to use the Kindle iOS app, choose Amazon. If you have an iOS device, click on Apple. Or you can choose Generic if you like. Then click Finish in the next window and the Calibre library window appears.

Choosing a device type in the Calibre setup wizard.

Importing your books: Now let’s get your books into Calibre. If you’ve created those smart folders I suggested, you can do this by choosing Add Books > Add Books From a Single Directory and in the sheet that appears, select the appropriate smart folder from the sidebar. Click on the first book in the resulting list, hold down the Shift key, and then click on the list’s last book. Then click Open. Those books will be imported into Calibre’s library.

Alternatively, if you have all your ebooks jammed into the same folder or volume rather than in a smart folder, choose Add Books > Add Books From Directories, Including Sub Directories (Multiple Books Per Directory, Assumes Every Ebooks File Is A Different Book). And yes, we have a winner for the world’s longest menu command. Navigate to the folder or volume that contains your ebooks and click Choose. Calibre will dig down and add all the ebooks it finds to the library.

If you have a mix of ebook types—mobi as well as epub books—all of these files will appear in the library. We can sort them out later.

Cleaning up the metadata: You can clear up a lot of your problems simply by attaching better metadata—information such as author, title, genre, and year of publication, for example—to your books. In order to do that, select all the books in your library and choose Edit Metadata > Download Metadata And Covers. Calibre will go online and try to track down the most appropriate metadata for the books in your library, including cover art. This can take a long time if you have a large library.

Once the process is complete you’ll be asked if you’d like to replace your books’ metadata. Click Yes.

After downloading new metadata choose to attach it to your ebooks.

When you do this you’ll find that your library contains more information than it once did. For example, you’ll probably see more entries under Publisher than before. You might also find ratings and tags. Hopefully, title and author information will be cleaned up as well. Instead of Dickens, Charles—which is sorted under C rather than D—you should see Charles Dickens. (Though this doesn’t always work.)

Removing duplicates: Let’s get rid of your duplicates. To do that choose Calibre > Preferences and then click Plugins in the resulting sheet. In the next sheet click Get New Plugins. In the list that appears locate Find Duplicates, select it, and click Install. Then quit and restart Calibre.

Selecting Fuzzy gives you the best chance of finding the greatest number of duplicate ebooks.

Once restarted, Calibre will display a Find Duplicates entry in the toolbar. Click on it and in the window that appears choose Fuzzy under both the Title Matching and Author Matching headings and click OK. A list of books that are duplicated will appear. Command-click any additional copies of the book (leave one unchecked so that you don’t delete the original) and then click on Remove Books in the toolbar. The duplicates will disappear.

Further cleanup: Despite better metadata, it’s possible that you still face author names presented as Dickens, Charles. While you can scroll down your library list, seek out these errant entries, and manually correct them, I’d instead suggest this technique that I picked from user “Garcie” in the MobileRead forum.

In the Search field, enter author:, (author-colon-comma). Press Return and you’ll see a list of just those books whose authors use this Last Name, First Name scheme. Select all these titles.

Press the E key to produce the Edit Meta Information window and click the Search and Replace tab. From the Search Mode pop-up menu choose Regular Expression. From the Search Field pop-up menu choose Authors. In the Search For field enter (.*), (.*). In the Replace With field enter \2 \1 (with a space between \2 and \1). Below, in the Test Text area, enter Dickens, Charles in the Your Test field. When you do this, the Test Result field should read Charles Dickens, indicating that your other settings are correct. If they are, click the OK button. Your selected titles should now display authors in First Name/Last Name format.

The cure for last name/first name author entries.

Export your books: After cleaning up your books you could stop right here. In the Finder navigate to the library folder you created and within it you’ll find all your ebooks organized in folders by author’s first name.

But suppose you’ve imported multiple ebook formats and they’re all sitting in your library. My guess is that you’ll want these separated so that you know which you can copy to your Kindle and which will work within iBooks. That’s easily done.

On the left side of the Calibre window click on Formats to reveal the format entries below. Now choose a format such as MOBI. Select all the books in the library. Choose Save to Disk > Save Only MOBI Format to Disk. In the Choose Destination Directory window that appears click New Folder and name it something like “My Kindle Books” and click Choose.

View your ebooks by format before exporting them.

The selected books from Calibre’s library will be placed in this folder filed, by default, using the author’s last name. Open this folder and you’ll see other folders representing each of the author’s books.

You repeat this process for epub files. Select EPUB under the Formats entry, select all the books that appear, choose Save to Disk but this time choose the Save Single Format to Disk command. EPUB will appear as the single option. Select it and click OK. You’ll then be prompted for a location for your books.

And there’s more

You ask a seemingly simple question and this is what happens. As I said, Calibre is very powerful and I’ve just scratched the surface. There are other ways to export ebooks so that they don’t also include images and metadata files (check its Saving Books to Disk preference). And you can output all your books into a single directory rather than having them split out into individual folders. You can additionally export books directly to connected devices so that you needn’t fuss with creating these folders or syncing books through iTunes.

If what I’ve provided isn’t enough, click on Calibre’s Help menu. Your default browser will open and you’ll be taken to the Calibre manual page, where you can dig into the details (and there are many).

In the meantime, I welcome comments from experienced Calibre users who can recommend sleeker workflows. And if you know of a different tool that makes cleaning ebooks easier, I’m all ears.

Subscribe to the MacWeek Newsletter

Comments