Micro Matt


Support for #categories is deployed. This should give us a chance to make sure it works in the wild — but everything seems to be running smoothly so far.

One tricky thing: do we track which #categories are included in the post itself vs. in pure metadata?

Assuming all are inline, things are simple: on publish or update, we always parse hashtags in the body. If a tag isn’t there, we delete its association with the post. If it is there, we associate it with the post.

If we also want to have pure-metadata categories, so we don’t have to clutter up a post with visible hashtags, then we need to track which categories are inline vs. metadata-only, so we know which ones to remove on update and which ones to keep. (And now this is getting complicated.)

It’s mostly getting complicated in the UI. Assuming the editor now has a “categories” field, we’ll need to keep it in sync between inline / editor hashtags and metadata-only tags. I mean, it’s unlikely someone would use both; I don’t want to cater to that edge case, but I also don’t want to exclude it if we don’t have to.

We might also follow the pattern we’ve laid out with other add-on post metadata, like #authors. Right now, you can add an author in the Rich Text (RT) editor, but not the Plain Text (PT) editor. In this way, we keep our underlying flexibility, but the client / editing UI guides users toward the correct input method. I think we can assume that the PT editor is for focused writing and inline metadata; the RT editor is for exact control over presentation and metadata. I’m not sure if that’s entirely correct, but I think I have to start there.


Continuing yesterday’s work, internal support for #categories is finished. The result is basically a lightweight layer on top of the existing hashtag-based system.

Now whenever you create or modify a post, or move it to a blog, we’ll parse out the hashtags and automatically create categories from them, as necessary. Categories store original information about hashtags, plus a user-friendly title (which can include spaces, punctuation, capitalization you want, etc.) and a URL-friendly slug. (You can see some of the underlying code here.)

In this way, they’re completely optional and unobtrusive by default. If you care to carefully manage your categories, with this new system, you’ll be able to do it. If you just want to tag a post occasionally, this won’t slow you down. And if your needs change as you write more posts, this will be there when you decide to organize things.

I still have more testing to do before deploying this change, and even then, users won’t notice anything new yet. But the groundwork will be there for us to tackle the management side next.


Working on support for categories today. There are a few functional goals we’re trying to solve with this:

  • We can list all categories used on a blog
  • We can quickly filter all posts under a certain category
  • We can create tags with specific capitalization, Unicode characters, and spaces in the name
  • We can associate posts with certain categories via existing plain text tagging system

The last point in particular is pretty tricky to solve; all other points are solved easily by adding some new data structures. I’m thinking we’ll just store three pieces of data for each category: a slug (e.g. united-states), a title (e.g. United States), and a normalized “lookup slug” that can be represented by a hashtag (e.g. unitedstates).

Then we’ll do some magic on the backend when creating or updating a post that parses the post and creates a new category automatically and / or associates the post with an existing category. That will allow existing posts to use this new categorization system. Then we might also support a new “silent” way of adding categories via a new API field, so you can associate a post with a category without inserting it into the body of the post.

Just some implementation ideas so far; we’ll see if this works in practice.

#dev #categories