Gold Miner is an app I created to transform interesting conversations we have at thoughtbot into blog posts. The articles generated are part of the This week in #dev series, and today I’ll talk about some of the technical details of the app, like how we use artificial intelligence, async Ruby, and other interesting patterns.
The MVP
The first step was to classify what I thought were “interesting messages”. We
share a lot on our public Slack channels, so I decided to search messages
containing “tip” or “TIL” in them. To allow people to hand-pick particular
messages, I also fetched anything reacted with a :rupee-gold:
emoji.
I created a MessagesQuery
class to help me build a message search query like:
interesting_messages = MessagesQuery
.new
.on_channel("dev")
.sent_after("2023-04-12")
Due to some limitations of the Slack API, I had to fetch the messages in three different requests. It was a bit slow, but not too bad:
def search_interesting_messages
til_messages = @slack.search_messages(
query: interesting_messages.with_topic("TIL")
)
tip_messages = @slack.search_messages(
query: interesting_messages.with_topic("tip")
)
hand_picked_messages = @slack.search_messages(
query: interesting_messages.with_reaction("rupee-gold")
)
til_messages + tip_messages + hand_picked_messages
end
After that, I would grab those messages, extract the text, author, and permalink and format as a very simple Markdown file. Then, I’d go manually through each message, read it, summarize it, choose tags, think of a title and then publish the article.
I’m Too Lazy For This
For a while, that was it. I’d run the script, manually create the article, and open a PR to our blog repo. That was just too much work! My developer brain was begging for automation. I immediately thought of using an LLM to summarize the messages, generate titles, and extract topics! OpenAI had several APIs available, so it was an easy choice for me.
Before I started, I didn’t want to tie the app to a particular vendor, so I
developed the concept of a BlogPost::Writer
. The BlogPost
class would
delegate all that manual work I used to do to a writer
object it would receive
on initialization. That’s a case of the Strategy pattern (with an immutable
strategy).
Here’s an example of how it generates a highlight from a message:
class BlogPost
def initialize(messages, writer:)
@messages = messages
@writer = writer
end
def highlight_from(message)
<<~MARKDOWN
## #{@writer.give_title_to(message)}
#{@writer.summarize(message)}
MARKDOWN
end
end
The writer
is now the one responsible for generating a title, summary, and
extracting relevant topics from a message. Since Ruby doesn’t have interfaces, I
decided to codify that protocol in a shared RSpec example.
RSpec.shared_examples "a blog post writer" do
it {
expect(writer_instance).to
respond_to(:extract_topics_from).with(1).argument
}
it {
expect(writer_instance).to
respond_to(:give_title_to).with(1).argument
}
it {
expect(writer_instance).to
respond_to(:summarize).with(1).argument
}
end
The old behavior, i.e., only returning a message as is (not summarized) was
moved to a writer class called BlogPost::SimpleWriter
.
Artificial Intelligence To The Rescue
Now that I had a blog post writer
protocol, I could create a new writer
with OpenAI. I used the ruby-openai
gem and implemented the interface in no
time.
This class is quite simple because ChatGPT itself is doing all the heavy
lifting. One detail I added was a fallback to the SimpleWriter
if the the call
to the OpenAI API fails for some reason. It enables the app to keep running even
if the ChatGPT is down or in case one of the requests can’t complete.
Here’s how it extracts topics from a message:
def extract_topics_from(message)
topics_json = ask_openai <<~PROMPT
Extract the 3 most relevant topics, if possible in one word,
from this text as a single parseable JSON array: #{message[:text]}
PROMPT
if (topics = try_parse_json(topics_json))
topics
else
# case we can't parse the JSON, fallback to the simple writer
fallback_topics_for(message)
end
end
Boom! Now, I let AI do the hard work for me. I still have to read the messages to check if the content is correct, but I don’t have to think about titles, summaries, or topics anymore.
Also, because we support multiple strategies, I could create new writers for any other tools like Google’s Bard AI or even a self-hosted AI like Dolly.
Wait For It
Everything was working fine, and I did many editions of This week in #dev, but there was one problem with the OpenAI writer: it was slow. For each Slack message, we had to issue three API requests (summary, title, and topics) and network calls are slow. On top of that, all the requests were made sequentially, so it could take anywhere between 20 to 60 seconds to generate a blog post with four messages. That was even worse on peak hours for the ChatGPT API (not to mention those Slack API calls).
Those are key points, though. This app was IO-bound: most of the time, it was waiting for an HTTP request to complete. A perfect candidate for async Ruby.
The method for generating a highlight from a message, for instance, now looks like this:
def highlight_from(message)
title_task = Async { @writer.give_title_to(message) }
summary_task = Async { @writer.summarize(message) }
<<~MARKDOWN
## #{title_task.wait}
#{summary_task.wait}
MARKDOWN
end
It’s cool that none of the other code had to change, and because I added all the
async
infrastructure to the BlogPost
class, every writer now runs
asynchronously! I even added tests to ensure all the writer calls run
concurrently. While at it, I also made the Slack API calls async,
so the app searches messages in parallel.
The total time was reduced to less than a fourth of what it was before, a massive win!
Other Goodies
There are a few other minor things I did in this app that are worth mentioning:
- Monads: I used the dry-monads gem to handle errors gracefully. It helped me to structure the code in a railway-oriented way, which I find much easier to maintain than exceptions (in particular, when dealing with Threads).
- Dependency injection: I did a fair amount of dependency injection in this app. Because I was doing TDD, it made testing much easier, especially when dealing with code that interacts with external services.
- Zeitwerk: I used the Zeitwerk gem to load all the app code. It
avoids all those manual
require
s and keeps the code organized in the same way the files are arranged in the file system (like we do in Rails apps). All that for a singleZeitwerk::Loader.for_gem.setup
call? Love it! - App setup: I created a
bin/setup
script to install all the dependencies and set up the app. It helps new developers get started quickly and is a nice form of documentation.
Next steps
There’s still a lot of room for improvement in Gold Miner, but since I’m the only user, I’ve been taking it slow. One area I’d like to improve is making Gold Miner open a PR automatically for our blog repo, and adding each of the message authors as reviewers. Some parts of the code that could be better encapsulated and organized, but it has been good enough for me so far, so I didn’t bother.