Engineering
The Secret Engines of the Internet
Some thoughts on building an invisible tool that reaches millions.
We build platforms at Postlight. It’s a recurring theme of our conversations, both here on Track Changes and in our podcast. We’re passionate about building the things that power other things.
And we’re not the only ones! Most of the software you use every day lives within an integrated panoply of platforms and services, all working together to deliver your shopping carts and DMs and cat photos and Vines (RIP). The stuff we call “websites” and “apps” is actually a complicated, creaky mass full of unsung, sturdy chunks of code working in conjunction with other unsung, sturdy chunks of code. Take one piece out, and a bunch of other pieces break. They’re indispensable, even if you don’t know they’re there.
Vroom vroom
To give you a sense of how these hidden tools are delivering bits and pieces of the apps you use every day, why don’t we take a close look at one in particular. Oh, how convenient, we have one RIGHT HERE: The Mercury Web Parser, which we at Postlight launched back in 2016!
Mercury Parser extracts meaningful content from the chaos of any web page — specifically from articles, blog posts, and other similar web content. On its own, that doesn’t mean much, but combined with the ingenuity of developers around the world, it means a lot. In the course of an average day, Mercury Parser processes over 4 million requests on behalf of our users. It’s possible that you’re already using Mercury Parser and didn’t even know it.
As we’ve said before: Platforms are the jet engines of the Internet. Without them you just have a big cylinder with a lot of seating.
So if you knew where to look, where might you find our particular jet engine?
Feeds
Mercury has become a staple of many of the most popular RSS readers, most often used to retrieve full text from a truncated RSS feed, clean up an existing article, or pull in bigger and better images than the feed provides. RSS readers may be in an existential battle with the likes of Twitter and Facebook, but they remain invaluable tools for keeping up with what’s happening in the world, or just in a beloved corner of the internet. RSS readers including Reeder, NewsBlur, Feedbin, News Explorer, and Feedly all use Mercury Parser either directly or, as is the case with Feedly, indirectly via third-party browser extensions.
Imports and Pipelines
Outside the realm of RSS readers, other popular tools use Mercury for similar purposes. Apollo, a best-in-class Reddit client for iOS, uses Mercury to pull titles and preview images from articles when they’re not available. Bear, a note-taking and writing app, uses Mercury to import full articles from the web as notes in Bear. Zapier, the indispensable automated-workflow tool, provides an experimental workflow for parsing URLs on demand. Medium, the web’s foremost blogging and hand-clapping platform, uses Mercury to import content from elsewhere on the web into Medium.
A Better Reading Experience
Maybe you like to read your articles without the clutter and distraction of related links and chum and autoplaying videos? Postlight (that’s us, the product studio and Mercury Web Parser people who wrote this thing you’re reading, right now!) are using the Parser in Mercury Reader, an extension that brings distraction-free reading to Chrome — and according to Google, it currently has 1,764,216 users. A like-minded Android developer uses Mercury for clean, distraction-free reading in two of his popular Android apps: Pulse SMS and Talon for Twitter (he’s also released an open-source Android library so any Android dev can easily do the same).
Passion Projects
Other independent developers are using Mercury to power ambitious independent projects or small personal scripts. See, for example:
- Brighter Timeline, an open-source, Trump-free news aggregator.
- Viomatic, a tool that turns articles into narrated videos using images and text extracted via Mercury.
- latr.fm, a read-later service for podcasts.
- tlrl, an open-source, send-to-Kindle command-line tool.
- An InDesign script that allows designers to easily import a web page into InDesign.
And these are just a few of the tools and apps we know about. Thousands of developers have signed up for Mercury API keys, and, as I mentioned above, we’re delivering them parsed web pages at a rate of over 4 million requests per day. It’s amazing to think that a few lines of code that began as a browser bookmarklet (long story) for finding text on a web page has found a home in some of the apps we love most.
Why Mercury?
The case for using Mercury in your app is the same case as any other tool: developers want to solve new problems, so rather than create a general-purpose parser to extract content from a chaotic web page, they turn to Mercury, or tools like it. So they can get on with the thing they REALLY want to do. As one student emailed us:
“Because of the service, I have been able to create my app without an actual backend….I am glad I found such an easy and quick solution to parsing websites. It has allowed me to pursue this project in my free time as I am a student.
We use similar tools in our own work here at Postlight: open-source libraries, third-party APIs, magical infrastructure tools and services. Mercury itself relies on a handful of open-source libraries, each of which rely on open-source dependencies of their own. We love solving tough problems, but there’s nothing better than realizing that someone has already found, attacked, and brilliantly solved a problem that was in your way. Especially when you get to pay it back.
Mercury in particular is near and dear to this programmer’s heart. The idea that eventually became Mercury started out as a couple hundred lines of JavaScript in a bookmarklet written years ago by Postlight’s co-founder, Rich Ziade. When I was first teaching myself to program, I rewrote that bookmarklet in another language as a learning experience — and because I wanted to use it in a side project that tracked changes on web pages over time. (It was called ToDiffer!) Unfortunately, reader, the result was not good: my rewrite didn’t work very well.
I failed, but that bookmarklet spawned several of other open-source libraries in many different languages, written by considerably more skilled programmers than me, and I was able to use their tools when I didn’t have the time or skill to solve that problem myself.
Then — roughly 7 years after my failed attempt at rewriting that bookmarklet — I joined Postlight, and through some bizarre cosmic kismet, my first assignment was to rewrite a substantially more complex version of that bookmarklet from Python back to JavaScript so that we could release the Mercury Web Parser. This time, I like to think, I did a not-so-terrible job.
For me, that experience serves as a great reminder that the only reason anything in our weird, sometimes terrible, incredible world of technology works at all is because programmers are so often willing to share tricks, tools, software, and APIs that solve a problem, allowing someone else to focus on a larger or more specific one. And I’m so happy Mercury fills that role for so many people.
So, You Want to Try Mercury?
It’s easy (and free!) to integrate Mercury into something you’re building. Just sign up for a Mercury account to get started (see API details here). While you’re at it, you may want to try one of these unofficial third-party Mercury Parser wrapper libraries in your language of choice:
- Node: mercury-client, mercury-parser, node-mercury-parser
- R: hgr, mercury, postlightmercury
- Ruby: mercury_parser, mercury_web_parser
- Elixir: mercury_ex
- Java: mercury-web-parser-java
- Go: gomercury
- Python: mercury-parsepy, mercury-parser, python-mercury-api
(Know of any other Mercury API wrappers? Written one yourself? Please let us know!)
Story published on Jun 4, 2018.