I have spent the past couple of weeks fixated on implementing Webmentions on this website, and I have finally deployed the feature!
Webmentions are a protocol for notifying a website when another website mentions it. It’s as simple as a
POST request with the url of the mentioning website as the
source and the url of the mentioned website as the
target. The rest of the specification is around discovering support and verifying requests.
As simple as the protocol is, I found it surprisingly complicated to implement. The protocol itself is straightforward, but answering the “you’ve received a webmention, now what?” question is a lot harder. And then also, this website is the only Django project I work on, so I have some unfamiliarity with the framework.
I had previously attempted to implement sending webmentions on this site, but had gotten stuck on receiving webmentions, and also my implementation for sending them turned out to be fragile. I looked at some available python packages for this, but none worked how I thought it should.
django-wm came the closest and some of my architecture is inspired by that project. My new implementation overhauls sending functionality and adds support for receiving. Here’s an overview of what I’ve done:
- Queue saved content for later processing.
- Parse saved content for links.
- Send webmentions to sites that support them.
Queue Saved Content for Later Processing
Whenever I save a published item that supports webmentions, I also save an
OutgoingContent record with the permalink of the saved item.
With my previous implementation, the save is where I also tried to parse the content for links and notify the referenced websites. Almost always, this returned an error because the freshly saved item would not yet be available on my website to be verified. Also, saving models had to wait for these HTTP requests to respond before finishing, so saving took a few noticeable extra seconds.
The new implementation saves the
OutgoingContent for later processing. So far I’ve been doing this manually, but I have a Django Management Command ready to set up with a cron job and run on a schedule.
OutgoingContent records have a unique constraint on its
source URL field, so if an item is saved multiple times before processing, it won't be queued multiple times.
Parse Saved Content for Links
OutgoingContent involves sending a
GET request to it’s URL, parsing the response for an
h-entry microformat, and getting all of the URLs linked in it. All of my website items are marked up with microformats so I know it’s available, and limiting the parsing to the
h-entry means I won’t try to send webmentions to any miscellaneous links on my webpage (like the navigation, for example). Using a live URL ensures that it will also be available to the target URLs when they validate the webmention.
For each link found in the content, an
OutgoingWebmention record is created (or retrieved if one already exists). Similar to
OutgoingWebmention has a unique constraint on the combined
target fields. If the record already exists, it’s fields are reset so that it can be reprocessed.
OutgoingWebmentions that already exist with a
source that matches the
OutgoingContent source are reset for reprocessing. This captures the scenario where a link was removed from the content and the other website should be notified of its removal.
Send Webmentions to Sites That Support Them
OutgoingWebmention that needs processing, a
GET request is sent to its
target URL. The response is parsed looking for a webmention endpoint link. This process is laid out in the specification nicely.
If a webmention endpoint is found, then the webmention
POST request is sent to that endpoint. The result is then stored with the
OutgoingWebmention. I’m storing the result and the number of attempts for each
OutgoingWebmention. For the scheduled process, I plan on limiting the number of attempts to allow for retrying failures that are there result of a network blip, but also avoiding unnecessary requests to websites that don’t support webmentions.
I have a single endpoint set up to receive webmentions, and I advertise it with a
link[rel=webmention] in the
head on my Detail views.
Models that can receive webmentions inherit from a
Webmentionable abstract model class that sets up a
IncomingWebmention. Additionally, the
urlpatterns for routes that accept webmentions contain an extra
wm_model_name so that query can be constructed with that data.
On receipt of a webmention, I parse the request for the
target values and create-or-get an
IncomingWebmention with that data. Like
IncomingWebmentions have a unique constraint on those fields so that multiple requests for the same webmention will not result in multiple records in my database.
If the record already exists, it’s processing fields are reset so that it will be reprocessed. This handles use cases where the webmention is notifying of updated content, so my website can update accordingly.
The request itself is then verified. All that is being verified now is that request itself is valid. Are the source and target urls valid URLs? Are the source and target URLs the same? Does the resource at the target URL accept webmentions?
Once this is done, if everything is in order, the endpoint returns a 202 indicating that the webmention has been accepted and queued for processing.
As with sending webmentions I intend for the processing to be automated, but for the time being I’m handling it manually.
IncomingWebmention, the content is retrieved from the
source url and the existence of the
target url in its content is verified.
Then the content is parsed for microformats. If an
h-entry exists, then it’s saved with the
IncomingWebmention. If the
h-entry has an
author property, it is parsed for an
h-card. If no
h-card is found there, then the whole document is parsed for an
h-card is also saved with the record.
target url is resolved using Django’s
django.urls.resolve to find the View for the url, and through the information with the url pattern, the
IncomingWebmention is attached to the record being mentioned.
Similar to the
OutgoingWebmentions, I’m saving the number of processing attempts and plan on a limit when this piece is automated.
IncomingWebmentions that have been processed, the final step is approving, signifying that the webmention is okay to display on the website.
For approved webmentions, I’m displaying them in a few different ways on the site.
First, replies are listed in chronological order under a post as comments. After that, I’m displaying any mentions that are from my own site; this becomes a kind of related articles section. All other mentions are listed below that with simple “Mentioned by…” or “Bookmarked by…” text.
One of the problem parts with implementing this feature is testing locally. Since both target and source must be available to each other for the process to work, I could only test with my own content. I have no doubt I’ll see bugs pop up once external webmentions come in, but I’m happy with what I have so far and will improve as I go.
Looking forward, I’d like to add support for allow/deny lists for auto approving/rejecting webmentions from certain sources. And I’d like to support likes and reposts as distinct from generic mentions as well. I also have a vague notion of implementing native comments and/or likes through this same webmention framework.