Webmentions (Maybe?)

I have spent the past couple of weeks fixated on implementing Webmentions on this website, and I have finally deployed the feature! (I think)

I have spent the past couple of weeks fixated on implementing Webmentions on this website, and I have finally deployed the feature!

Webmentions are a protocol for notifying a website when another website mentions it. It’s as simple as a POST request with the url of the mentioning website as the source and the url of the mentioned website as the target. The rest of the specification is around discovering support and verifying requests.

As simple as the protocol is, I found it surprisingly complicated to implement. The protocol itself is straightforward, but answering the “you’ve received a webmention, now what?” question is a lot harder. And then also, this website is the only Django project I work on, so I have some unfamiliarity with the framework.

I had previously attempted to implement sending webmentions on this site, but had gotten stuck on receiving webmentions, and also my implementation for sending them turned out to be fragile. I looked at some available python packages for this, but none worked how I thought it should. django-wm came the closest and some of my architecture is inspired by that project. My new implementation overhauls sending functionality and adds support for receiving. Here’s an overview of what I’ve done:

Sending Webmentions

  1. Queue saved content for later processing.
  2. Parse saved content for links.
  3. Send webmentions to sites that support them.

Queue Saved Content for Later Processing

Whenever I save a published item that supports webmentions, I also save an OutgoingContent record with the permalink of the saved item.

With my previous implementation, the save is where I also tried to parse the content for links and notify the referenced websites. Almost always, this returned an error because the freshly saved item would not yet be available on my website to be verified. Also, saving models had to wait for these HTTP requests to respond before finishing, so saving took a few noticeable extra seconds.

The new implementation saves the OutgoingContent for later processing. So far I’ve been doing this manually, but I have a Django Management Command ready to set up with a cron job and run on a schedule.

OutgoingContent records have a unique constraint on its source URL field, so if an item is saved multiple times before processing, it won't be queued multiple times.

Parse Saved Content for Links

Processing the OutgoingContent involves sending a GET request to it’s URL, parsing the response for an h-entry microformat, and getting all of the URLs linked in it. All of my website items are marked up with microformats so I know it’s available, and limiting the parsing to the h-entry means I won’t try to send webmentions to any miscellaneous links on my webpage (like the navigation, for example). Using a live URL ensures that it will also be available to the target URLs when they validate the webmention.

For each link found in the content, an OutgoingWebmention record is created (or retrieved if one already exists). Similar to OutgoingContent, OutgoingWebmention has a unique constraint on the combined source and target fields. If the record already exists, it’s fields are reset so that it can be reprocessed.

Additionally all OutgoingWebmentions that already exist with a source that matches the OutgoingContent source are reset for reprocessing. This captures the scenario where a link was removed from the content and the other website should be notified of its removal.

Send Webmentions to Sites That Support Them

For each OutgoingWebmention that needs processing, a GET request is sent to its target URL. The response is parsed looking for a webmention endpoint link. This process is laid out in the specification nicely.

If a webmention endpoint is found, then the webmention POST request is sent to that endpoint. The result is then stored with the OutgoingWebmention. I’m storing the result and the number of attempts for each OutgoingWebmention. For the scheduled process, I plan on limiting the number of attempts to allow for retrying failures that are there result of a network blip, but also avoiding unnecessary requests to websites that don’t support webmentions.

Receiving Webmentions

  1. Receive webmention
  2. Process webmention

Receive Webmention

I have a single endpoint set up to receive webmentions, and I advertise it with a link[rel=webmention] in the head on my Detail views.

Models that can receive webmentions inherit from a Webmentionable abstract model class that sets up a GenericRelation with IncomingWebmention. Additionally, the urlpatterns for routes that accept webmentions contain an extra dict with wm_app_name and wm_model_name so that query can be constructed with that data.

On receipt of a webmention, I parse the request for the source and target values and create-or-get an IncomingWebmention with that data. Like OutgoingWebmentions, IncomingWebmentions have a unique constraint on those fields so that multiple requests for the same webmention will not result in multiple records in my database.

If the record already exists, it’s processing fields are reset so that it will be reprocessed. This handles use cases where the webmention is notifying of updated content, so my website can update accordingly.

The request itself is then verified. All that is being verified now is that request itself is valid. Are the source and target urls valid URLs? Are the source and target URLs the same? Does the resource at the target URL accept webmentions?

Once this is done, if everything is in order, the endpoint returns a 202 indicating that the webmention has been accepted and queued for processing.

Process Webmention

As with sending webmentions I intend for the processing to be automated, but for the time being I’m handling it manually.

For an IncomingWebmention, the content is retrieved from the source url and the existence of the target url in its content is verified.

Then the content is parsed for microformats. If an h-entry exists, then it’s saved with the IncomingWebmention. If the h-entry has an author property, it is parsed for an h-card. If no h-card is found there, then the whole document is parsed for an h-card. The h-card is also saved with the record.

Finally, the target url is resolved using Django’s django.urls.resolve to find the View for the url, and through the information with the url pattern, the IncomingWebmention is attached to the record being mentioned.

Similar to the OutgoingWebmentions, I’m saving the number of processing attempts and plan on a limit when this piece is automated.

For all IncomingWebmentions that have been processed, the final step is approving, signifying that the webmention is okay to display on the website.

Displaying Webmentions

For approved webmentions, I’m displaying them in a few different ways on the site.

First, replies are listed in chronological order under a post as comments. After that, I’m displaying any mentions that are from my own site; this becomes a kind of related articles section. All other mentions are listed below that with simple “Mentioned by…” or “Bookmarked by…” text.

You can see a reply in action on this photo and first-party mentions on this post.

Closing Thoughts

One of the problem parts with implementing this feature is testing locally. Since both target and source must be available to each other for the process to work, I could only test with my own content. I have no doubt I’ll see bugs pop up once external webmentions come in, but I’m happy with what I have so far and will improve as I go.

Looking forward, I’d like to add support for allow/deny lists for auto approving/rejecting webmentions from certain sources. And I’d like to support likes and reposts as distinct from generic mentions as well. I also have a vague notion of implementing native comments and/or likes through this same webmention framework.

I could not have implemented this without the well-written spec, the indieweb documentation, Webmention Rocks, and snooping the html of Jeremy Keith. Thanks to everyone that put in the work there.