I have spent the past couple of weeks fixated on implementing Webmentions on this website, and I have finally deployed the feature!
Webmentions are a protocol for notifying a website when another website mentions it. It’s as simple as a POST
request with the url of the mentioning website as the source
and the url of the mentioned website as the target
. The rest of the specification is around discovering support and verifying requests.
As simple as the protocol is, I found it surprisingly complicated to implement. The protocol itself is straightforward, but answering the “you’ve received a webmention, now what?” question is a lot harder. And then also, this website is the only Django project I work on, so I have some unfamiliarity with the framework.
I had previously attempted to implement sending webmentions on this site, but had gotten stuck on receiving webmentions, and also my implementation for sending them turned out to be fragile. I looked at some available python packages for this, but none worked how I thought it should. django-wm
came the closest and some of my architecture is inspired by that project. My new implementation overhauls sending functionality and adds support for receiving. Here’s an overview of what I’ve done:
Sending Webmentions
- Queue saved content for later processing.
- Parse saved content for links.
- Send webmentions to sites that support them.
Queue Saved Content for Later Processing
Whenever I save a published item that supports webmentions, I also save an OutgoingContent
record with the permalink of the saved item.
With my previous implementation, the save is where I also tried to parse the content for links and notify the referenced websites. Almost always, this returned an error because the freshly saved item would not yet be available on my website to be verified. Also, saving models had to wait for these HTTP requests to respond before finishing, so saving took a few noticeable extra seconds.
The new implementation saves the OutgoingContent
for later processing. So far I’ve been doing this manually, but I have a Django Management Command ready to set up with a cron job and run on a schedule.
OutgoingContent
records have a unique constraint on its source
URL field, so if an item is saved multiple times before processing, it won't be queued multiple times.
Parse Saved Content for Links
Processing the OutgoingContent
involves sending a GET
request to it’s URL, parsing the response for an h-entry
microformat, and getting all of the URLs linked in it. All of my website items are marked up with microformats so I know it’s available, and limiting the parsing to the h-entry
means I won’t try to send webmentions to any miscellaneous links on my webpage (like the navigation, for example). Using a live URL ensures that it will also be available to the target URLs when they validate the webmention.
For each link found in the content, an OutgoingWebmention
record is created (or retrieved if one already exists). Similar to OutgoingContent
, OutgoingWebmention
has a unique constraint on the combined source
and target
fields. If the record already exists, it’s fields are reset so that it can be reprocessed.
Additionally all OutgoingWebmentions
that already exist with a source
that matches the OutgoingContent
source are reset for reprocessing. This captures the scenario where a link was removed from the content and the other website should be notified of its removal.
Send Webmentions to Sites That Support Them
For each OutgoingWebmention
that needs processing, a GET
request is sent to its target
URL. The response is parsed looking for a webmention endpoint link. This process is laid out in the specification nicely.
If a webmention endpoint is found, then the webmention POST
request is sent to that endpoint. The result is then stored with the OutgoingWebmention
. I’m storing the result and the number of attempts for each OutgoingWebmention
. For the scheduled process, I plan on limiting the number of attempts to allow for retrying failures that are there result of a network blip, but also avoiding unnecessary requests to websites that don’t support webmentions.
Receiving Webmentions
Receive Webmention
I have a single endpoint set up to receive webmentions, and I advertise it with a link[rel=webmention]
in the head
on my Detail views.
Models that can receive webmentions inherit from a Webmentionable
abstract model class that sets up a GenericRelation
with IncomingWebmention
. Additionally, the urlpatterns
for routes that accept webmentions contain an extra dict
with wm_app_name
and wm_model_name
so that query can be constructed with that data.
On receipt of a webmention, I parse the request for the source
and target
values and create-or-get an IncomingWebmention
with that data. Like OutgoingWebmention
s, IncomingWebmentions
have a unique constraint on those fields so that multiple requests for the same webmention will not result in multiple records in my database.
If the record already exists, it’s processing fields are reset so that it will be reprocessed. This handles use cases where the webmention is notifying of updated content, so my website can update accordingly.
The request itself is then verified. All that is being verified now is that request itself is valid. Are the source and target urls valid URLs? Are the source and target URLs the same? Does the resource at the target URL accept webmentions?
Once this is done, if everything is in order, the endpoint returns a 202 indicating that the webmention has been accepted and queued for processing.
Process Webmention
As with sending webmentions I intend for the processing to be automated, but for the time being I’m handling it manually.
For an IncomingWebmention
, the content is retrieved from the source
url and the existence of the target
url in its content is verified.
Then the content is parsed for microformats. If an h-entry
exists, then it’s saved with the IncomingWebmention
. If the h-entry
has an author
property, it is parsed for an h-card
. If no h-card
is found there, then the whole document is parsed for an h-card
. The h-card
is also saved with the record.
Finally, the target
url is resolved using Django’s django.urls.resolve
to find the View for the url, and through the information with the url pattern, the IncomingWebmention
is attached to the record being mentioned.
Similar to the OutgoingWebmentions
, I’m saving the number of processing attempts and plan on a limit when this piece is automated.
For all IncomingWebmention
s that have been processed, the final step is approving, signifying that the webmention is okay to display on the website.
Displaying Webmentions
For approved webmentions, I’m displaying them in a few different ways on the site.
First, replies are listed in chronological order under a post as comments. After that, I’m displaying any mentions that are from my own site; this becomes a kind of related articles section. All other mentions are listed below that with simple “Mentioned by…” or “Bookmarked by…” text.
You can see a reply in action on this photo and first-party mentions on this post.
Closing Thoughts
One of the problem parts with implementing this feature is testing locally. Since both target and source must be available to each other for the process to work, I could only test with my own content. I have no doubt I’ll see bugs pop up once external webmentions come in, but I’m happy with what I have so far and will improve as I go.
Looking forward, I’d like to add support for allow/deny lists for auto approving/rejecting webmentions from certain sources. And I’d like to support likes and reposts as distinct from generic mentions as well. I also have a vague notion of implementing native comments and/or likes through this same webmention framework.
I could not have implemented this without the well-written spec, the indieweb documentation, Webmention Rocks, and snooping the html of Jeremy Keith. Thanks to everyone that put in the work there.