Table of Contents
The Mastodon server I use recently experienced some technical issues, which we could all still talk to each other on the local feed (and our toots seem to be reaching people outside of our instance), we could not see toots incoming, and our notifications were also broken.
But, as we could still access our data, many of us backed ours up using the Export function from the preferences section on a web browser. However, once we have our data, what can be done with it? In this post, I’ll go through the process of how I archive my toots using my Mac.
Toots?
Before I begin, I should explain for those unaware, that Mastodon is a federated alternative to Twitter, and toots are the name given to the alternative of Tweets - however, many prefer posts or notes instead. I like using toots, but do alternate between that and posts. For clarity here, I’ll be using “toots” for the microblogging messages on Mastodon, and “posts” for Hugo blog posts that come from .md files.
Micro Blog
Many wanted a back up in case we needed to migrate to a different instance, but I have long wanted to create an archive of my toots in case the instance went offline, or the admin decided to turns on automatic deletion of toots after a certain amount of time in order to save space.
I created a test Micro Blog containing some copied-and-pasted toots a few months ago, and while I was happy with the end result, the actual process of getting to it was slow and cumbersome due to the process of copying and pasting toot-by-toot.
Spurred on by the technical issues of my instance, I decided to have another look at what could be done with the outbox.json file I extracted from the exported archive.
Scripts
A started with a web search and some scripts came up, but I ran into multiple errors. I’m not well versed in many languages, more of just a HTML and CSS person, with dabbles in YAML and TOML to reconfigure Hugo. As such, even though the guides I used seemed straightforward, and probably are to those who regularly use Scala or JavaScript, I encountered errors I didn’t know how to fix.
However, I finally managed to get something working through a slightly modified version of a Node.js script by Chris Deluca:
#!/usr/bin/env node
import { readFile, writeFile } from 'node:fs/promises';
import { Buffer } from 'node:buffer';
import striptags from 'striptags';
(async () => {
try {
const filePath = new URL(
'./outbox.json',
import.meta.url
);
const contents = await readFile(filePath, { encoding: 'utf8' });
const data = JSON.parse(contents);
data.orderedItems.forEach(async (item) => {
if (item.object.inReplyTo || !item.object.content) return;
const unixTimestamp = Math.floor(
new Date(item.published).getTime()/1000
);
const publishDate = new Date(item.published).toISOString();
const template = `+++
title = "Note on ${item.published}"
slug = "${unixTimestamp}"
publishdate = "${publishDate}"
draft = false
syndicated = [ "${item.object.url || ''}" ]
+++
${striptags(unescape(item.object.content), {allowedTags: new Set([
'a',
'strong',
'em',
])})}`;
const fileData = new Uint8Array(Buffer.from(template));
await writeFile(
`/Users/Jessica/posts/${unixTimestamp}.md`,
fileData
);
});
} catch (err) {
console.error(err.message);
}
})();
For some reason, I had an issue with line 10 ./outbox.json
, I needed the ./ in front to make any of the pathing work, as without it, it became confused and complained of lacking permissions.
Also, as someone who doesn’t use Node.js often, I had to grab striptags separately before running the script to make it work.
As such, here is the step-by-step of how I did it if you need to start from scratch and don’t know much about all this sort of stuff, like me!
Guide
I’ve written this guide for Mac as that’s what I’m using. The entire process uses Finder, Terminal, The Unarchiver, and CotEditor - though similar applications ought to work fine.
- In Terminal, install homebrew:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- In Terminal, install Node.js:
brew install node
- In Terminal, install striptags:
npm install striptags
- Go to your Mastodon instance via a web browser, logging in and going to Preferences β Import and Export.
- There should be a button to create an archive of your data.
- After the archive has finished collecting everything, but this may take a long time depending on how much data there is, press on Download Your Archive.
- A .tar.gz file will download. In Finder, open and save its contents through an extractor such as The Unarchiver or Keka.
- In Finder, locate the
outbox.json
file, and copy it into your home folder (mine is in my name ofJessica
. You can quickly get to your home folder by pressing down Command, Shift and H at the same time -command+shift+h
). - In a text editing application like CotEditor or SublimeText, copy and paste the above large JavaScript code, and save the file as
convert.mjs
in your home folder. - Next in the text editor, you need to change line 37. It determines the location of where you want the Hugo posts to be created. You will need to change the name of the home folder from
Jessica
to the name of yours. - In Finder, go into your home folder and create a new folder called
posts
. - In Terminal, you ought to be automatically in your home folder already, you can confirm this by entering
pwd
into the Terminal and pressing Enter. It should return with/Users/[NAME]
. If you are elsewhere, keep inputtingcd ..
and hitting Enter until you see/ %
at the end, and then type incd /Users/[NAME]
and pressing Enter. Verify you are in the right place again by usingpwd
as described before. - In Terminal, type
node convert.mjs
and press Enter. Nothing may look like its happening, but if you use Finder and navigate toposts
in your home folder, it should now be populated with .md files. - In Finder, copy and paste the .md files to where the posts go within your Hugo website.
- You may need to edit the .md files slightly, due to possible formatting issues. You can do this within your text editor.
Optional Tweaks
- As default, the script will generate .md files titled
Note on [TIME]
within the Hugo Front matter (as seen on line 22). You may wish to rename this. - You may also wish to modify the slug (the URL ending the post will have) on line 23.
Limitations
- Only Public posts are converted. For example, replies, which are usually sent as Unlisted, will not appear, but if you sent them as Public, they will.
- Any
#hashtags
will appear in the .md file, but won’t be functional. - Images didn’t import or load. Not an issue for me as I rarely use then, but if they’re a significant part of you Mastodon experience, you may want to look further in how the images can also be imported into Hugo posts.
- Custom emojis didn’t appear either, with only their shortcode in text being visible, for example:
:infinity_rainbow:
- Content warnings vanished too.
- The archive can only be requested every seven days, meaning your Hugo posts may often be out of date, depending on how regular you post.
- You will also need to manually request the archive every seven days, extract it, and move the
outbox.json
file to the appropriate location. - Running the script will create posts from the entirety of the
outbox.json
file. If you are doing this weekly, make a note of the most recent .md file created, and then when you run the script every week, delete the .md files older than it, so you don’t end up with duplicates. Alternatively, you could copy and paste all the contents withinposts
to where the posts go within your Hugo website, being careful not to overwrite existing files, and instead just have the newly created files added. - As I alluded to in step 15: “you may need to edit the .md files slightly, due to possible formatting issues”. This is because while paragraphs appear normally within Mastodon, they do not convert well.
For example, a daily waffle game comes out looking like this: #waffle442 2/5π©π©π©π©π©π©βπ©β¬π©π©π©π©π©π©π©β¬π©βπ©π©π©π©π©π©
Instead of this:
#waffle442
2/5
π©π©π©π©π©
π©βπ©β¬π©
π©π©π©π©π©
π©β¬π©βπ©
π©π©π©π©π©
As such, you may need to spend time reformatting if you use paragraphs extensively, like me!
Other Options
If you are just wanting to archive toots, and not necessarily republish them, you may instead be more interested in mastodon-data-viewer.py or mastodon-archive, with the former using the export function again, and latter instead accessing the data via the API.
Not really wanting to put a strain on the instance I’m on by requesting all my data via the API for no real reason, I have only tested mastodon-data-viewer.py, which creates a local “website” only your device can access where you can browse your past toots (and replies!) easily by navigating between months, and by a search bar! As such, it may be useful to spin up every now and then just to use the search function if your instance (like mine) does not have ElasticSearch, and you want to find something in your past toots. Content Warnings and images are also supported here, if that’s important to you, and there are buttons to reply to the toots and to go to them directly.
Lastly, there is also Meow, which appears to have a nice interface, but I wasn’t too sure about trusting it as it isn’t open source and I’d rather be able to run something like that offline.
Conclusion
Despite the limitations, and although there are other methods out there which may be more suited to your situation (I recall stumbling across one using the API, but cannot find it now), I am personally okay with this approach. Although not fully automated, and needing manual intervention, as I know my toots have many paragraphs within them (as well as waffle game scores that do not need archiving!), I don’t mind doing a bit of light reformatting once a week to make sure the content all works fine.
However, if there was a way to make sure the converted posts carried the same formatting the toots had, I would be interested in alternatives - even if I still have to export the archive, run the script, and re-upload weekly. It’s never a bad idea to do a weekly backup of your Mastodon data anyway!
Tags: Setup Social Media Website