Tag Archives: wordpress

Sort of Blogging on the Fediverse

Over at Key Smash!, I’ve been helping beta-test the Pterotype plugin to hook up a self-hosted WordPress to the Fediverse. It gives WordPress an ActivityPub presence, so new posts and comments can be seen in Mastodon, Pleroma, and other ActivityPub-powered networks, and replies from those networks can come back as comments.

But Key Smash! is a simple test case. It’s at the top of the site, there’s no caching, it’s only got a handful of posts, and it hasn’t been bombarded by spammers for years.

So I’ve installed it on here. Older posts won’t federate, but new ones (starting here) should, and replies should show up as comments. With luck they’ll land in the moderation queue instead of the spam queue.

You may be able to follow the site by searching for this post’s URL in Mastodon/etc. Maybe. I need to report a bug in the handling of sites that aren’t at the top level: To find the site I need to search for @blog@www.hyperborea.org/journal – the first time. Then that search stops working, but I can find it at @blog@www.hyperborea.orgjournal instead. But that only works after I’ve searched for the first one.

Well, that’s part of why I set it up here: to help beta test.

Update: You can now follow the blog directly at @blog@www.hyperborea.org

Update (Dec): I turned it off temporarily due to spam problems. Spam comments were visible through ActivityPub, and couldn’t be deleted due to a FK constraint on the Pterotype tables.

Possibly Out-There Federation Idea

Now that Pixelfed federation and Pterotype are taking shape, I can hook up my photos and blogging directly into Mastodon and the Fediverse, but you know what would be even cooler?

Connecting them to each other.

A lot of my blog ideas grow out of photos or statuses that I’ve posted previously, as I find more to say or a better way to say it. And while it’s always possible to just post a comment or reply with a link, imagine posting them into the same federated thread.

Here’s a scenario we can do today:

  1. Photo of something interesting on Pixelfed, boosted to Mastodon. I believe we’re one update away from Mastodon replies and Pixelfed comments appearing together.
  2. Blog post on Plume or WordPress with Pterotype going into more detail about the photo. Comments and Mastodon/Pleroma replies can interleave right now. (Try it, if you want!)
  3. Another photo on Pixelfed as a follow-up. Again, comments and replies can interleave.

This is already pretty cool, but it still creates three separate discussions. The best I can do is add a “Hey, I wrote more on my blog over here: <link>” to the first discussion.

What if there were a way to publish the blog entry as a reply to the PixelFed photo? Or to publish the second photo as a reply to the blog?

And that opens up other possibilities where people can reply to other people’s photos and blog entries with their own. (Webmentions sort of do this, but they’re not going to create a single federated discussion.)

I’m not sure what form this interleaved discussion would take, or what the pitfalls might be. (Visibility might suffer, for instance.) Blogging and photo posting tend to be platforms for an original post that can have comments, rather than platforms where a top-level post can be an OP or a reply, and this would change that model.

What’s in Your Social Media Archive?

I checked out what you get when you export your content from Twitter, Facebook, Google+, LinkedIn, WordPress and LiveJournal, with an eye for both private archives and migrating to your own site.

Tired of Twitter? Fed up with Facebook? Irritated by Instagram?

If you want to leave a major social network, but keep your content — or even just make sure you have your own backup in case the site shuts down, purges accounts, or changes its TOS *cough* LiveJournal *cough*, you can usually get some of your info. But not all of them give it to you in a way that’s useful.

Twitter

You get a CSV spreadsheet containing all your tweets since the dawn of Twitter, with the text in one column, ID in another, timestamp, reply-to, and so on. It’s pretty easy to import this into another system. (I pulled mine into a test WordPress site using the WP All-Import plugin.)

Links in the text appear as the t.co shortened URL, with the “real” URL in another column. Of course, if the “real” URL was also a shortener, you’ll just see bit.ly or whatever. And if you’ve been on Twitter long enough, you may find that some of your older links use shorteners that don’t exist anymore (or have purged their archives), like ping.fm or tr.im.

You also get an offline web app with an index.html that allows you to view all your tweets month by month without visiting the site.

But you don’t get any of your uploaded media, or direct messages. So if you mostly use Twitter for text-based microblogging, you’re fine, but if you use it for photo sharing or private conversations, you’re out of luck.

Update: Retweets are sometimes incomplete in the spreadsheet. The text field is a constructed manual retweet — “RT @otheruser: Text of the original tweet” — but it’s truncated to fit in 140 characters (even if the original was made after the 280-character update). So if adding the username pushes it over the limit, or if it was longer to begin with, you don’t actually get the entire original tweet in the spreadsheet. I suspect this means retweets don’t actually use that field, and get the content straight from the original tweet by ID.

Facebook

You get all your photos, videos and messages, organized by folders, but the names are all just numeric IDs. You do get an offline web application that includes names and indexes.

It also has your entire timeline in one giant HTML file. But it only includes the text, the type of update, and the timestamps. If you posted a link, it doesn’t include the link. If you posted a photo, it doesn’t link to the photo.

And you don’t get comments, either your comments on other people’s posts, or their comments on yours.

Worst, though? It doesn’t indicate the privacy of each post. That means you can’t take the timeline and import it to a new system unless you separate the public and private posts one by one.

Update March 2018: Apparently if you use the Facebook Messenger app on Android, there’s a good chance Facebook also has your SMS messages and call history. This is probably not something you expected them to have.

Google Plus

Google Takeout allows you to export various categories of data, including your Google+ stream, circles, +1s and page posts.

Each post is exported as a separate HTML file, named after the first line. Comment threads are included, along with timestamps, a permalink to the original post, and a visibility indicator. It only marks Public vs. Limited, but that’s better than you get from Facebook.

The HTML files are suitable for publishing as-is, and marked up so that that it shouldn’t be hard to write an import tool for a CMS. (I’m planning on writing a script to convert them to WordPress’ XML format.)

Images aren’t included in the G+ stream download, and are instead hotlinked on photo posts and galleries. I haven’t checked, but I suspect any images you uploaded to Google+ will be included in your Google Photos download.

There is an index of all your posts…but it’s in alphabetical order.

Bonus: Google Buzz

When Google shut down Buzz a few years ago, they generated archives and put them in each person’s Drive account. They did one cool thing, which was to create two sets of archives: One complete, the other containing only public posts.

The format? Long PDFs, dozens of pages each, with all your posts, labeled by source (Buzz, Twitter, a specific site, etc.)…with the letters scrambled. Apparently they left the “reduce file size” option turned on when they generated them. This means you can’t copy/paste or search in the PDF itself, but you can open it in Google Docs and it’ll convert the text back, at which point you can do both. But that doesn’t preserve links or media, which you have to get out of the original PDF…

LinkedIn

LinkedIn generates two phases of archives. The first one, available within minutes of requesting it, contains your profile info, your messages, contacts and invitations in CSV files.

The complete archive, available within 24 hours, actually lives up to the name. Everything is in a set of CSV files: Your contacts, your shares, your group posts, your group comments, even your behind the scenes info like ad targeting categories and recent login records. (One word of warning: They’re encoded as UTF-16, so if the tool you use to import afterward isn’t expecting that, you may need to convert it.)

I’m not sure how photos and video are handled, as I’ve never uploaded either to LinkedIn (other than my profile picture, which landed in a folder called Media Files).

LiveJournal

LiveJournal’s own export tool will export a month at a time into a CSV or XML file, which includes your posts and their metadata (timestamps, moods, etc.), but not comments, userpics or photos.

There are other tools available using the API, which might be able to get more data. I’ve looked at two:

The WordPress importer will pull in all your posts, and the comments on them, and makes a note of moods, music, etc. (you can use my LJ-Moods plugin to display them). It doesn’t transfer any images you’ve uploaded.

DreamWidth’s importer seems more complete – LJ and Dreamwidth are based on the same code, after all – and is able to natively handle moods, userpics, etc. But it doesn’t transfer your media library either.

WordPress

WordPress exports a giant XML file containing all your posts, their comments, and their metadata. You can import it into another WordPress instance, and have virtually the same blog. Or you can merge two blogs together by importing both. (I’ve moved posts with comment threads from one blog to another by putting them in a category, exporting the category, and then importing them on the new blog.)

It doesn’t include your media library, but if you import the file to a new site before closing down the old one, the importer should offer to pull in all of the images and other media that are actually used in posts.

Plus on a self-hosted site you have a lot of tools available: backup plugins that will include everything, SFTP access through your web host, etc.

Update: Tumblr

I didn’t initially include Tumblr because it doesn’t have an exporter…but WordPress has an importer that does a good job of transferring your blog directly from Tumblr to a WordPress blog. (Look on your WordPress dashboard under Tools/Import.) It even imports images (though sometimes it imports a single-image post as a gallery for some reason). The original URL is stored in a custom field, and you can leave it connected and import new items when you want to bring them in.

Some gotchas: It can only map to one author, but you get to choose which one. It puts everything in the default category. Videos don’t get imported, even if you’ve just embedded a YouTube video.

Update: Mastodon

With the 2.3.0 update (March 2018), Mastodon has added its first archive tool. It’s essentially complete, but it’s only machine-readable so far. You get a pair of files in ActivityPub format (based on JSON), one containing your profile and one containing your formatted posts. You also get a folder structure containing any images and videos you’ve uploaded, and your icon and header image.

If you’re willing to slog through the JSON files, you can figure out which image goes to which post, but it’s still a pain.

But this is a first pass, aimed more at portability (keep your own backups or move your data to another instance or service) than readability. ActivityPub is a new standard, so there aren’t many converters yet, but that’s likely to improve.

Others

Instagram doesn’t have an export tool, so you have to rely on third-party solutions.

Flickr allows you to bulk-download photos from your Camera Roll, and it helpfully uses the title to name the files, but it doesn’t export the description, tags, or comments.

Mastodon currently only lets you export your contacts and block lists, but archiving and migration (from one Mastodon instance to another) are on the roadmap.

Using WP-CLI with WordPress 4.6 on DreamHost

I use WP-CLI from time to time to do maintenance on my WordPress sites that I host on a DreamHost VPS. But today I tried to run the search-replace function and found that wp wouldn’t run. Instead I got this error:

Fatal error: Call to undefined function apply_filters() in /path/to/wp-includes/load.php on line 317

It didn’t take long to confirm that WordPress 4.6 had changed things around, breaking the version of WP-CLI on my server. As it turns out, WP CLI 0.24 fixes this, but DreamHost is running 0.24-alpha.

So I tried installing the current version locally on my account, only to get a different error:

Fatal error: Class 'Phar' not found in /path/to/wp-cli.phar on line 3

I found this article very helpful for enabling PHAR support on a DreamHost VPS. I went into ~/.php/5.6/phprc (create this file for your version of PHP if you don’t have it) and added:

extension=phar.so
phar.readonly = Off
phar.require_hash = Off
suhosin.executor.include.whitelist = phar

Once I verified that it would work by running /usr/local/bin/php-5.6 wp-cli.phar --info, I took the opportunity to (a) override the wp command with the local one and (b) make sure it used php 5.6 by adding the following alias to my .bash_profile:

alias wp='/usr/local/bin/php-5.6 ~/bin/wp-cli.phar'

This won’t be needed once DreamHost updates their WP-CLI package, but for now, it solves my problem faster than waiting for a response from tech support.

Make Feedly Notice an Updated WordPress Post by Changing the GUID

Sometimes it’s better to update an existing post than to write a new one. Maybe there’s an ongoing conversation in the comments thread, or news is breaking over the course of a day. Sometimes I’ll post a poll, then reuse the same post for the results, to keep discussion together. The problem is that not everyone will get the notice that the article has changed.

After posting a question on Feedly’s Google+ page, I confirmed that Feedly (and no doubt other readers) uses the post GUID to decide whether to fetch content. If it’s seen the GUID before, it assumes it’s already seen the content, and stops looking at it.

If you’ve never delved into RSS markup, you’ve probably never noticed this field, but the GUID is a “generally unique identifier” used to tell feed readers whether they’ve seen an article before. A new GUID means a new post. Most of the time, you don’t want that to happen, because you don’t want it adding the same post over and over again every time you fix a typo or change a headline.

Under limited circumstances, though, it might make sense to tell reader software that the updated post is a new one.

A StackExchange post pointed me to a filter hook that can be used to change the GUID. WordPress uses the ID-style permalink by default, because it should be unique, but it’s important to remember that the field is only an ID, and isn’t used as a link — so you can modify it without worrying about it staying a valid URL.

The response suggested to just append the modification date, but I didn’t want to do that. I frequently update old posts to fix typos, update links, remove dead links, etc. So instead, I added a custom field that I can fill out when I make a big enough change that I want the post to show up as new again.

// Modify the GUID in the RSS feed after major revisions (but not after every
// little change) so that clients like Feedly will pull the updated content.
function ktv_feed_guid_revisions($content) {
	$revised = get_post_meta( get_the_ID(), 'updated', true );
	if ($revised != "") {
		$content .= "?updated=$revised";
	}
	return $content;
}
add_filter('get_the_guid', 'ktv_feed_guid_revisions', 7);

I’ve got it in a functionality plugin for now. When I make a change that I really want to update everywhere, I add a custom field called “updated” and give it a value – usually a number, so that I can add to it if I have something that I update more than once while it’s still new enough to show up in feeds.

I wrote this months ago, but never got around to publishing it. Yesterday’s 10 ways to optimize your feed for feedly reminded me it was still sitting in my drafts, so I dusted it off.