7

I haven’t been over there much since /r/GC and /r/LGBdroptheT got canned, but inspired by KF and their meticulous archiving I wonder what the best way to archive a Reddit user’s entire posting and commenting history is these days.

Is it best to use something like Removeddit and then archive.ph, or something else?

I haven’t been over there much since /r/GC and /r/LGBdroptheT got canned, but inspired by KF and their meticulous archiving I wonder what the best way to archive a Reddit user’s entire posting and commenting history is these days. Is it best to use something like Removeddit and then archive.ph, or something else?

4 comments

[–] Lipsy 4 points Edited

I've seen tools that can scrape the entire haul of all of reddit for up to a month at a time. No images, and it's almost 100GB for an entire month's worth of reddit activity.
Such tools are compiled with APIs that will archive any comment as soon as they "see" it, so the vast vast majority of comments that aren't deleted within a few seconds of posting are captured. Please let me know if you're interested in this sort of thing and I'll go back and try to find the required names/specs.

If you're looking to download the content of a few specific subs, there seem to be quite a variety of tools for that (found with search, can't vouch for or against any of them)

You'll also need a specialized text editor (designed to "load" the file in strategic chunks) if you're going to be loading up and searching through text dumps on the level of 100s of MB or at the scale of GB. Trying to do this with a standard text editor or word processor will definitely cause a crash.

One such editor is "010 Editor", which has builds for windows, macOS and Linux.

I'm interested as well! I'm always looking for new tools for archiving!

Wow Lipsy, thank you for this! I definitely need to learn more about it all.

In this case I’m interested in archiving the contents of a particular user account on behalf of a friend who may need it as evidence of real life harassment.

You're welcome. There are others here who know far more than I do (among other things, anyone who helps code the site!) but I'll poke around my fledgling network and see what I can find.