help clicking "next" using command line in project for archiving post title timelines & redirect links (see details)

Posted March 24, 2025 by Maplefields in Ovarit

disclaimer: I'm a self-taught beginner. I don't know how to write webpage scripts. I only started learning linux bash command line at my own slow pace. I'm applying what I learned so far. I don't know programming languages and I don't have time to learn any, like python and javascript, right now. I am familiar with scripting loops and stuff in BASH.

Update (success!):

Tentative solution:

next=$(grep -E "after=" pg2.txt | cut -d "=" -f 3 | cut -d "\"" -f 1)

I have no problems looping the variable in Git Bash. I'm in the process of downloading all post titles & links data (not the posts themselves) from o/GC in raw txt format.

Currently working on creating a script to extract the data.

Goal:

To go through each GC related CIRCLE page of posts and collect the post titles and URL links (ovarit links and article links) in chronological order.

To get something like this: [date] [title] [ovarit URL link] [article redirect link] [ovarit archive link]

(a list for each circle).

Current problem (update: working new angle):

how do you "click" the next button from command line (currently using git BASH*, and only the curl command works)?

I've gotten as far as downloading the page source using curl.

I'm confident I can extract the data I need given enough time to finetune the grep (or awk) command and the script to output into a neat txt or markdown file. I'll share the extracted text on Ovarit (this may help the archivers organize what has already been archived) and saidit (need to create an account).

$ curl "https://ovarit.com/o/GenderCritical/new?after=Njc1ODE3" --ssl-no-revoke >pg3.txt

Unfortunately, I don't understand the logic behind "after=Njc1ODE3". How does the "next" button decide what random string to assign the next page? I am manually clicking next and copying the url into CLI to figure out a procedure, but I want this process automated. So that the script finds the "next" button and automatically clicks it.

Problem 2:

How do I check from CLI if an ovarit URL has been archived on archive.is/ph/li etc...?

I want to retrieve a true / false response and if possible, an archive link.

I want to post this list so that Ovarit archivers using archive.is know what's already been done.

idea that failed

I thought I could use the firefox extension "pageZipper" that lets you scroll down quickly ( and it clicks "next" for you and appends the next page as you scroll down quickly, so I was able to go through 60 pages in 1 minute of scrolling). But when I clicked page source (and tried to download the htm file), the appended pages didn't appear.

* asterisk note:

(*) Unfortunately, this is a busy month (and April) for me, and I'm stuck using Windows, so I can't boot into Linux for now. I'll need this bash script (once I figure it out) to run in the background of Windows while I do my other work. If this were May, I would have had more time to research how to archive Ovarit on archive.is without manually copying and pasting links. Ideally, we'd have an automated script that splits up the work so each volunteer would just need to run the script once overnight (or several nights) and share the text output with the group.

Why I'm omitting archive.org

It's very easy to send an email request to archive.org to delete archived webpages. I've seen the Kfarm archives be entirely wiped from their database.

About archiving each post

I already read the threads. It's impossible for one IP address to download the entire Ovarit site within a month's time. Volunteers would have to group organize and divide the task among many IP addresses for this to be successful.

Loading comments...

Posted March 24, 2025 by Maplefields

Edited March 25, 2025

Score: 5

/o/Ovarit

7264 subscribers

Created August 19, 2020

This circle is for "meta" discussions about the Ovarit site or community that don't belong on other meta circles such as:

/o/Circles - propose a new circle
/o/Suggestions - suggest new features for the site
/o/Bugs - for reporting bugs in the website
/o/Announcements - where admins announce changes to the site

If you want to discuss thoughts about how Ovarit should be which aren't suggestions for new features (such as significant changes to site functionality that already exists); or suggest changes to or discuss the Sitewide Rules, Guidelines, Mission, etc; or have discussions about Ovarit or the Ovarit community itself; or ask for help on how to use the site, then this is the place.

No introduction posts.

No external data gathering or external polls.

It is not appropriate to air grievances against individual circles, their moderators, their users in aggregate, or specific Ovarit users here. If you want to contest your interactions with circle moderators (including content removals and bans), message the moderators of the circle. If you can't resolve the issue with the moderators themselves, then message the site admins.

Circles on Ovarit are directed by their specific moderation teams. If you have suggestions for a specific circle or circles, bring them up with the appropriate moderators. Topics are only relevant for this circle if they apply to the entire site.

The Sitewide Rules and Sitewide Guidelines are both enforced here.

Moderators