[ / / / / / / / / / / / / / ] [ r8k / ck / wooo / fit / random / dislyte / eots / fiobr / join / lathe / lumidor / nand / warroom ]

/gamergatehq/ - The GamerGate Headquarters

BTFOs are Life, Ethics is Hometown

Name
Email
Subject
REC
STOP
Comment *
File
Password (Randomized for file and post deletion; you may also set your own.)
Archive
* = required field[▶Show post options & limits]
Confused? See the FAQ.
Embed
(replaces files and can be used instead)
Oekaki
Show oekaki applet
(replaces files and can be used instead)
Options

Allowed file types:jpg, jpeg, gif, png, webp,webm, mp4, mov, swf, pdf
Max filesize is16 MB.
Max image dimensions are15000 x15000.
You may upload5 per post.


Happy 5th Anniversary GamerGate!

File: 1457586335777.webm (4.32 MB,320x240,4:3,All hands on deck.webm)

48cd8e No.318649

ARCHIVE.IS MAY BE COMPROMISED, OR NEARLY COMPROMISED.

>All 8chan archive links were pulled off the site for a few hours today.

>During that time, we exposed a severe vulnerability in relying on Archive.is.

>The cited reason for the pulldown of the site was archived pages containing CP keep showing up.

>After much arguing and drama, the 8chan archives were reinstated.

BUT WE DON'T KNOW HOW LONG THAT WILL BE THE CASE!

http://s000.tinyupload.com/index.php?file_id=07149261893554013542

Contains a small script I wrote that will download a local copy of all the /v/ #GG thread archives. Use it to download your own backup copy.

I want to do something similar for /gamergate/ and GGHQ, for Deepfreeze links, and Wiki citations. We can worry about hosting later. If we lose Archive.is, we lose everything.

Help in this Herculean effort would be incredibly appreciated!

____________________________
Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318652

Thanks. Doing it right now.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318659

I have a massive amount of corrupted archives. So far: about 1200 corrupted files (616 bytes each) against 630 proper archives. These corrupted archives seem otherwise available on archive.is; any idea why it's happening?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318660

>>318659

Is this happening with the script, or a general observation?

If the former, give me an example link and I'll look into it.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318662

File: 1457594702385.png (13.9 KB,737x188,737:188,0dpTc.png)

>>318660

General observation. The file contains some javascript code, looks like Google Analytic.

I rewrote "archlink" so it contains the failed downloads only, and tried again. The second time around some finally downloaded, other won't. Maybe it's just archive.is' way of saying "not now honey, mommy's busy"

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318702

>>318662

I want porn of this.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318709

File: 1457612018850.png (136.56 KB,752x422,376:211,DramaIsNotMyStrongSuit.png)

>>318702

Original Character, Do Not Steal

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318711

semi related: deepfreeze uses archive(dot)is links are affected their articles, there some alternative for archives (eg. openwayback)

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318724

Back up to where?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318727

script to pull archive.is urls out of a text file: https://pastee.org/gecp3

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318731

>>318727

Thanks, have some archive links https://pastee.org/34wz7

HTH

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318739

>>318662

Interesting. /pol/ talked about that script at one point, saying there was no reason why traffic from archive.is should be redirecting there.

Educated guess: If the script tries to load on a page, the grabber gets a corrupted copy. If it doesn't try to load you get the clear .zip of the archive. Plausible?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318740

>>318739

It's getting weirder… I launched the script again with the missing files, this time all of the downloaded zips are 131.3kB. They are technically valid, yet there's only one file inside: index.html. If I extract this file and open if with an hexadecimal editor, it contains the opening and closing "html" tag, but starting at address 0x0D, it seem to be binary?! It's not a program, but it doesn't look like a compressed stream either. I have no clue what's going on.

Anybody else is having issues?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318742

>>318740

Give me a couple suffixes for known bad files pls, I'd like to look at this. By suffix I mean the five letter "xDyNM" from the archive link.

I'll download em manually and see what happens.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318743

>>318742

1keQV

1kOrY

1lPT4

1mPAr

1N9tv

1nCzA

1Oa5H

1ob2Z

1oXlb

1peXJ

1ptmQ

1SDVw

1sIN8

1UHVr

1XupV

21j3o

23p93

24mMx

24tvv

2HJuj

2ia1s

2Jico

2KZL4

2lgQn

2lii3

2m4aX

2N0yZ

2nhKw

2QEUH

2sfbf

2StTm

2yc27

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318745

>>318743

This is really fucking weird.

I'm going to town on 1kOry from that list and NOTHING can download it. Every time, you get a zip with a corrupted file beginning with html headers.

Yet if you go to the page http://archive.is/1kOrY and you click the "Download .zip" link, you get the EXACT same source file, but everything is 100% perfect.

Happens with wget, happens with curl… Just what the fug is going on here?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318746

>>318745

To make it clear: these files would download as 616 bytes files yesterday. It's always these files I'm trying to download. Yesterday I had a little more luck and would sometimes manage to finally grab some files that previously wouldn't download properly…

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318747

>>318746

I have a possible solution. Stand by.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318748

File: 1457668249601.jpg (47.75 KB,550x550,1:1,Asuka fuck yeah.jpg)

>>318746

>>318747

Alright, I think I've got it!

Actually this is probably better, since it opens archiving up to non-Linuxfags too. Here we go:

Download the "DownThemAll!" extension for Firefox or Pale Moon from here:

https://addons.mozilla.org/en-US/firefox/addon/downthemall/

Install it and restart your browser.

Take the 'archlinks' file and open it in a text editor, and resave it in your GG archive folder as 'archlinks.txt'

Now, in Firefox or Pale Moon, go to:

>Tools

>DownThemAll Tools

>Manager

This will open the DownThemAll raw file interface. You'll see some bullshit files there that you don't want in the list, so just click - delete them.

Now right click anywhere on that screen and choose

>Advanced

>Import from File

Select your archlinks.txt file as the source (change the filetype from Meta to .txt using the little dropdown box in the bottom right corner to make it show up)

Now at the main Manager screen again choose your destination folder for the downloads from the box on the left. It will default to your home folder or desktop if you let it, lol. Get them in the right spot.

At this point you should see the massive list of downloads, all from archive.is, filling the manager window.

Right click one and hit "Select All", then right click one again and say "Check all Selected." That tells it to download every fucking one of the files.

Press the Start! button at the bottom and let it go to work.

I tested in 5 of the problem files above and it worked like a charm.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318749

>>318748

It works. Thanks!

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318751

File: 1457669843024.jpg (79.49 KB,490x480,49:48,Badunkadunk.jpg)

Scratch that. Some of them work, but I still get 616 bytes long files more than half of the time:

m7x5b

M81dB

MbGja

mcv5h

MeCrT

qXYx0

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318752

>>318751

I just tested all of those you listed, DLs completed without issue.

Running a selection of 50 right now for a bigger test.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318753

>>318752

To be fair, I have a list of 999 (yeah, it's frustrating) that won't download.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318754

File: 1457671425693.png (191.47 KB,1600x900,16:9,Shot.png)

>>318752

I'll just keep forcing download until I get the proper thing I guess.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318755

>>318751

Alright, I'm beginning to suspect its a bandwidth issue.

I did 50 random files, for about 7 corrupted. A good example would be

naHEC

Came in at 8.7 mb the first time, corrupt, and over 33mb on the redo that gave me the proper archive. Similar with most of the problem files - they're really big in non-corrupt state.

I'll play with more options tonight and tomorrow and see if we can compensate for this somehow without having to go full turbonerd. If we can throttle it on our end such that archive.is' servers don't choke on it we can probably make this work.

Duds are easy to find in *nix because you can automate an integrity check on the files very easily, but we'd need some kind of recursive program to keep downloading them til everything was correct.

Oy fucking Vey.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318758

>>318755

Think you can get some backups going on the archives I sent you in an email?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318761

File: 1457674007116.jpg (29.47 KB,423x469,423:469,Asuka little ole'.jpg)

>>318758

Yes, but we need to solve this file integrity issue first and foremost.

Downloading all the archives won't do a lot of good if half of them come up busted. I have a plan and I'll keep working on it tomorrow and updating this thread regularly.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318762

>>318755

>>318746

I know someone who downloaded a bunch of the zips manually said they stopped being able to access archive.is for a while, and I've personally had the same thing happen with manually archiving too many pages in a short time.

Are you just hitting the limit from abusing their servers too much and then getting a bunch of dummy files until enough time passes that the servers let you download them again? Maybe try putting a significant delay between each download so you don't download too many in an hour (or whatever timeframe they use) and then let them slowly download overnight.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318763

>>318761

Excellent.

Hopefully this shit with archive.is is a one time thing, but better safe than sorry.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318764

>>318762

It's likely. I'll try to break the initial list into smaller chunks, and download one chunk every thirty minutes maybe?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318765

>>318762

My theory is that their server throttles you with that redirect if you have too much bandwidth taken up.

I configured DownThemAll to use no more than one connection per server and to download one file at a time with the speed capped at 50kbps. I also activated the setting that downloads the last part of a file first, to keep the checksum from getting messed with. I'm running a batch of 200 archives with those settings right now.

Currently at ~21 files complete with no corruptions yet. Cross your fingers.

EDIT:

At 68 files downloaded, zero corrupt files. I have to get to bed, but I ~think~ we've figured it out? Knock on wood. I'll be back late tomorrow to check it out again, but this seems very promising.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.
Post last edited at

48cd8e No.318782

Hilarious. Absolutely top kek.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318804

File: 1457732768001.png (6.56 KB,255x252,85:84,1388263733556.png)

Okay, now will people pay attention to my archive.is batch downloader? Just feed it a text file or export of your bookmarks and you get all the .zip backups from archive.is or archive.today links.

https://gitgud.io/MetalUpa/BackUpa

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318812

File: 1457742219955.png (1.02 MB,817x1200,817:1200,feac5271817f662da363e648b2….png)

I don't understand half the techno babble in this thread but keep at it and let me know whenever its finished.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318863

File: 1457769399054.jpg (34.69 KB,457x446,457:446,Asuka turned on.jpg)

>>318765

CAN CONFIRM

200 archives downloaded without error.

SETTINGS

In the DownThemAll manager, click the Preferences link at the bottom and go to the Advanced tab.

>Set 'Max number of segments per download' to 1

>Timeout to 15 minutes

>Download last few kilobytes first to somewhere around 2800

In the 'Network' tab:

>Concurrent downloads to 8

>Downloads per server to 1

That fixed the file corruption for me. Can anyone else confirm?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318874

>>318863

At 50 files now with no corruptions using those settings.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318893

File: 1457800691060.jpg (61.11 KB,490x416,245:208,autism.jpg)

Close, but I still had some issues with some file being 616 bytes. I capped the transfer rate in the manager window. It takes a fuck ton of time, but that'll do it. I hope

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.318975

>>318893

What did you set the limit to? 50kbps?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.319006

>>318975

100Kb. Finished downloading everything a couple of hours ago. I still need to copy the files on another computer and check if everything can be unpacked properly.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.319761

My mum is always involved in some shit with the council / old people's homes, I told her to save their pages with archive.is, so they can't change things later.

I believe the Finnish government has used a national firewall to block the archive. It's surely only a matter of time before other countries follow suit. No doubt claiming they're only doing it to "stop the hate speech from Gamergate".

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.319822

>>319761

Using the good ol' GamerGate boogeyman could be done, even though there are outer outs possible for governments to block the service:

The "right to be forgotten", which allows someone to request that search engines remove links to pages deemed private, even if the pages themselves remain on the internet, this is in practice in the EU and Argentina since 2006. Surely, archive.is isn't a search engine, but I'm also (and probably legitimately) concerned about the slippery slope.

The "digital millennium copyright act", which tilts strongly in favor of copyright holders could be become an issue since there's a push the revise the way copyright works on the Internet (everything and anything you publish should be protected by a copyright, either from you, the service you use or other private registration services).

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.319873

you need some kind of trusted broker still that's the point of an archive. You can't tamper with it. A flat file under your control on your local host can be tampered with compromising it's trustability.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

48cd8e No.328457

pppp

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.



[Return][Go to top][Catalog][Nerve Center][Random][Post a Reply]
Delete Post [ ]
[]
[ / / / / / / / / / / / / / ] [ r8k / ck / wooo / fit / random / dislyte / eots / fiobr / join / lathe / lumidor / nand / warroom ]