Scraper to download multiple and categorize files

27178908 - zz.jpg (150.4KiB, 1058x1010) save_alt

I made this one mainly because i don't like separate files after downloaded them all, i don't know if there are any other with these features, i was just to lazy to search(but not lazy enough to code it :D )
here it is: https://github.com/thophys/yiff_scraper
i tried to make the instructions as clear as i could
i'll link the windows executable here, but you'll need to follow the instructions on the github repo

actualy i forgot to link, but it's on the repo anyway

Why not just use wget?

Because I'm lazy

im on windows and idk where its downloading to but its not where i tell it

My bad, just fixed it(i haven't tested on windows before xD)
turns out that windows directory use "\" instead of "/"

that was fast, ty

who's artwork is that? @thophys

from post on 2019-05-27

This works great, the only problem that i found (i dont know if im doing something wrong i dont know much about this kind of stuff) is that it doesn't download all the files from one post.

example: i tried scrapping https://yiff.party/patreon/16688671 , this artist has multiple files per post (02.jpg / 03.jpg / 04.jpg and so on) but it only downloads one of them (i.e 01.jpg), ignore the rest (02.jpg / 03.jpg / 04.jpg) then jump to the next post.

>>86458 colored version https://data.yiff.party/patreon_data/7290217/27178881/1.jpg

Oh, that's a problem, gonna fix that, I didn't realize that while testing, the creators I downloaded mostly had only shared files and attachments

29758720 - 41435628 - Poppy_blacksmith_wip_6.PNG (189.7KiB, 1014x831) save_alt


Just fixed it on version 0.6.2, sorry for the inconvenience :)



crashes on my machine after downloading a couple of videos :(
no problem with images though

could you provide which creator you're trying to download, so i can try and replicate the error

you should implement a --verbose tag
i'm trying to download https://yiff.party/patreon/2755238

i just downloaded it all here and no crashing :/
but i found a bug, so no time wasted :D
not sure what might have happened in your side, i'll add a verbose flag, as well as a save-log option, so it makes easier to understand

ther eyou have it:
tell me if you have any problems...

Loading config...
"withVerbose": true,

SyntaxError: Unexpected string in JSON at position 590
at JSON.parse (<anonymous>)
at /snapshot/yiff_scraper/dist/services/file-system.js:106:34
at FSReqCallback.readFileAfterClose [as oncomplete] (internal/fs/read_file_context.js:63:3)


your config file seems to be wrong

is it?

you missed a comma after "saveExternalLinksToTxt": true"
and the last property(in this case ""saveLog": true") must not have comma

here you go: https://0bin.net/paste/OQrL2LQz0CIg5RKW#36MKy0P1szLB5xcusVJVKaPmbrhvnLkOrWmk0tDlg4R

>>87886 >>87887
oy link is 404

índice.png (48.8KiB, 653x309) save_alt

Hi thophys! i have a problem with the scrapper

i was downloading " https://yiff.party/patreon/3278483 " content, managed to get around 163 files out of 218, but (i think) when it tried to download attachments, a file names "??1.png" loads up, the process goes from like 0% to 3% then crashes, any idea what could be happening? it crashes right on this file: http://prntscr.com/ta3awx

Also one thing to mention which didnt bother me but just in case, this artist has some posts with files without extension, check the post named WIP (2020-06-14) its a file with no extension just called "png", the scrapper doesnt get them, i just got them manually and added a name to it to make them work

sem título 2.png (64.1KiB, 808x291) save_alt

Strange, i couldn't reproduce it here, i was able to download it normally

1 download failed, but that's not related to the downloader
- about the files without extensions, it download those, they go to a "misc" folder, in your case, it just haven't reached that file because, as you said, it crashed before

try deleting those files with ?? and running it again by setting "withVerbose": true and "saveLog": true, that may give a better explanetion about the error, also provide your log here so i can see

Screenshot_8.png (14.5KiB, 652x46) save_alt

sorry for being dumb, but i got this when i set "saveLog": true then it crashes


>>90797 strange, it should have created the log file, try only with verboose then, i'll see if there's a problem by saving the log

Hey this looks really cool! Thank you for making this! Any plans on adding Socks5 proxy support? I would love to use something like this through Tor and not being able to do so is the only thing preventing me from switching over to this program.

I deleted the "??" files, and added "withVerbose": true and "saveLog": false because of the crash mentioned above, it just redownload the deleted files, and whenever it gets back to download 164/218 crashes again.

The file its trying to download its a post-attachments, i deleted the folder and when it gets to 164/218 it creates the folder back.

feel free to open an issue here: https://github.com/thophys/yiff_scraper/issues/new/choose
i'll try to investigate it further, but it's hard because i can't reproduce that error here on my side

i have no plans for that, at least not for now..sorry

No need to apologize. The scraper is still pretty neat none the less. I hope all goes well with it!

thanks bro it worked

pretty based software
anyway you could implement to somehow fetch the content of the descriptions as well? some creators put their content in the desc (MEGA links, dropbox etc) like him: https://yiff.party/patreon/4045168

bug report here

[7/10/2020, 7:44:00 AM] Loading creator lesdias posts...
[7/10/2020, 7:44:00 AM] Loading creator https://yiff.party/patreon/2755238
lesdias-nsfw (lesdias)|11-11 pages => [========================================?[39=] 100%
[7/10/2020, 7:44:04 AM] Loading creator lesdias posts done!
[7/10/2020, 7:44:05 AM] {"message":"Request failed with status code 404","name":"Error","stack":"Error: Request failed with status code 404\n at createError (/snapshot/yiff_scraper/node_modules/axios/lib/core/createError.js:16:15)\n at settle (/snapshot/yiff_scraper/node_modules/axios/lib/core/settle.js:17:12)\n at IncomingMessage.handleStreamEnd (/snapshot/yiff_scraper/node_modules/axios/lib/adapters/http.js:236:11)\n at IncomingMessage.emit (events.js:327:22)\n at endReadableNT (_stream_readable.js:1218:12)\n at processTicksAndRejections (internal/process/task_queues.js:84:21)","config":{"url":"https://yiff.party/7330723.json","method":"get","headers":{"Accept":"application/json, text/plain, */*","User-Agent":"axios/0.19.2"},"transformRequest":[null],"transformResponse":[null],"timeout":0,"responseType":"json","xsrfCookieName":"XSRF-TOKEN","xsrfHeaderName":"X-XSRF-TOKEN","maxContentLength":-1}}