"twtxtfeevalidator/0.0.1" UA about? I thought I could ask before throwing a 1000GB file at it 🪤 could it be the same 'xt' thing @lyse was talking about the other day?
hmm… apparently the invalid twts are the latest ones I’d posted from Timeline but highly probably because I’d tried to restore them manually, after unintentionally overriding my twtxt file with one that was out of date 🤦
@kat@yarn.girlonthemoon.xyz (that is targeted only at me. i do not shower enough. exposing myself)
@kat@yarn.girlonthemoon.xyz though i feel like this doesn’t need to be said because if anyone is that pretty they are not a self hoster because they regularly shower
"twtxtfeevalidator/0.0.1" UA about? I thought I could ask before throwing a 1000GB file at it 🪤 could it be the same 'xt' thing @lyse was talking about the other day?
@lyse@lyse.isobeef.org Oh! no need to be sorry and feel free to keep at it if it helps, I don’t mind. It’s just that I’m always on the lookout for corpo-bots and crawlers slipping through the cracks (a fun little game of sorts) 😅 the only thing I let them see is a robots.txt telling them to :diffoff
Also, I’m curious about the invalid lines in my feed. is it something I should lookout for in future?
@kat@yarn.girlonthemoon.xyz after some fighting with this janky software (that i still love despite the jank) we now have stupid tux as our logo. slayyy
@movq@www.uninformativ.de oh no good luck!!!
**So I need to figure out how to block ASN(s)…
Additionally, I’ thinking of; How to detect DDoS attachs?
Here’s one way I’ve come up that’s qu …**
So I need to figure out how to block ASN(s)…
Additionally, I’ thinking of; How to detect DDoS attachs?
Here’s one way I’ve come up that’s quite simple:
Detecting DDoS attacks by tracking requests across multiple IPs in a sliding window. If total requests exceed a threshold in a given time, flag as potential DDoS. ⌘ Read more
(#tw5ulrq)
⌘ Read more
(#d6gewza) @lyse@lyse Cool 👌
@lyse @lyse.isobeef.org Cool 👌 ⌘ Read more
Hmmm so I’ve sustained two DDoS attacks on my Gitea server today. A few hours apar. Still analyzing the traffic…
Hmmm so I’ve sustained two DDoS attacks on my Gitea server today. A few hours apar. Still analyzing the traffic… ⌘ Read more
For the time being… I’ve just blocked all of OpenAI(s) Bots. They (thankfully) publish a JSON endpoint that you can use to block all OpenAI …
For the time being… I’ve just blocked all of OpenAI(s) Bots. They ( thankfully) publish a JSON endpoint that you can use to block all OpenAI crawlers from reaching your server ( in my case, blocking it at the edge). Example:
proxy-1:~# curl -qs https://openai.com/gptbot.json | jq -r '.prefixes[].ipv4Prefix' | xargs -I{} ./block-ip.sh {}
Where … ⌘ Read more
**(#buvh2sa) @aelaraji Yes! 👏 This is exactly what it is! 🤣 I will of course soon™ be hosting this service, likely at validator.twtxt.net ...**
[@aelaraji _@aelaraji.com_](https://twtxt.net/external?uri=https://aelaraji.com/twtxt.txt&nick=aelaraji) Yes! 👏 This is exactly what it is! 🤣 I will of course soon™ be hosting this service, likely atvalidator.twtxt.net😅😅 ⌘ [Read more](https://twtxt.net/twt/rmyrhwq)
Any idea What’s this "twtxtfeevalidator/0.0.1" UA about? I thought I could ask before throwing a 1000GB file at it 🪤 could it be the same ‘xt’ thing @lyse@lyse.isobeef.org was talking about the other day?
(#f26jg3a) @kat Haha 🤣 If someone figures this out, please let me know 🙏🙏 – In the meantime, I’m going to very soon™ write a daemon …
@kat @yarn.girlonthemoon.xyz Haha 🤣 If someone figures this out, please let me know 🙏🙏 – In the meantime, I’m going to very soon™ write a daemon that will watch the audit log for repeated violations and add to the network firewall. ⌘ Read more
**(#4nndfsa) This is better:
proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
2025/01/04 23:17:04 4.227.36.76 58982 GE ...**
This is better:
proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
2025/01/04 23:17:04 4.227.36.76 58982 GET /external?aff-HY0BLO=&f=mediaonly&f=noreplies&nick=g1n&uri=https%3A%2F%2Fthe-president-codes.linegames.org null 0 On OWASP_CRS/4.7.0
Actionset: OWASP_CRS/4.7.0
Message: Bad User Agent
Severity: 0
Raw: SecRule REQUEST_HEADERS:User-Agent “@pmFromFile /etc/cadd … ⌘ Read more
**Nice! I wrote another useful tool 👌
proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
Actionset: OWASP_CRS/4.7.0
M ...**
Nice! I wrote another useful tool 👌
proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
Actionset: OWASP_CRS/4.7.0
Message: Bad User Agent
Severity: 0
Raw: SecRule REQUEST_HEADERS:User-Agent “@pmFromFile /etc/caddy/waf/bad_user_agents.txt” “id:2000,log,phase:1,deny,msg:‘Bad User Agent’”
⌘ [Read more](https://twtxt.net/twt/4nndfsa)
@prologic@twtxt.net we live in hell
**How in da fuq do you actually make these fucking useless AI bots go way?
proxy-1:~# jq '. | select(.request.remote_ip=="4.227.36.76")' /v ...**
How in da fuq do you _actually_ make these fucking useless AI bots go way?
proxy-1:~# jq ‘. | select(.request.remote_ip==“4.227.36.76”)’ /var/log/caddy/access/mills.io.log | jq -s ‘. | last’ | caddy-log-formatter -
4.227.36.76 - [2025-01-05 04:05:43.971 +0000] “GET /external?aff-QNAXWV=&f=mediaonly&f=noreplies&nick=g1n&uri=https%3A%2F%2Fmy-hero-ultra-impact-codes.linegames.org HTTP/2.0” … ⌘ Read more
(#d6gewza) Done.
Done. ⌘ Read more
(#d6gewza) @lyse Oh good! It works haha 🤣 I’ll bump it up a bit 👌
@lyse @lyse.isobeef.org Oh good! It works haha 🤣 I’ll bump it up a bit 👌 ⌘ Read more
And now I’ve applied rate limits on every site to reasonable values 👌
And now I’ve applied rate limits on every site to reasonable values 👌 ⌘ Read more
(#fc4hw5q) @bender@bender Isn’t that why um yarning my progress 🤣
@bender Isn’t that why um yarning my progress 🤣 ⌘ Read more
@movq@www.uninformativ.de woah it’s like a cheatsheet with explanations! java is kind of arcane magic sorcery to me so i’m having trouble understanding it but i have that with most programming languages. this is like so much easier to actually look at and read instead of my eyes glazing over lol
@andros@twtxt.andros.dev Sorry I missed your messages to #twtxt on IRC. There are people there, but it can take several hours to get a response. E.g. I check it every day or two. I recommend using an IRC bouncer. To answer your question about registries, I used a couple of registries when I first started out, to try to find feeds to follow, but haven’t since then. I don’t remember which ones, but they were easy to find with web searches.
@prologic@twtxt.net YEAH it’s so cool!!! i was thinking about trying it as sorta practice for golang lol
(#7xqzija) @kat I’ve actually moved most of my stuff of of Cloudflare now 🤣 I’m actually very happy with my edge proxy setup that reverse pro …
@kat @yarn.girlonthemoon.xyz I’ve actually moved most of my stuff of of Cloudflare now 🤣 I’m actually very happy with my edge proxy setup that reverse proxies, caches and acts as a web application firewall 🥳 ⌘ Read more
(#vyg3vca) @kat Have you seen the SSG that I built and use on all my static sites? zs 🤔
@kat @yarn.girlonthemoon.xyz Have you seen the SSG that I built and use on all my static sites? zs 🤔 ⌘ Read more
Oh gawd. I can’t enable caching on my edge proxy everywhere 😱 Some shit™ doesn’t deal with a caching reverse proxy in front of it very well …
Oh gawd. I can’t enable caching on my edge proxy everywhere 😱 Some shit™ doesn’t deal with a caching reverse proxy in front of it very well for some reason I don’t have time to dig into right now 🤔 ⌘ Read more
@prologic@twtxt.net that’s iconic af though like i should do the same bc i hate cloudflare that much i just refuse to use them
@lyse@lyse.isobeef.org oh nah it came out like that lol! i actually love how squished it looks it feels accurate lol
oh yeah i think i might have a tripod around but i do need a sandbag or something i could use as one. maybe yeah a giant bag of rice could work LOL. thanks for the tips!!! i took a video class last year in college and we worked with cameras and tripods with sandbags so it was on my mind
@lyse@lyse.isobeef.org yeah! as long as it’s fun :D experimenting with it like picking up the camera every once in a while to point somewhere else, or in editing inserting more video in between the static angles, that could be fun!
@movq@www.uninformativ.de this is why people like me can’t code this is boring eyes glazing over kinda stuff lol
What’s a reasonable per second or per minute rate limit that I could apply in general at my edge proxy for all clients? (no matter what) … L …
What’s a reasonable per second or per minute rate limit that I could apply in general at my edge proxy for all clients? ( no matter what) … LIke a good reasonable upper bound? 🤔 ⌘ Read more
(#qed3omq) @movq@movq Yeah I swear to god the engineers that write this shit™ don’t know how to write distributed cralwers that …
@movq @www.uninformativ.de Yeah I swear to god the engineers that write this shit™ don’t know how to write distributed cralwers that don’t happy the shit™ out of their targets 🤦♂️ ⌘ Read more
(#qed3omq) @doesnm@doesnm No. I generally don’t put up any robots.txt files at all really, because they mostly get ignored. I don’t g …
@doesnm @doesnm.p.psf.lt No. I generally don’t put up any robots.txt files at all really, because they mostly get ignored. I don’t generally mind if “normal” web crawlers crawl things. But LLM(s) can go fuck themselves 🤣 ⌘ Read more
Did you have disallow rule in robots.txt? (I think not because can google several twtxt.net posts)
(#qed3omq) @movq Yeah it’s starting to piss me off too 🤣 Not nearly as much as that guy, but stil. Anyway I’m having fun! Now I just need to …
@movq @www.uninformativ.de Yeah it’s starting to piss me off too 🤣 Not nearly as much as that guy, but stil. Anyway I’m having fun! Now I just need to find a good IP/Subnet list that I can blacklist entirely, ideally one that’s updated frequently so I can refresh firewall rules. ⌘ Read more
(#qed3omq) Bloody fucking hell. I think one of Google’s GenAI crawlers was just hitting my Gitea instance quite hard. Fuck 🤬 Geez
Bloody fucking hell. I think one of Google’s GenAI crawlers was just hitting my Gitea instance quite hard. Fuck 🤬 Geez ⌘ Read more
(#m3hwsra) @movq@movq Oh 🤦♂️
@movq @www.uninformativ.de Oh 🤦♂️ ⌘ Read more
I just banned 41 bad user agents from accessing any of my services. 😱
I just banned 41 bad user agents from accessing any of my services. 😱 ⌘ Read more
(#m3hwsra) @movq How do you manage to get those skulines on your photos? 🤔
@movq @www.uninformativ.de How do you manage to get those skulines on your photos? 🤔 ⌘ Read more
(#fvudyva) @doesnm No, it’s only designed for yarnd. What did you have in mind here? 🤔
@doesnm @doesnm.p.psf.lt No, it’s only designed for yarnd. What did you have in mind here? 🤔 ⌘ Read more
(#o2ru6pq) @doesnm It is the same API that yarnc the command-line client uses.
@doesnm @doesnm.p.psf.lt It is the same API that yarnc the command-line client uses. ⌘ Read more
Want this API for Goryon or just Goryon with support to just twtxt.txt. I can’t read timeline without visible replies and missing twts
(#7xqzija) i.e: Not much point in running a WAF on a static site. But OTOH if there’s enough abuse from shitty assholes, there might be 🤔🤔
i.e: Not much point in running a WAF on a static site. But OTOH if there’s enough abuse from shitty assholes, there might be 🤔🤔 ⌘ Read more
**(#7xqzija) I’m just basically learning now how ModSecurity rules work and how to write my own.
The builtin OWASP rules are already working nice …**
I’m just basically learning now how ModSecurity rules work and how to write my own.
The builtin OWASP rules are already working nicely 👌 – And yeah I won’t include the WAF on every site block, probably just my main/primary domain where I tend to run demo services and other things. ⌘ Read more
(#7xqzija) @kat If you’ve been following my yarns the other day about me getting off of Clownflare and building my own WAF, Proxy and effectivel …
@kat @yarn.girlonthemoon.xyz If you’ve been following my yarns the other day about me getting off of Clownflare and building my own WAF, Proxy and effectively my own Edge network, you’ll know I’m doing this at the very edge 🤣🤣 ⌘ Read more
@prologic@twtxt.net oooh gonna have to look into this, doubt most of my sites need it but i’m thinking one or two could use it
Having a lot of fun with Coraza today. A Web Application Firewall library written in Go that also happens to have a Caddy module.
Having a lot of fun with Coraza today. A Web Application Firewall library written in Go that also happens to have a Caddy module. ⌘ Read more