Fark me 🤦♂️ I woke up quite late today (after a long night helping/assisting with a Mainframe migration last night fork work) to abusive traffic and my alerts going off. The impact? My pod (twtxt.net) was being hammered by something at a request rate of 30 req/s (there are global rate limits in place, but still…). The culprit? Turned out to be a particular IP 43.134.51.191 and after looking into who own s that IP I discovered it was yet-another-bad-customer-or-whatever from Tencent, so that entire network (ASN) is now blocked from my Edge:
+# Who: Tentcent
+# Why: Bad Bots
+132203
Total damage?
$ caddy-log-formatter twtxt.net.log | cut -f 1 -d ' ' | sort | uniq -c | sort -r -n -k 1 | head -n 5
61371 43.134.51.191
402 159.196.9.199
121 45.77.238.240
8 106.200.1.116
6 104.250.53.138
61k reqs over an hour or so (before I noticed), bunch of CPU time burned, and useless waste of my fucking time.
Standard Ebooks: liberated ebooks, carefully produced for the true book lover
Article URL: https://standardebooks.org
Comments URL: https://news.ycombinator.com/item?id=43599637
Points: 505
# Comments: 106 ⌘ Read more
reviewing logs this morning and found i have been spammed hard by bots not respecting the robots.txt file. only noticed it because the OpenAI bot was hitting me with a lot of nonsensical requests. here is the list from last month:
- (810) bingbot
- (641) Googlebot
- (624) http://www.google.com/bot.html
- (545) DotBot
- (290) GPTBot
- (106) SemrushBot
- (84) AhrefsBot
- (62) MJ12bot
- (60) BLEXBot
- (55) wpbot
- (37) Amazonbot
- (28) YandexBot
- (22) ClaudeBot
- (19) AwarioBot
- (14) https://domainsbot.com/pandalytics
- (9) https://serpstatbot.com
- (6) t3versionsBot
- (6) archive.org_bot
- (6) Applebot
- (5) http://search.msn.com/msnbot.htm
- (4) http://www.googlebot.com/bot.html
- (4) Googlebot-Mobile
- (4) DuckDuckGo-Favicons-Bot
- (3) https://turnitin.com/robot/crawlerinfo.html
- (3) YandexNews
- (3) ImagesiftBot
- (2) Qwantify-prod
- (1) http://www.google.com/adsbot.html
- (1) http://gais.cs.ccu.edu.tw/robot.php
- (1) YaK
- (1) WBSearchBot
- (1) DataForSeoBot
i have placed some middleware to reject these for now but it is not a full proof solution.
6.1.107: longterm
Version:6.1.107 (longterm)Released:2024-08-29Source:linux-6.1.107.tar.xzPGP Signature:linux-6.1.107.tar.signPatch:full ( incremental)ChangeLog:ChangeLog-6.1.107 ⌘ Read more
6.1.106: longterm
Version:6.1.106 (longterm)Released:2024-08-19Source:linux-6.1.106.tar.xzPGP Signature:linux-6.1.106.tar.signPatch:full ( incremental)ChangeLog:ChangeLog-6.1.106 ⌘ Read more
Spomienka na obete kragujevskej vzbury
Veľvyslanec Slovenskej republiky Michal Pavúk a pridelenec obrany plk. Rastislav Skyva spolu so štátnym tajomníkom Ministerstva práce, zamestnávania, veteránov a sociálnych vecí Zoranom Antićom, predstaviteľmi Ministerstva obrany a mesta Kragujevac si položením venca pri pamätníku na Stanovljanskom poli uctili pamiatku 44 zastrelených slovenských vojakov, príslušníkov 71. pešieho pluku rakúsko-uhorskej armády, ktorí pred 106 rokmi zaplatil … ⌘ Read more
Paradise Explained, Meaning Lost: A Nonsensically Annotated Edition of Milton’s Epic · Issue #106 · NaNoGenMo/2018 · GitHub https://github.com/NaNoGenMo/2018/issues/106