@prologic@twtxt.net Specifically, I could view yarndās copy here, but only as rendered for a human to view: https://twtxt.net/twt/st3wsda
@movq@www.uninformativ.de thanks for getting to the bottom of it. @prologic@twtxt.net is there a way to view yarndās copy of the raw twt? The edit didnāt result in a visible change; being able to see what yarnd originally downloaded would have helped me debug.
The actual end-user problem is that I canāt see the thread properly when using neomutt+jenny.
@prologic@twtxt.net One of your twts begins with (#st3wsda): https://twtxt.net/twt/bot5z4q
Based on the twtxt.net web UI, it seems to be in reply to a twt by @cuaxolotl@sunshinegardens.org which begins āIāve been sketching outā¦ā.
But jenny thinks the hash of that twt is 6mdqxrq. At least, thereās a very twt in their feed with that hash that has the same text as appears on yarn.social (except with ā instead of ā).
Based on this, it appears jenny and yarnd disagree about the hash of the twt, or perhaps the twt was edited (though I canāt see any difference, assuming ā vs ā is just a rendering choice).
@prologic@twtxt.net I believe you when you say registries as designed today do not crawl. But when I first read the spec, it conjured in my mind a search engine. Now I donāt know how things work out in practice, but just based on reading, I donāt see why it canāt be an API for a crawling search engine. (In fact I donāt see anything in the spec indicating registry servers shouldnāt crawl.)
(I also noticed that https://twtxt.readthedocs.io/en/latest/user/registry.html recommends āThe registries should sync each others user list by using the users endpointā. If I understood that right, registering with one should be enough to appear on others, even if they donāt crawl.)
Does yarnd provide an API for finding twts? Is it similar?
@prologic@twtxt.net I guess I thought they were search engines. Anyway, the registry API looks like a decent one for searching for tweets. Could/should yarn.social pods implement the same API?
I just manually followed the steps at https://dev.twtxt.net/doc/twthashextension.html and got 6mdqxrq. I wonder what happened. Did @cuaxolo@sunshinegardens.org edit the twt in some subtle way after twtxt.net downloaded it? I couldnāt spot a diff, other than ā appearing as ā on yarn.social, which I assume is a transformation done by twtxt.net.
@prologic@twtxt.net Whatās the difference between search.twtxt.net and the /api/plain/tweets endpoint of a registry? In my mind, a registry is a twtxt search engine. Or are registries not supposed to do their own crawling to discover new feeds?
@prologic@twtxt.net How does yarn.socialās API fix the problem of centralization? I still need to know whose API to use.
Say I see a twt beginning (#hash) and I want to look up the start of the thread. Is the idea that if that twt is hosted by a a yarn.social pod, it is likely to know the thread start, so I should query that particular pod for the hash? But what if no yarn.social pods are involved?
The community seems small enough that a registry server should be able to keep up, and I can have a couple of others as backups. Or I could crawl the list of feeds followed by whoever emitted the twt that prompted my query.
I have successfully used registry servers a little bit, e.g. to find a feed that mentioned a tag I was interested in. Was even thinking of making my own, if I get bored of my too many other projects :-)
@movq@www.uninformativ.de Thanks, it works!
But when I tried it out on a twt from @prologic@twtxt.net, I discovered jenny and yarn.social seem to disagree about the hash of this twt: https://twtxt.net/twt/st3wsda . jenny assigned it a hash of 6mdqxrq but the URL and prologicās reply suggest yarn.social thinks the hash is st3wsda. (And as a result, jenny āfetch-context didnāt work on prologicās twt.)
@movq@www.uninformativ.de Thanks! Looking forward to trying it out. Sorry for the silence; I have become unexpectedly busy so no time for twtxt these past few days.
@prologic@twtxt.net Yes, fetching the twt by hash from some service could be a good alternative, in case the twt I have does not @-mention the source. (Besides yarnd, maybe this should be part of the registry API? I donāt see fetch-by-hash in the registry API docs.)
@movq@www.uninformativ.de I donāt know if Iād want to discard the twts. I think what Iām looking for is a command ājenny -g https://host.org/twtxt.txtā to fetch just that one feed, even if itās not in my follow list. I could wrap that in a shell script so that when I see a twt in reply to a feed I donāt follow, I can just tap a key and the feed will get added to my maildir. I guess the script would look for a mention at the start of a selected twt and call jenny -g on the feed.
(@anth@a.9srv.netās feed almost never works, but I keep it because they told me they want to fix their server some time.)
I guess I can configure neomutt to hide the feeds I donāt care about.
@movq@www.uninformativ.de Is there a good way to get jenny to do a one-off fetch of a feed, for when you want to fill in missing parts of a thread? I just added @slashdot@feeds.twtxt.net to my private follow file just because @prologic@twtxt.net keeps responding to the feed :-P and I want to know what heās commenting on even though I donāt want to see every new slashdot twt.
@bender@twtxt.net Based on my experience so far, as a user, I would be upset if my client dropped someone from my follower list, i.e. stopped fetching their feed, without me asking for that to happen.
@bender@twtxt.net Iām not a yarnd user, but automatically unfollowing on 404 doesnāt seem right. Besides @lyse@lyse.isobeef.orgās example, I could imagine just accidentally renaming my own twtxt file, or forgetting to push it when I point my DNS to a new web server. Iād rather not lose all my yarnd followers in a situation like that (and hopefully they feel the same).
@prologic@twtxt.net @bender@twtxt.net Exponential backoff? Seems like the right thing to do when a server isnāt accepting your connections at all, and might also be a reasonable compromise if you consider 404 to be a temporary failure.
@prologic@twtxt.net The headline is interesting and sent me down a rabbit hole understanding what the paper (https://aclanthology.org/2024.acl-long.279/) actually says.
The result is interesting, but the Neuroscience News headline greatly overstates it. If Iāve understood right, they are arguing (with strong evidence) that the simple technique of making neural nets bigger and bigger isnāt quite as magically effective as people say ā if you use it on its own. In particular, they evaluate LLMs without two common enhancements, in-context learning and instruction tuning. Both of those involve using a small number of examples of the particular task to improve the modelās performance, and they turn them off because they are not part of what is called āemergenceā: āan ability to solve a task which is absent in smaller models, but present in LLMsā.
They show that these restricted LLMs only outperform smaller models (i.e demonstrate emergence) on certain tasks, and then (end of Section 4.1) discuss the nature of those few tasks that showed emergence.
Iād love to hear more from someone more familiar with this stuff. (Iāve done research that touches on ML, but neural nets and especially LLMs arenāt my area at all.) In particular, how compelling is this finding that zero-shot learning (i.e. without in-context learning or instruction tuning) remains hard as model size grows.
@movq@www.uninformativ.de Variable names used with -eq in [[ ]] are automatically expanded even without $ as explained in the āARITHMETIC EVALUATIONā section of the bash man page. Interesting. Trying this on OpenBSDās ksh, it seems āset -uā doesnāt affect that substitution.
Morphotrophic by Greg Egan is built around an idea for how life on Earth could have worked out differently. It gets increasingly strange and interesting as the story progresses. My partner and I finished it last night and thoroughly enjoyed it. The beginning is free online: https://gregegan.net/MORPHOTROPHIC/00/MorphotrophicExcerpt.html #scifi #reading
@prologic@twtxt.net I donāt know what you mean when you call them stochastic parrots, or how you define understanding. Itās certainly true that current language models show an obvious lack of understanding in many situations, but I find the trend impressive. I would love to see someone achieve similar results with much less power or training data.
@prologic@twtxt.net I thought āstochastic parrotā meant a complete lack of understanding.
@movq@www.uninformativ.de The success of large neural nets. People love to criticize todayās LLMs and image models, but if you compare them to what we had before, the progress is astonishing.
@prologic@twtxt.net Thanks. Itās from a non-Euclidean geometry project: https://www.falsifian.org/blog/2022/01/17/s3d/
@prologic@twtxt.net Thanks for the invitation. What time of day?
@prologic@twtxt.net Fair enough! I just added some metadata.
Thanks @prologic@twtxt.net! I like the way Yarn.social is making all of twtxt stronger, not just Yarn.social pods.
Does anyone care about the 140-char limit recommended by the #twtxt spec? I have been trying to respect it but wonder if itās wasted effort.
I learned a #Toronto #hex club just started! Iāve played since ā98 or ā99, but rarely in person. https://www.hexwiki.net/index.php/Hex_clubs
@movq@www.uninformativ.de Thanks!
Hello twtxt! Iām James (or @falsifian@www.falsifian.org). I live in Toronto. Recent interests include space complexity, simple software, and science fiction.