[HN Gopher] First word discovered in unopened Herculaneum scroll... ___________________________________________________________________ First word discovered in unopened Herculaneum scroll by CS student Author : razin Score : 249 points Date : 2023-10-12 14:11 UTC (6 hours ago) (HTM) web link (scrollprize.org) (TXT) w3m dump (scrollprize.org) | sillysaurusx wrote: | See also Nat's twitter announcement: | https://twitter.com/natfriedman/status/1712470683207532906 | | $700k is a life changing amount of money. I admit, it's tempting | to drop everything and go devote myself like a monk to the | pursuit of ancient enlightenment via modern ML. I wonder where | we'd start... | | It's also funny that the scroll might just be a laundry list. | chakintosh wrote: | Or a customer complaint: | https://www.thearchaeologist.org/blog/complaint-tablet-to-ea... | Wojtkie wrote: | What I love about the Ea-Nasir story is the tablet was found | in a pile of other tablets, suggesting that Ea-Nasir saved | them. Why? Who knows, maybe he found them funny. | vimax wrote: | I heard somewhere it was common practice to reuse tablets. | It was easier to scrape the surface clean than to make a | new tablet. | | You'd save any tablets you have, and might wait until you | need it to scrape it clean. | | In Mesopotamia there was a period where it was fashionable | to use a more rare softer red clay on top of the white | clay. Your stylus would cut through the top layer leaving | nice white letters on a red background. It made it easier | to scrape clean and reuse, but much less durable over time. | jdminhbg wrote: | Yes, the clay tablets were used over and over. The ones | that are preserved have what was written on them when | they were fired, accidentally, by being in a building | that was destroyed by fire. | 0xf00ff00f wrote: | A laundry list with something purple... | seydor wrote: | purple was the color of nobility and rather rare. It might be | the description of a king or a room or roman fashion items. | riffraff wrote: | Or a complaint about bad writers | | https://en.m.wikipedia.org/wiki/Purple_prose | empath-nirvana wrote: | it might cost more than $700k in compute. | latchkey wrote: | It certainly did. | https://news.ycombinator.com/item?id=36312385 | cosmojg wrote: | Where do they say that the winners used that cluster? | latchkey wrote: | It is an assumption based on the fact that the codebase | uses cuda and the main backer of the project owns the | cluster. | terhechte wrote: | This is one of more than 600 scrolls that could be read | afterwards if the method becomes scalable. What's more: | "excavations were never completed, and many historians believe | that thousands more scrolls remain underground." [0] | | [0]: https://scrollprize.org | jdminhbg wrote: | > It's also funny that the scroll might just be a laundry list. | | Most likely not, I believe they're starting with scrolls that | were readable on the outside, which we know are minor works of | Greek stoic philosophy. Also a laundry list would be written on | a reusable wax tablet, rather than costly papyrus. | michael_nielsen wrote: | It's likely somehow a reference to the Emperor. Purple cloth | was extremely rare and expensive, and it was the colour worn by | the Emperors. Indeed, it eventually became a capital crime for | people outside the Emperor's family to wear it. I don't know if | that was yet true at the time of Vesuvius, although Wikipedia | claims Caligula may have had someone killed for wearing purple. | Arete314159 wrote: | The other word visible is "oino", wine. Wine can be described | as purple. | OfSanguineFire wrote: | While modern people make that connection, that is | culturally dependent. The color terms available to speakers | of a language, and what objects those terms can be | associated with, change over time. In the case of the Greek | word for "purple", it was connected to a dye and therefore | used for clothing, but one shouldn't expect it to be used | for wine. | dataflow wrote: | > $700k is a life changing amount of money | | Probably ~half of that will go to taxes? | thrway63245 wrote: | Not sure why this is downvoted. Yes, in California half will | go to taxes and the rest is enough for a downpayment on a | shack. Hardly life changing. | jedberg wrote: | > It's also funny that the scroll might just be a laundry list. | | Even if it were, a laundry list from 2000 years ago would be a | fascinating read. | davidw wrote: | That's extremely cool. I wonder what we'll learn. | | As an aside, the "Professor Seales and team scanning at the | particle accelerator" photo looks like it came from a TV show. | "If we keep telling the computer 'enhance', we'll be able to read | it". | adamlgerber wrote: | i love this project. i feel like this is going to be a great | source of interest and value over the next few years (and | potentially immesurable value over longer time frames). | kelsey9876543 wrote: | I recently saw a wonderful youtube video on this: | https://www.youtube.com/watch?v=Z_L1oN8y7Bs | | Title: Herculaneum scrolls: A 20-year journey to read the | unreadable | | it goes a little bit into the technology of how this was done, | deep learning finally cracked the code. They had the scans for a | decade but it took ML training to be able to identify which parts | were paper and which parts were the ink on top. This had been | done on a different set of scrolls with easier to read higher | contrasting materials like the video says, 20 years ago. Deep | learning is cracking the code for these datasets we had | previously thought were impossible to algorithmically solve. | nulbyte wrote: | Thank you for sharing. It's a month old, but even so, I just | saw a pinned comment ppsted an hour ago about an announcement | coming later today. | versteegen wrote: | Can't speak for the video, but this is a bit misleading | actually. What cracked this was actually visual inspection | looking for patterns which could then be used as better | training data, which so far apparently hasn't found very many | letters that were too hard to see. Read the OP describing the | iterative process of hand-annotation guided by output of a | model, then retraining the model with the additional data, it's | a fascinating technique! Simply using deep learning on the | initially available ground truths without knowing what features | the models should be looking for actually pretty much didn't | work! | | Also, so far the process of virtually unrolling the scrolls is | mostly manual and extremely labour intensive. | kelsey9876543 wrote: | Thank you for adding the deeper insight! The competition and | the methods used are very fascinating indeed. | tclancy wrote: | Somewhat off-topic but if you clicked in here, you might be | interested in this book: "The Riddle of the Labyrinth: The Quest | to Crack an Ancient Code". | jdminhbg wrote: | This is the 21st-century equivalent of living through the opening | of Tut's tomb. Incredible to think there's a very real chance | that in the medium-term future you might be able to buy a copy of | a newly-translated work on Amazon that hasn't been read for | millennia. | carapace wrote: | Why the ad for Amazon? | jdminhbg wrote: | It's just a reference to making a boring, pervasive part of | culture. Please feel free to buy those translations at any | book company you feel like. | carapace wrote: | Sorry, I'm just cranky this morning. | alanbernstein wrote: | Surely they will be public domain by now?? | lexicality wrote: | It is disgraceful that the ancient Greek authors won't | see an obol that these so called "translators" and | "historians" make from reselling their work. | | They should sue! /s | jdminhbg wrote: | The original Greek text is, but I got a C in Greek so | I'll have to pay for a copyrighted English translation. | versteegen wrote: | The lettering was found by looking for 'crackle' texture on | papyrus segments from the CT scans which obviously were in the | shape of Greek letters, and annotating those as training data. | Unfortunately such crackle texture isn't visible, at least by | eye, on most of the papyrus. Probably it's only that visible | where the ink was very thick. You can easily see the difference | in texture in this electron microscope image [1] (far higher | resolution than the CT scans) but especially on the very edge of | the inked area (the narrow strip in the left image; I think the | whole right image is inked) where the ink was pushed to. I'm | surprised the crackle was discovered only after the Kaggle Ink | Detection contest. Looking at the CT-scanned fragments with | infrared ground truths, which were used in the Kaggle contest, | Casey Handmer wrote [2]: | | > The ongoing apparent failure of deep-learning based ink | detection based on the fragments indicated to me that direct | inspection of the actual data would be more fruitful, as it has | been here. | | > ... | | > I found similar "cracked mud" and "flake" textures | corresponding to known character ink, but only for perhaps 10% of | the known characters. It's been a long day, I can probably find | more on closer inspection, but that does make one wonder about | automated ink detection and what that is seeing. | | These new images are much better than I hoped for, but still only | in one small area, so I'm still pessimistic about more than an | odd sentence being readable. | | [1] https://scrollprize.org/img/tutorials/sem.png | | [2] https://caseyhandmer.wordpress.com/2023/08/05/reading- | ancien... | munificent wrote: | I love uses of machine learning like this a thousand times more | than generative LLMs spouting probable-sounding nonsense. | esafak wrote: | It is amazing what some college student can pull off with today's | technology. | Rallen89 wrote: | >Shortly after that, another contestant, Youssef Nader, | independently discovered the same word in the same area, with | even clearer results -- winning the second place prize of | $10,000. | | That's what u get for optimising your code | hansoolo wrote: | I thought the same. He had the better results, but too late. | QuercusMax wrote: | Or maybe the winner optimized his code, resulting in faster | time to get results. Either one is equally plausible! | zeteo wrote: | Not really: | | >Youssef used a model from the Kaggle competition and was | inspired by Luke's results to look in the same area. | autokad wrote: | imagine the person making this scroll 2,000 years ago wondering | 'I wonder if some kid 2000 years in the future is going to win a | boat load of money by reading this' | nataliste wrote: | I wrote this for a different community (filled with semiliterate | sophists), but this is absolutely huge and could upend huge | swathes of understanding about the last two thousand years. | | You can avoid the longform essay below if you want. The short of | it is there are several potentially common works possibly in the | library that could directly prove or disprove what is found in | the New Testament and the predicates of Rabbinic Judaism as | established at the Council of Jamnia. | | We could be seeing the beginning of conclusive proof that | invalidates the narratives of Christianity, Judaism, and Islam by | the end of the year. | | The Vesuvius Challenge isn't just an interesting contest in the | machine learning realm; it's a groundbreaking endeavor that could | redefine our understanding of the humanities if successful. The | opportunity to digitally unroll and read the Herculaneum Papyri | could offer unprecedented insights into ancient civilizations and | the total feedstock of civilization today. This is not merely | about filling in some historical gaps; it's about fundamentally | altering how we understand antiquity and, by extension, our own | intellectual heritage. | | The loss of the Library of Alexandria has long been considered a | "dark age" event for intellectual progress. Now, consider the | Herculaneum library--a collection of papyri from a villa once | owned by Julius Caesar's father-in-law, carbonized but preserved | by the Vesuvius eruption in 79 AD. Hundreds of these scrolls are | unreadable because their carbon-based ink blends in with the | carbonized papyrus, and thus are invisible to conventional | imaging techniques. Yet, these scrolls are quite possibly on the | cusp of revelation. | | Recent developments have introduced machine learning and high- | resolution X-ray scans as methods for reading these "unreadable" | scrolls. What texts do they contain? Treatises on science and | philosophy? The lost books of Livy? The epic cycle? Governmental | policies like the Twelve Tables? It's a tantalizing question | because whatever is locked in those scrolls could be an | unfiltered look at the Roman Empire--an empire that fundamentally | influenced the trajectory of Western culture, religion, | governance, and philosophy. | | Ponder a history of Rome that has not been retouched by myriadic | emperors, by Constantine's Christianity, or the interpretive lens | of the Roman Catholic Church. Unmediated accounts of Roman | society, unaltered by the layers of religious and political power | that came later, could rewrite our textbooks and shift the | justification of history. It's not just about enriching our | understanding of ancient civilizations; this could be a | cornerstone on which to build a fresh philosophical understanding | of human society. | | If the project succeeds, there will be repercussions in the | academic realm. The humanities have long struggled to justify | their existence in a world that increasingly prizes STEM and | lacks any novel sources for the classical world. Suddenly, there | could be a concrete, urgent task at hand: to decode, interpret, | and integrate an influx of new knowledge. The Vesuvius Challenge | could revitalize the field, offering an unforeseen but compelling | reason for its study. In essence, it provides a utilitarian | justification for the humanities, one that transcends 'cultural | enrichment' and enters the realm of 'historical redefinition.' | | The Vesuvius Challenge could be the hinge upon which history | swings, yielding intellectual treasure that could be as | groundbreaking as the writings that were lost in Alexandria. For | millennia, those scrolls have remained unread. Now, it's a | software problem. That's not just a challenge; it's an | imperative. | | The presence of specific works in the Herculaneum Papyri could | dramatically impact our understanding of major historical events. | | In particular for me, I pray that the biography of Herod the | Great by Nicholas of Damascus is discovered intact. While | mainstream accounts generally portray the life of Herod within | the context of Roman patronage and Judaean politics, uncovering a | contemporary account by a close intimate (and used as a primary | source by Josephus) would offer fresh, unmediated insights into | his rule and its socio-political intricacies. Chronologies of the | life of Jesus could be explicitly validated or disproved. | | The relevance here is far from academic. Consider the following | naturalistic hypothesis: that the inception and rise of | Christianity was entirely a dynastic struggle within the | Hasmonean-Herodian line. What if the tale of Jesus is, in | essence, a dramatized, mystified rendition of a 1st-century | dynastic conflict, one that was subsequently co-opted and | transformed into a religious narrative by an early form of | conspiratorial thinking? Something like a 1st-century version of | Q-anon, distorting real events to serve an alternative, concealed | agenda in the aftermath of the First Jewish-Roman War. | | Unveiling a document like Nicholas of Damascus' biography could | be groundbreaking in testing such a hypothesis. If Herod's life | and rule were detailed without the religious overlays that later | Christian interpretations bring into the picture, one could make | more definitive assertions about the socio-political environment | of the time. Furthermore, it could provide concrete evidence to | either substantiate or refute theories about Christianity's | emergence as a byproduct of a Herodian-Hasmonean power struggle. | | The fact that such a theory could be _tested_ is significant in | its own right. Traditionally, discussions about early | Christianity rely heavily on religious texts and subsequent | historical accounts, many of which are fraught with dogma and | ideological interpretations. A primary source devoid of such | influences would be a game-changer, offering a baseline of raw | data from which more accurate and reliable hypotheses could be | drawn. | | And it's not limited solely to Christianity. Rabbinic Judaism | could have equally monumental implications as a result. The owner | of the villa, likely a wealthy Roman, would be unlikely to have | had any primary Hebrew texts like the Pentateuch. However, that | doesn't rule out the possibility of possessing Greek or Latin | works discussing Jewish culture, beliefs, and politics. Given the | villa's historical context, it's conceivable that there might be | indirect ethnographic accounts from the period surrounding the | destruction of Jerusalem in 70 AD but before the Council of | Jamnia, traditionally dated around 90 AD, which helped canonize | Hebrew scriptures. | | Why is this important? The Council of Jamnia is often cited as a | crucial moment for the development of Rabbinic Judaism. It | allegedly led to the fixing of the Hebrew Bible canon and | crystallized what would become Talmudic tradition. If documents | were to surface that provide a snapshot of Judaic thought and | practice just before this council, it could upend millennia of | precedent and identity. | | In a broader context, discovering pre-Jamnia ethnographic sources | could significantly change our understanding of how Judaism | adapted and evolved in the aftermath of the Second Temple's | destruction. This could lead to far-reaching questions. How much | of the Talmudic tradition was actually a post-hoc rationalization | or systematization of beliefs and practices that were far more | fluid before the Council of Jamnia? How much anti-Romanism was | pared away to prevent suppression? Moreover, how would such a | revelation interact with or even challenge the validity of | current Rabbinic and Orthodox Jewish practices? | | The implications for the Judeo-Christian heritage as a whole are | staggering. If both Christianity and Judaism could be traced back | explicitly to politically or socially motivated machinations, | rather than divinely inspired or time-honored traditions, the | entire foundation of Judeo-Christian culture would come into | question. In essence, the Vesuvius Challenge has the potential to | destabilize two of the world's major religious traditions at | their historical roots. It is difficult to overstate the | potential impacts. | | The Vesuvius Challenge is not just an academic or technological | endeavor. Its success could instigate an unparalleled | epistemological crisis in religious studies and the humanities. | It provides the opportunity to re-examine, with primary sources, | the historical foundations of Western religious, cultural, and | ultimately political traditions. We're not just potentially | rewriting history here; we're reevaluating the very frameworks | through which that history has been understood. | narag wrote: | So this is just the very begining? Will they be able to | decypher whole docs? I guess you wouldn't have written all that | otherwise! | | Anyway, if there's religion involved, I doubt any revelation | will shake anything. | 1vuio0pswjnm7 wrote: | "He found a few dozen ink strokes - and some complete letters - | that could be labeled and used as training data. | | Before long, the model was unveiling traces of crackle invisible | to his own eye. Soon, these traces began to form letters and | hints of actual words." | | This does not sound like a "Large Language Model (LLM)" or other | large set of training data, like the sort hyped by so-called | "tech" companies; this sounds relatively small. What am I | missing. (Besides brain cells.) ___________________________________________________________________ (page generated 2023-10-12 21:00 UTC)