[HN Gopher] First word discovered in unopened Herculaneum scroll...
       ___________________________________________________________________
        
       First word discovered in unopened Herculaneum scroll by CS student
        
       Author : razin
       Score  : 249 points
       Date   : 2023-10-12 14:11 UTC (6 hours ago)
        
 (HTM) web link (scrollprize.org)
 (TXT) w3m dump (scrollprize.org)
        
       | sillysaurusx wrote:
       | See also Nat's twitter announcement:
       | https://twitter.com/natfriedman/status/1712470683207532906
       | 
       | $700k is a life changing amount of money. I admit, it's tempting
       | to drop everything and go devote myself like a monk to the
       | pursuit of ancient enlightenment via modern ML. I wonder where
       | we'd start...
       | 
       | It's also funny that the scroll might just be a laundry list.
        
         | chakintosh wrote:
         | Or a customer complaint:
         | https://www.thearchaeologist.org/blog/complaint-tablet-to-ea...
        
           | Wojtkie wrote:
           | What I love about the Ea-Nasir story is the tablet was found
           | in a pile of other tablets, suggesting that Ea-Nasir saved
           | them. Why? Who knows, maybe he found them funny.
        
             | vimax wrote:
             | I heard somewhere it was common practice to reuse tablets.
             | It was easier to scrape the surface clean than to make a
             | new tablet.
             | 
             | You'd save any tablets you have, and might wait until you
             | need it to scrape it clean.
             | 
             | In Mesopotamia there was a period where it was fashionable
             | to use a more rare softer red clay on top of the white
             | clay. Your stylus would cut through the top layer leaving
             | nice white letters on a red background. It made it easier
             | to scrape clean and reuse, but much less durable over time.
        
               | jdminhbg wrote:
               | Yes, the clay tablets were used over and over. The ones
               | that are preserved have what was written on them when
               | they were fired, accidentally, by being in a building
               | that was destroyed by fire.
        
         | 0xf00ff00f wrote:
         | A laundry list with something purple...
        
           | seydor wrote:
           | purple was the color of nobility and rather rare. It might be
           | the description of a king or a room or roman fashion items.
        
             | riffraff wrote:
             | Or a complaint about bad writers
             | 
             | https://en.m.wikipedia.org/wiki/Purple_prose
        
         | empath-nirvana wrote:
         | it might cost more than $700k in compute.
        
           | latchkey wrote:
           | It certainly did.
           | https://news.ycombinator.com/item?id=36312385
        
             | cosmojg wrote:
             | Where do they say that the winners used that cluster?
        
               | latchkey wrote:
               | It is an assumption based on the fact that the codebase
               | uses cuda and the main backer of the project owns the
               | cluster.
        
         | terhechte wrote:
         | This is one of more than 600 scrolls that could be read
         | afterwards if the method becomes scalable. What's more:
         | "excavations were never completed, and many historians believe
         | that thousands more scrolls remain underground." [0]
         | 
         | [0]: https://scrollprize.org
        
         | jdminhbg wrote:
         | > It's also funny that the scroll might just be a laundry list.
         | 
         | Most likely not, I believe they're starting with scrolls that
         | were readable on the outside, which we know are minor works of
         | Greek stoic philosophy. Also a laundry list would be written on
         | a reusable wax tablet, rather than costly papyrus.
        
         | michael_nielsen wrote:
         | It's likely somehow a reference to the Emperor. Purple cloth
         | was extremely rare and expensive, and it was the colour worn by
         | the Emperors. Indeed, it eventually became a capital crime for
         | people outside the Emperor's family to wear it. I don't know if
         | that was yet true at the time of Vesuvius, although Wikipedia
         | claims Caligula may have had someone killed for wearing purple.
        
           | Arete314159 wrote:
           | The other word visible is "oino", wine. Wine can be described
           | as purple.
        
             | OfSanguineFire wrote:
             | While modern people make that connection, that is
             | culturally dependent. The color terms available to speakers
             | of a language, and what objects those terms can be
             | associated with, change over time. In the case of the Greek
             | word for "purple", it was connected to a dye and therefore
             | used for clothing, but one shouldn't expect it to be used
             | for wine.
        
         | dataflow wrote:
         | > $700k is a life changing amount of money
         | 
         | Probably ~half of that will go to taxes?
        
           | thrway63245 wrote:
           | Not sure why this is downvoted. Yes, in California half will
           | go to taxes and the rest is enough for a downpayment on a
           | shack. Hardly life changing.
        
         | jedberg wrote:
         | > It's also funny that the scroll might just be a laundry list.
         | 
         | Even if it were, a laundry list from 2000 years ago would be a
         | fascinating read.
        
       | davidw wrote:
       | That's extremely cool. I wonder what we'll learn.
       | 
       | As an aside, the "Professor Seales and team scanning at the
       | particle accelerator" photo looks like it came from a TV show.
       | "If we keep telling the computer 'enhance', we'll be able to read
       | it".
        
       | adamlgerber wrote:
       | i love this project. i feel like this is going to be a great
       | source of interest and value over the next few years (and
       | potentially immesurable value over longer time frames).
        
       | kelsey9876543 wrote:
       | I recently saw a wonderful youtube video on this:
       | https://www.youtube.com/watch?v=Z_L1oN8y7Bs
       | 
       | Title: Herculaneum scrolls: A 20-year journey to read the
       | unreadable
       | 
       | it goes a little bit into the technology of how this was done,
       | deep learning finally cracked the code. They had the scans for a
       | decade but it took ML training to be able to identify which parts
       | were paper and which parts were the ink on top. This had been
       | done on a different set of scrolls with easier to read higher
       | contrasting materials like the video says, 20 years ago. Deep
       | learning is cracking the code for these datasets we had
       | previously thought were impossible to algorithmically solve.
        
         | nulbyte wrote:
         | Thank you for sharing. It's a month old, but even so, I just
         | saw a pinned comment ppsted an hour ago about an announcement
         | coming later today.
        
         | versteegen wrote:
         | Can't speak for the video, but this is a bit misleading
         | actually. What cracked this was actually visual inspection
         | looking for patterns which could then be used as better
         | training data, which so far apparently hasn't found very many
         | letters that were too hard to see. Read the OP describing the
         | iterative process of hand-annotation guided by output of a
         | model, then retraining the model with the additional data, it's
         | a fascinating technique! Simply using deep learning on the
         | initially available ground truths without knowing what features
         | the models should be looking for actually pretty much didn't
         | work!
         | 
         | Also, so far the process of virtually unrolling the scrolls is
         | mostly manual and extremely labour intensive.
        
           | kelsey9876543 wrote:
           | Thank you for adding the deeper insight! The competition and
           | the methods used are very fascinating indeed.
        
       | tclancy wrote:
       | Somewhat off-topic but if you clicked in here, you might be
       | interested in this book: "The Riddle of the Labyrinth: The Quest
       | to Crack an Ancient Code".
        
       | jdminhbg wrote:
       | This is the 21st-century equivalent of living through the opening
       | of Tut's tomb. Incredible to think there's a very real chance
       | that in the medium-term future you might be able to buy a copy of
       | a newly-translated work on Amazon that hasn't been read for
       | millennia.
        
         | carapace wrote:
         | Why the ad for Amazon?
        
           | jdminhbg wrote:
           | It's just a reference to making a boring, pervasive part of
           | culture. Please feel free to buy those translations at any
           | book company you feel like.
        
             | carapace wrote:
             | Sorry, I'm just cranky this morning.
        
             | alanbernstein wrote:
             | Surely they will be public domain by now??
        
               | lexicality wrote:
               | It is disgraceful that the ancient Greek authors won't
               | see an obol that these so called "translators" and
               | "historians" make from reselling their work.
               | 
               | They should sue! /s
        
               | jdminhbg wrote:
               | The original Greek text is, but I got a C in Greek so
               | I'll have to pay for a copyrighted English translation.
        
       | versteegen wrote:
       | The lettering was found by looking for 'crackle' texture on
       | papyrus segments from the CT scans which obviously were in the
       | shape of Greek letters, and annotating those as training data.
       | Unfortunately such crackle texture isn't visible, at least by
       | eye, on most of the papyrus. Probably it's only that visible
       | where the ink was very thick. You can easily see the difference
       | in texture in this electron microscope image [1] (far higher
       | resolution than the CT scans) but especially on the very edge of
       | the inked area (the narrow strip in the left image; I think the
       | whole right image is inked) where the ink was pushed to. I'm
       | surprised the crackle was discovered only after the Kaggle Ink
       | Detection contest. Looking at the CT-scanned fragments with
       | infrared ground truths, which were used in the Kaggle contest,
       | Casey Handmer wrote [2]:
       | 
       | > The ongoing apparent failure of deep-learning based ink
       | detection based on the fragments indicated to me that direct
       | inspection of the actual data would be more fruitful, as it has
       | been here.
       | 
       | > ...
       | 
       | > I found similar "cracked mud" and "flake" textures
       | corresponding to known character ink, but only for perhaps 10% of
       | the known characters. It's been a long day, I can probably find
       | more on closer inspection, but that does make one wonder about
       | automated ink detection and what that is seeing.
       | 
       | These new images are much better than I hoped for, but still only
       | in one small area, so I'm still pessimistic about more than an
       | odd sentence being readable.
       | 
       | [1] https://scrollprize.org/img/tutorials/sem.png
       | 
       | [2] https://caseyhandmer.wordpress.com/2023/08/05/reading-
       | ancien...
        
       | munificent wrote:
       | I love uses of machine learning like this a thousand times more
       | than generative LLMs spouting probable-sounding nonsense.
        
       | esafak wrote:
       | It is amazing what some college student can pull off with today's
       | technology.
        
       | Rallen89 wrote:
       | >Shortly after that, another contestant, Youssef Nader,
       | independently discovered the same word in the same area, with
       | even clearer results -- winning the second place prize of
       | $10,000.
       | 
       | That's what u get for optimising your code
        
         | hansoolo wrote:
         | I thought the same. He had the better results, but too late.
        
           | QuercusMax wrote:
           | Or maybe the winner optimized his code, resulting in faster
           | time to get results. Either one is equally plausible!
        
         | zeteo wrote:
         | Not really:
         | 
         | >Youssef used a model from the Kaggle competition and was
         | inspired by Luke's results to look in the same area.
        
       | autokad wrote:
       | imagine the person making this scroll 2,000 years ago wondering
       | 'I wonder if some kid 2000 years in the future is going to win a
       | boat load of money by reading this'
        
       | nataliste wrote:
       | I wrote this for a different community (filled with semiliterate
       | sophists), but this is absolutely huge and could upend huge
       | swathes of understanding about the last two thousand years.
       | 
       | You can avoid the longform essay below if you want. The short of
       | it is there are several potentially common works possibly in the
       | library that could directly prove or disprove what is found in
       | the New Testament and the predicates of Rabbinic Judaism as
       | established at the Council of Jamnia.
       | 
       | We could be seeing the beginning of conclusive proof that
       | invalidates the narratives of Christianity, Judaism, and Islam by
       | the end of the year.
       | 
       | The Vesuvius Challenge isn't just an interesting contest in the
       | machine learning realm; it's a groundbreaking endeavor that could
       | redefine our understanding of the humanities if successful. The
       | opportunity to digitally unroll and read the Herculaneum Papyri
       | could offer unprecedented insights into ancient civilizations and
       | the total feedstock of civilization today. This is not merely
       | about filling in some historical gaps; it's about fundamentally
       | altering how we understand antiquity and, by extension, our own
       | intellectual heritage.
       | 
       | The loss of the Library of Alexandria has long been considered a
       | "dark age" event for intellectual progress. Now, consider the
       | Herculaneum library--a collection of papyri from a villa once
       | owned by Julius Caesar's father-in-law, carbonized but preserved
       | by the Vesuvius eruption in 79 AD. Hundreds of these scrolls are
       | unreadable because their carbon-based ink blends in with the
       | carbonized papyrus, and thus are invisible to conventional
       | imaging techniques. Yet, these scrolls are quite possibly on the
       | cusp of revelation.
       | 
       | Recent developments have introduced machine learning and high-
       | resolution X-ray scans as methods for reading these "unreadable"
       | scrolls. What texts do they contain? Treatises on science and
       | philosophy? The lost books of Livy? The epic cycle? Governmental
       | policies like the Twelve Tables? It's a tantalizing question
       | because whatever is locked in those scrolls could be an
       | unfiltered look at the Roman Empire--an empire that fundamentally
       | influenced the trajectory of Western culture, religion,
       | governance, and philosophy.
       | 
       | Ponder a history of Rome that has not been retouched by myriadic
       | emperors, by Constantine's Christianity, or the interpretive lens
       | of the Roman Catholic Church. Unmediated accounts of Roman
       | society, unaltered by the layers of religious and political power
       | that came later, could rewrite our textbooks and shift the
       | justification of history. It's not just about enriching our
       | understanding of ancient civilizations; this could be a
       | cornerstone on which to build a fresh philosophical understanding
       | of human society.
       | 
       | If the project succeeds, there will be repercussions in the
       | academic realm. The humanities have long struggled to justify
       | their existence in a world that increasingly prizes STEM and
       | lacks any novel sources for the classical world. Suddenly, there
       | could be a concrete, urgent task at hand: to decode, interpret,
       | and integrate an influx of new knowledge. The Vesuvius Challenge
       | could revitalize the field, offering an unforeseen but compelling
       | reason for its study. In essence, it provides a utilitarian
       | justification for the humanities, one that transcends 'cultural
       | enrichment' and enters the realm of 'historical redefinition.'
       | 
       | The Vesuvius Challenge could be the hinge upon which history
       | swings, yielding intellectual treasure that could be as
       | groundbreaking as the writings that were lost in Alexandria. For
       | millennia, those scrolls have remained unread. Now, it's a
       | software problem. That's not just a challenge; it's an
       | imperative.
       | 
       | The presence of specific works in the Herculaneum Papyri could
       | dramatically impact our understanding of major historical events.
       | 
       | In particular for me, I pray that the biography of Herod the
       | Great by Nicholas of Damascus is discovered intact. While
       | mainstream accounts generally portray the life of Herod within
       | the context of Roman patronage and Judaean politics, uncovering a
       | contemporary account by a close intimate (and used as a primary
       | source by Josephus) would offer fresh, unmediated insights into
       | his rule and its socio-political intricacies. Chronologies of the
       | life of Jesus could be explicitly validated or disproved.
       | 
       | The relevance here is far from academic. Consider the following
       | naturalistic hypothesis: that the inception and rise of
       | Christianity was entirely a dynastic struggle within the
       | Hasmonean-Herodian line. What if the tale of Jesus is, in
       | essence, a dramatized, mystified rendition of a 1st-century
       | dynastic conflict, one that was subsequently co-opted and
       | transformed into a religious narrative by an early form of
       | conspiratorial thinking? Something like a 1st-century version of
       | Q-anon, distorting real events to serve an alternative, concealed
       | agenda in the aftermath of the First Jewish-Roman War.
       | 
       | Unveiling a document like Nicholas of Damascus' biography could
       | be groundbreaking in testing such a hypothesis. If Herod's life
       | and rule were detailed without the religious overlays that later
       | Christian interpretations bring into the picture, one could make
       | more definitive assertions about the socio-political environment
       | of the time. Furthermore, it could provide concrete evidence to
       | either substantiate or refute theories about Christianity's
       | emergence as a byproduct of a Herodian-Hasmonean power struggle.
       | 
       | The fact that such a theory could be _tested_ is significant in
       | its own right. Traditionally, discussions about early
       | Christianity rely heavily on religious texts and subsequent
       | historical accounts, many of which are fraught with dogma and
       | ideological interpretations. A primary source devoid of such
       | influences would be a game-changer, offering a baseline of raw
       | data from which more accurate and reliable hypotheses could be
       | drawn.
       | 
       | And it's not limited solely to Christianity. Rabbinic Judaism
       | could have equally monumental implications as a result. The owner
       | of the villa, likely a wealthy Roman, would be unlikely to have
       | had any primary Hebrew texts like the Pentateuch. However, that
       | doesn't rule out the possibility of possessing Greek or Latin
       | works discussing Jewish culture, beliefs, and politics. Given the
       | villa's historical context, it's conceivable that there might be
       | indirect ethnographic accounts from the period surrounding the
       | destruction of Jerusalem in 70 AD but before the Council of
       | Jamnia, traditionally dated around 90 AD, which helped canonize
       | Hebrew scriptures.
       | 
       | Why is this important? The Council of Jamnia is often cited as a
       | crucial moment for the development of Rabbinic Judaism. It
       | allegedly led to the fixing of the Hebrew Bible canon and
       | crystallized what would become Talmudic tradition. If documents
       | were to surface that provide a snapshot of Judaic thought and
       | practice just before this council, it could upend millennia of
       | precedent and identity.
       | 
       | In a broader context, discovering pre-Jamnia ethnographic sources
       | could significantly change our understanding of how Judaism
       | adapted and evolved in the aftermath of the Second Temple's
       | destruction. This could lead to far-reaching questions. How much
       | of the Talmudic tradition was actually a post-hoc rationalization
       | or systematization of beliefs and practices that were far more
       | fluid before the Council of Jamnia? How much anti-Romanism was
       | pared away to prevent suppression? Moreover, how would such a
       | revelation interact with or even challenge the validity of
       | current Rabbinic and Orthodox Jewish practices?
       | 
       | The implications for the Judeo-Christian heritage as a whole are
       | staggering. If both Christianity and Judaism could be traced back
       | explicitly to politically or socially motivated machinations,
       | rather than divinely inspired or time-honored traditions, the
       | entire foundation of Judeo-Christian culture would come into
       | question. In essence, the Vesuvius Challenge has the potential to
       | destabilize two of the world's major religious traditions at
       | their historical roots. It is difficult to overstate the
       | potential impacts.
       | 
       | The Vesuvius Challenge is not just an academic or technological
       | endeavor. Its success could instigate an unparalleled
       | epistemological crisis in religious studies and the humanities.
       | It provides the opportunity to re-examine, with primary sources,
       | the historical foundations of Western religious, cultural, and
       | ultimately political traditions. We're not just potentially
       | rewriting history here; we're reevaluating the very frameworks
       | through which that history has been understood.
        
         | narag wrote:
         | So this is just the very begining? Will they be able to
         | decypher whole docs? I guess you wouldn't have written all that
         | otherwise!
         | 
         | Anyway, if there's religion involved, I doubt any revelation
         | will shake anything.
        
       | 1vuio0pswjnm7 wrote:
       | "He found a few dozen ink strokes - and some complete letters -
       | that could be labeled and used as training data.
       | 
       | Before long, the model was unveiling traces of crackle invisible
       | to his own eye. Soon, these traces began to form letters and
       | hints of actual words."
       | 
       | This does not sound like a "Large Language Model (LLM)" or other
       | large set of training data, like the sort hyped by so-called
       | "tech" companies; this sounds relatively small. What am I
       | missing. (Besides brain cells.)
        
       ___________________________________________________________________
       (page generated 2023-10-12 21:00 UTC)