Discussion about this post

User's avatar
Nimmish Sany's avatar

Interesting hypothesis! One additional observation: when Shakespeare wrote his plays, the use of em-dashes were extremely rare if not absent. However most modern editions of his plays use em-dashes extensively.

One likely reason is the different cognitive pathways our brian choose while producing original work vs. when we are reproducing something. The latter could involve more analytical stages. For instance, someone editing Shakespeare's work today needs to ensure that Shakespeare would indeed write something like that or if it adheres to his writing style. This requires not only knowledge of a large body of Shakespeare's work but also familiarity with his style. In other words, the editor is not only editing the work but simultaneously using his knowledge of Shakespeare's literary style to self-correct.

AI is a bit like this, I suspect, because a large chunk of the tasks most people assign to them are analytical/research/summaries/math/graphs, etc. Here the pathway is not much of original creative writing but one of reproducing something. Constantly embarking itself along such analytical pathways could have likely skewed the LLM’s thinking process because LLMs use em-dashes in equal generosity while writing fiction work too.

Perhaps, a bit of Shakespeare is what we all need

thenikiverse's avatar

Incredible read! I literally just wrote the other day about how I’m so acutely aware of the em dash. It’s interesting to read about the differences in general grammar em dash and why LLM models use it aggressively.

3 more comments...

No posts

Ready for more?