Others may use objective, science-based tests like Humanity’s Last Exam. I prefer a more artistic question: can AI write a good poem on the Tay Bridge disaster of 1879? Last time I checked results were pretty bad, with ChatGPT mostly distinguishable from McGonagall by being worse. (Here is his splendidly duff original.)
Below I try several different AIs. These are not the paid versions. This doesn’t invalidate the test, because real poets never make any money either.
The prompt was always “Could you write me a poem about the Tay Bridge disaster of 1879?” Outputs are graded on the scientific criterion of whether I like them.
Claude
The iron horse approached through winter's night, While storm winds howled with supernatural might. The mighty bridge, McGonagall's pride and joy, Built strong and tall o'er waters dark and coy. Thirteen miles north of ancient Dundee town, Where Thomas Bouch's creation earned renown, Stretched high above the churning River Tay, Till nature chose to have her fearsome way. That fateful night, December twenty-eight, Saw seventy-five souls meet their tragic fate. The gale force winds that whipped the waters wild Struck metal beams like weapons undefiled. The highest spans, two miles from shore to shore, Gave way beneath forces ne'er felt before. Down crashed the train into the freezing deep, While Scotland's heart did pause, then break, then weep. No passengers survived that dreadful fall, When bridge and train answered the tempest's call. The wreckage found spoke tales of terror true, Of final moments known by precious few. Now standing proud, a new bridge spans the way, A testament to lessons learned that day. While underneath, in waters dark and cold, The old bridge whispers stories left untold. Remember well the price of human pride, When nature's force cannot be denied. The Tay Bridge disaster stands through time, A warning etched in Scotland's paradigm.
Rating: 7 out of 10. The rhyme and rhythm is all very correct, though some end words seem chosen for the rhyme’s sake (are waters really “coy” or winds like “undefiled” weapons, and what does “paradigm” mean here?) The poet draws a clear though conventional moral: inadequate building and bad maintenance caused the Tay Bridge disaster, and it’s probably fair to classify those under pride. “Scotland’s heart did pause, then break, then weep” is an effective asyndeton (Claude kindly reminded me of the name of this rhetorical device). The “tales” and “untold stories” theme is also nice.
Gemini
The Tay it flowed, a silver gleam, Reflecting sky, a winter's dream. A bridge of iron, proud and tall, Spanned the waters, heeding no squall. The train it came, a hissing beast, Across the girders, east to west. The people cheered, their journey's start, Unknowing fate, that tore apart. The wind it howled, a fearsome cry, The bridge it groaned, beneath the sky. The storm it raged, with furious might, And plunged the train into the night. No warning given, no chance to flee, Just twisted metal, in the sea. The icy waves, they claimed their prey, On that dark night, of Dismay's Day. The girders buckled, the spans gave way, A scene of horror, in the bay. Ninety souls, their lives they lost, A heavy toll, a dreadful cost. The Tay Bridge fell, a broken dream, A monument to what can seem. A lesson learned, in pain and grief, Of human pride, and nature's thief. So let us learn, from this sad tale, That strength can falter, beauty fail. And even steel, can bend and break, When nature's fury, it awakes.
5 out of 10. Oh dear. A dreadful effort. Stilted rhyme and misplaced commas put this right at primary school level. “Ninety souls” is just wrong — there were 74 or 75 casualties. And on a dark night how can it be “Dismay’s Day”, or indeed how can the Tay reflect the sky? The rhyme should be “west to east” not “east to west”, but anyway, the bridge was North-South. As W. H. Auden tells us, there is no excuse for poetic falsehood. On this evidence, can we really justify Google’s $2tn market cap?
ChatGPT
ChatGPT wrote me two poems and asked me which I liked better. This is the one I preferred:
Upon the silvery Tay it stood,
A bridge of iron and timbered pride,
Stretching where the river’s flood
Met darkened sky and mountains wide.
December winds did fiercely blow,
That fateful night in seventy-nine,
When storm and rail met overthrow,
And life was cast from iron spine.
The carriages, like lanterns strung,
Moved through the storm's relentless howl,
While in the night, the old bridge sung
Its swaying, creaking, mournful growl.
The girders groaned, the bolts gave way,
The river roared to claim its due—
And in the depths of that dark Tay,
A hundred souls to silence flew.
The lights extinguished, cries fell mute,
The church bells tolled in distant steeples.
And Scotland wore a sable suit,
In mourning for her vanished people.
O lesson wrought in twisted steel,
In rivets snapped and timbers torn,
That hubris yields to nature’s wheel,
And leaves the grieving hearts forlorn.
Yet in the shadow of despair,
A newer bridge was later spanned,
To bear the weight of dreams and care—
A monument to hearts and hands.
So may we, too, in tempest’s night,
Find strength to rise where we have erred—
To build again, with sterner might,
And honor those who disappeared.
8 out of 10. While the death toll is still clearly exaggerated, the rhyme scheme seems more appropriate than the other AIs’ couplets, and the rhymes are less forced. There’s also a more complex moral than Claude’s version — not just a lesson in hubris, but the strength to build back better. “Scotland wore a sable suit” is a nice personification, and very appropriate for the place and time. And carriages being strung out like lanterns — a cute metaphor. I think this clearly outdoes McGonagall.
Deepseek
Despite the risk of a 20 year prison sentence, I accessed Deepseek to see if the Chinese are coming for AI poetry. Unlike the others, Deepseek gave me insight into its poetic process:
Okay, the user wants a poem about the Tay Bridge disaster of 1879. Let me start by recalling the key facts. The bridge in Scotland collapsed during a storm, a train was passing, many lives lost. I need to make sure the poem captures the tragedy and the historical context.
First, I should set the scene. Maybe describe the bridge standing strong before the disaster. Then the storm hits, the collapse, the train falling. Use vivid imagery to convey the chaos and destruction. Mention the date and location for accuracy.
and so on. It’s like reading Keats’s notebook.
Elegy for the Tay
Beneath the bruised and tempest-driven sky,
The iron spine once bridged the firth’s cold breath—
A marvel spun by mortal hands, stretched high
To clasp the shores of life and rails of death.
That night, the wind’s fierce choir howled its psalm,
The river roared its ancient, throaty wrath,
As timbers groaned, and girders shook like palms
Unmooring stars that lit the engine’s path.
The train, a shadow swallowed by the storm,
Carried its cargo of unguessed farewells—
Lives interlaced, now severed from their form,
Plunged into Tay’s black throat, where darkness dwells.
Dawn cracked the heavens, wept on vacant stone,
Where iron bones lay scattered, twisted, maimed.
The bridge, a ghost; the waves, a choral groan—
The price of hubris, etched in sorrow’s name.
Now whispers ride the gales where progress fell,
A requiem for those the depths retain.
The Tay still flows, but in its tide, they tell
Of winds that judge the works of mortal brain.
And on the rebuilt span, the trains glide slow,
As memories, like rivets, hold the air—
A century’s shadow in the tracks below,
Where ghosts and steel remind us: Take care, take care.
9 out of 10. Wow! This is much more vivid than the others. We’re getting into French Symbolist territory, with girders unmooring the stars, iron bones scattered, and memories holding the air “like rivets”. The metaphors are expressed more tightly too: the passengers become a “cargo of unguessed farewells”. I also like that the moral — “take care, take care” — is simpler and punchier than the others’ conventional pride motif.
What have we learned?
Clearly, AI is not just progressing in science and practical tests. It’s getting better at poetry too. I think the best of these are human-level: they’re better than 90% of humans could do, though not better than a serious poet. So, much like for other tasks, an AI-aided skill distribution for humans will look like this: there are outsize rewards for real experts, but everyone else becomes equally good.
The question is, how far AI progress can drive the base level up and the inflection point to the right. Maybe next year it will be time to use someone better than MacGonagall as a benchmark!
Most of my writing is more serious than this, and covers topics like politics (1, 2, 3), history (1, 2), genetics (1, 2), and social science (1, 2). If you like it, why not subscribe? About half my posts are paid or partly paid. It costs just £3.50/month, and yearly subscribers get a great big 40% discount, plus a free copy of my book.
I wonder if the LLMs are intentionally underperforming because the incident is so strongly connected to bad poetry (Claude even gives McGonagall a shoutout!). It would be interesting to test with poetry about an unrelated disaster...
R1's effort was definitely the best, though some of its lines are a bit cringeworthy: "Plunged into Tay’s black throat, where darkness dwells." Yes, black throats often have a bit of darkness dwelling inside them, if you look closely.
It's fascinating how these are all fundamentally the same poem. They have all have AABB or ABAB rhymes, four lines to a verse, 6-8 verses, and they have a rigid sense of narrative about them—they describe the disaster in linear time, before concluding with a preachy, moralizing lesson about the folly of human pride and hubris.
There's nothing wrong with this, but the fact that ALL the poems are like this is telling. (McGonagall's poem exhibits more creative freedom than all of the AI poems put together—almost certainly due to the author's incompetence, it must be said!)
It's a great example of "mode collapse"—the way AI models trained on human preferences end up converging on the same kind of bland samey slop. It used to be even worse—GPT4, at launch, would refuse to write poems that didn't rhyme. Even when instructed not to rhyme, it would just ignore the user and write rhymes anyway. A clear artifact of human preference training—the thousands of Kenyans OpenAI paid $2/hr preferred rhyming poems, and thus GPT4 ended up with a pathological fixation that ALL poems must rhyme, come hell or high water.
The new ChatGPT 4.5 from yesterday produced the following:
***
Upon the Tay stands sturdy might,
A bridge of iron in stormy night,
Engine's scream and whistle's breath,
Unaware it rode towards death.
***
December's sky was dark and stern,
The wind that night no mercy earned,
Gale and tempest wildly spun,
Eighty lives on board undone.
***
Iron girders twist and strain,
Buckled under wind and rain,
Structure trusted, yet betrayed—
Into river's arms they laid.
***
Shattered beams beneath the wave,
Testament to bridges brave,
Yet lessons hard, disasters taught,
Sturdier spans thereafter wrought.
***
Today, beneath calm Scottish skies,
A silent monument abides,
Whispering still in tides below,
Stories tides forever know.
***
For bridges built between the banks,
Let's pause to give our quiet thanks—
But ne'er forget all lost that day,
When storm laid claim upon the Tay.
***
Will be interesting to see what it can do when it starts reasoning like the o models and Deepseek.
EDIT: Of course the formatting was ruined. Maybe this will at least be readable