It had been a few weeks since I last posted here, and this recent project felt like a good reason to return. In one of my AI courses, I built a text-generation model using a Long Short-Term Memory (LSTM) neural network trained on the speeches and letters of Abraham Lincoln. The goal was simple: could a machine learn to write in a voice that feels like Lincoln?
At a high level, an LSTM is a type of deep learning model designed to work with sequences. In this case, the sequence is language. Rather than looking at a single word in isolation, the model learns how words relate to one another over time. It keeps track of what came before and uses that context to predict what should come next. This is what makes it different from simpler models. It has a kind of “memory” that allows it to follow patterns across sentences.
To train the model, I used a publicly available collection of Lincoln’s speeches and letters. The dataset contained about 86,000 total words, but only around 9,000 of those were unique. That turned out to be an important lesson. While the dataset felt large at first, it is actually quite small for training a language model. The limited vocabulary and relatively small amount of text constrained how well the model could generalize.
The training process works by feeding the model sequences of words and asking it to predict the next word. Each time it makes a prediction, it compares that prediction to the actual next word and adjusts its internal parameters. These parameters are essentially weights that determine how much influence one word has on another. Over time, the model improves its ability to assign higher “scores” to words that make sense in a given context.

Several choices affect how well this process works. The number of training cycles (epochs), the batch size, and the length of the input sequences all play a role. In my case, I saw improvement in the model’s predictions through about 13 epochs before the gains started to level off. This was a clear example of how training can improve performance up to a point, after which additional training may not add much value without more data.
Despite the limitations, the most rewarding part of the project was generating new text. By giving the model a short starting phrase, it could produce the next 30 words in a sequence. The results were not perfect, but they often captured the rhythm and tone of Lincoln’s writing. That was the moment where the theory became real. A model trained only on historical text was able to produce something that felt stylistically consistent.

This project helped me better understand what “deep learning” means in practice. It is not magic. It is a process of learning patterns from data, adjusting parameters, and improving predictions over time. It also reinforced the importance of data quality and scale. Even a well-designed model can only be as strong as the information it learns from.
There is still a long way to go before a model like this can reliably generate meaningful, historically accurate text. But this was a valuable step. It showed how machines can begin to understand language not just as words, but as patterns, structure, and context.
And in a small way, it showed how a voice from the past can be approximated through data, one word at a time.
