The History of Natural Language Processing


Hello! I am a student in the History of Computing class at San Jose State University ( 

My official topic is: "The history and development of text based natural language processing." 

If any one has information they would like to share please email me at, or you can make a discussion about it.

Thank you very much and all input is helpful!

- Ryan Lichtig SJSU Computer Science Major

A Brief History of Natural Language Processing

Talos, in Greek mythology, is the guardian of Europa and her land of Crete. Forged by the divine smith Hephaistos; Talos is an automaton, an autonomous machine of bronze that patrolled Europa’s land protecting it against enemies and invaders. This divine guardian and deity generated the idea of synthetic life and intelligence, but this idea was only that: a concept. The capability of creating such magnificent devices was left to the Gods themselves, something no human could ever achieve. However, thousands of years later during 1818, Mary Shelly immortalizes Frankenstein’s monster and changed the idea from something divine to something human by creating artificial life and intelligence through the medium of science. Now is when artificial life and intelligence begins, now something man-made could be created and it would have the ability to live, to learn, and to, most importantly, adapt. While Mary Shelly’s novel was purely fiction it allowed for the thought of human made synthetic life to take over, rewriting the idea long instilled by the Greek myths.

In 1946 during World War II another major advancement took place, the creation of Colossus. This computer, although kept secret for years by Great Britain, electronically decrypted German messages encrypted by the Enigma machine. Colossus could be considered one of the first modern computers, a technology that allowed a super human amount of calculations to occur in a relatively small amount of time. With biological science proving ineffective for creating synthetic life, humanity moved to technology and computers in their quest for artificial life and intelligence. Shortly after World War II had ended came the Cold War with Soviet Russia. The fear of nuclear war and Soviet spies sparked the development of natural language processing, beginning with the translation of the Russian language, both spoken and written, to English and culminating with modern marvels such as Watson and the mobile device application Siri.

With tensions wrung high and missiles at the ready, natural language processing was invented, focusing on machine translation. A formal definition of machine translation is “going by algorithm from machine-readable source text to useful target text, without recourse to human translation or editing” (Automatic 19). This tool was considered vital to the United States government because, when fully developed, it would enable them to translate the Russian text to English with a low chance of error and at speeds faster than humans. Because the government needed it, funding was readily available.

“Research on NLP began in earnest in the 1950’s. Automatic translation from Russian to English, in a very rudimentary form and limited experiment, was exhibited in the IBM-Georgetown Demonstration of 1954” (Jones 2). The experiment converted more than sixty Russian sentences to English using the IBM-701 mainframe computer. Their method of machine translation was to use computational linguistics, a combination of statistics and rules of language. The researchers claimed that there were problems with machine translation and that they would be solved within the next three or four years, things were looking good.

The years from 1954 to 1966 proved to be beneficial to the field of AI as a whole. In 1956 the Dartmouth Conference took place in Dartmouth College, New Hampshire. In this conference the John McCarthy coined term ‘Artificial Intelligence’ and had a month long ‘brainstorm session’ on many topics related to AI. Only a year later, in 1957, Noam Chomsky published the book Syntactic Structures in which he revolutionized previous linguistics concepts and concluded that in order for machines to understand language the sentence structure had to change. He invented a type of grammar entitled Phase-Structure Grammar that methodically converted natural language sentences to a form usable by computers. Then, in 1958, John McCarthy released the programming language LISP: Locator/Identifier Separation Protocol, a language still commonly used today. ELIZA was created in 1964 and, by only rearranging sentences and following some relatively simple grammar rules, impersonated a psychiatrist. It was very successful and some reports state that some people telling the ‘Doctor’ their personal secrets. However overall progress was slow, computers were still in their beginning phases. It “was the era of punched cards and batch processing” (Jones 2).

The United State’s National Research Council, NRC for short, founded the Automatic Language Processing Advisory Committee, ALPAC for short, in 1964. ALPAC was the committee assigned to evaluate the progress of NLP research. In 1966 ALPAC and the NRC halted research on machine translation because progress, after twelve years and twenty million dollars, had slowed and machine translation had became more expensive than manual human translation. This first major setback in AI was due to funding; they could not get money required to do research, and the first attempt at machine translation had failed.

The 1966 ALPAC review caused a dark age for natural language processing, with funding halted and jobs failing people lost hope in machine translation. It took nearly fourteen years for NLP to come back to the spotlight, this time they had abandoned previous concepts of machine translation and started fresh. The combination of statistics and linguistics that lead the research and development of NLP in the previous years had now been changed to pure statistics. One pioneer, Fred Jelinek, had a major impact on this new and improved field. He had imagined using probability and statistics to process speech and language. Once he said that “Every time I fire a linguist, the performance of our speech recognition system goes up” (IBM100 1).

This era from the end of machine translation in 1966 to the early 1980’s, “was one of growing confidence and consolidation, and also an expanding community” (Jones 7). The increase in computer power is what allowed this transformation to take place. New and more powerful computers were coming out faster, allowing more research and programs to be written for NLP. Take, for example, SHRDLU: a project finished in 1970 that consisted of rearranging blocks, cones, or pyramids, by user input. SHRDLU would understand sentences like: ‘put the blue cube on top of the red cube’ and carry out that action in the real world. It introduced the idea of real world applications as opposed to the previous ideas of translation or software control. Then in 1982 the concept of a chatbot was created and the project Jabberwacky began. The purpose of the project was to create an AI program that could simulate natural human chat in an interesting, entertaining and humorous manner; in hopes of passing the long sought after Turing Test. This project created another use for NLP and further expanded the community of both researchers and funding opportunities.

Beginning in the in the early 1990’s NLP started growing faster than ever. “The vast quantities of text flooding the World Wide Web have in particular stimulated work on tasks for managing this flood, notably by information extraction and automatic summarizing” (Jones 8). The creation and public use of the internet coupled with Canada’s enormous quantities of texts in both French and English aided in the revival of machine learning and therefore machine translation. With all of this new information and computer readable texts there was a major advancement in the use of spoken language and speech recognition (Jones 8). With major advances in the field of NLP, both speech and text, the US government began taking interest once again. The US government began creating research programs that could be easily customized and do not have such a heavy reliance on database knowledge (Jones 8).

Presently, in the 21st century, NLP research and development is booming. Computing power and memory is so cheap and easily available that now we can buy hard drives with terabytes of memory as opposed to mere kilobytes. Because so much information and so many programs can be stored so easily on modern computers NLP transformed for the last time: basic grammar rules were added to the wealth of statistics found in the previous decades. These modern developments have spawned the birth of many new and complicated research topics. Ranging from machine text reading, a project in which computers can read summarize and understand text, to SIRI’s speech recognition. Computers around the globe are becoming more and more advanced, take for example: Watson. The artificial intelligence system created by IBM in 2006 was put to the test on the show ‘Jeopardy!’ against Brad Rutter, the biggest all time winner, and Ken Jennings, the holder for the longest win-streak on the show with 75 straight wins. Watson listened for the question, understood it using the most modern methods of NLP, searched through its massive database, which was not connected to the internet, and found the most likely answer. Watson won.

Modern NLP consists of speech recognition, machine learning, machine text reading, and machine translation. These parts when combined would allow for artificial intelligence to gain real knowledge of the world, not just playing chess or moving around an obstacle course. In the near future computers will be able to read all of the information online and learn from it and solve problems and possibly cure diseases. There limit for NLP and AI is humanity, research will not stop until both are at a human level of awareness and understanding. With this level of continuous development situations predicted by Isaac Asimove in the novel I Robot might become our future.


  • Automatic Language Processing Advisory Committee, National Academy of Sciences, National Research Council, and Division of Behavioral Sciences. Language and Machines - Computers in Translation and Linguistics. Washington, D.C.: National Academy of Sciences, National, Research Council, 1966. The National Academic Press. Web. 9 Dec. 2011. <>.
  • "IBM100 - Pioneering Speech Recognition." IBM - United States. Web. 10 Dec. 2011. <>.

Please let me know if you have any suggestions!


Not quite getting the connection your trying to establish between the quest for artificial life and natural language processing. Wouldn't the development of AI preclude the need for natural language processing, as a truly "intelligent" AI would implicitly understand language in a human way? Natural language processing would seem to only be relevant in a world where we are trying to make non-sentient machines process language in intelligent and natural ways.

Nor do I think there is a strong case that natural language processing, a la Watson, equates to Artificial intelligence/life in the broad sense you seem to imply?

--Srterpe 14:01, 11 December 2011 (EST)


Thank you for your input, and this is merely the Introduction. The main idea that I was trying to show was simply the development of artificial intelligence through out time and the change of the concept from some divine to something possible for humans. However I understand your point, but there seems to be a misunderstanding: AI is a very broad subject and NLP is just a tiny branch of it. There are AI machines that simply observe the world and find a way through an obstacle course, like Shakey. While there are others that learn chess or drive on the deserts autonomously. In this introduction I am introducing a unification of the artificial life and intelligence to show the idea of Machine Learning that will be brought up in the next paragraphs.

I believe Watson was actually a very important step in NLP because it took the questions asked, deciphered them, then found a response through its databases (not connected to the internet). All of these processes are found in NLP in one form or another and so it is an adequate model of a modern marvel. However if there is another one that you believe is better please do not hesitate to inform me!

I hope that clarifies things, and thank you very much for your comment!

- Ryan Lichtig

No, I agree entirely, Watson is an amazing example of natural language processing. Just not sure that we're any closer to truly sentient machines or synthetic life as a result.

Anyway I don't want to keep you from finishing your paper, I was just cruising around seeing what others in class were doing.

--Srterpe 21:14, 11 December 2011 (EST)

Oh, I see and you are completely right. We are not very close to sentient beings and truly artificial life. But, as will appear most likely at the end of my essay, we seem to be getting there and it might be sooner than we think.

Good luck on your project too.

- Ryan Lichtig