Text Mining
Most of the documents and thus the data in the world are written in
text. There has been a lot of work in Data Mining on numbers or
data base records, but much less on mining text. Why?
The answer is that text is hard for computers to understand. This seems
kind of odd as almost every human understands language. Perhaps this is
one of the chief differences between computers, in their current state,
and people. (Turing Test) (Semantics and Context)
Linguistics
Fortunately, there is a long history of linguistics. This work
has been used to help us develop systems for natural languge processing
(understanding and generation). More recently (the last 50 years), there
has been a lot of work in getting computers to process language. This often
uses linguistics, but also takes advantage of the strengths of the
computer: fast processing and a lot of memory.
This is an active industrial area. To work in this area, you need to
have the basic skills. You also need to know the limits of the technology.
It is also an active research area.