Text Processing
- Tags
- text-processing
The creation, storage and access of (massive) text in digital form by a computer.
Broca's area is the region of the brain associated with (spoken) language.
[see page 6, Applications]
- Information Retrieval Deciding relvence of a bunch of documents based on a search query, think Google.
- Information Extraction
Recognise (Specific) information from text. eg.
Foo
IS ADog
. - Text Categorisation Put text into discrete categories, eg. Email into Spam
- Summarisation Extract essential information from (one or more) text articles.
- Natural Language Generation Generate natural language text from an abstract, structured, representation. Eg. generate a manual in multiple languages from a single abstract representation of the steps to be carried out.
- Machine Translation English to French and/or vice versa. (VERY Difficult).
Considerations
Text isn't simple [see page 4, unstructured] information. It has structure, headings, tables etc. These are clues to help determine the importance of terms.