Developing an automatic document classification system – A review of current literature and future directions

PDF

Authors
  1. Brown, J.D.
Corporate Authors
Defence R&D Canada - Ottawa, Ottawa ONT (CAN)
Abstract
Assigning a security classification to a document is typically a labour-intensive manual process performed by a trained professional who must read and understand the document and subsequently apply the rules of an organization's security policy. Automating this process would increase organizational efficiency and would nicely complement newly proposed data-centric security systems where all data must be accurately labelled with the appropriate classification. This paper introduces some of the challenges faced in developing an automated security classification system and discusses current text categorization technologies (dimensionality reduction and machine learning techniques) which would be the key enablers of such a system. In addition to the technology review, several avenues of research are proposed to evaluate a number of potential solutions to the security classification problem.

Il y a un résumé en français ici.

Keywords
Natural language processing;Data-centric security systems;Algorithms;Classification;Text categorization;Trusted labelling
Report Number
DRDC-OTTAWA-TM-2009-269 — Technical Memorandum
Date of publication
01 Jan 2010
Number of Pages
42
DSTKIM No
CA033621
CANDIS No
533015
Format(s):
Electronic Document(PDF)

Permanent link

Document 1 of 1

Date modified: