USING TEMPORAL DIFFERENCE LEARNING FOR CLOSE-LOOP RESOURCE ALLOCATION

PDF

Authors
  1. Berger, J.
Corporate Authors
Defence Research Establishment Valcartier, Valcartier QUE (CAN)
Abstract
Closed-loop resource allocation in naval tactical defence is very complex and often relies on the use of specific heuristics showing random variations in the expected solution quality. As a result, problems where combinatorial issues over time are critical to resource allocation require proper models to be defined and suitable algorithms to be developed. A neural network algorithm using the temporal difference method as a learning technique is proposed to solve a closed-loop dynamic weapon-target allocation problem. Directed to select efficient strategies during the planning phase, the learning process is driven by the estimation of the outcome prediction of a battle scenario at a specific time given the sequence of action decisions already taken so far. Combined with past experience the neural network progressively modifies its internal representation of the solution space in order to improve the quality of the decision in timely planning resource allocation. Partial results obtained through computer simulation conducted within the context of naval anti-air warfare for a single problem representation show that the proposed aproach fail to match the performance of a greedy heuristic but outperforms a random resource allocation policy.
Keywords
CLOSED LOOP RESOURCE ALLOCATION
Report Number
DREV-TM-9504 — Technical Memorandum
Date of publication
01 Nov 1995
Number of Pages
22
DSTKIM No
96-01014
CANDIS No
154741
Format(s):
Document Image stored on Optical Disk;Hardcopy

Permanent link

Document 1 of 1

Date modified: