unreal.LearningAgentsPolicy
¶
- class unreal.LearningAgentsPolicy(outer: Object | None = None, name: Name | str = 'None')¶
Bases:
LearningAgentsManagerListener
A policy that maps from observations to actions.
C++ Source:
Plugin: LearningAgents
Module: LearningAgents
File: LearningAgentsPolicy.h
Editor Properties: (see get_editor_property/set_editor_property)
decoder_network
(LearningAgentsNeuralNetwork): [Read-Only] The underlying decoder neural network.encoder_network
(LearningAgentsNeuralNetwork): [Read-Only] The underlying encoder neural network.interactor
(LearningAgentsInteractor): [Read-Only] The agent interactor this policy is associated with.is_setup
(bool): [Read-Only] True if this object has been setup. Otherwise, false.manager
(LearningAgentsManager): [Read-Only] The manager this object is associated with.policy_network
(LearningAgentsNeuralNetwork): [Read-Only] The underlying policy neural network.visual_logger_objects
(Map[Name, LearningAgentsVisualLoggerObject]): [Read-Only] The visual logger objects associated with this listener.
- decode_and_sample_actions(action_noise_scale=1.000000) None ¶
Decodes and samples action vectors using the Decoder network. This should be called after EvaluatePolicy and before PerformActions.
- Parameters:
action_noise_scale (float) – Scale of the action noise to use during sampling. Set this to zero to always sample the mean (expected) action.
- encode_observations() None ¶
Encodes the buffered observation vectors using the Encoder network. This should be called after GatherObservations and before EvaluatePolicy.
- evaluate_policy() None ¶
Calling this function will run the underlying neural network on the previously encoded observations to populate the encoded actions. This should be called after EncodeObservations and before DecodeAndSampleActions.
- get_decoder_network_asset() LearningAgentsNeuralNetwork ¶
Gets the current Encoder Network Asset being used
- Return type:
- get_encoder_network_asset() LearningAgentsNeuralNetwork ¶
Gets the current Encoder Network Asset being used
- Return type:
- get_memory_state(agent_id=-1) Array[float] ¶
Gets the current memory state for a given agent as represented by an abstract vector learned by the policy.
- get_memory_state_size() int32 ¶
Gets the size of the memory state
- Return type:
int32
- get_policy_network_asset() LearningAgentsNeuralNetwork ¶
Gets the current Policy Network Asset being used
- Return type:
- classmethod make_policy(manager, interactor, class_=None, name='Policy', encoder_neural_network_asset=None, policy_neural_network_asset=None, decoder_neural_network_asset=None, reinitialize_encoder_network=True, reinitialize_policy_network=True, reinitialize_decoder_network=True, policy_settings=[], seed=1234) LearningAgentsPolicy ¶
Constructs this object to be used with the given agent interactor and policy settings.
- Parameters:
manager (LearningAgentsManager) – The input Manager
interactor (LearningAgentsInteractor) – The input Interactor component
name (Name) – The policy name
encoder_neural_network_asset (LearningAgentsNeuralNetwork) – Optional Encoder Network Asset to use. If not provided, asset is empty, or bReinitializeEncoderNetwork is set then a new neural network object will be created.
policy_neural_network_asset (LearningAgentsNeuralNetwork) – Optional Policy Network Asset to use. If not provided, asset is empty, or bReinitializePolicyNetwork is set then a new neural network object will be created according to the given PolicySettings.
decoder_neural_network_asset (LearningAgentsNeuralNetwork) – Optional Decoder Network Asset to use. If not provided, asset is empty, or bReinitializeDecoderNetwork is set then a new neural network object will be created.
reinitialize_encoder_network (bool) – If to reinitialize the encoder network
reinitialize_policy_network (bool) – If to reinitialize the policy network
reinitialize_decoder_network (bool) – If to reinitialize the decoder network
policy_settings (LearningAgentsPolicySettings) – The policy settings to use on creation of a new policy network
seed (int32) – Random seed to use for initializing network weights and policy sampling
- Return type:
- run_inference(action_noise_scale=1.000000) None ¶
Calls GatherObservations, EncodeObservations, EvaluatePolicy, DecodeAndSampleActions, PerformActions
- Parameters:
action_noise_scale (float) –
- set_memory_state(agent_id=-1, memory_state) None ¶
Sets the current memory state for a given agent as represented by an abstract vector learned by the policy.
- setup_policy(manager, interactor, encoder_neural_network_asset=None, policy_neural_network_asset=None, decoder_neural_network_asset=None, reinitialize_encoder_network=True, reinitialize_policy_network=True, reinitialize_decoder_network=True, policy_settings=[], seed=1234) None ¶
Initializes this object to be used with the given agent interactor and policy settings.
- Parameters:
manager (LearningAgentsManager) – The input Manager
interactor (LearningAgentsInteractor) – The input Interactor component
encoder_neural_network_asset (LearningAgentsNeuralNetwork) – Optional Encoder Network Asset to use. If not provided, asset is empty, or bReinitializeEncoderNetwork is set then a new neural network object will be created.
policy_neural_network_asset (LearningAgentsNeuralNetwork) – Optional Policy Network Asset to use. If not provided, asset is empty, or bReinitializePolicyNetwork is set then a new neural network object will be created according to the given PolicySettings.
decoder_neural_network_asset (LearningAgentsNeuralNetwork) – Optional Decoder Network Asset to use. If not provided, asset is empty, or bReinitializeDecoderNetwork is set then a new neural network object will be created.
reinitialize_encoder_network (bool) – If to reinitialize the encoder network
reinitialize_policy_network (bool) – If to reinitialize the policy network
reinitialize_decoder_network (bool) – If to reinitialize the decoder network
policy_settings (LearningAgentsPolicySettings) – The policy settings to use on creation of a new policy network
seed (int32) – Random seed to use for initializing network weights and policy sampling