-
3. Week_2-4_David Silver _ Week 2 Summary카테고리 없음 2021. 6. 21. 05:05
3. Week_2-4_David Silver _ Week 2 Summary $\cdot$ David Silver $\quad : \;$ Deep learning + RL = AI ?
$\cdot$ Week 2 Summary
This week,
we discussed methods for representing large impossibly continuous state space !
$\rightarrow \quad$ Ways to construct features.Week 2. Feature representation & update through Neural Network
A representation is an agent's internal encoding of the state !
The agent constructs features to summarize the current input.
Whenever we are talking about features and representation learnings,
we are in the alnd of function approximation.This brings us to the left side of our course-map.
Coarse Coding
First, we introduce Coarse Coding
Coarse Coding is related to State-Aggregation.
It groups together neighboring states where each grouping can have an arbitrary shape.One particular example of a 2-dimensional Coarse Coding is represented by these overlapping circles.
Each circle is a feature that is $1$ when the state is inside the circle, and $0$ when the state is outside the circle.
Tile Coding
Next, we dicussed a particular type of Coarse Coding, called Tile Coding.
Tile Coding generates features suing a setg of overlapping grid.
Each grid is called a tiling.The tiling itself has no overlap or space between the squares,
Only one feature can be active at a time !By stacking multilple offset tilings,
we can discriminate between different states.The shape, size, and number of tilings
help us balance generalization, discrimination, and computational efficiency.Neural Networks
We then discuss
a way to learn the representation On-line with Neural Networks.
With Coarse Coding techniques, the representation is fixed before learning !
A feed-forward neural network uses a series of layers to produce a representation.
In each layer, multiple neurons received the same input and produce distinct outputs.
These outputs are then fed to next layer and the process repeats.Each neuron computes it's output by taking a weighted sum of the input,
and pssing the sum through an activation function.Training Neural Networks
To train a Neural Network,
we use an iterative process called Gradient Descent.We pass the inputs into the network to produce predictions,
then we compare those predictions to the outputs and compute our loss function.
Finally, we compute the derivative of the loss function,
and apply our learning rule to the weights.Next week,
We'll talk about learning to maximize reward with function approximation !