如何通过迭代法优化求解萨顿山地车问题并附上Matlab代码？

2026-05-16 11:082阅读0评论SEO问题

内容介绍
文章标签
相关推荐

本文共计704个文字，预计阅读时间需要3分钟。

1+内容介绍考虑驾驶一辆动力不足的汽车爬上陡峭的山路。困难在于重力比汽车引擎更强，即使全油门，汽车也无法加速上坡。唯一的解决方法...

1 内容介绍

Consider the task of driving an underpowered car up a steep mountain road. The diculty is that gravity is stronger than the car’s engine, and even at full throttle the car cannot accelerate up the steep slope. The only solution is to first move away from the goal and up the opposite slope on the left. Then, by applying full throttle the car can build up enough inertia to carry it up the steep slope even though it is slowing down the whole way. This is a simple example of a continuous control task where things have to get worse in a sense (farther from the goal) before they can get better. Many control methodologies have great diculties with tasks of this kind unless explicitly aided by a human designer. Consider the task of driving an underpowered car up a steep mountain road. The diculty is that gravity is stronger than the car’s engine, and even at full throttle the car cannot accelerate up the steep slope. The only solution is to first move away from the goal and up the opposite slope on the left. Then, by applying full throttle the car can build up enough inertia to carry it up the steep slope even though it is slowing down the whole way. This is a simple example of a continuous control task where things have to get worse in a sense (farther from the goal) before they can get better. Many control methodologies have great diculties with tasks of this kind unless explicitly aided by a human designer.

2 部分代码

% This is the main file for the simulation.

clear

close all

clc

%% Set simulation parameters.

% There is a minimum grid size here. Otherwise, traceBack function will

% fail. Since the dynamic equation of the car is:

% vNext = v + 0.001 * u - 0.0025 * cos(3 * p);

% When v(0) = 0, and cos(3p) = 0, u = 1, will result in vNext = 0.001. For

% this reason, velocity grid should be finer that 0.001.

% Bigger number means finer grids

% We will create a grid/matrix, the row is for the discretized position and

% the column is for the discretized velocity.

gridSizePos = 400;

gridSizeVel = 400;

% x0 holds the initial position and velocity;

% Select -0.6 to -0.4 for position.

% Initial velocity shold be zero as described in the original problem.

x0 = [-0.52 0];

%x0 = [0.4 0];

%% Find optimal policy.

tic

[error, predecessorP, predecessorV, policy] = ...

mountainCarValIter(gridSizePos, gridSizeVel, 1000);

toc

%% Trace back the optimal policy, for given a certain initial condition

[XStar, UStar, TStar] = ...

traceBack(predecessorP, predecessorV, policy, x0, gridSizePos, gridSizeVel);

%% Animation

visualizeMountainCar(gridSizePos, XStar, UStar)

%% Plot errors over iterations.

figure

plot(error);

title('Convergence errors over iterations');

%% Plot the policy matrix.

figure

imagesc(policy)

title('Policy matrix')

xlabel('Position');

ylabel('Velocity');

3 运行结果

4 参考文献

部分理论引用网络文献，若有侵权联系博主删除。

标签：优化求解基于

本文共计704个文字，预计阅读时间需要3分钟。

1+内容介绍考虑驾驶一辆动力不足的汽车爬上陡峭的山路。困难在于重力比汽车引擎更强，即使全油门，汽车也无法加速上坡。唯一的解决方法...

1 内容介绍

2 部分代码

% This is the main file for the simulation.

clear

close all

clc

%% Set simulation parameters.

% There is a minimum grid size here. Otherwise, traceBack function will

% fail. Since the dynamic equation of the car is:

% vNext = v + 0.001 * u - 0.0025 * cos(3 * p);

% When v(0) = 0, and cos(3p) = 0, u = 1, will result in vNext = 0.001. For

% this reason, velocity grid should be finer that 0.001.

% Bigger number means finer grids

% We will create a grid/matrix, the row is for the discretized position and

% the column is for the discretized velocity.

gridSizePos = 400;

gridSizeVel = 400;

% x0 holds the initial position and velocity;

% Select -0.6 to -0.4 for position.

% Initial velocity shold be zero as described in the original problem.

x0 = [-0.52 0];

%x0 = [0.4 0];

%% Find optimal policy.

tic

[error, predecessorP, predecessorV, policy] = ...

mountainCarValIter(gridSizePos, gridSizeVel, 1000);

toc

%% Trace back the optimal policy, for given a certain initial condition

[XStar, UStar, TStar] = ...

traceBack(predecessorP, predecessorV, policy, x0, gridSizePos, gridSizeVel);

%% Animation

visualizeMountainCar(gridSizePos, XStar, UStar)

%% Plot errors over iterations.

figure

plot(error);

title('Convergence errors over iterations');

%% Plot the policy matrix.

figure

imagesc(policy)

title('Policy matrix')

xlabel('Position');

ylabel('Velocity');

3 运行结果

4 参考文献

部分理论引用网络文献，若有侵权联系博主删除。

标签：优化求解基于

1 内容介绍

2 部分代码

3 运行结果

4 参考文献

部分理论引用网络文献，若有侵权联系博主删除。

相关推荐

1 内容介绍

2 部分代码

3 运行结果

4 参考文献

部分理论引用网络文献，若有侵权联系博主删除。

相关推荐