两端约束的最优控制问题及其数值解法
问题的基本形式
设 n n n维系统状态房产 x ˙ ( t ) = f [ x ( t ) , u ( t ) , t ] \dot{x}(t)=f[x(t),u(t),t] x˙(t)=f[x(t),u(t),t],控制向量 u ( t ) ∈ Ω u(t)\in\Omega u(t)∈Ω是分段连续函数, Ω ∈ R m \Omega\in R^m Ω∈Rm是有界闭集,满足约束 g [ x ( t ) , u ( t ) , t ] ≥ 0 g[x(t),u(t),t]\ge 0 g[x(t),u(t),t]≥0,终端时刻固定为 t f t_f tf。目标是使状态从初态 x ( t 0 ) = x 0 x(t_0)=x_0 x(t0)=x0转移到终态 x ( t f ) x(t_f) x(tf),其中 G [ x ( t f ) , t f ] = 0 G[x(t_f),t_f]=0 G[x(tf),tf]=0,且使得性能指标 J [ u ( t ) ] = Φ [ x ( t f ) , t f ] + ∫ t 0 t f L [ x ( t ) , u ( t ) , t ] d t J[u(t)]=\Phi[x(t_f),t_f]+\int_{t_0}^{t_f}L[x(t),u(t),t]dt J[u(t)]=Φ[x(tf),tf]+∫t0tfL[x(t),u(t),t]dt达到最小。
基本解法
构造Hamilton函数 H [ x ( t ) , u ( t ) , λ ( t ) , t ] = L [ x ( t ) , u ( t ) , t ] + λ ( t ) T f [ x ( t ) , u ( t ) , t ] H[x(t),u(t),\lambda(t),t]=L[x(t),u(t),t]+\lambda(t)^Tf[x(t),u(t),t] H[x(t),u(t),λ(t),t]=L[x(t),u(t),t]+λ(t)Tf[x(t),u(t),t] 。设 u ∗ ( t ) u^*(t) u∗(t)为最优控制, x ∗ ( t ) x^*(t) x∗(t)是最优轨线,则存在与 u = u ∗ ( t ) u=u^*(t) u=u∗(t)和 x = x ∗ ( t ) x=x^*(t) x=x∗(t)对应的最优伴随向量 λ = λ ∗ ( t ) \lambda=\lambda^*(t) λ=λ∗(t),使得: { x ˙ = ∂ H ∂ λ λ ˙ = − ∂ H ∂ x \begin{cases} \dot{x}=\frac{\partial H}{\partial \lambda} \\ \dot{\lambda}=-\frac{\partial H}{\partial x}\\ \end{cases} {x˙=∂λ∂Hλ˙=−∂x∂H
其中, u ∗ = arg min u ∈ Ω H [ x ∗ ( t ) , u ( t ) , λ ∗ ( t ) ] u^*=\arg\min_{u\in \Omega}H[x^*(t),u(t),\lambda^*(t)] u∗=argminu∈ΩH[x∗(t),u(t),λ∗(t)];
上述方程同时还满足边界条件 x ( t 0 ) = x 0 , G [ x ( t f ) , t f ] = 0 x(t_0)=x_0,G[x(t_f),t_f]=0 x(t0)=x0,G[x(tf),tf]=0;
横截条件 λ ( t f ) = ∂ Φ ( t f ) ∂ x + [ ∂ G ( t f ) ∂ x ] T v \lambda(t_f)=\frac{\partial \Phi(t_f)}{\partial x}+[\frac{\partial G(t_f)}{\partial x}]^Tv λ(tf)=∂x∂Φ(tf)+[∂x∂G(tf)]Tv。
数值解法
直接法
在考虑控制量约束 g [ x ( t ) , u ( t ) , t ] ≥ 0 g[x(t),u(t),t]\ge 0 g[x(t),u(t),t]≥0和终端约束 G [ x ( t f ) , t f ] = 0 G[x(t_f),t_f]=0 G[x(tf),tf]=0存在的条件下,需要对原来的性能指标 J [ u ( t ) ] J[u(t)] J[u(t)]加罚函数项得到 J ˉ [ u ( t ) ] \bar{J}[u(t)] Jˉ[u(t)]:
J ˉ [ u ( t ) ] = J [ u ( t ) ] + μ ∑ i = 1 r G i [ x ( t f ) , t f ] 2 + η ∫ t 0 t f ∑ i = 1 l min ( g i , 0 ) 2 d t \bar{J}[u(t)]=J[u(t)]+\mu\sum_{i=1}^rG_i[x(t_f),t_f]^2+\eta\int_{t_0}^{t_f}\sum_{i=1}^l\min(g_i,0)^2dt Jˉ[u(t)]=J[u(t)]+μi=1∑rGi[x(tf),tf]2+η∫t0tfi=1∑lmin(gi,0)2dt
直接法多采用梯度法及其变型进行求解,具体的计算步骤如下:
Step1. 根据经验选定初始控制 u 0 ( t ) u^0(t) u0(t),允许误差 ε > 0 \varepsilon>0 ε>0;
Step2. 将 u 0 ( t ) u^0(t) u0(t)代入状态方程并求解得到 x 0 ( t ) x^0(t) x0(t);
Step3. 计算 J ˉ [ u 0 ( t ) ] \bar{J}[u^0(t)] Jˉ[u0(t)],并根据协态方程从 t f t_f tf到 t 0 t_0 t0反向积分计算 λ 0 ( t ) \lambda^0(t) λ0(t);
Step4. 计算 u 0 u^0 u0处的梯度 ∇ J ˉ [ u 0 ( t ) ] = ∂ H [ x 0 ( t ) , u 0 ( t ) , λ 0 ( t ) , t ] ∂ u \nabla \bar{J}[u^0(t)]=\frac{\partial H[x^0(t),u^0(t),\lambda^0(t),t]}{\partial u} ∇Jˉ[u0(t)]=∂u∂H[x0(t),u0(t),λ0(t),t];
Step5. 确定搜索步长 α 0 = arg min α > 0 J ˉ [ u 0 − α ∇ J ˉ [ u 0 ( t ) ] ] \alpha^0=\arg\min_{\alpha >0} \bar{J}[u^0-\alpha\nabla \bar{J}[u^0(t)]] α0=argminα>0Jˉ[u0−α∇Jˉ[u0(t)]];
Step6. 修正控制向量 u 1 ( t ) = u 0 ( t ) − α 0 ∇ J ˉ [ u 0 ( t ) ] u^1(t)=u^0(t)-\alpha^0\nabla \bar{J}[u^0(t)] u1(t)=u0(t)−α0∇Jˉ[u0(t)];
Step7. 若满足终止条件 ∣ ∣ ∇ J ˉ [ u 0 ( t ) ] ∣ ∣ ≤ ε ||\nabla \bar{J}[u^0(t)]||\leq \varepsilon ∣∣∇Jˉ[u0(t)]∣∣≤ε,则结束循环;否则,令 u 0 = u 1 u^0=u^1 u0=u1回到Step2.
Step2和Step3往往是比较难计算的。
另外,若 u ( t ) u(t) u(t)满足上下界限约束,则在Step6中需要对 u ( t ) u(t) u(t)进行限幅。而针对横截条件中的 v v v可以采用 2 μ G 2\mu G 2μG估算:
λ i ( t f ) = ∂ Φ ( t f ) ∂ x i + ∑ j = 1 r 2 μ G j [ x ( t f ) , t f ] ∂ G j ( t f ) ∂ x i \lambda_i(t_f)=\frac{\partial \Phi(t_f)}{\partial x_i}+\sum_{j=1}^r2\mu G_j[x(t_f),t_f]\frac{\partial G_j(t_f)}{\partial x_i} λi(tf)=∂xi∂Φ(tf)+j=1∑r2μGj[x(tf),tf]∂xi∂Gj(tf)
间接法
直接法中修正后的控制向量 u u u不一定满足约束 g ≥ 0 g\geq 0 g≥0,而是通过施加罚函数,限幅等手段进行迭代。而间接法则是尽量充分保证 u u u能满足约束 g ≥ 0 g\geq 0 g≥0,这里给出间接法中的拟线性化方法实现逼近。该方法的核心是首先求出 u ( x , λ , t ) u(x,\lambda,t) u(x,λ,t)带入正则方程,引入增广状态 Y ( t ) = [ x ( t ) , λ ( t ) ] T , Y ( t ) ∈ R 2 n Y(t)=[x(t),\lambda(t)]^T,Y(t)\in R^{2n} Y(t)=[x(t),λ(t)]T,Y(t)∈R2n,将正则方程转化为 Y ˙ = g ( Y , t ) \dot{Y}=g(Y,t) Y˙=g(Y,t),再将该方程进一步线性化得到:
Y ˙ K + 1 = ( ∂ g ∂ Y ) K Y K + 1 + [ g ( Y K , t ) − ( ∂ g ∂ Y ) K Y K ] \dot{Y}^{K+1}=(\frac{\partial g}{\partial Y})_KY^{K+1}+[g(Y^K,t)-(\frac{\partial g}{\partial Y})_KY^{K}] Y˙K+1=(∂Y∂g)KYK+1+[g(YK,t)−(∂Y∂g)KYK]
其中, Y K Y^K YK代表第 K K K步迭代的解。若对于给定的 ε > 0 \varepsilon>0 ε>0,当 ∣ ∣ Y k + 1 ( t ) − Y k ( t ) ∣ ∣ ≤ ε ||Y^{k+1}(t)-Y^k(t)||\leq \varepsilon ∣∣Yk+1(t)−Yk(t)∣∣≤ε时停止计算。