# Nondifferentiable Optimization Problems Homework

By:

Author Name: Nathanael Robinson
Steward: Dajun Yue and Fenqui You

## Introduction

Non-differentiable optimization is a category of optimization that deals with objective that for a variety of reasons is non differentiable and thus non-convex. The functions in this class of optimization are generally non-smooth. These functions although continuous often contain sharp points or corners that do not allow for the solution of a tangent and are thus non-differentiable. In practice non-differentiable optimization encompasses a large variety of problems and a single one-size fits all solution is not applicable however solution is often reached through implementation of the subgradient method. Non-differentiable functions often arise in real world applications and commonly in the field of economics where cost functions often include sharp points. Early work in the optimization of non-differentiable functions was started by Soviet scientists Dubovitskii and Milyutin in the 1960's and led to continued research by Soviet Scientists. The subject has been a continued field of study since with different theories and methods being applied to solution in different cases.

## Cost Functions

In many cases, particularly economics the cost function which is the objective function of an optimization problem is non-differentiable. These non-smooth cost functions may include discontinuities and discontinuous gradients and are often seen in discontinuous physical processes. Optimal solution of these cost functions is a matter of importance to economists but presents a variety of issues when using numerical methods thus leading to the need for special solution methods.

## Solution Methods

Solution of differentiable problems and differentiable cost functions can in general forms be solved with gradient based analytical methods such as the Kuhn-Tucker model and through numerical methods such as steepest descent and conjugate gradient. However the introduction of non-differentiable points in the function invalidates these methods, steepest descent cannot be calculated for a vertical line. A common method for solution of a non-differentiable cost function is through transformation into a non-linear programming model where all of the of new functions involved are differentiable such that solution is now possible through ordinary means.

## Simple Kink Case

A common case of a non-differentiable function is the simple kink. The function is of the form:

The function is non-differentiable because of several simple kinks which can be modeled by:

If these simple kinks were removed the function would be differentiable across the entire domain. Some other types of non-differentiable objective functions can be modeled as simple kinks to allow the same type of solution.
The approach to solution of the simple kink case is to approximate each of the non-differentiable kinks with a smooth function that will allow conventional solution to the entire problem. This requires that the kinks be the only factor that renders the function non-differentiable. A simple kink can be modeled by a two-parameter approximation,, of the simple kink

Where y and c are parameters with

Each kink will be replaced in the function with its two-parameter approximation such the new function is differentiable with the parameters and . The solution can now be iteratively solved by adjusting the parameters c and y and solving the optimization problem

A solution to the approximated objective function will be obtained. The problem is now resolved with an updated parameter for which is obtained by multiplying which where can also be updated if necessary. And a new minimization carried out with the case. The procedure can be repeated until a value of that is consistent with the and parameters is reached.

If the non-differentiable function is convex and subject to convex constraints then the use of the -Subgradient Method can be applied. This method is a descent algorithm which can be applied to minimization optimization problems given that they are convex.
With this method the constraints won't be considered explicitly but rather the objective function will be minimized to the value . This makes it such that the minimization of over set is equal to finding the minimum of the extended real value function where is the indicator function of . The solution will converge through a 4 step system, the basis of these steps lies a series of propositions which are further detailed in [1].
Step 1: Select a vector such that , a scalar and a scalar .
Step 2: Given set where is the smallest non-negative integer such that
Step 3: Find a vector such that

Step 4: Set where is such that

Return to step 2 to iterate until convergence. This method is not only guaranteed to converge but progress towards convergence is made with each iteration.

## Cutting Plane Methods

Cutting planes were first utilized for the convergence of convex non-differentiable equations. The application of cutting planes will use the subgradient inequality to change the function by approximating it as

Where are subgradients of at . Thus, The original problem is now formulated as

Which is equivalent to the new problem

This new minimization formulation is now differentiable and easier to deal with, however it is only an approximation of the original equation which will become a better approximation as more constraints are added to the new model.

The subgradient optimization method is among the most common methods for convergence of non-differentiable optimization problems. It extends the gradient methods used in smooth optimization but is more complicated as search direction of subgradients is not necessarily the same as improving direction. Details can be further enumerated in the Subgradient Optimization page in this Wiki Textbook.

## Illustrative Example

A simple example of non-differentiable optimization is approximation of a kink origination from an absolute value function. The simple function is an example of a function that while continuous for an infinite domain is non-differentiable at due to the presence of a "kink" or point that will not allow for the solution of a tangent. Since the non-differentiable point of the function is known an approximation can be added to relax and smooth the function with parameter . This new approximation can be modeled

## References

1. Bertsekas,D. Mitter, S. "A Descent Numerical Method for Optimization Problems with Nondifferentiable Cost Functionals*" Vol 11, No 4 of Siam Journal of Control, 1973.
2. Bertsekas, D. "Nondifferentiable Optimization Via Approximation" Vol 1, No 25 of Mathematical Programming Study 3, 1975.
3. Elhedhli, S. Goffin, J-L. Vial, J-P. "Nondifferentiable Optimization: Introduction, Applications and Algorithms" Encyclopedia of Optimization, 2000.

An example of a non-differentiable cost function such as one that may be seen in economics
An example of a two parameter kink approximation.

## Smooth reformulation

As Sid points out, there's no need to treat this problem as non-smooth, since you'd just be making it harder on yourself.

Let's assume for the sake of notation that $\mathbf{x}_{1}, \ldots, \mathbf{x}_{15} \in [0,1]^{3} \subset \mathbb{R}^{3}$ are the coordinates of your 15 particles in the unit cube. A smooth formulation, as Sid suggests, presented in standard form (for nonlinear programming), would be:

\begin{alignat}{1} &\min_{\mathbf{x}_{1}, \ldots, \mathbf{x}_{15} \in [0,1]^{3}} -E \\ \mathrm{s.t.} & \quad E - \|\mathbf{x}_{i} - \mathbf{x}_{j}\|^{2} \leq 0, \,\, i, j = 1, \ldots, 15, \,\, i \neq j \end{alignat}

where $E$ is a proxy for the minimum distance, which I'm assuming is related to minimizing some sort of energy. There might be a way to reformulate this problem as an equivalent convex problem, but I don't think there is.

This formulation probably isn't convex, because the left-hand sides of the nonlinear constraints aren't convex, so you'll need to use a nonconvex nonlinear programming solver to be assured of a global optimal solution (unless you can prove convexity of the feasible set, but I doubt that). Deterministic global solvers that will work for nonconvex problems include (but aren't limited to):

• BARON (which is commercial, but you can submit jobs for free via the NEOS optimization server run by University of Wisconsin-Madison)
• LINDOGlobal (also commerical, also available through the NEOS optimization server)
• Couenne (open-source, part of the COIN-OR suite of open-source solvers)
• Bonmin (also part of COIN-OR)
• LaGO (again, part of COIN-OR)
• icos (available as open-source, or through NEOS)

It's important to note that one solver may work on your problem when others won't; BARON is generally considered the best, but it's fallible, and there are cases where, for example, Couenne will solve a problem to (epsilon) global optimality, but BARON won't (and vice versa).

## Solving nonsmooth problems

Let's suppose for the sake of argument that you (like Hans) want to solve a non-smooth nonlinear programming problem. This type of problem isn't my area of expertise, but I know of a couple references.

The most famous person in the field (who, as far as I can tell, developed the most important parts of the theory early on) is Frank H. Clarke. The gist of non-smooth optimization seems to be: replace gradients with Clarke's generalized gradients. Using Clarke's generalized gradients, you're supposed to be able to formulate a non-smooth analogue of Newton's method, as well as algorithms for optimization. His textbook on the theory (Optimization and Nonsmooth Analysis by Frank H. Clarke; the link goes to Amazon) is considered a classic.

In terms of software, the best links I can find are to Napsu Karmitsa's home page; she's developed a couple non-smooth optimization solvers, and she links to other non-smooth optimization solvers. The methods I've heard of most often are called bundle methods, and should be deterministic. (I favor deterministic methods over stochastic methods.) More links to non-smooth codes can be found here; your mileage may vary, because like I said, I don't work with these methods.

I do know that just because a method is developed for non-smooth problems does not mean it will work for non-smooth, non-convex problems, so you will need to make sure that the solver you choose can handle both non-smoothness and non-convexity.

Finally, as Hans points out in the comments, non-smooth formulations regularly appear in science and engineering. However, my first instinct as someone in the optimization field is to try and find an equivalent smooth reformulation because methods for solving smooth problems are generally much faster than methods for solving non-smooth methods (a labmate uses non-smooth solvers, and has made this observation). If you can reformulate the problem as a smooth optimization problem, it generally behooves you to do so.