Nonlinear operators
YALMIP supports modeling of nonlinear, often non-differentiable,
operators that typically occur in convex programming. Nine simple operators
are currently supported: min, max, abs, sqrt,
norm, sumk, sumabsk, geomean and cpower,
and users can easily add their own (see the end of this page). The operators
can be used intuitively, and YALMIP will automatically try to find out
if they are used in a way that enables a convex representation. Although
this can simplify the modeling phase significantly in some cases, it
is recommended not to use these operators unless you know how to model
them by your self using epigraphs and composition rules of convex and
concave functions, why and when it can be done etc. The text-book
[S. Boyd and L. Vandenberghe]
should be a suitable introduction for the beginner.
In addition to modeling convex and concave operators and perform
automatic analysis and derivation of equivalent conic programs, YALMIP
also uses the nonlinear operator framework for implementing
logic expression involving or and
and, and in the same vein but on a higher level, to handle piecewise
functions in pwf.
The nonlinear operator framework was initially implemented for
functions that can be modelled rigorously using conic constraints
and additional variables. However, there are many functions that
cannot be exactly modelled using conic constraints, such as
exponential functions and logarithms, but are convex or
concave, and of course can be analyzed in terms of convexity
preserving operations. These function are supported in a framework
called evaluation based nonlinear operators. The models using these
general convex functions will be analysed for convexity, but the
resulting model will be a problem that only can be solved using
a general nonlinear solver, such as
fmincon. See evaluation based nonlinear operators. Note that
this extension still is experimental and not intended for large
problems.
Convexity analysis in 10 lines
Without going into theoretical details, the convexity analysis is
based on epi- and hypograph formulations, and composition rules. For
the compound expression f = h(g(x)), it holds that (For
simplicity, we write increasing, decreasing, convex and concave, but
the correct notation would be nondecreasing, nonincreasing, convex
or affine and concave or affine. This notation us used throughout
this manual and inside YALMIP)
f is convex if h is convex and
increasing and g is convex
f is convex if h is convex and
decreasing and g is concave
f is concave if h is concave
and increasing and g is concave
f is concave if h is concave
and decreasing and g is convex |
Based on this information, it is possible to recursively analyze
convexity of a complex expression involving convex and concave functions.
When solvesdp is called, YALMIP
checks the convexity of objective function and constraints by using
information about the properties of the operators. If YALMIP manage
to prove convexity, graph formulations of the operators are automatically
introduced. This means that the operator is replaced with a graph, i.e.,
a set of constraints.
epigraph: t represents convex function f(x)
: replace with t≥f(x)
hypograph: t represents concave function
f(x) : replace with t≤f(x) |
Of course, in order for this to be useful, the epigraph
representation has to be represented using standard constraints, such
as conic constraints.
The operators
The operators defined in the current release are
described in the table below. This information might be useful to
understand how and when YALMIP can derive convexity.
Name |
Convex/Concave |
Monotinicity |
Comments |
abs |
convex |
none |
|
min |
concave |
increasing |
|
max |
convex |
increasing |
|
norm |
convex |
none |
All standard norms (1,2, inf and Frobenius) can be used
(applicable on both vectors and matrices.) |
sumk |
convex |
See comment |
Defines the sum of the
k largest elements of a vector, or the sum of the k
largest eigenvalues of a symmetric matrix. Increasing
for vector arguments, no monotinicity defined for
eigenvalue operator. |
sumabsk |
convex |
none |
Defines the sum of the
k largest absolute value elements of a vector, or the
sum of the k largest absolute value eigenvalues of a
symmetric matrix. |
geomean |
concave |
See comment |
For vector arguments, the operator is increasing. For
symmetric matrix argument, the operator is defined as the
geometric mean of the eigenvalues. No monotinicity
defined for eigenvalue operator |
cpower |
See comment |
See comment |
Convexity-aware
version of power. For negative powers, the operator is convex and decreasing.
For positive powers less than one, the operator is concave
and increasing. Positive powers larger than 1 gives a convex
increasing operator. |
sqrt |
concave |
increasing |
Short for
cpower(x,0.5) |
Standard use
Consider once again the linear regression
problem.
a = [1 2 3 4 5 6]';
t = (0:0.2:2*pi)';
x = [sin(t) sin(2*t) sin(3*t) sin(4*t) sin(5*t) sin(6*t)];
y = x*a+(-4+8*rand(length(x),1));
a_hat = sdpvar(6,1);
residuals = y-x*a_hat;
|
Using abs and max, we can easily solve the L1
and the L∞ problem (Note that the abs operator currently
has performance issues and should be avoided for large arguments. Moreover, explicitly creating
absolute values when minimizing the L∞ error is
unnecessarily complicated).
solvesdp([],sum(abs(residuals)));
a_L1 = double(a_hat)
solvesdp([],max(abs(residuals)));
a_Linf = double(a_hat)
|
YALMIP automatically concludes that the objective functions can be
modeled using some additional linear inequalities, adds these, and solves
the problems. We can simplify the code even more by using the norm
operator (this is much faster for large-scale problems due to implementation
issues in YALMIP). Here we also compute the least-squares solution (note
that this norm will generate a second-order cone constraint).
solvesdp([],norm(residuals,1));
a_L1 = double(a_hat)
solvesdp([],norm(residuals,2));
a_L2 = double(a_hat)
solvesdp([],norm(residuals,inf));
a_Linf = double(a_hat)
|
The following piece of code shows how we easily can solve a regularized
problem.
solvesdp([],1e-3*norm(a_hat,2)+norm(residuals,inf));
a_regLinf = double(a_hat)
|
The norm operator is used exactly as the built-in norm function
in MATLAB, both for vectors and matrices. Hence it can be used also
to minimize the largest singular value (2-norm in matrix case), or the
Frobenious norm of a matrix.
The double command of-course applies also to the nonlinear
operators (double(OPERATOR(X)) returns OPERATOR(double(X)).
double(1e-3*norm(a_hat,2)+norm(residuals,inf))
ans =
3.1175
|
A construction useful for maximizing determinants
of positive definite matrices is the function (det P)1/m,
for positive definite matrix P, where m is the dimension of
P. This concave function, called geomean in YALMIP, is
supported as an extended operator. Note that the positive semidefiniteness
constraint on P is added automatically by YALMIP.
D = randn(5,5);
P = sdpvar(5,5);
solvesdp(set(P < D*D'),-geomean(P));
|
The command can be applied also on positive vectors, and will then
model the geometric mean of the elements. We can use this to find the analytic
center of a set of linear inequalities (note that this is absolutely
not the recommended way to compute the analytic center.)
A = randn(15,2);
b = rand(15,1)*5;
x = sdpvar(2,1);
solvesdp([],-geomean(b-A*x)); % Maximize product of elements in b-Ax, s.t Ax < b
|
Rather advanced constructions are possible, and YALMIP will try derive
an equivalent convex model.
sdpvar x y z
F = set(max(1,x)+max(y^2,z) < 3)+set(max(1,-min(x,y)) < 5)+set(norm([x;y],2) < z);
sol = solvesdp(F,max(x,z)-min(y,z)-z);
|
Polynomial and sigmonial expressions
By default, polynomial expressions (except quadratics) are not analyzed
with respect to convexity and conversion to a conic model is not performed.
Hence, if you add a constraint such as set(x^4 + y^8-x^0.5 < 10) ,
YALMIP may complain about convexity, even though we can see that the
expression is convex and can be represented using conic constraints.
More importantly, YALMIP will not try to derive an equivalent conic
model. However, by using the command cpower instead, rational
powers can be used.
To illustrate this, first note the difference between a monomial
generated using overloaded power and a variable generated cpower.
sdpvar x
x^4
Polynomial scalar (real, homogeneous, 1 variable)
cpower(x,4)
Linear scalar (real, derived, 1 variable)
|
The property derived indicates that YALMIP will try to replace the
variable with its epigraph formulation when the problem is solved. Working
with these convexity-aware monomials is no different than usual.
sdpvar x y
F = set(cpower(x,4) + cpower(y,4) < 10) + set(cpower(x,2/3) + cpower(y,2/3) > 1);
plot(F,[x y]);
|
Note that when you plot sets with constraints involving nonlinear
operators and polynomials, it is recommended that you specify the variables
of interest in the second argument (YALMIP may otherwise plot the set
with respect to auxiliary variables introduced during the construction
of the conic model.)
Do not use these operators unless you really need them. The conic
representation of rational powers easily grow large.
Limitations
If the convexity propagation fails, an error will be issued (error
code 14). Note that this does not imply that the model is nonconvex,
but only means that the simple sufficient conditions used for checking
convexity were violated. Failure is however typically an indication
of a bad model, and most often due to an actual nonconvex part in the
model. The problems above are all convex, but not this problem below,
due to the way min enters in the constraint.
sdpvar x y z
F = set(max(1,x)+max(y^2,z) < 3)+set(max(1,min(x,y)) < 5)+set(norm([x;y],2) < z);
sol = solvesdp(F,max(x,z)-min(y,z)-z);
sol.info
ans =
Convexity check failed (Expected convex function in constraint #2 at level 2)
|
In the same sense, this problem fails due to a nonconvex objective
function.
sdpvar x y z
F = set(max(1,x)+max(y^2,z) < 3);
sol = solvesdp(F,-norm([x;y]));
sol.info
ans =
Convexity check failed (Expected concave function in objective at level 1)
|
This following problem is however convex, but convexity propagation
fails.
sdpvar x
sol = solvesdp([],norm(max([1 1-x 1+x])))
sol.info
ans =
Convexity check failed (Monotonicity required at objective at level 1)
|
The described operators cannot be used in polynomial expressions
in the current implementation. The following problem is trivially convex
but fails.
sdpvar x y
sol = solvesdp([],norm([x;y])^2);
sol.info
ans =
Convexity check failed (Operator in polynomial in objective)
|
Another limitation is that the operators not are allowed in cone
and semidefinite constraints.
sdpvar x y
sol = solvesdp(set(cone(max(x,y,1),2)),x+y);
sol.info
ans =
Convexity propagation failed (YALMIP)
|
In practice, these limitations should not pose a major problem. A
better model is possible (and probably recommended) in most cases if
these situations occur.
Mixed integer models
In some cases when the convexity analysis fails, it is possible
to tell YALMIP to switch from a graph based approach to a mixed
integer model based approach. In other words, if no graph model can
be derived, YALMIP introduces integer variables to model the
operators. This is currently implemented for min,
max, abs and linear norm for real arguments. By default, this feature
is not invoked, but can be activated by sdpsettings('allowmilp',1) .
Consider the following simple example which fails due to the
non-convex use of the convex abs operator
sdpvar x y
F = set(abs(abs(x+1)+3) > y)+set(0<x<3);
sol = solvesdp(F,-y);
sol.info
Convexity check failed (Expected concave function in constraint #1 at level 1)
|
By turning on the mixed integer fall back model, a mixed integer
LP is generated and the problem is easily solved.
sdpvar x y
F = set(abs(abs(x+1)+3) > y)+set(0<x<3);
sol = solvesdp(F,-y,sdpsettings('allowmilp',1));
double([x y])
ans =
3.0000 7.0000
|
If you know that your model is non-convex and will require a
mixed integer model, you can bypass the initial attempt to generate
the graph model by using sdpsettings('allowmilp',2) .
Evaluation based nonlinear operators
YALMIP now also supports experimental support for general
convex/concave functions that cannot be modelled using conic
representations. The main difference when working with these
operators is that the problem always requires a general nonlinear
solver to solved, such as fmincon.
All convexity analysis is still performed tough.
sdpvar x
solvesdp(set(exp(2*x + 1) < 1),-x,sdpsettings('solver','fmincon'));
double(x)
ans =
-0.5000
double(exp(2*x + 1))
ans =
1
|
Note that this feature still is very limited and experimental.
Too see how you can add our own function, see the
example for scalar entropy.
As a word of caution, note that fmincon
performs pretty bad on problems with functions that aren't defined
everywhere, such as logarithms. Hence, solving problem involving
these functions can easily lead to problems. It is highly
recommended to at least provide a feasible solution, or even better,
to use the inverse operator to formulate the problem. Consider the
following trivial example of finding the analytic center of a unit
cube centered at the point (3,3,3)
x = sdpvar(3,1);
p = [1-(x-3);(x-3)+1]
% Not recommended
solvesdp([],-sum(log(p)));
% Better
assign(x,[3.1;3.2;3.3]);
solvesdp([],-sum(log(p)),sdpsettings('usex0',1));
% Best (well, adding initials on x and t would be even better)
t = sdpvar(3,1);
solvesdp(exp(t) < p ,-sum(t));
|
Behind the scene
If you want to look at the model that YALMIP generates, you can use
the two commands model and expandmodel . Please
note that these expanded models never should be used manually. The commands
described below should only be used for illustrating the process that
goes on behind the scenes.
With the command model , the epi- or hypograph model
of the variable is returned. As an example, to model the maximum of
two scalars x and y, YALMIP generates two linear inequalities.
sdpvar x y
t = max([x y]);
F = model(t)
++++++++++++++++++++++++++++++++++++++++++++
| ID| Constraint| Type|
++++++++++++++++++++++++++++++++++++++++++++
| #1| Numeric value| Element-wise 1x2|
++++++++++++++++++++++++++++++++++++++++++++
sdisplay(sdpvar(F(1)))
ans =
'-x+t' '-y+t'
|
For more advanced models with recursively used nonlinear operators,
the function model will not generate the complete model
since it does not recursively expand the arguments. For this case, use
the command expandmodel . This command takes two arguments,
a set of constraints and an objective function. To expand an expression,
just let the expression take the position as the objective function.
Note that the command assumes that the expansion is performed in order
to prove a convex function, hence if you expression is meant to be concave,
you need to negate it. To illustrate this, let us expand the objective
function in an extension of the geometric mean example above.
A = randn(15,2);
b = rand(15,1)*5;
x = sdpvar(2,1);
expandmodel([],-geomean([b-A*x;min(x)]))
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| ID| Constraint| Type|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| #1| Numeric value| Element-wise 2x1|
| #2| Numeric value| Second order cone constraint 3x1|
| #3| Numeric value| Second order cone constraint 3x1|
| #4| Numeric value| Second order cone constraint 3x1|
| #5| Numeric value| Second order cone constraint 3x1|
| #6| Numeric value| Second order cone constraint 3x1|
| #7| Numeric value| Second order cone constraint 3x1|
| #8| Numeric value| Second order cone constraint 3x1|
| #9| Numeric value| Second order cone constraint 3x1|
| #10| Numeric value| Second order cone constraint 3x1|
| #11| Numeric value| Second order cone constraint 3x1|
| #12| Numeric value| Second order cone constraint 3x1|
| #13| Numeric value| Second order cone constraint 3x1|
| #14| Numeric value| Second order cone constraint 3x1|
| #15| Numeric value| Second order cone constraint 3x1|
| #16| Numeric value| Second order cone constraint 3x1|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
The result is two linear inequalities related to the min operator
and 15 second order cone constraints used for the conic representation
of the geometric mean.
Adding new operators
If you want to add your own operator, all you need to do is to create
1 file. This file should be able to return the numerical value of the
operator for a numerical input, and return the epigraph (or hypograph)
and a descriptive structure of the operator when the first input is
'graph' . As an example, the following file implements the
nonlinear operator tracenorm. This convex operator returns
sum(svd(X)) for matrices X. This value can also be described
as the minimizing argument of the optimization problem mint,A,B
t subject to set([A X;X' B] > 0) + set(trace(A)+trace(B) < 2*t).
function varargout = tracenorm(varargin)
switch class(varargin{1})
case 'double' % What is the numerical value (needed for displays etc)
varargout{1} = sum(svd(varargin{1}));
case 'char' % YALMIP send 'graph' when it wants the epigraph or hypograph
switch varargin{1}
case 'graph'
t = varargin{2}; % 2nd arg is the extended operator variable
X = varargin{3}; % 3rd arg and above are args user used when defining t.
A = sdpvar(size(X,1));
B = sdpvar(size(X,2));
F = set([A X;X' B] > 0) + set(trace(A)+trace(B) < 2*t);
% Return epigraph model
varargout{1} = F;
% a description
properties.convexity = 'convex'; % convex | none | concave
properties.monotonicity = 'none'; % increasing | none | decreasing
properties.definiteness = 'positive'; % negative | none | positive
varargout{2} = properties;
% and the argument
varargout{3} = X;
case 'milp'
varargout{1} = [];
varargout{2} = [];
varargout{3} = [];
otherwise
error('Something is very wrong now...')
end
case 'sdpvar' % Always the same.
varargout{1} = yalmip('addextendedvariable',mfilename,varargin{:});
otherwise
end
|
The function sumk.m in YALMIP is implemented using this
framework and might serve as an additional fairly simple example. The
overloaded operator norm.m is also defined using this method,
but is a bit more involved, since it supports different norms. Note
that we with a slight abuse of notation use the terms increasing and
decreasing instead of nondecreasing and nonincreasing.
Adding new operators with mixed
integer models
If the convexity analysis fails, it is possible to have fall back
alternative models based on integer variables. If the operator is
called with the flag milp , a mixed integer exact model
can be
returned. As an illustration, here is a stripped down version of the
epigraph and MILP model of the absolute value of a real scalar.
function varargout = scalarrealabs(varargin)
switch class(varargin{1})
case 'double'
varargout{1} = abs(varargin{1});
case 'char' % YALMIP send 'graph' when it wants the epigraph or hypograph
switch varargin{1}
case 'graph'
t = varargin{2};
X = varargin{3};
varargout{1} = set(-t <= X <= t);
properties.convexity = 'convex';
properties.monotonicity = 'none';
properties.definiteness = 'positive';
varargout{2} = properties;
varargout{3} = X;
case 'milp'
t = varargin{2};
X = varargin{3};
d = binvar(1,1); % d=1 means x>0, d=0 means x<0
F = set([]);
M = 1e4; % Big-M constant
F = F + set(x >= -M*(1-d)) % d=1 means x >= 0
F = F + set(x <= M*d) % d=0 means x <= 0
F = F + set(-M*(1-d) <= t-x <= M*(1-d); % d=1 means t = X
F = F + set(-M*d <= t+x <= M*d; % d=0 means t = -X
varargout{1} = F;
properties.convexity = 'milp';
properties.monotonicity = 'milp';
properties.definiteness = 'milp';
varargout{2} = properties;
varargout{3} = X;
otherwise
error('Something is very wrong now...')
end
case 'sdpvar' % Always the same.
varargout{1} = yalmip('addextendedvariable',mfilename,varargin{:});
otherwise
end
|
MILP models are most often based on so called big-M models. For
these methods to work well, it is important to have as small
constants M as possible, but in the code above, we just picked a
number. For the MILP models defined by default in YALMIP (min,
max, abs and linear norms), more effort is spent on
choosing the
constants. To learn more about how you can do this for your model,
please check out the code for these models.
Adding evaluation based nonlinear
operators
General convex and concave functions are support in YALMIP by the
evaluation based nonlinear operator framework. The definition of
these operators are almost identical to the definition of standard
nonlinear operators. The following code implements a (simplified
version) of a scalar entropy measure -xlog(x).
function varargout = entropy(varargin)
switch class(varargin{1})
case 'double' % What is the numerical value of this argument (needed for displays etc)
varargout{1} = -varargin{1}*log(varargin{1});
case 'sdpvar' % Overloaded operator for SDPVAR objects.
varargout{1} = yalmip('addEvalVariable',mfilename,varargin{1});
case 'char' % YALMIP sends 'graph' when it wants the epigraph, hypograph or domain definition
switch varargin{1}
case 'graph'
t = varargin{2};
X = varargin{3};
% This is different from standard extended operators.
% Just do it!
F = SetupEvaluationVariable(varargin{:});
% Now add your own code, typically domain constraints
F = F + set(X > 0);
% Let YALMIP know about convexity etc
varargout{1} = F;
varargout{2} = struct('convexity','concave','monotonicity','none','definiteness','none');
varargout{3} = X;
case 'milp' % No MILP model available for entropy
varargout{1} = [];
varargout{2} = [];
varargout{3} = [];
otherwise
error('ENTROPY called with CHAR argument?');
end
otherwise
error('ENTROPY called with invalid argument.');
end
|
The evaluation based framework is primarily intended for scalar
functions, but can be extended to support element-wise vector functions. See the
implementation of the overloaded log operator for details.
|