In this post, we’ll talk about the trace parameterization of nonnegative Hermitian trigonometric polynomials, providing a proof which depends on the Kalman-Yakubovich-Popov lemma. This follows very closely along the lines of Chapter 2, Section 2.5 of Dumitrescu’s book “Positive Trigonometric Polynomials and Signal Processing Applications” (Springer, 2007).
The trace parameterization of a Hermitian trigonometric polynomial which is nonnegative on the unit circle is an analog to the Riesz-Fejér theorem, which connects the convex set of trigonometric polynomials which are nonnegative on the unit circle to the set of positive definite matrices by means of causal polynomials. For this reason, the trace parameterization is often useful in computational settings, as it allows one bring the methods of semidefinite programming to bear on a problem involving these polynomials. We sketch an equivalent formulation of the trace parameterization, derived instead from the control-theoretic Kalman-Yakubovich-Popov lemma. This reveals a surprising connection between Hermitian trigonometric polynomials which are nonnegative on the unit circle and positive real transfer functions of linear time-invariant control systems with linear time-invariant feedback.
Basic Harmonic Analysis
Let us first establish some of the relevant Harmonic analysis terminology, notation, and definitions before proceeding. First, let us define the general object with which we will concern ourselves.
A specific subset of trigonometric polynomials will be the primary item on which we focus in this post.
Letting , the final condition above then says that . This allows us to see the symmetry which makes these polynomials particularly interesting, namely:
It can be shown that Hermitian trigonometric polynomials are real-valued as well. For with , we may write the complex coefficients of as . This then allows us to see the following holds for values on the unit circle:
So, it is indeed the case that Hermitian trigonometric polynomials are real-valued.
A portion of a Hermitian trigonometric polynomial is, itself another important class of polynomials. In fact, this class of polynomials allows such polynomials as we’ve seen above to be written in a very nice way.
That is, Causal polynomials are simply polynomials with no terms for .
Notice then that the causal part of as given above (which we denote by ) is simply
As another notational item, for a polynomial , let us write .
Now, with this notation in mind, note that — as for a Hermitian trigonometric polynomial — we may observe
So, Hermitian trigonometric polynomials are necessary the sum of two causal polynomials. Later, we will see that they are also the product of causal polynomials, demonstrating that these two classes of polynomials are indeed intimately linked.
Basic Control Theory
Control theory broadly concerns itself with control-parameterized dynamical systems of ODE’s of the form
where denotes and is a sufficiently smooth vector field (which generally depends on the context of the problem). We say that is the state vector, which is point-wise in , and is the and control vector, which is point-wise in . Both the state vector and control vector can be thought of as being bounded, continuous vector fields as well (for our purposes, as least), and we generally consider dynamics which are time-invariant (i.e. autonomous) and finite dimensional.
Considering a fixed initial control and some initial value problem , if the correct assumptions are made (i.e. satisfying Pickard-Lindelöff or the like), we can (in theory) find a solution . This essentially is the main problem in addressed by the theory of ODE’s. Control theory, very generally, is concerned with the reverse problem: Given a desired trajectory , we seek to find a control vector field such that is a solution to the IVP.
Frequently, we will also concern ourselves with systems that have feedback and controls which may depend on that feedback. Intuitively, feedback can be thought of as a measurement of part of a state of the system — think of the cruise control in a car, which accelerates or decelerates based in response to the measured speed of the vehicle and the present control being applied. More precisely, let us make the following definitions:
We will consider a particularly nice class of control systems called linear, time invariant systems, which we will now define.
For these systems, we will particularly concerned with two concepts that are essentially dual (and are dual general, not just in the linear case): controllability and observability. Before we see this link, let us first build up to each term’s respective definitions. First, we will need the notion of the reachable set.
That is, the reachable set at is everywhere you can steer the system to beginning at in a finite amount of time. Of particular interest is the case when we can steer the system anywhere we would like, which leads to the concept of controllability.
Essentially, the internal state of a controllable system can be moved from any initial state to any final state in finite time — it is both an intuitive notion and a rather strong condition!
The notion of observability can be seen in a relatively concrete fashion as well. Broadly, observability is is a measure for how much of the internal state of a system can be accurately inferred by the output. Think of a nice thermostat — if you set the tempurature to a certain level, the system will keep the the room at that level by turning on and off the heat in response to its internal thermometer, which (barring any strange lacunas in the air flow of the room) precisely yield a current measurement for the internal state of the system. More precisely, let us make the following definition for a simple class of systems with which we will concern ourselves:
In more colloquial terms, this says that the behavior of the entire system — no matter what fashion it has been steered in previously — can be determined at any time, and in finite time, by the output of the system alone. The link between observability and controllability can be seen rather naturally in this context: Controllability allows one to steer the internal state of a system anywhere one wishes in a finite amount of time irrespective of the starting point, while observability allows one to determine the internal state of a system in a finite amount of time irrespective of the previous steering.
For the case of linear, time-invariant systems, controllability and observability have particularly nice characterizations which make this duality quite clear. The results can be summarized by the following theorems:
This result follows from the fact that the range of the Kalman matrix is the reachable set from , i.e. . While it can be shown without much work that , we must construct the so-called controllability Gramian of to achieve our desired equality. This is outside the scope of our interests here, so we will omit the proof at present.
Now, we will see that a similar condition exists for observability.
Why this condition is necessary is perhaps less clear than the controllability criteria, and has to do with placing the eigenvalues of in the left of the complex plane by a suitable choice of in order to make the estimation error of a so-called Luenberger observer go to zero as . We will not delve in to this here, as it is somewhat lengthy. The basic gist can be summarized as saying that, if rows of the observability matrix are linearly independent, then each of the states of the system can be obtained through linear combinations of the output variable .
Finally, we must define the notion of a transfer function. In our context, a transfer function is simply a representation of the relation between the input and the output of a linear time-invariant system in terms of the temporal frequency. More precisely, in our case, the transfer function is the linear mapping of the Laplace transform of the input to the Laplace transform of the output. Let us now demonstrate what we mean.
Let denote the usual Laplace transform of a function (where is used here to represent a complex variable instead of the standard purely by historical convention).
Taking the Laplace transform of yields
Substituting this into the output, we yield
Thus, if is the transfer function, it is a linear mapping of to , and we result in the following definition:
This, combined with the rest of the terminology given here, turns out (surprisingly) to be enough to give an alternate derivation of the trace parameterization of Hermitian trigonometric polynomials.
Trace Parameterization: A Harmonic Analysis Perspective
Let us begin with some notation. For shorthand, we will denote the vector containing the canonical basis elements for a polynomial of degree by
Noting that , a familiarity with basic linear algebra allows us to see that the action of a vector of dimension (i.e. a linear functional) on generates a polynomial. That is, if where for each , then a new polynomial is generated by the following action:
The standard proof of the Riesz-Fejèr theorem uses this observation precisely to characterize nonnegative Hermitian trigonometric polynomials. The theorem is stated as follows:
Then, for a nonnegative Hermitian trigonometric polynomial, it is useful to note here that if (that is, if is restricted to the unit circle), the following holds:
That is, all nonnegative Hermitian trigonometric polynomials are the squared moduli of causal polynomials.
We may also note that, as , we then have that
So, a characterization of all Hermitian trigonometric polynomials nonnegative trigonometric polynomials is given by a choice of their respective causal polynomials. However, in some important computational circumstances, this characterization may not encode the coefficients in a particularly optimal way. For those circumstances in which the characterization of is indeed non-optimal, we may wish to consider a parameterization which relies on a particularly nice case of the observation — namely, when . This leads to a definition:
As suggested by the notation , a Gram matrix associated with a given Hermitian trigonometric polynomial need not be unique.
Following the definition of a Gram matrix, we may now provide a theorem yielding the so-called trace parameterization of a Hermitian trigonometric polynomial , which allows for a different method of encoding the coefficients .
First, let us observe that the following holds:
Note now that
Combining, this yields:
which does indeed yield the desired form of the coefficients .
A keen observer might now conjecture that there may be a connection between the non-negativity of and the properties of a given Gram matrix associated with . This is indeed the case, and can be summarized by the following theorem (whose proof we omit, as this is not our central focus):
In essence, the characterizations of trigonometric polynomials which are nonnegative on the unit circle given by trace parameterization and the Reisz-Fejèr theorem yields a link between those polynomials and positive definite matrices. This is particularly useful in a computation setting as both of these sets are convex (a fact which can be easily checked, but which we will not demonstrate here). As convexity is a required condition for optimization, this sheds light on the comment that trace parameterization may be a more useful method to encode these polynomials in some contexts.
Trace Parameterization: A Control Theory Perspective
Now, we will show that the trace parameterization of Hermitian trigonometric polynomials can, in fact, be derived in a control theory perspective. This relies on the so-called Kalman-Yakubovich-Popov lemma (or positive real lemma, as it is sometimes referred to). A specific, nicer case of the lemma — which is important enough that we will think of it as a theorem here — can be stated as follows:
To use this, let us recall that the causal part of a Hermitian trigonometric polynomial which is nonnegative on the unit circle is positive real. Let be such a polynomial. We now will place this causal part of into a linear time-invariant control system with a controllable pair and an observable pair. That is, we realize a controllable and observable linear time-invariant system such that . How this is done is slightly involved (we utilize the so called Rosenbrock system matrix to realize our desired transfer function), but for our purposes may be achieved by setting
Then, for some positive definite matrix , the matrix in the Kalman-Yakubovich-Popov Lemma for Observable Systems is positive definite and becomes
In fact, this turns out to be a Gram matrix of , which can be easily verified by simply computing .
Now, we may also note that
for . But this then yields that
Which is precisely the trace parameterization of Hermitian trigonometric polynomials we had previously derived!
- B. Dumitrescu, Positive Trigonometric Polynomials and Signal Processing Applications. Springer (2007).
- R. Lozano, B. Brogliato, O. Egeland, B. Maschke, Dissipative Systems Analysis and Control: Theory and Applications. Springer (2000)
- F. Jafari, Topics in Harmonic Analysis. University of Wyoming, Lecture Notes, Spring 2017 (currently unpublished).
- D. Meyer, Nonlinear Trajectory Generation and Control: Course Notes from an Introduction. University of Wyoming, Lecture notes Spring 2017 (currently unpublished).