# Cost function features and VQE¶

This notebook showcases the more advanced features of our cost functions using the example of VQE. Since QAOA can be thought of as a special case of VQE, everything said here applies also to the QAOA cost functions, unless otherwise explicitly mentioned.

As with all the Demo Notebooks, you need to start the Simulator and the
Quil Compiler in the background by typing `qvm -S`

and `quilc -S`

in
two open and disposable terminal windows.

## Short Intro: The Variational Quantum Eigensolver (VQE)¶

We begin with a short introduction to the VQE to establish nomenclature. The aim of VQE is to find the ground state and/or ground state energy of a given cost hamiltonian \(\hat{H}_\mathrm{cost}\). To do so, one prepares a trial state \(\left| \psi (\vec{\gamma})\right> = \hat{U}(\vec{\gamma}) \left| 0 \right>\) by applying a parametric program \(\hat{U}(\vec{\gamma})\) to the initial state \(\left| 0 \right>\), and then measures its energy expectation value with respect to the cost Hamiltonian, \(\left<\hat{H}_\mathrm{cost}\right>(\vec{\gamma}) = \left< \psi(\vec{\gamma}) \right|\hat{H}\left| \psi(\vec{\gamma})\right>\). This expectation value is then minimized by optimizing the parameters \(\vec{\gamma}\) in a classical optimizer until a minimum of \(\left<\hat{H}_\mathrm{cost}\right>(\vec{\gamma})\) is found for a parameter set \(\vec{\gamma}^*\). The lowest energy eigenstate of \(\hat{H_\mathrm{cost}}\) can now be prepared by applying \(\hat{U}(\vec{\gamma}^*)\) to \(\left| 0 \right>\), and its energy is given by \(E_0 = \left< \psi(\vec{\gamma}^*) \right|\hat{H}_\mathrm{cost}\left| \psi(\vec{\gamma}^*)\right>\)

Now it should also be clear, that QAOA can be considered as a special case of VQE where the Ansatz is fixed to be of the form

with the free parameters \(\vec{\beta}, \vec{\gamma}\).

Before we begin, let us first import all neccesary libraries:

```
# The usual combination of scipy, numpy and matplotlib
from scipy.optimize import minimize
import numpy as np
import matplotlib.pyplot as plt
# The pyquil dependencies
from pyquil.paulis import PauliSum, PauliTerm
from pyquil.api import WavefunctionSimulator, get_qc
from pyquil.quil import Program
from pyquil.gates import RY, H
from pyquil.unitary_tools import lifted_pauli
# A finally the cost functions
from entropica_qaoa.vqe.cost_function import (PrepareAndMeasureOnWFSim,
PrepareAndMeasureOnQVM)
from entropica_qaoa.qaoa.cost_function import QAOACostFunctionOnWFSim
from entropica_qaoa.utilities import pauli_matrix
# And one our QAOA parameter classes
from entropica_qaoa.qaoa.parameters import StandardParams, FourierParams
import warnings
warnings.filterwarnings('ignore')
```

## Setting up the problem¶

We start by creating a cost hamiltonian `hamiltonian`

and state
preparation program `prepare_ansatz`

. For demonstration purposes we
will use the simplest possible problem for VQE: the hamiltonian is the
bit-flip operator \(X\) (Pauli operator \(\sigma_X\)) on a
single qubit, and the parametric program consists of a single
\(R_y\)-rotation. The parameter \(\gamma\) is the rotation angle
of this rotation.

**Note**

Besides an `RY`

gate, we also need to add a `declare`

instruction to
`prepare_ansatz`

to declare a classical memory register. Later the
rotation angle `gamma`

will be written (by the user/optimiser) and
read (by the WavefunctionSimulaor/QVM) from this register. This design
allows our VQE and QAOA cost functions to make use of Quil’s parametric
compilation,
and we don’t have to recompile the program every time we update the
parameters.

```
# create the cost hamiltonian
hamiltonian = PauliSum([PauliTerm("X", 0)])
# and the parametric state preparation program:
prepare_ansatz = Program() # builds an empty program
params = prepare_ansatz.declare("params", # add a classical register to store the values in
memory_type="REAL", memory_size=1)
#prepare_ansatz.inst(H(0))
prepare_ansatz.inst(RY(params[0], 0))
print("The hamiltonian\n"
"---------------\n", hamiltonian)
print("\nThe program\n"
"-----------\n", prepare_ansatz)
```

```
The hamiltonian
---------------
(1+0j)*X0
The program
-----------
DECLARE params REAL[1]
RY(params[0]) 0
```

Next we can create a cost function to be passed to a classical optimizer
later. We do this using `vqe.cost_function.PrepareAndMeasureOnWFSim`

,
a class that combines a cost hamiltonian `hamiltonian`

and a state
preparation circuit `prepare_ansatz`

. This cost function can
subsequently be passed to any classical optimizer - here we will use
methods available in `scipy.optimize.minimize`

.

```
# create the cost_function with our ansatz and hamiltonian:
cost_fun = PrepareAndMeasureOnWFSim(prepare_ansatz=prepare_ansatz,
make_memory_map=lambda p: {"params": p},
hamiltonian=hamiltonian)
```

With the cost function set up, let us have a look at it graphically:

```
gammas = np.linspace(-3,3,200)
exp_vals = np.empty_like(gammas)
for i, v in enumerate(gammas):
exp_vals[i] = cost_fun([v])
plt.plot(gammas, exp_vals)
plt.xlabel(r"$\gamma$")
plt.ylabel("Cost function")
plt.show()
```

We can also find the minimal function value and argument:

```
# the initial argument
gamma0 = [0]
# and minimization
out = minimize(cost_fun, gamma0, method="Cobyla")
print(out)
```

```
fun: -0.9999999954355244
maxcv: 0.0
message: 'Optimization terminated successfully.'
nfev: 24
status: 1
success: True
x: array([-1.57070078])
```

**Aside: Interpretation for the interested reader**

The plot above can be interpreted as follows. The qubit starts in the state \(|0\rangle\), and the operator \(RY(\gamma)\) rotates its Bloch vector about the Y-axis through an angle \(\gamma\). For \(\gamma = 0\) the qubit remains in \(|0\rangle\), which is an eigenstate of the Pauli \(Z\) operator, and therefore the expectation value of the cost function \(X\) is zero. For \(\gamma = \pm\pi\), we flip the qubit to the state \(|1\rangle\), which is again an eigenstate of the \(Z\) operator, and thus has zero expectation value in the \(x\)-basis. When \(\gamma = \pm \pi/2\), we create the superposition states \(|\pm\rangle = (|0\rangle \pm |1\rangle)/\sqrt{2}\), which are eigenstates of the \(X\) operator, and we therefore find the maximum and minimum values of the cost function, \(\pm 1\).

The result `out`

should now contain the minimal eigenvalue of
`hamiltonian`

as the minimum function value, and the correct
parameters for `prepare_ansatz`

to prepare the corresponding
eigenstate. We can compare this with the real minimum eigenvalue, by
printing `hamiltonian`

as a matrix:

```
print("The output of scipy.optimize.minimize:\n",out)
print("\n And the eigenvalues of the hamiltonian:\n", np.linalg.eigvalsh(pauli_matrix(hamiltonian)))
```

```
The output of scipy.optimize.minimize:
fun: -0.9999999954355244
maxcv: 0.0
message: 'Optimization terminated successfully.'
nfev: 24
status: 1
success: True
x: array([-1.57070078])
And the eigenvalues of the hamiltonian:
[-1. 1.]
```

This looks good - it seems we found the ground state almost exactly. On a real quantum computer, however, we can’t measure the expectation value of a state directly. Instead, we have to take multiple samples, and calculate the mean of the resulting output values. The more samples we take from the output state, the closer the sample mean of the ground state energy will come to its true mean. We can think of the effect of taking only a finite number of samples as adding uncertainty, or noise, to the true underlying probability distribution.

In practice, if we have only an estimate of the energy, but not its true value, this will influence the effectiveness of different optimisers. Gradient-based optimisers, for example, work well when we have exact function values, and we can compute gradients through finite differences to determine where to move in the next iteration. In the presence of noisy function value estimates, we may expect such methods to be less effective.

To effciently simulate the effect of this sampling noise, the
`PrepareAndMeasureOnWFSim`

has an argument `nshots`

that we will
demonstrate in the following section.

## Simulating sampling noise¶

To see how this sampling noise influences the behaviour of different
optimisers, we provide the `nshots`

option in the Wavefunction
Simulator cost function `PrepareAndMeasureOnWFSim`

. If we set
`nshots=N`

in either the cost function constructor or the cost
function call, it will return the true expectation value plus simulated
sampling noise with the (approximate) statistics of the true noise at
\(N\) samples. Note the word “approximate” here: to speed up
simulations, we use a Gaussian approximation of the noise, instead of
actually taking \(N\) samples. This works very well for large
\(N\) (where it also gives the largest speed up), but can lead to
unphysical results for small \(N\), such as expectation values lower
than the actual minimum expectation value. More details on the
implementation of this simulated sampling noise are given in the last
section below.

[For clarity, note that if we are using the QVM or QPU and corresponding
cost function `PrepareAndMeasureOnQVM`

, the argument `nshots`

is
compulsory, and the measurement noise is not mimicked using the Gaussian
approximation - it is the true noise obtained by taking the specified
finite number of samples.]

Let us now create a cost function that adds the simulated sampling noise for \(N=1000\) shots, and also \(N = 5\) shots.

```
# create the cost_function with our ansatz and hamiltonian:
cost_fun = PrepareAndMeasureOnWFSim(prepare_ansatz=prepare_ansatz,
make_memory_map=lambda p: {"params": p},
hamiltonian=hamiltonian,
nshots = 1000)
```

```
gammas = np.linspace(-3,2,200)
exp_vals1 = np.empty_like(gammas)
for i, v in enumerate(gammas):
exp_vals1[i] = cost_fun([v])
exp_vals2 = np.empty_like(gammas)
for i, v in enumerate(gammas):
exp_vals2[i] = cost_fun([v], nshots=5) # Note: by passing in nshots here, we override the value of 1000
# from above
```

Now plot the outcomes side by side:

```
plt.figure(figsize=(10,4))
plt.subplot(121)
plt.plot(gammas, exp_vals2, label="cost function")
plt.title("nshots=5")
plt.legend()
plt.xlabel(r"$\gamma$")
plt.subplot(122)
plt.plot(gammas, exp_vals1, label="cost function")
plt.xlabel(r"$\gamma$")
plt.legend()
plt.title("nshots=1000")
plt.tight_layout()
```

In the left panel we see that when `nshots`

is very small, we can
observe unphysical results where the outcome is higher (lower) than the
maximum (minimum) possible energy. This is because we are using a
Gaussian approximation of the real sampling noise, which should only be
valid when the number of shots is sufficiently large. In the right
panel, we see that for a much larger value of `nshots`

, the unphysical
results are suppressed.

## Getting the measurement variance¶

According to the central limit theorem, when `nshots`

=
\(\infty\), the sample mean converges to the true mean (i.e. the
mean obtained directly from the wavefunction), while the standard
deviation of the sample mean tends to zero. For a finite value of
`nshots`

, the sample mean itself has a non-zero standard deviation. We
can access this standard deviation by setting the flag
`scalar_cost_function=False`

in the constructor of the cost functions.
The cost function then returns both the sample mean and its standard
deviation.

As an implementation note, if we specify a sample size of `nshots=0`

,
no noise is added to the mean, and thus the standard deviation is also
set to zero. This case is therefore equivalent to setting `nshots`

=
\(\infty\).

Here is the above code again, with the flag
`scalar_cost_function=False`

:

```
# create the cost_function with our ansatz and hamiltonian:
cost_fun = PrepareAndMeasureOnWFSim(prepare_ansatz=prepare_ansatz,
make_memory_map=lambda p: {"params": p},
hamiltonian=hamiltonian,
nshots = 30,
scalar_cost_function=False)
# get the means and standard deviations
gammas = np.linspace(-3,3,200)
exp_vals = np.empty_like(gammas)
std_devs = np.empty_like(gammas)
for i, v in enumerate(gammas):
exp_vals[i], std_devs[i] = cost_fun([v])
# and plot both
plt.plot(gammas, exp_vals, label="cost function")
plt.fill_between(gammas, exp_vals - std_devs, exp_vals + std_devs,
label=r"$1\sigma$ interval", color='g',alpha=0.2)
plt.xlabel(r"$\gamma$")
plt.legend();
```

## Logging of the optimisation process¶

In the previous sections, we focused on single-shot measurements of the
cost function. We now turn our attention to features that facilitate
understanding of the process of cost function optimisation. For
debugging and benchmarking purposes, it is often interesting to
visualise how the optimisation progresses, and at which parameter values
the cost function is called. We therefore provide the option to create a
log of the cost function calls, `cost_function.log`

.

Since most typical optimisers (e.g. those contained in
`scipy.optimize.minimze`

) expect the objective function to return only
a scalar value (the function value), and not a tuple (i.e. the value and
standard deviation), we cannot use the same methods above to return the
standard deviation; we must set `scalar_cost_function = True`

(since
this is the default value, we need not actually specify it). Instead, we
will be able to access the standard deviation, as well as other
information, through the optimiser log.

To create the optimisation log, we set `enable_logging=True`

.

```
# create the cost_function with our ansatz and hamiltonian:
cost_fun = PrepareAndMeasureOnWFSim(prepare_ansatz=prepare_ansatz,
make_memory_map=lambda p: {"params": p},
hamiltonian=hamiltonian,
nshots=100,
enable_logging=True)
# and find the optimal value
gamma0 = [0]
# and minimization
out = minimize(cost_fun, gamma0, method="Cobyla")
print(out)
```

```
fun: -1.0077099207193003
maxcv: 0.0
message: 'Optimization terminated successfully.'
nfev: 21
status: 1
success: True
x: array([-1.484475])
```

We can examine the log as follows:

```
# extract gamma and function value for each of the function calls in the log
gamma_log = np.array([step.x for step in cost_fun.log])
fun_log = np.array([step.fun for step in cost_fun.log])
fun_log
```

```
array([[-2.32444817e-01, 1.00000000e-01],
[ 8.63425400e-01, 5.40302306e-02],
[-8.76703509e-01, 5.40302306e-02],
[-8.40539162e-01, 4.16146837e-02],
[-1.00781855e+00, 7.07372017e-03],
[-8.87544533e-01, 4.16146837e-02],
[-9.73698167e-01, 1.78246056e-02],
[-9.97341147e-01, 1.94547708e-02],
[-9.99675894e-01, 8.29623162e-04],
[-9.87282974e-01, 1.01869310e-02],
[-1.00573566e+00, 5.51433419e-03],
[-1.00865607e+00, 7.85278933e-03],
[-1.01279134e+00, 8.63137919e-03],
[-9.83984561e-01, 9.40944224e-03],
[-9.96383742e-01, 8.24214714e-03],
[-9.94521526e-01, 8.82594620e-03],
[-9.96615159e-01, 8.53408329e-03],
[-1.00429095e+00, 8.68002406e-03],
[-9.89020365e-01, 8.60705599e-03],
[-1.00707282e+00, 8.64134183e-03],
[-1.00770992e+00, 8.62141647e-03]])
```

Here, in each of the 22 function evaluations, we log the function value and its standard deviation. We can visualise this information, along with the parameter values at each function call:

```
# create an array for the x-axis:
x = np.arange(out["nfev"])
fig, ax = plt.subplots(1,2, figsize=(12,4))
ax[0].plot(x, fun_log[:,0], label="cost_function")
ax[0].fill_between(x, fun_log[:,0] - fun_log[:,1], fun_log[:,0] + fun_log[:,1],
alpha=0.2, color='g',label="standard deviation")
ax[0].legend()
ax[0].set_xlabel("step")
ax[1].plot(x, gamma_log, label=r"$\gamma$")
ax[1].legend()
ax[1].set_xlabel("step");
```

## Running on the QVM or QPU¶

So far, we have run all our experiments on the Wavefunction Simulator.
Eventually, however, we may also want to run them on the the QVM, or
even the real QPU. Since these don’t return a wavefunction, quantities
of interest such as the energy expectation value and its standard
deviation are determined by taking samples from the device. It is
therefore essential to provide a value for the argument `nshots`

, and
we instead use the cost function constructors `PrepareAndMeasureOnQVM`

(for general VQE) and `QAOACostFunctionOnQVM`

(for QAOA). These behave
mostly identically to `PrepareAndMeasureOnWFSim`

and
`QAOACostFunctionOnWFSim`

, with a few differences:

We must pass an argument

`qvm`

, which is either an identification string for a QVM type such as`2q-qvm`

, or a connection to a QVM or QPU (see Rigetti’s docs).There is an additional argument

`base_numshots`

, which acts as a multiplier of`nshots`

. This number is then hard-compiled into the circuit, whereas`nshots`

can be changed dynamically during the optimisation (however, to do so would require writing a custom optimiser). This may be of interest for users working with more sophisticated optimisers. A more detailed explanation can be found in the FAQs section of the documentation.

We now walk through the above example again, only this time running on the QVM. As in Simulating Sampling Noise, we will calculate the cost function twice - once with 5 samples per point, and once with 1000 samples per point.

```
# this time we really need a QVM
qvm = get_qc("2q-qvm")
# sample 5 times
cost_fun = PrepareAndMeasureOnQVM(prepare_ansatz=prepare_ansatz,
make_memory_map=lambda p: {"params": p},
hamiltonian=hamiltonian,
qvm = qvm,
base_numshots = 5,
nshots = 1,
scalar_cost_function=False)
gammas = np.linspace(-3,3,200)
exp_vals1 = np.empty_like(gammas)
std_devs1 = np.empty_like(gammas)
for i, v in enumerate(gammas):
exp_vals1[i], std_devs1[i] = cost_fun([v])
# sample 1000 times
cost_fun = PrepareAndMeasureOnQVM(prepare_ansatz=prepare_ansatz,
make_memory_map=lambda p: {"params": p},
hamiltonian=hamiltonian,
qvm = qvm,
base_numshots = 1000,
nshots = 1,
scalar_cost_function=False)
exp_vals2 = np.empty_like(gammas)
std_devs2 = np.empty_like(gammas)
for i, v in enumerate(gammas):
exp_vals2[i], std_devs2[i] = cost_fun([v])
```

Observe how these computations take appreciably longer than in the section Simulating Sampling Noise above, where we instead took account of the sampling noise using a Gaussian approximation. In the present case, we must actually take 1000 samples per point, which represents a significant computational overhead.

Let us plot the results:

```
plt.figure(figsize=(10,4))
plt.subplot(121)
plt.plot(gammas, exp_vals1, label="cost function")
plt.fill_between(gammas, exp_vals1 - std_devs1, exp_vals1 + std_devs1,
label=r"$1\sigma$ interval", color='g', alpha=0.2)
plt.xlabel(r"$\gamma$")
plt.legend()
plt.title("nshots=5")
plt.subplot(122)
plt.plot(gammas, exp_vals2, label="cost function")
plt.fill_between(gammas, exp_vals2 - std_devs2, exp_vals2 + std_devs2,
label=r"$1\sigma$ interval",color='g', alpha=0.2)
plt.title("nshots=1000")
plt.xlabel(r"$\gamma$")
plt.legend()
plt.tight_layout()
```

Evidently, this time there are no values outside of the range [-1,1], as
would also be the case on a real QPU. Furthermore, the plot in the right
panel for `nshots=5`

looks considerably different from the one earlier
with simulated sampling noise. The left plot for `nshots=1000`

, on the
other hand, looks very similar to the `nshots=1000`

plot with the
simulated sampling noise. This shows again that our simulated sampling
noise works well for larger sample numbers.

## Using other optimisers¶

In the above examples, we have used `scipy.optimize.minimize`

as our
optimiser, with COYBLA as the specific method. Even within this Scipy
package there are several different methods we could choose - for the
full list, see
here.
Alternatively, we could use a completely different optimisation package.

Let’s redo the above example using some different methods, comparing the performance and time taken of each. Here we’ll use the explicit calculation of the expected energy value.

```
import time
# create the cost_function with our ansatz and hamiltonian:
cost_fun = PrepareAndMeasureOnWFSim(prepare_ansatz=prepare_ansatz,
make_memory_map=lambda p: {"params": p},
hamiltonian=hamiltonian,
nshots = 0, # exact calculution of energy
scalar_cost_function = True,
enable_logging = True)
# the initial value
gamma0 = [0]
```

```
methods = ['Nelder-Mead', 'SLSQP', 'Cobyla']
for method in methods:
initial_time = time.time()
# minimization
out = minimize(cost_fun, gamma0, method=method)
final_time = time.time()
total_time = final_time - initial_time
print('Method: ', method)
print('Time taken: ', total_time)
print('Result:')
print(out)
print('\n')
```

```
Method: Nelder-Mead
Time taken: 0.8190350532531738
Result:
final_simplex: (array([[-1.5708125],
[-1.57075 ]]), array([-1., -1.]))
fun: -0.9999999998692137
message: 'Optimization terminated successfully.'
nfev: 52
nit: 26
status: 0
success: True
x: array([-1.5708125])
Method: SLSQP
Time taken: 0.16961216926574707
Result:
fun: -0.9999999984694352
jac: array([-5.53205609e-05])
message: 'Optimization terminated successfully.'
nfev: 10
nit: 3
njev: 3
status: 0
success: True
x: array([-1.57085165])
Method: Cobyla
Time taken: 0.4087040424346924
Result:
fun: -0.9999999954355244
maxcv: 0.0
message: 'Optimization terminated successfully.'
nfev: 24
status: 1
success: True
x: array([-1.57070078])
```

Clearly, all three tested optimisers found the same minimum. However, it took Nelder-Mead 52 function evaluations for convergence, while COBYLA and SLSQP needed only 10 and 24 function evaluations, respectively. The latter two therefore seem to be superior.

However, it is important to note that we set `nshots=0`

, meaning that
the exact expectation value is returned by the cost function. If we had
set `nshots`

to some finite value, SLSQP would have had a much harder
time, because it is a gradient-based algorithm: it decides where to move
in the next step by computing the gradient using finite differences. To
work effectively, this requires exact values of the cost function, which
is not possible when taking samples (unless, of course, an infinite
number of samples is taken).

We can also use a different optimisation library altogether. Here we
demonstrate a Bayesian optimisation function from `scikit-optimize`

(docs here).

**Note that to run the cell below you will need to install
scikit-optimize.**

```
from skopt import gp_minimize
import math
initial_time = time.time()
out = gp_minimize(cost_fun,
n_calls=15,
dimensions=[(0, 2*math.pi)],
x0=gamma0)
final_time = time.time()
total_time = final_time - initial_time
print('Method: ', 'Bayesian Optimisation')
print('Time taken: ', total_time)
print('Result:')
print(out)
print('\n')
```

```
Method: Bayesian Optimisation
Time taken: 3.203672409057617
Result:
fun: -0.999991600695217
func_vals: array([ 0. , 0.8768013 , -0.6238324 , 0.99655989, -0.39643009,
0.33248555, 0.9932913 , 0.2975288 , -0.98315755, -0.4315722 ,
0.84533808, -0.99477806, -0.99997487, -0.9999573 , -0.9999916 ])
models: [GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
kernel=1**2 * Matern(length_scale=1, nu=2.5) + WhiteKernel(noise_level=1),
n_restarts_optimizer=2, noise='gaussian', normalize_y=True,
optimizer='fmin_l_bfgs_b', random_state=758834708), GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
kernel=1**2 * Matern(length_scale=1, nu=2.5) + WhiteKernel(noise_level=1),
n_restarts_optimizer=2, noise='gaussian', normalize_y=True,
optimizer='fmin_l_bfgs_b', random_state=758834708), GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
kernel=1**2 * Matern(length_scale=1, nu=2.5) + WhiteKernel(noise_level=1),
n_restarts_optimizer=2, noise='gaussian', normalize_y=True,
optimizer='fmin_l_bfgs_b', random_state=758834708), GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
kernel=1**2 * Matern(length_scale=1, nu=2.5) + WhiteKernel(noise_level=1),
n_restarts_optimizer=2, noise='gaussian', normalize_y=True,
optimizer='fmin_l_bfgs_b', random_state=758834708), GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
kernel=1**2 * Matern(length_scale=1, nu=2.5) + WhiteKernel(noise_level=1),
n_restarts_optimizer=2, noise='gaussian', normalize_y=True,
optimizer='fmin_l_bfgs_b', random_state=758834708)]
random_state: RandomState(MT19937) at 0x7EFDD46975A0
space: Space([Real(low=0, high=6.283185307179586, prior='uniform', transform='normalize')])
specs: {'args': {'func': <entropica_qaoa.vqe.cost_function.PrepareAndMeasureOnWFSim object at 0x7efda5eedc50>, 'dimensions': Space([Real(low=0, high=6.283185307179586, prior='uniform', transform='normalize')]), 'base_estimator': GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
kernel=1**2 * Matern(length_scale=1, nu=2.5),
n_restarts_optimizer=2, noise='gaussian', normalize_y=True,
optimizer='fmin_l_bfgs_b', random_state=758834708), 'n_calls': 15, 'n_random_starts': 10, 'acq_func': 'gp_hedge', 'acq_optimizer': 'auto', 'x0': [0], 'y0': None, 'random_state': RandomState(MT19937) at 0x7EFDD46975A0, 'verbose': False, 'callback': None, 'n_points': 10000, 'n_restarts_optimizer': 5, 'xi': 0.01, 'kappa': 1.96, 'n_jobs': 1}, 'function': 'base_minimize'}
x: [4.708290366828594]
x_iters: [[0], [2.07242348685137], [5.6095486036037965], [1.6537672709750455], [3.5492177078893405], [0.33893784178429537], [1.6866947139371509], [2.8394894653558724], [4.528595933473349], [3.5878275735781333], [1.0071976821501234], [4.814628850783941], [4.705298919252173], [4.7031472197413935], [4.708290366828594]]
```

The usage of Bayesian optimisation to solve this very simple problem is slightly overkill. However, for VQE or QAOA problems where the cost function evaluations themselves are expensive, and the parameter space itself is not too large, Bayesian optimisation may be worth investigating. A good primer can be found here.

## Towards QAOA¶

A more detailled explanation of our QAOA library can be found in the Notebooks First steps: An example workflow, Working with the Parameter classes, and Advanced QAOA parameter classes. Here we simply explain how it can be regarded as a special case of VQE.

For QAOA - which was originally designed for solving classical optimization problems - the Hamiltonian is diagonal in the computational basis, and typically contains at most 2-qubit terms (there is nothing to prevent one from considering k-qubit terms, however the limitations of near-term hardware make the k = 2 case the most practically feasible).

Let’s set up a simple Hamiltonian.

```
hamiltonian = []
hamiltonian.append(PauliTerm("Z", 0, -1)*PauliTerm("Z", 1))
hamiltonian.append(PauliTerm("Z", 0, 0.8))
hamiltonian.append(PauliTerm("Z", 1, -0.5))
hamiltonian = PauliSum(hamiltonian)
print(hamiltonian)
```

```
(-1+0j)*Z0*Z1 + (0.8+0j)*Z0 + (-0.5+0j)*Z1
```

Since the parameters for a QAOA circuit have more structure than just a
flat array, and there exist multiple possible parametrisations, we
provide special classes to hold the parameters for a QAOA circuit. We
will use the `FourierParams`

class here. We can create these initial
parameters as follows:

```
params = FourierParams.linear_ramp_from_hamiltonian(hamiltonian, n_steps=5, q=3)
```

The QAOA cost function has a fixed structure, with a corresponding fixed
state preparation program. We therefore provide special cost functions
for QAOA, which inherit most of the behaviour from
`vqe.cost_functions.PrepareAndMeasure...`

. They are created via

```
qaoa_cost_fun = QAOACostFunctionOnWFSim(hamiltonian,
params,
nshots=1000,
enable_logging=True)
```

Unlike for `PrepareAndMeasureOnWFSim`

, we didn’t have to pass a state
preparation circuit `prepare_ansatz`

or function to generate memory
maps `make_memory_map`

to `QAOACostFunctionOnWFSim`

. These are
already fixed by the fact that we want to run QAOA with a given cost
Hamiltonian. Instead, we have to pass the QAOA parameters `params`

to
the cost function.

If we want to find the optimal parameters, we have to provide our
optimiser with some initial parameter set. The object `params`

contains information on both the problem hyperparameters, as well as the
variable parameters to be optimised - see Working with the Parameter classes
for further information.

```
params
```

```
Hyperparameters:
register: [0, 1]
qubits_singles[0, 1]
qubits_pairs[[0, 1]]
Parameters:
v: [0.35 0. 0. ]
u: [0.35 0. 0. ]
```

We can obtain a 1D array with all of our variable parameters - here
denoted `u`

and `v`

- using the `params.raw()`

method: we can
subsequently pass these to an optimiser, such as a method from
`scipy.optimize.minimize`

:

```
p0 = params.raw()
out = minimize(qaoa_cost_fun, p0, tol=1e-3, method="Cobyla", options={"maxiter": 500})
print("The output of scipy.optimize.minimize:\n",out)
print("\n And hamiltonian as a matrix:\n", lifted_pauli(hamiltonian, hamiltonian.get_qubits()))
```

```
The output of scipy.optimize.minimize:
fun: -1.0771291890106551
maxcv: 0.0
message: 'Optimization terminated successfully.'
nfev: 44
status: 1
success: True
x: array([ 0.37708678, 0.0304932 , 0.11447838, 0.40621426, -0.07116865,
-0.02260146])
And hamiltonian as a matrix:
[[-0.7+0.j 0. +0.j 0. +0.j 0. +0.j]
[ 0. +0.j -0.3+0.j 0. +0.j 0. +0.j]
[ 0. +0.j 0. +0.j 2.3+0.j 0. +0.j]
[ 0. +0.j 0. +0.j 0. +0.j -1.3+0.j]]
```

Examining the logs this time involves a little extra work. The logging functionality simply appends the array of current parameters (i.e. at any step in the optimisation) to the log. For instance, the 10th log entry reads:

```
qaoa_cost_fun.log[9]
```

```
LogEntry(x=array([ 0.57650078, -0.06356536, -0.05304437, 0.29571643, -0.03284953,
-0.01781406]), fun=(-0.6191671592796331, 0.02776909955792431))
```

The `LogEntry`

array contains the `u`

and `v`

parameters for all
three values of the Fourier parameter `q`

. Meanwhile, the `fun`

entry is of the form (function value, standard deviation).

To disentangle the `u`

and `v`

parameters, we can pipe them through
the `params`

instance again, using the method `.update_from_raw()`

:

```
# logs of the parameter values
u_log = []
v_log = []
for step in qaoa_cost_fun.log:
params.update_from_raw(step.x)
u_log.append(params.u)
v_log.append(params.v)
# create arrays from the lists
u_log = np.array(u_log)
v_log = np.array(v_log)
# log of the function values
fun_log = np.array([step.fun for step in qaoa_cost_fun.log])
# create an array for the x-axis:
x = np.arange(out["nfev"])
```

Now we can plot the information in the log:

```
fig, ax = plt.subplots(1,2, figsize=(12,4))
ax[0].plot(x, fun_log[:,0], label="cost_function(p)")
ax[0].fill_between(x, fun_log[:,0] - fun_log[:,1], fun_log[:,0] + fun_log[:,1], alpha=0.3)
ax[0].legend()
ax[0].set_xlabel("step")
for i in range(3):
ax[1].plot(x, v_log[:,i], label=f"v[{i}]")
for i in range(3):
ax[1].plot(x, u_log[:,i], label=f"u[{i}]")
ax[1].legend()
ax[1].set_xlabel("step");
```

We can also plot the final, optimal parameters alone, with the built-in
`.params.plot()`

function.

Note: In this case we are working with `Fourier`

parameters `u`

and
`v`

, but the actual circuit parameters `betas`

and `gammas`

are
generally those of interest. When we call the `.plot()`

function on a
set of `Fourier`

params, they are automatically converted back to the
`betas`

and `gammas`

.

```
params.update_from_raw(out["x"])
params.plot();
```

## Appendix: Simulated measurement noise - statistics implementation details¶

The attentive observer will have noticed that when we add simulated
measurement noise via the `nshots`

option on the Wavefunction
Simulator, we sometimes find function values below (above) the minimum
(maximum) eigenvalue of the Hamiltonian. As explained above, this is
because we “fake” the sampling noise when using the wavefunction-based
cost functions `PrepareAndMeasureOnWFSim`

and
`QAOACostFunctionOnWFSim`

. We first calculate the true energy
expectation value and variance via

and then return, in accordance with the central limit theorem, this energy expectation value plus appropriately scaled Gaussian noise, along with the standard deviation of the mean:

Now in some examples above, for the purposes of illustrating the
limitations of this method, we used extremely small numbers of shots
\(\leq 10\). For such small values of `nshots`

, the central limit
theorem does not hold, and we get the afore mentioned unphysical results
on occasion. In practice, in a real VQE or QAOA run, one would in any
case take much larger numbers of shots.

On the other hand, the sampling-based cost functions
`PrepareAndMeasureOnQVM`

and `QAOACostFunctionOnQVM`

don’t need to
‘fake’ the sampling noise, and we are guaranteed to get physical
results. This comes at the cost of much slower simulations, since many
random numbers have to be generated.