A paper on optimization of a Lipschitz continuous function. Algorithms provided. Useful if the evaluation of function is expensive.

Definition of Lipschitz continuity: Function is Lipschitz continuous if

where and are metric on and respectively. For example, is Lipschitz continuous if

## Set up

- defined on a compact convex set
- Global optimization
- Search algorithm: produces sequence and selection of depends only on
- Performance metric of algorithm: speed of convergence
- Function has finite Lipschitz constant:

## Notations

- , i.e. dimension
- Standard -norm is used:

- Ball centered at with radius :

- Inner radius of :

- Diameter of :

- Volume of defined by by Lebesgue measure
- Set of -Lipschitz functions defined on :

- Set of all Lipschitz continusous functions:
- Uniform distribution:

## Definitions and propositions

Optimization consistency of algorithm , over set of functions , which generates :

Necessary and sufficient condition for consistency over the set of Lipschitz functions (also guarantee for global maximizer):

Convergence of pure random search: With probability no less than ,

Minimax rate of algorithm over -Lipschitz function according to Bull (2011): ,

where

and is expectation over the evaluation points . This suggests the convergence rate at .

## Algorithms

LIPO algorithm: for function with Lipschitz constant is known

- Input:
- Procedure
- Let
- Set , evaluate
- While :
- Let
- If
- Evaluate
- Update

- Output:

The if-statement inside while loop is the decision rule to accept only those that has the potential to maximize . The RHS is the global maximum encountered so far and LHS searches for the Lipschitz bound. It is provided as a lemma in the paper that is a maximizer iff the inequality satisfies.

Adaptive LIPO for unknow Lipschitz constant:

- Input:
- Procedure
- Let
- Set and , evaluate
- While :
- Draw a Bernoulli random variable with probability
- If , it is the exploration,
- If , it is the exploitation,
- Evaluate
- Update
- Update

- Output:

The Lipschitz constant estimate at each step is . The set is the set of potential maximizer, i.e., those that statisfied the inequality above, i.e.,

## Note

A naive implementation would not be any faster than Monte Carlo search if the inequality has to be evaluated every time. A more efficient approach would be to modify the random function to draw exactly those . This would be depend on shape of the function domain.

(PDF)

## Bibliographic data

```
@inproceedings{
title = "Global Optimization of Lipschitz Functions",
author = "Cedric Malherbe and Nicolas Vayatis",
year = "2017",
booktitle = "Proceedings of the 34th International Conference on Machine Learning",
address = "Sydney, Australia",
}
```