Parametric Portfolio Policies


You are reading the work-in-progress edition of Tidy Finance with Python. Code chunks and text might change over the next couple of months. We are always looking for feedback via Meanwhile, you can find the complete R version here.

In this chapter, we apply different portfolio performance measures to evaluate and compare portfolio allocation strategies. For this purpose, we introduce a direct way to estimate optimal portfolio weights for large-scale cross-sectional applications. More precisely, the approach of Brandt, Santa-Clara, and Valkanov (2009) proposes to parametrize the optimal portfolio weights as a function of stock characteristics instead of estimating the stock’s expected return, variance, and covariances with other stocks in a prior step. We choose weights as a function of the characteristics, which maximize the expected utility of the investor. This approach is feasible for large portfolio dimensions (such as the entire CRSP universe) and has been proposed by Brandt, Santa-Clara, and Valkanov (2009). See the review paper Brandt (2010) for an excellent treatment of related portfolio choice methods.

The current chapter relies on the following set of Python packages:

import pandas as pd
import numpy as np
import sqlite3
import statsmodels.api as sm

from itertools import product, starmap
from scipy.optimize import minimize

Compared to previous chapters, we introduce the scipy.optimize module from the scipy (Virtanen et al. 2020) for solving optimization problems.

Data Preparation

To get started, we load the monthly CRSP file, which forms our investment universe. We load the data from our SQLite-database introduced in Accessing and Managing Financial Data and WRDS, CRSP, and Compustat.

tidy_finance = sqlite3.connect(

crsp_monthly = (pd.read_sql_query(
  sql=("SELECT permno, month, ret_excess, mktcap, mktcap_lag " 
       "FROM crsp_monthly"),

To evaluate the performance of portfolios, we further use monthly market returns as a benchmark to compute CAPM alphas.

factors_ff_monthly = (pd.read_sql_query(
  sql="SELECT month, mkt_excess FROM factors_ff3_monthly",

Next, we retrieve some stock characteristics that have been shown to have an effect on the expected returns or expected variances (or even higher moments) of the return distribution. In particular, we record the lagged one-year return momentum (momentum_lag), defined as the compounded return between months \(t-13\) and \(t-2\) for each firm. In finance, momentum is the empirically observed tendency for rising asset prices to rise further, and falling prices to keep falling (Jegadeesh and Titman 1993). The second characteristic is the firm’s market equity (size_lag), defined as the log of the price per share times the number of shares outstanding (Banz 1981). To construct the correct lagged values, we use the approach introduced in WRDS, CRSP, and Compustat.

crsp_monthly_lags = (crsp_monthly
    month=lambda x: x["month"] + pd.DateOffset(months=13)
  .get(["permno", "month", "mktcap"])

crsp_monthly = (crsp_monthly
         on=["permno", "month"],
         suffixes=["", "_13"])

data_portfolios = (crsp_monthly
    momentum_lag=lambda x: x["mktcap_lag"] / x["mktcap_13"],
    size_lag=lambda x: np.log(x["mktcap_lag"])
  .dropna(subset=["momentum_lag", "size_lag"])

Parametric Portfolio Policies

The basic idea of parametric portfolio weights is as follows. Suppose that at each date \(t\) we have \(N_t\) stocks in the investment universe, where each stock \(i\) has a return of \(r_{i, t+1}\) and is associated with a vector of firm characteristics \(x_{i, t}\) such as time-series momentum or the market capitalization. The investor’s problem is to choose portfolio weights \(w_{i,t}\) to maximize the expected utility of the portfolio return: \[\begin{aligned} \max_{\omega} E_t\left(u(r_{p, t+1})\right) = E_t\left[u\left(\sum\limits_{i=1}^{N_t}\omega_{i,t}r_{i,t+1}\right)\right] \end{aligned}\] where \(u(\cdot)\) denotes the utility function.

Where do the stock characteristics show up? We parameterize the optimal portfolio weights as a function of the stock characteristic \(x_{i,t}\) with the following linear specification for the portfolio weights: \[\omega_{i,t} = \bar{\omega}_{i,t} + \frac{1}{N_t}\theta'\hat{x}_{i,t},\] where \(\bar{\omega}_{i,t}\) is a stock’s weight in a benchmark portfolio (we use the value-weighted or naive portfolio in the application below), \(\theta\) is a vector of coefficients which we are going to estimate, and \(\hat{x}_{i,t}\) are the characteristics of stock \(i\), cross-sectionally standardized to have zero mean and unit standard deviation.

Intuitively, the portfolio strategy is a form of active portfolio management relative to a performance benchmark. Deviations from the benchmark portfolio are derived from the individual stock characteristics. Note that by construction the weights sum up to one as \(\sum_{i=1}^{N_t}\hat{x}_{i,t} = 0\) due to the standardization. Moreover, the coefficients are constant across assets and over time. The implicit assumption is that the characteristics fully capture all aspects of the joint distribution of returns that are relevant for forming optimal portfolios.

We first implement cross-sectional standardization for the entire CRSP universe. We also keep track of (lagged) relative market capitalization relative_mktcap, which will represent the value-weighted benchmark portfolio, while n denotes the number of traded assets \(N_t\), which we use to construct the naive portfolio benchmark.

data_portfolios = (data_portfolios
  .groupby("month", group_keys=True)
  .apply(lambda x: x.assign(
    relative_mktcap=x["mktcap_lag"] / x["mktcap_lag"].sum()
  .transform(lambda x: (x - x.mean()) / x.std()
                       if"lag") else x)
  .drop(["mktcap_lag"], axis=1)

Computing Portfolio Weights

Next, we move on to identify optimal choices of \(\theta\). We rewrite the optimization problem together with the weight parametrization and can then estimate \(\theta\) to maximize the objective function based on our sample \[\begin{aligned} E_t\left(u(r_{p, t+1})\right) = \frac{1}{T}\sum\limits_{t=0}^{T-1}u\left(\sum\limits_{i=1}^{N_t}\left(\bar{\omega}_{i,t} + \frac{1}{N_t}\theta'\hat{x}_{i,t}\right)r_{i,t+1}\right). \end{aligned}\] The allocation strategy is straightforward because the number of parameters to estimate is small. Instead of a tedious specification of the \(N_t\) dimensional vector of expected returns and the \(N_t(N_t+1)/2\) free elements of the covariance matrix, all we need to focus on in our application is the vector \(\theta\). \(\theta\) contains only two elements in our application - the relative deviation from the benchmark due to size and momentum.

To get a feeling for the performance of such an allocation strategy, we start with an arbitrary initial vector \(\theta_0\). The next step is to choose \(\theta\) optimally to maximize the objective function. We automatically detect the number of parameters by counting the number of columns with lagged values.

lag_columns = [
  i for i in data_portfolios.columns if "lag" in i
n_parameters = len(lag_columns)
theta = pd.DataFrame(
  {"theta": [1.5] * n_parameters}, 

The function compute_portfolio_weights() below computes the portfolio weights \(\bar{\omega}_{i,t} + \frac{1}{N_t}\theta'\hat{x}_{i,t}\) according to our parametrization for a given value \(\theta_0\). Everything happens within a single pipeline. Hence, we provide a short walk-through.

We first compute characteristic_tilt, the tilting values \(\frac{1}{N_t}\theta'\hat{x}_{i, t}\) which resemble the deviation from the benchmark portfolio. Next, we compute the benchmark portfolio weight_benchmark, which can be any reasonable set of portfolio weights. In our case, we choose either the value or equal-weighted allocation. weight_tilt completes the picture and contains the final portfolio weights weight_tilt = weight_benchmark + characteristic_tilt which deviate from the benchmark portfolio depending on the stock characteristics.

The final few lines go a bit further and implement a simple version of a no-short sale constraint. While it is generally not straightforward to ensure portfolio weight constraints via parameterization, we simply normalize the portfolio weights such that they are enforced to be positive. Finally, we make sure that the normalized weights sum up to one again: \[\omega_{i,t}^+ = \frac{\max(0, \omega_{i,t})}{\sum_{j=1}^{N_t}\max(0, \omega_{i,t})}.\]

The following function computes the optimal portfolio weights in the way just described.

def compute_portfolio_weights(theta, data,

    lag_columns = [i for i in data.columns if "lag" in i]
    theta = pd.DataFrame(theta, index=lag_columns)

    data = (data
        .groupby("month", group_keys=True)
        .apply(lambda x: x.assign(
          weight_benchmark=lambda x: x["relative_mktcap"]
                if value_weighting else 1 / x.shape[0],
                weight_tilt=lambda x: x["weight_benchmark"]
                + x["characteristic_tilt"]

    if not allow_short_selling:
        data = (data
            weight_tilt=lambda x: np.maximum(0, x["weight_tilt"])

    # Normalize
    data = (data
            .groupby("month", group_keys=True)
            .apply(lambda x: x.assign(weight_tilt=lambda x: x["weight_tilt"]
                                      / x["weight_tilt"].sum()))

    return data

In the next step, we compute the portfolio weights for the arbitrary vector \(\theta_0\). In the example below, we use the value-weighted portfolio as a benchmark and allow negative portfolio weights.

weights_crsp = compute_portfolio_weights(

Portfolio Performance

Are the computed weights optimal in any way? Most likely not, as we picked \(\theta_0\) arbitrarily. To evaluate the performance of an allocation strategy, one can think of many different approaches. In their original paper, Brandt, Santa-Clara, and Valkanov (2009) focus on a simple evaluation of the hypothetical utility of an agent equipped with a power utility function \(u_\gamma(r) = \frac{(1 + r)^{(1-\gamma)}}{1-\gamma}\), where \(\gamma\) is the risk aversion factor.

def power_utility(r, gamma=5):
    return ((1 + r) ** (1 - gamma)) / (1 - gamma)

We want to note that Gehrig, Sögner, and Westerkamp (2020) warn that, in the leading case of constant relative risk aversion (CRRA), strong assumptions on the properties of the returns, the variables used to implement the parametric portfolio policy, and the parameter space are necessary to obtain a well-defined optimization problem.

No doubt, there are many other ways to evaluate a portfolio. The function below provides a summary of all kinds of interesting measures that can be considered relevant. Do we need all these evaluation measures? It depends: the original paper Brandt, Santa-Clara, and Valkanov (2009) only cares about the expected utility to choose \(\theta\). However, if you want to choose optimal values that achieve the highest performance while putting some constraints on your portfolio weights, it is helpful to have everything in one function.

def evaluate_portfolio(weights_data,

    evaluation = (weights_data
        .groupby("month", group_keys=True)
        .apply(lambda x:
               pd.Series(np.average(x[["ret_excess", "ret_excess"]],
                         ["return_tilt", "return_benchmark"]))
              value_vars=["return_tilt", "return_benchmark"],
        .assign(model=lambda x: x["model"].str.replace("return_", ""))

    evaluation_stats = (evaluation
        .aggregate([("Expected utility", lambda x: np.mean(power_utility(x))),
                    ("Average return", lambda x: np.mean(length_year*x)*100),
                    ("SD return", lambda x: np.std(x) *
                    ("Sharpe ratio", lambda x: np.mean(x)/np.std(x) *

    if capm_evaluation:
        evaluation_capm = (evaluation
            .merge(factors_ff_monthly, how="left", on="month")
            .groupby("model", group_keys=True)
            .apply(lambda x: sm.OLS(x["portfolio_return"],
            .rename(columns={"const": "CAPM alpha",
                             "mkt_excess": "Market beta"})
        evaluation_stats = (evaluation_stats
          .merge(evaluation_capm, how="left", on="model")

    if full_evaluation:
        evaluation_weights = (weights_data
                  value_vars=["weight_benchmark", "weight_tilt"],
             .groupby(["model", "month"])["weight"]
               [("Mean Absolute weight", lambda x: np.mean(abs(x))),
               ("Max. weight", lambda x: max(x)),
               ("Min. weight", lambda x: min(x)),
               ("Avg. sum of negative weights",
               lambda x: -np.sum(x[x < 0])),
               ("Avg. fraction of negative weights",
               lambda x: np.mean(x < 0))]
             .aggregate(lambda x: np.average(x) * 100)
             .assign(model=lambda x: x["model"].str.replace("weight_", ""))
        evaluation_stats = (evaluation_stats
          .merge(evaluation_weights, how="left", on="model")
    evaluation_stats = (evaluation_stats            

    return evaluation_stats

Let us take a look at the different portfolio strategies and evaluation measures.

benchmark tilt
Expected utility -0.250 -0.261
Average return 6.638 0.259
SD return 15.440 21.112
Sharpe ratio 0.430 0.012
CAPM alpha 0.000 -0.005
Market beta 0.992 0.952
Mean Absolute weight 0.030 0.077
Max. weight 4.054 4.218
Min. weight 0.000 -0.174
Avg. sum of negative weights 0.000 78.061
Avg. fraction of negative weights 0.000 49.074

The value-weighted portfolio delivers an annualized return of more than 6 percent and clearly outperforms the tilted portfolio, irrespective of whether we evaluate expected utility, the Sharpe ratio, or the CAPM alpha. We can conclude the market beta is close to one for both strategies (naturally almost identically 1 for the value-weighted benchmark portfolio). When it comes to the distribution of the portfolio weights, we see that the benchmark portfolio weight takes less extreme positions (lower average absolute weights and lower maximum weight). By definition, the value-weighted benchmark does not take any negative positions, while the tilted portfolio also takes short positions.

Optimal Parameter Choice

Next, we move to a choice of \(\theta\) that actually aims to improve some (or all) of the performance measures. We first define a helper function compute_objective_function(), which we then pass to an optimizer.

def objective_function(theta,
                       objective_measure="Expected utility",

    processed_data = compute_portfolio_weights(
        theta, data, value_weighting, allow_short_selling)

    objective_function = evaluate_portfolio(

    objective_function = -objective_function.loc[objective_measure, "tilt"]

    return objective_function

You may wonder why we return the negative value of the objective function. This is simply due to the common convention for optimization procedures to search for minima as a default. By minimizing the negative value of the objective function, we get the maximum value as a result. In its most basic form, Python optimization relies on the function minimize(). As main inputs, the function requires an initial guess of the parameters and the objective function to minimize. Now, we are fully equipped to compute the optimal values of \(\hat\theta\), which maximize the hypothetical expected utility of the investor.

optimal_theta = minimize(
  x0=[1.5] * n_parameters,
  args=(data_portfolios, "Expected utility", True, True),

  columns=["Optimal Theta"],
  index=["momentum_lag", "size_lag"]).T.round(3)
momentum_lag size_lag
Optimal Theta 0.361 -1.823

The resulting values of \(\hat\theta\) are easy to interpret: intuitively, expected utility increases by tilting weights from the value-weighted portfolio toward smaller stocks (negative coefficient for size) and toward past winners (positive value for momentum). Both findings are in line with the well-documented size effect (Banz 1981) and the momentum anomaly (Jegadeesh and Titman 1993).

More Model Specifications

How does the portfolio perform for different model specifications? For this purpose, we compute the performance of a number of different modeling choices based on the entire CRSP sample. The next code chunk performs all the heavy lifting.

def evaluate_optimal_performance(data,
                                 objective_measure="Expected utility",
    optimal_theta = minimize(
        args=(data, objective_measure, 
              value_weighting, allow_short_selling),

    processed_data = compute_portfolio_weights(
      optimal_theta, data, 
      value_weighting, allow_short_selling

    portfolio_evaluation = evaluate_portfolio(processed_data)

    weight_text = "VW" if value_weighting else "EW"
    short_text = "" if allow_short_selling else " (no s.)"

    strategy_name_dict = {
      "benchmark": weight_text,
      "tilt": f"{weight_text} Optimal{short_text}"

    portfolio_evaluation.columns = [
      for i in portfolio_evaluation.columns

Finally, we can compare the results. The table below shows summary statistics for all possible combinations: equal- or value-weighted benchmark portfolio, with or without short-selling constraints, and tilted toward maximizing expected utility.

data = [data_portfolios]
value_weighting = [True, False]
allow_short_selling = [True, False]
objective_measure = ["Expected utility"]

permutations = product(
  data, objective_measure,
  value_weighting, allow_short_selling
results = list(starmap(
performance_table = (pd.concat(results, axis=1)
column_names = sorted(performance_table.columns, key=len)
VW EW VW Optimal EW Optimal VW Optimal (no s.) EW Optimal (no s.)
Expected utility -0.250 -0.251 -0.261 -0.313 -0.250 -0.252
Average return 6.638 9.991 0.259 -4294.913 7.291 7.959
SD return 15.440 20.340 21.112 13666.276 16.687 19.120
Sharpe ratio 0.430 0.491 0.012 -0.314 0.437 0.416
CAPM alpha 0.000 0.002 -0.005 -3.191 0.000 0.000
Market beta 0.992 1.124 0.952 -71.402 1.054 1.135
Mean Absolute weight 0.030 0.000 0.077 57.700 0.030 0.030
Max. weight 4.054 0.000 4.218 1125.389 2.333 1.307
Min. weight 0.000 0.000 -0.174 -205.898 0.000 0.000
Avg. sum of negative weights 0.000 0.000 78.061 72077.874 0.000 0.000
Avg. fraction of negative weights 0.000 0.000 49.074 52.041 0.000 0.000

The results indicate that the average annualized Sharpe ratio of the equal-weighted portfolio exceeds the Sharpe ratio of the value-weighted benchmark portfolio. Nevertheless, starting with the weighted value portfolio as a benchmark and tilting optimally with respect to momentum and small stocks yields the highest Sharpe ratio across all specifications. Finally, imposing no short-sale constraints does not improve the performance of the portfolios in our application.


  1. How do the estimated parameters \(\hat\theta\) and the portfolio performance change if your objective is to maximize the Sharpe ratio instead of the hypothetical expected utility?
  2. The code above is very flexible in the sense that you can easily add new firm characteristics. Construct a new characteristic of your choice and evaluate the corresponding coefficient \(\hat\theta_i\).
  3. Tweak the function optimal_theta() such that you can impose additional performance constraints in order to determine \(\hat\theta\), which maximizes expected utility under the constraint that the market beta is below 1.
  4. Does the portfolio performance resemble a realistic out-of-sample backtesting procedure? Verify the robustness of the results by first estimating \(\hat\theta\) based on past data only. Then, use more recent periods to evaluate the actual portfolio performance.
  5. By formulating the portfolio problem as a statistical estimation problem, you can easily obtain standard errors for the coefficients of the weight function. Brandt, Santa-Clara, and Valkanov (2009) provide the relevant derivations in their paper in Equation (10). Implement a small function that computes standard errors for \(\hat\theta\).


Banz, Rolf W. 1981. The relationship between return and market value of common stocks.” Journal of Financial Economics 9 (1): 3–18.
Brandt, Michael W. 2010. Portfolio choice problems.” In Handbook of Financial Econometrics: Tools and Techniques, edited by Yacine Ait-Sahalia and Lars Peter Hansen, 1:269–336. Handbooks in Finance. North-Holland.
Brandt, Michael W, Pedro Santa-Clara, and Rossen Valkanov. 2009. Parametric portfolio policies: Exploiting characteristics in the cross-section of equity returns.” Review of Financial Studies 22 (9): 3411–47.
Gehrig, Thomas, Leopold Sögner, and Arne Westerkamp. 2020. Making portfolio policies work.” Working Paper.
Jegadeesh, Narasimhan, and Sheridan Titman. 1993. Returns to buying winners and selling losers: Implications for stock market efficiency.” The Journal of Finance 48 (1): 65–91.
Virtanen, Pauli, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, et al. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python.” Nature Methods 17: 261–72.