Forecasting S&P 500 Futures Repo Rates

During my time at HSBC, I worked on a Delta One Trading team. This team had two objectives: fulfill as many client orders as possible (usually over $1 billion per day), and make as much profit as possible. One way they could increase their profit is by taking bets on the Repo Rates of the S&P 500 Futures. The formula for pricing S&P 500 futures is:

F(t,T) = S(t)*(e^((r+u-q)*(T-t)))

Where:

  • t is the date the contract is made

  • T is the expiration date of the contract

  • S(t) is the current price of the S&P 500

  • F(t,T) is the price of the futures contract

  • r is the risk-free rate of return

  • q is the S&P 500 dividend yield

  • u is the repo rate

My goal was to forecast u for the next week using data my team had access to. In a few months I:

  • analyzed the data available to my team and engineered features for the model

  • tested and trained many different types of models, including traditional machine learning models (ARIMA, Random forests, Gaussian processes, SVMs), newer models like RNNs and LSTMs

  • created pipeline that gathers data from API, generates a prediction, and sends a report to the team

  • created system that monitor’s the model’s performance and automatically retrains the model

By the end of the project, my team had a model that correctly predicted whether the repo rate would increase or decrease 60% of the time. My tech stack for this project was

  • Python

  • Python’s Data Science Libraries (Pandas, Numpy, Statsmodel, Scikit-learn)

  • Pytorch