Exploring India Yield Curve in Python

Rahul N Sekar
10 min readMar 4, 2023

--

Inflation is a hot topic around the world off late. Governments & central banks have been trying to tame inflation since early 2022. US rates have increased from 0.25% to 4.75% within the last one year. Indian repo rate has moved up from 4.00% to 6.50% during the same period.

These are overnight rates. Banks get to lend or borrow at these rates for 1 day or overnight. However as the duration of lending or borrowing increases, the rates also change usually going up. The curve showing this is called the yield curve. Wiki has a detailed article here: https://en.wikipedia.org/wiki/Yield_curve

This is a walk through of how this curve can be computed from Indian government bond prices. The result should look something like this:

http://www.worldgovernmentbonds.com/country/india/

To build the yield curve, we need:

  1. Source of bond prices
  2. Source of bond info; coupon rate, issue date, maturity date etc.
  3. Python & packages; numpy, scipy, pandas, requests etc.
  4. Complete understanding of the theory behind bond pricing & yield curves.

Lets jump right in.

Bond Prices & Info

The NSE website has the latest bond prices traded: https://www.nseindia.com/market-data/bonds-traded-in-capital-market

Poking around a bit, there also seems to be a API that has json responses:

https://www.nseindia.com/api/liveBonds-traded-on-cm?type=gsec

Here is some python code to parse the json & fill up the data into a nice pandas DataFrame:

import requests
def get_gsec_mktdata(use_prev_close: bool = False):
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36'}
r1 = requests.get('https://www.nseindia.com', headers = headers)
resp = requests.get('https://www.nseindia.com/api/liveBonds-traded-on-cm',
params = {'type': 'gsec'}, headers = headers, cookies = r1.cookies)
# print(resp.status_code)
raw = resp.jHereson()['data']
# print(raw[0])
filtered = [(x['symbol'],
x['faceValue'],
x['series'],
x['lastPrice'] or (x['previousClose'] if use_prev_close else 0),
x['totalTradedVolume'],
x['averagePrice'],
x['buyPrice1'],
x['sellPrice1']
) for x in raw]
df = pd.DataFrame(filtered, columns= ['sym', 'ser', 'fv', 'prc', 'vol', 'avg_prc', 'bid', 'ask'])
return df

Good start! Now we need to know what the bond symbols mean, coupon rate, coupon dates etc. The NSE website also has a location where this information is available for all bonds. Some code to fetch that information:

def get_gsec_info():
resp = requests.get('https://archives.nseindia.com/content/equities/DEBT.csv')
info_df = pd.read_csv(StringIO(resp.text.replace(' ', '')), index_col=False)
filtered_df = info_df[info_df['SERIES'].isin(['GS', 'TB'])]
filtered_df = filtered_df.copy()
filtered_df[['MATURITY', 'ISSUEDATE']] = filtered_df.apply(add_info, axis=1)
filtered_df['IPRATE'] = filtered_df['IPRATE'].fillna(0)
# print(filtered_df[['SYMBOL', 'FACEVALUE', 'IPRATE', 'MATURITY', 'ISSUEDATE']])
return filtered_df

Ok, now we have both bond info & bond prices. Note that I’m assuming these prices are dirty prices. We will verify this later.

Bond Pricing

Lets start with some functions to price bonds. There are two types of Indian Govt Bonds: T-Bills & dated G-Secs. T-bills are zero coupon bonds that trade at discount to face value. Dated G-Secs have coupons paid at a frequency, usually every six months. In order to price a bond we need to know the cashflow & discount rates. Please see here for an understanding of bond pricing: https://en.wikipedia.org/wiki/Bond_valuation

We will be using the net present value approach to pricing the bonds. Lets write some structured code:

class Product:
def __init__(self, isin: str):
self.isin = isin


class DiscountCurve:
@abstractmethod
def discount_factor(self, dt: datetime.date, from_date: datetime.date = datetime.datetime.now()):
pass

@abstractmethod
def yld(self, dt, from_date: datetime.date = datetime.datetime.now()):
pass


class Security:
def __init__(self, p: Product, price: float, asof_date: datetime.date, volume: float = None, avg_price: float = None,
bid_price: float = None, ask_price: float = None):
self.product = p
self.price = price
self.volume = volume
self.avg_price = avg_price
self.bid_price = bid_price
self.ask_price = ask_price
self.asof_date = asof_date

Product has information about a product identified by ISIN. Lets keep it general so it could used for stocks, bonds, options etc. Security is a Product combined with market data i.e. price, volume, bid-ask, date etc.

DiscountCurve will have the discount factor or yield for a given future date. This can either be a simple constant rate or a parameterised curve as our yield curve will be. Lets gets to that later. Now lets implement the Class Bond(Product):

class Bond(base.Product):

def __init__(self,
coupon_pct: float,
maturity_date: datetime.date,
issue_date: datetime.date,
coupon_freq: int = 2,
isin: str = None,
symbol: str = None,
face_value: float = 100.0,
):

super().__init__(isin)
self.symbol = symbol
self.coupon_pct = coupon_pct
self.maturity_date = maturity_date
self.issue_date = issue_date
self.coupon_freq = coupon_freq
self.face_value = face_value

def is_price_sane(self, price, asof_date:datetime.date = datetime.date.today(),
y_min: float = -0.0001, y_max: float = 0.25) -> bool:
if not 0.1 < price / self.face_value < 10.0:
return False
y1 = self.npv(disc_curve.ConstDC(y_min), asof_date) / price - 1.0
y2 = self.npv(disc_curve.ConstDC(y_max), asof_date) / price - 1.0
return y1*y2 < 0.0

def _coupon_dates(self) -> tuple:
if not self.coupon_freq:
return ()
ret = []
i = 1
mnths = int(12 / self.coupon_freq)
nxt = self.issue_date + relativedelta(months= i * mnths)
while nxt <= self.maturity_date:
ret.append(nxt)
i += 1
nxt = self.issue_date + relativedelta(months= i * mnths)
return tuple(ret)

def cashflows(self) -> dict:
mat = self.maturity_date
ret = {mat: self.face_value}
if not self.coupon_freq:
return ret
dts = self._coupon_dates()
cpn = self.face_value * self.coupon_pct / self.coupon_freq / 100.0
for dt in dts:
ret[dt] = ret.get(dt, 0) + cpn
return ret

def accrued_interest(self, dt: datetime.date) -> float:
if not self.coupon_freq or dt <= self.issue_date:
return 0

dts = list(self._coupon_dates()) + [self.issue_date]
mx = max([d for d in dts if d < dt])
cpn = self.face_value * self.coupon_pct / 100.0 / self.coupon_freq
prd = 12 / self.coupon_freq * 30
return cpn * (dt - mx).days / prd

def npv(self, dc: base.DiscountCurve, asof_date: datetime.date = datetime.date.today()):
ret = 0
for d, p in self.cashflows().items():
if d >= asof_date:
ret += p * dc.discount_factor(d, asof_date)
return ret

def yield_to_maturity(self, dirty_price: float, asof_date: datetime.date = datetime.date.today()):
if not self.is_price_sane(dirty_price, asof_date, -0.99, 1.00):
return None
apr = optimize.brentq(
lambda r: self.npv(disc_curve.ConstDC(r), asof_date) / dirty_price - 1.0,
-0.99, # -99%
1.00, # +100%
)
return 2.0 * (math.sqrt(1.0 + apr) - 1.0) # BEY

def modified_duration(self, dirty_price, asof_date):
sar = self.yield_to_maturity(dirty_price, asof_date) / 2.0 #semi-annual rate
ret = 0
for d, p in self.cashflows().items():
t = (d - asof_date).days / 365.0 #yrs
ret += t * p / math.pow((1.0 + sar), 2*t)

ret /= dirty_price * (1.0 + sar)

return ret

def z_spread(self, yield_curve):
pass

Hope the function & variable names are self explanatory. Do read up on the wiki article on bond pricing if you are unsure of any of these terms. To generate the yield curve, we only need the npv() and its dependent functions.

The function yield_to_maturity() is only there to check if the prices of the bonds are reasonable & sane. The YTMs will also be useful to plot alongside the yield curve. Notice how the YTM calculation involves solving the equation

security.product.npv(discount_curve(x), pricing_date) = security.dirty_price

OK. Now we are ready to compute the yield curves. Before that lets take a quick look at the yields on these bonds.

Yield To Maturity

Lets run the code!

def get_gsec_bonds():
df = get_gsec_info()
ret = []
for idx, bnd in df.iterrows():
# print(bnd[1])
ret.append(Bond(bnd.IPRATE, bnd.MATURITY, bnd.ISSUEDATE, 2 if bnd.IPRATE > 0 else 0, face_value=bnd.FACEVALUE, symbol=bnd.SYMBOL))
return ret


def get_gsec_securities():
bnds = get_gsec_bonds()
mkt_df = get_gsec_mktdata()
ret = []
for bnd in bnds:
row = mkt_df[mkt_df.sym == bnd.symbol]
if len(row) > 0:
r = row.iloc[0]
# print(row.iloc[0].prc)
ret.append(Security(bnd, r.prc, datetime.date.today(), r.vol, r.avg_prc, r.bid, r.ask))
return ret

gsecs = get_gsec_securities()
data = []
filtered_gsecs = []
for gsec in gsecs:
p = gsec.product
if not p.is_price_sane(gsec.price) or gsec.volume <= 0:
continue
ytm = p.yield_to_maturity(gsec.price-p.accrued_interest(datetime.date.today()))
# if not ytm or not (0.01 < ytm < 0.20):
# continue
filtered_gsecs.append(gsec)
mid_prc = (gsec.bid_price + gsec.ask_price)/2.0
data.append([p.symbol, p.coupon_pct, p.maturity_date, gsec.price, p.yield_to_maturity(gsec.price-p.accrued_interest(datetime.date.today())),
gsec.volume, gsec.avg_price, mid_prc, gsec.bid_price, gsec.ask_price])

df = pd.DataFrame(data, columns=["sym", "cpn", "mat", "prc", "ytm", "vol", "avg_price", "mid_prc", "bid", "ask"])
df

A lot of the ~300 bonds don’t trade at all. The TB yields are very high; 364D150623 is at 12.78%. The 669GS2024 is yielding 5.6920% which is lower than the overnight repo rate of 6.50%. At a first glance some of these prices look suspect. Overall other prices look ok. Lets keep aside the data issues for now & proceed with the yield curve.

The Yield Curve

We are going to model the curve on the Nelson Siegel family of functions.

from scipy import optimize

_DAYS_IN_YEAR = 365

class ConstDC(base.DiscountCurve):
def __init__(self, r):
self.r = r

def yld(self, dt, from_date: datetime.date = datetime.datetime.now()):
return self.r

def discount_factor(self, dt: datetime.date, from_date: datetime.date = datetime.datetime.now()):
return math.pow(1.0 + self.r, (from_date - dt).days / _DAYS_IN_YEAR)


class DCFromBonds(base.DiscountCurve):

def __init__(self, bnd_secs: list):
popt, pcov = optimize.curve_fit(
self.calibrate_func,
{'bnd_secs': bnd_secs},
np.array([bnd_sec.price for bnd_sec in bnd_secs]),
self.init_params_guess()
)
self.set_params(*popt)

def yld(self, dt, from_date: datetime.date = datetime.date.today()):
yrs = (dt - from_date).days / _DAYS_IN_YEAR
return math.pow(self.discount_factor(dt, from_date), -1.0 / yrs) - 1.0

def calibrate_func(self, bnd_secs, *params):
self.set_params(*params)
return [bnd_sec.product.npv(self, bnd_sec.asof_date) for bnd_sec in bnd_secs['bnd_secs']]

def set_params(self, *params):
raise Exception('Unimplemented')

def init_params_guess(self):
raise Exception('Unimplemented')


class NelsonSiegel(DCFromBonds):
def set_params(self, b0, b1, b2, tau):
self.b0 = b0
self.b1 = b1
self.b2 = b2
self.tau = tau

def init_params_guess(self):
return [0.05, 0.0, 0.0, 2.0]

def discount_factor(self, dt: datetime.date, from_date: datetime.date = datetime.datetime.now()):
yrs = (dt - from_date).days / 365.0
m_tau = yrs / self.tau # m / tau
exp_m_tau = np.exp(-m_tau)
rate = self.b0 +\
(self.b1 + self.b2) * (1.0 - exp_m_tau) / m_tau -\
self.b2 * exp_m_tau

if 1.0 + rate < 0.0:
return 0
return math.pow(1.0 + rate, -yrs)

The constant discount curve is straight forward. The next class is an abstract one that is a general parameterised discount curve based on bond prices. It doesn’t implement the parameters OR the actual discount_factor() method. The subclass NelsonSiegel(DCFromBonds) does the necessary. On initialization, optimize.curve_fit() will calculate the NS pameters b0, b1, b2 & tau. The discount curve function will use these calibrated parameters to calculate the rates. Lets plot!

from app.fin import disc_curve
dc = disc_curve.NelsonSiegel(filtered_gsecs)
[dc.b0, dc.b1, dc.b2, dc.tau]

#[0.05054167463443448, 0.00790125064272829, 0.0699630912430819, 10.379672783155987]

The parameters are sane but not accurate. b0 + b1 should be around 6.50% but is 5.84%. This is how the curve looks:

import datetime, math
from matplotlib import pyplot as plt
from dateutil import relativedelta as rd
tdy = datetime.date.today()
days = [90, 180, 365, 365*3, 365*5, 365*10, 365*20, 365*30, 365*40]
rts = [dc.yld(tdy + rd.relativedelta(days=d))*100 for d in days]

plt.figure(figsize=(10,6))
plt.plot([d/365 for d in days], rts, marker="^", color="red")
plt.scatter(df.mat.apply(lambda x: (x-tdy).days/365), df.ytm*100)
plt.xlabel('date')
plt.ylabel('yield')
plt.show()

Reasonable start. As suspected the bond prices are off, particularly the short maturity T Bills. The near term rates are below the overnight repo rates. Another problem I see is that the bond YTMs of coupon paying bonds should be *below* the upward sloping yield curve since the YTM is cashflow weighted while yield curve rate is for a bullet or zero coupon bond. The curve itself is sloping down at the far end, which is another problem. Although theoretically possible, the yield curve of India doesn’t look like that at the moment.

Better Prices

It looks like the NSE prices are only for trades on the exchange. Almost all trades of Indian GSecs are over the counter (OTC). After asking around, I found another source for the price data that has OTC trades:

https://www.ccilindia.com/OMMWTB.aspx

The data here looks much better. The traded amounts are in crores of Rupees. Unfortunately this source doesn’t seem to have any APIs / JSON for the data. The best I could find is a daily dump in a zip file that has a csv inside with the trades and prices. The zip looks very interesting with lots of other information on IR Swaps etc. Lets explore that later.

Meanwhile, code to extract prices from CCIL:

def get_gsec_mktdata_ccil():
res = requests.get("https://www.ccilindia.com/Research/Statistics/Pages/Infovendors.aspx")
zip_url = re.findall(
'https://www\.ccilindia\.com/Research/Statistics/Lists/DailyDataForInfoVendors/Attachments/\d+/DFIV_\w+\.zip',
res.text)[0]
res = requests.get(zip_url)
zf = zipfile.ZipFile(BytesIO(res.content))
outright_csv = zf.read([x for x in zf.namelist() if x.startswith('outright')][0])
df = pd.read_csv(BytesIO(outright_csv))
df.drop(df.columns.difference(['ISIN', 'Volume (Cr.)', 'Last price', 'Wtd Avg Price']),
axis=1, inplace=True)
df.columns = ['isin', 'vol_cr', 'prc', 'avg_prc']
df['bid']= np.nan
df['ask'] = np.nan
return df

The source here also has YTM corresponding to the prices. The YTMs computed with these prices are much higher. The prices must be clean prices. Lets incorporate all of this into the yield curve calculations; a small change in the optimiser function to add accrued_interest:

def __init__(self, bnd_secs: list):
popt, pcov = optimize.curve_fit(
self.calibrate_func,
{'bnd_secs': bnd_secs},
np.array([bnd_sec.price + bnd_sec.product.accrued_interest(bnd_sec.asof_date) for bnd_sec in bnd_secs]),
self.init_params_guess()
)
self.set_params(*popt)

Trying to match the YTMs, there were also some more bugs in the coupon date computation. It missed the last coupon on maturity date. After fixing the YTMs are within 2–3bps for coupon bonds. The TB YTMs are still way off. Our calcuations are underestimating the YTMs. Lets leave this for now.

Better Yield Curve

Here we go!

Much better. The 3m yield is 7.03% which is 53bps above the 6.50% repo rate. The external website mentioned in the beginning has this rate at 6.95% or 45bp above repo rate. My guess is that the difference is because of Nelson Siegel vs some other function family. Anyway I’m done with the yield curve for now.

Future

Now that we have the yield curve, we can price corporate bonds with spreads & option adjustments. That will be fun!

We can also look at IR Swaps and other interest rate derivatives. The CCIL data source does have trades on those.

This yield curve can now be used to price equity derivatives. We can use a time varying short term rate, instead of a constant rate assumption as in Black-Scholes pricing, for a more sophisticated option pricing model.

Source Code

All of the source code in this article is available at https://github.com/rahulsekar/prism

I hope to add more analytics on financial markets into prism.

--

--

No responses yet