The Apply Function – Making Pandas Work Like Excel

Photo by Jeffrey Czum on Pexels.com

In our last post we walked through the lag function so that for our defined group we have a column that contains a ‘lagged’ value.  We can think about this operating much like LAG in SQL.  But now what?  We have an extra column that we wouldn’t have needed if we were using Excel.  Well, we aren’t trying to mirror Excel, we are trying to mimic the behavior.  So, the next step would be to apply a function to the dataframe using pandas .apply().  But, we can’t do that without a helper function defining what we need to do.

I’m assuming if you are reading my blog, you understand how to define a function in python, so our walkthrough there will be very, very brief. 

Let’s say that in the dataframe we used before, we want to see a month-over-month change in subscription amount.  This would let us track revenue growth as well as customer level growth.

Photo by Serpstat on Pexels.com

To do that we can use a function like the below.  One thing to note here is that you MUST `return` the result of the function or the apply will not work the way we want it to.

  def growth( last_month, this_month):
      diff = this_month - last_month
      return diff

Now we can get pretty fancy in our function and the apply will still work.  But, for the purpose of illustrating we will keep it simple.  

Well if we have this great new metric we want to track (and maybe pivot on it later!) we should store it in the dataframe.  So what does that mean? You guessed it: a new series!  We’ll name our new column ‘monthly_growth’.

df[‘monthly_growth’] = pd.Series()

And now we have this:

IndexActivity DateCustomer NameSubscription Amountsubscription_lagmonthly_growth
22020-02Risk & Data75N/AN/A
32020-03Risk & Data12575N/A
12020-02XYZ, Co.100N/AN/A

The next part utilizes a lambda function in order to apply the helper and populate the new series all in one line.

df[‘monthly_growth’] = df.apply(lambda x: 
                          growth(x[‘subscription_lag’],x[‘monthly_growth’]))

Resulting in this!

IndexActivity DateCustomer NameSubscription Amountsubscription_lagmonthly_growth
22020-02Risk & Data75N/AN/A
32020-03Risk & Data1257550
12020-02XYZ, Co.100N/AN/A

Great!  We have the change in subscription amount month over month in that column.  Setting things up this way also allows us to see people who decreased their subscription amount, ie negative growth.  So there you have it.  The offset functionality in Excel can be mimicked in pandas making it much easier to work within a dataframe, especially for large values. 

Photo by Flickr on Pexels.com

Happy Modelling!

The Shift Function – Making Pandas work like Excel

Using the toolkit of a data scientist, business intelligence analysts translate business needs and requirements into actionable insights for their stakeholders.  Built into this framework is the need to help business users replicate the models they build in excel in a more scalable way.  

petronas twin towers malaysia
Photo by Nafis Abman on Pexels.com

The most common approach I’ve encountered in Python is to use Pandas DataFrames.  They offer a similar tabular view, and a much simpler way to look at summary statistics.  Additionally, DataFrames can be easily exported to .csv  or .xls(x) file formats for stakeholder consumption.

What happens though when you can’t do the same sorts of operations in pandas that you can in Excel?  Referencing a previous cell in another column for offset like functions doesn’t have a clean solution in pandas.  You could export your dataframe to a temporary table in an sql database (still using python libraries), but you’re now calling in additional information to be stored in memory.  You would be sacrificing runtime and simplicity in this case.  Add onto that, if a business user will be running a python script, one would want to minimize the number of dependencies for running the scripts.  

There is no clear offset function in pandas and iterating over a large dataframe can take some serious computing and huge amounts of time.  After typing a number of different methods, the cleanest, and fastest way to work with an offset calculation is shifting the column and applying a function to the shifted column.

black and brown coat animal on brown trunk
Photo by Flickr on Pexels.com

Let’s get into the technical details here.  Say you have a dataframe that includes monthly revenue data, like the small example below.

Index Activity Date Customer Name Subscription Amount
1 2020-02 XYZ, Co. 100.00
2 2020-02 Risk & Data 75.00
3 2020-03 Risk & Data 125.00

 

So if you wanted to look at changes over time for a customer, here’s how we do that.  First, we can make a new blank column to hold the data we will be moving around.

df[‘subscription_lag’] = pd.Series()

Since we want to look at changes by customer, and not in general, we’ll need to sort and group the data before we invoke the shift function in pandas.  Fortunately, you can do this all in one line.

df[‘subscription_lag’] = (df.sort_values(by=[‘Activity Date’], ascending = True).groupby([‘Customer Name’])[‘Subscription Amount’].shift(1))

The code result will end up looking something like this.

Index Activity Date Customer Name Subscription Amount subscription_lag
2 2020-02 Risk & Data 75.00 N/A
3 2020-03 Risk & Data 125.00 75.00
1 2020-02 XYZ, Co. 100.00 N/A

 

Now, we have an offset column that we can use to operate on row by row.  

In the next post, we’ll add onto this and work with the apply function to see how the operations on the offset columns will work.  

Thanks for your patience everyone, I’ve taken some time off from updating and plan on getting back into a more regular rhythm, though less frequent then weekly.  I’m looking forward to getting back into things. 

Happy Modeling!

What About Derivatives?

We know how to model some of them but what are they?

Options are the right to purchase or sell a set number of financial assets at a fixed price for a fixed amount of time.  They are a subset of financial instruments called derivatives, of which there are several other types.  Derivatives get their name from the way they ‘derive’ their value from some underlying asset.  We’ll talk a little about derivatives in general then go through a few common types.  

architecture black and white challenge chance
Photo by Pixabay on Pexels.com

So we are deriving the value of an instrument based on some other asset or instrument.  That is an annoyingly vague description.  How do we make that idea more tangible?  

Well, imagine that we want to buy some stock.  And our analysis (or gut – you can sort of gamble with these things  and that is called speculation) tells us that the stock’s price is going to go way up soon.  So there’s some upside we think we can make money on.  But we also know that the stock market is unpredictable and we don’t want to just buy the stock right now.  Maybe it’s something like Apple where a single share of over $300 (at the time of writing this post) and buying 20 shares is a little out of our price range.  Now we do have some money, and we’re pretty sure we know the price will go up so we choose to buy a call option.  That call option has a strike price of $320 and will expire in 3 months.  Instead of buying the stock now, we are buying the right to buy the stock later and the money we will make is still any upside of the stock but with a smaller cash outflow.  The downside here is that we totally lose our initial investment if we buy the calls and the price doesn’t go up enough to make our cash back. Where if we bought the stock we would still have it even if the price didn’t go up as much as we’d hoped.  

black and white business chart computer
Photo by Lorenzo on Pexels.com

So how much do we pay for these ‘rights’?  The cost (i.e. value) is based on the price of the stock.  It will cost more to buy this option if the price of the stock is $321 than it would if the price of the stock is $315.  We saw this in action when we reviewed the option pricing models in previous posts.  Recall that current stock price is included in all iterations of the model.  Therefore we are deriving the call value based on the stock price, but not just any stock, the underlying asset related to our call.

Ok, we understand the idea that a derivative is based on an underlying asset.  Now we can describe some common types of derivatives.  We’ve discussed a stock option already.  A call being the right to buy and a put being the right to sell.  

In a previous post we mentioned briefly the idea of a forward contract.  Forwards are also a  right to buy or sell an underlying asset at an agreed upon price at some point in the future.  However, these contracts are drawn up as custom documents between the party selling the asset and the party buying the asset.  They are not standard in any way and are therefore not sold on any exchange.

Futures contracts operate similarly to forward contracts, however futures are standardized and they are exchange traded.  Many therefore include margin requirements, and ‘true-ups’ in the accounts holding the asset during the time frame.

person holding gray twist pen and white printer paper on brown wooden table
Photo by Cytonn Photography on Pexels.com

A Swap agreement is another type of derivative where two parties in the agreement exchange the cash flows, or swap them, for two different financial assets/instruments.  They could swap a payment of a fixed interest rate for a floating interest rate.  These are a common form of interest rate swap.  They could also swap one currency for another currency, or a cross currency swap.  There are several other types as well but these are common.  Swaps generally do not trade on any exchanges. 

There is another type of swap that deserves its own paragraph: the Credit Default Swap.  These instruments essentially allow the ability to buy insurance on a company against any default of that company, such as bankruptcy.  Like other forms of insurance, one would pay a regular premium in exchange for a large payout if the company were to be in default.  You are ‘swapping’ some cash flow for insurance like protection.  These contributed to the magnitude of the Great Recession’s financial crisis and mismanagement of these instruments brought down the financial giant AIG.

I think I’ll leave additional discussion for another day.  More detail on how to model derivatives will be coming soon, and perhaps some additional analysis on how stock price movement affects the derivative prices.  In the meantime, stay safe and healthy.

Happy Modeling!

BSM

Structure and Components of Neo4j Databases

In the last post,  we introduced the concept of graph databases.  They give users a relationship first structure to the data as well as clear visuals outlining those relationships.  We also introduced Neo4j as a graph database technology.  This post will outline the components and general features of Neo4j as a tool for data storage and manipulation.  If you are not familiar with the idea of a graph database or you just want a high-level refresher, I recommend that you read the previous post here before continuing.  Please note these Neo4j posts are essentially my training notes from their introductory class module.  If you want to study along with me you can find a link here.   

coding computer data depth of field
Photo by Kevin Ku on Pexels.com

In order to fully utilize the tool, an understanding of the structure of the platform and it’s available tools is needed.  I’m going to walk through these six main components each in turn.

Since we are working with a graph database, the first and probably most important component to understand is the Database itself.  What sets the Neo4j database apart?  

Firstly, the database operates with index free adjacency which means that every node in the database directly points to its adjacent node, free of the need of indexes.  When a single node is written into the database, it is stored as connected.  Any subsequent access to that particular node is done using pointer navigation which is very fast.  The relationships are stored directly, and therefore do not need to be created using a query, thus speeding up access to the data.  The data itself can be traversed without needing to be indexed.

working pattern internet abstract
Photo by Markus Spiske on Pexels.com

Another feature of the database is its emphasis on transactionality.  It is important for applications that require ACID.  If a relationship between nodes is created, the nodes themselves are also updated.  This helps preserve the integrity of the information but also makes it flexible and easy to update, allowing the database to evolve with the engineer’s needs.  Also to note that clustering of nodes is supported.

The second primary component is the graph engine.  The engine interprets Cypher statements and then executes kernel-level code to store or retrieve data.  Cypher is the query language used in Neo4j databases that is similar to SQL in its simplicity of structure, but even more so in that it allows very natural language level statements and queries.  

business computer connection data
Photo by Josh Sorenson on Pexels.com

One of the great components about Neo4j is that it is open source.  It seems like it would be very easy to launch and use without delving deeper but for some institutions it may be necessary and the option is available to peek under the hood.  Out-of-the-box, Neo4j supports several other languages: Python, Java, JavaScript, & Go among others.  

Again because of the open source nature of Neo4j, and Cypher as well, there are several libraries developed by the community to address needs that do not come standard issue.  If you wanted to integrate with R or Ruby, there are libraries you can resource.  An example of a really powerful library is Graph Algorithm.  It is designed to help analyze the data in the graph database directly.  You can also integrate with the powerful Bolt engine and with GraphQL.

In addition to these libraries, there are some tools available for easy use of Neo4j Databases.  You can use a standard web browser to access data in Neo4j or to test out Cypher statements.  There is also a Neo4j browser that can access the graph engine.  A new tool created by Neo4j is called Bloom.  It essentially allows you to navigate the graph database without knowing very much about cypher.  It’s a very powerful tool and gives users accessibility right out of the box. 

This video was a great way to see Bloom in action.  If you’d rather read than watch here are my takeaways from the video demo of Bloom.  Bloom makes it very easy to navigate the data visually.  It is designed so you can zoom in and zoom out on the data with easy to see color coded filters, all without even opening a search window.  Bloom is a very appropriate name for the tool since the graphs appear to bloom like flowers as you move in and out of granularity.  I mentioned that Bloom also has a search feature.  It enables power-users to save queries and filters to the interface.  Which in turn, allows business users to access the queries and filters themselves.  The menu in the feature was easy to understand and required no knowledge of Cypher if the search was already saved.  Additionally, the search feature allows for user input parameters, such as searching for data above or below a threshold.  Again this feature requires no knowledge of Cypher if the filter was already saved.

map1

Finally, because it is a graph database, the ‘white board modeling’ makes it easy to understand the structure of the data.  The flowcharts and arrows scribbled on the white board during the planning sessions with business users are exactly what gets coded into the database.  This structure, given that the nodes and relationships update together, allows for extreme flexibility and easy updates.

So there you have it folks.  The components for a Neo4j graph database are very powerful, flexible, and clear.  I think I might have drunk some Neo4j KoolAid.  Anyone else with me?

Happy Modeling Everyone!

 

Graph Databases: Understanding the Basics and Intro to Neo4j and Cypher

white dry erase board with red diagram
Photo by Christina Morillo on Pexels.com

Databases are such a powerful tool.  They help us store all sorts of information in a very logical fashion and in a way that allows us to access that information later.  However, we as a global society have reached a point in data collection that makes it very difficult to march forward to exactly the same tune with the same instruments as before.  So relational databases are not nearly as fashionable as they used to be with the new fangled noSQL databases emerging.  These changes to the way we think about data storage comes at a cost: ACID* was no longer a standard.  If we’re working with ‘Big Data’ and we lose a few rows, it’s not material right?  It’s like the accountant who sees a $1500 error on a balance sheet but the total assets are in the trillions.  Does that actually matter?

*Quick Sidebar on ACID – Atomicity, Consistency, Isolation, and Durability – is a standard set of properties that guarantee that database transactions are processed in a reliable manner and is especially concerned with how a database may recover from a failure during transaction processing.

So now we have databases that can handle huge amounts of data but losing something in the process.  Additionally, some of the queries to handle information requests from both SQL and noSQL databases can be so complicated that they need to be run as a batch instead of in real time.  This could present problems for a company.  For example, imagine an individual who is being moved to a different part of the organization at a Broker/Dealer Bank.  The person worked with sensitive information that could be used illegally and for things like insider trading since it is non-public data.  Now they’re moving ‘over the wall’ to a customer facing position that has the same information at their fingertips as the general public.  The person could have to wait for 12 hours for a batch job to run to revoke old security permissions.  Those are a very dangerous 12 hours for the Bank, and the person could easily access something sensitive or confidential inadvertently.  Think about how BIG the organization for a Broker/Dealer bank is and how often this could potentially happen.  Wouldn’t it be great for there to be a real-time switch instead? 

photo of european banknotes
Photo by Markus Spiske on Pexels.com

This is where Graph databases come in.  They are a relationship first approach to storing and querying data.  Graph databases are often used with online transaction processing, and are optimized for transactional performance. They don’t infer connection between data using foreign keys, but store them at creation.  This makes working with the database much quicker and more flexible, allowing real-time updates, even within large organizations dealing with Big Data.  Another perk of graph databases is that ACID could once again be brought back to ensure the reliability of the database in the case of transaction failure.   

neo4j_logo
https://neo4j.com/

Neo4j, an example of an off-the-shelf graph database is a very powerful tool.  It uses CRUD (Create, Read, Update, Delete) operations to work on the model, and is schema optional.  Neo4j databases were created with the following goals in mind.  Firstly, they are intended to be Intuitive, i.e. less transactional friction, by storing relationships at creation, not at time of query.  Secondly, they have Speed, speed in both development, with clear and very light coding required, as well as speed in execution, a function of the intuitiveness, that enables real-time decisions.  Finally, they are Agile, which represents how flexible the database can adapt to the changing business.  It is common, for example, for the database schema to evolve over time, and is not necessarily set at creation the same way the schema for relational databases are.

 

This all sounds very fancy, and exciting, but why would we use them?  One can see the application in a social network, visualizing you and your friends on Facebook for example.  But what else?  The use cases definitely don’t end there.  They can be used for real time recommendations.  Say you want to show a customer recommendations while they are shopping online between Thanksgiving and Christmas, and now it also becomes important to make sure they’re seeing recommendations for products that will be able to ship before the big Holiday.  This is the power of real time decisions.  A graph based search is another application.  Lufthansa uses a Neo4j graph database system for their inflight entertainment system.  In addition to these there are several other use cases in the real world including (and certainly not limited to!): master data management, fraud detection, network & IT management, and Identity & Access management, like our Bank example above.

We’ve been talking so much about graph databases that we neglected to define what we mean.  We need to understand what a graph IS when it comes to graph databases.  A graph consists of two elements: a node and a relationship.  A node is an entity that can also have labels.  A relationship is how the nodes are connected.  Both nodes and relationships can have properties.  Graph databases are a representation of the real world so they can easily be described using common language.  

Screen Shot 2020-04-22 at 2.07.21 PM 

Cypher is the language developed to work with Neo4j.  It’s a declarative language that focuses on describing WHAT you are interested in as opposed to HOW it is acquired.  Cypher is a very light language, meaning it can accomplish the same sorts of queries in way less code than standard SQL.  Because there is less code, and the code is less complicated, developers can spend way less time debugging the code.  On top of all of that, it’s very easy to read.  

Well, that’s all for today everyone.  I hope you enjoyed walking through this new(ish) technology with me.  I think it’s super interesting so I expect that there will be more Graph database, Neo4j, and Cypher posts in the future.  Happy Modeling!

person in blue shirt wearing brown beanie writing on white dry erase board
Photo by Startup Stock Photos on Pexels.com

Stay safe and healthy!

 

Binomial Models in Option Pricing, Part 5

Here we are again discussing the updates to the binomial option pricing model that will converge to the Black-Scholes-Merton Model.  Really, in order to understand what is going to be discussed here, the other parts of the series are recommended. You can find them here: Part 1, Part 2, Part 3, Part 4, and a side discussion on features.  

space grey ipad air with graph on brown wooden table
Photo by Burak K on Pexels.com

Let’s back up and talk theory for a minute.  We know that in finance and statistics (and the world!) we have much more predictive power when we have more data points to work with.  In dealing with an instrument that has a fixed time frame, then how do we create more data points? Well if you recall the 2-step binomial model we have discussed already, the second step didn’t really expand the expiration.  We semi-arbitrairily added a step at the half-way mark. It had no impact on the price of the option, we are just allowing movement in the underlying asset price up to expiration for the European option we are working with. [With American options, it’s a bit more complex.  Digging into those won’t really add to the discussion at hand, so we can revisit those, perhaps, at a later date.] What is there to stop us from adding 4 steps, or 10, or 50? Nothing but the complexity of the model. The additional steps allow for additional movement in the underlier and more chances for realistic movement up and down without changing any of the economics of the instrument and simply by expanding on the model we made.

Wonderful, we have a conceptual idea of what is going to happen here.  Now, we need to implement it. Let’s take a look at the one-step model and the two-step model we have built over the last few weeks.

Screen Shot 2020-04-11 at 8.34.08 AM

These are functional models that work well for one and two steps but you can imagine this type of programming would get out of hand very quickly.  Picture even 10 steps. What a mess. Fortunately for us, there is an easier way to accomplish this and it will allow scaling. We can use it for 2 steps, or 1,000.  How do we accomplish this? Well we need to add another variable in our formula. N here will be the number of steps we will slice the time to expiration into. I am also choosing to add a ‘show tree’ flag so that if we want the model to output the decision tree we can show it easily.  Up and down move sizes are still calculated the same but are now based on the length of step not the time to expiration. The probability of a up or down move in one step is also calculated the same as before. The next innovation in the model we have is to initialize stock and call trees as matrices of zeros.  We can iterate over a matrix of zeros and fill it in based on other values in the matrix. This is also what we will print out if we want to see the tree. The new model looks a little more complex, but once you realize we’re using matrix position of each value to do the right calculation it’s easier to see.       

Screen Shot 2020-04-18 at 10.32.10 AM

What about a test with our same data from last time?  Looks good to me especially when you look at the tree we made.  

Screen Shot 2020-04-18 at 10.33.09 AM

We’ve been talking about convergence for our model, what does it look like when we add more values?  Here let’s print a few then take a look at a graph that shows the change in price as we increase the time steps.  

Screen Shot 2020-04-18 at 10.34.02 AMScreen Shot 2020-04-18 at 10.34.46 AMBinomial

As we make the time steps arbitrarily small, the math approaches a continuous time model. When that happens, the binomial model converges to the Black-Scholes-Merton option pricing model, or BSM model.  The BSM model is based on the underlying assumption that stock prices follow a lognormal distribution. So if we’re going to move on to create this, we’ll need a statistical package imported. I prefer the statistics module from the SciPy Library.  Because of our assumptions we’ll need to create a normal distribution with mean m, and standard deviation, of s for the stock price at time, T.  

Screen Shot 2020-04-18 at 10.35.58 AM

If we divide the output of this formula by T, we get the continuously compounded annual return of the stock price.  The continuously compounded annual returns are normally distributed with a mean and stddev below.

Screen Shot 2020-04-18 at 10.36.06 AM

Now that we have all this we can create an expected value for the stock price at time, T.

Screen Shot 2020-04-18 at 10.36.12 AM

Notice that this is similar to the differences between arithmetic mean and geometric mean: like geometric mean, the mean return (mu) will always be a little less than the expected annual return (like the arithmetic mean)ⁱ.  We need to recall here that the binomial model creates a replicating portfolio that is designed to yield the risk-free rate. To derive the BSM in continuous time we create an instantaneously riskless portfolio – i.e. one that is risk free over the next instant- using similar logic.

Before we do that we need to address the other assumptions for the BSM model.

  1. No arbitrage (like the Binomial model)
  2. Price of underlying asset follows a lognormal distribution
  3. The continuous risk free rate is constant and known
  4. Volatility of the underlier is constant and known
  5. Markets are frictionless 
  6. The underlying asset has no cash flow – we did talk about dividends, but this is not part of the BSM model assumptions – we can try to model the scenario but that’s out of scope for this discussion
  7. Options are valued as European options – again American style is out of scope for now

Ok!  Now we can move on to building more of the BSM model.  Because this is a continuous time model, we need a continuous distribution to correctly work with the formula.  SciPy will help us here.  

Screen Shot 2020-04-18 at 10.36.20 AM

The math for the distribution factors seems scary, but really with SciPy doing the heavy lifting we don’t need to worry.  (And we won’t be discussing the derivation of those formulas, it’s not important for understanding the BSM model).

That’s it.  Those are all the pieces for the BSM.  We just need to build it.

Screen Shot 2020-04-18 at 10.36.28 AM

If we input the same data for the stock and option details that we have used before we should get the same price that we converged to, $1.53 for the call price (we included the put price as well for kicks).  

Screen Shot 2020-04-18 at 10.36.33 AM

Whoohoo!!! Five installations and we were able to prove that the binomial model converges to the BSM as we add further time steps.  And we made Python do all the tough math so we could focus on the fun stuff.  Let’s look at one last bit of the fun stuff before we conclude.

Screen Shot 2020-04-18 at 10.36.38 AM

BSM

Glad you all stuck it out with me for this series, and this particularly long post.  Stay tuned for more Risk and Data topics in the future.  

Hope you all are still staying safe and healthy.  Happy Modeling!  

 

ⁱKaplan, Inc. FRM 2015 Part I Book4: Valuation and Risk Models (La Crosse: Kaplan Schweser 2014), 113.

 

Binomial Models in Option Pricing, Part 4

We’re going to continue our discussion today on the Binomial option pricing model.  The last installment was a bit of a sidebar on how we can adjust the model for a put option or for other scenarios that could affect the cash flows, dividends, etc.  If you are just joining us for this series, it is recommended that you review at least part 1, part 2, and part 3. The side discussion is not necessary background for this post, but it is recommended for a thorough understanding of the topic.  

1*5GrNq7q_5HHO0_uNc1uJtA

So last time we went through a binomial model that we generated to create a simple decision tree.  We calculated the risk neutral probability of a move upward and the same for a move down. Then the ending cash flows were discounted back at the risk free rate to arrive at the current call value.  How does this work if we break this down into more time steps? Essentially you need to address and up and a down move at each node. We’ll build a model for a two step scenario before we get into a deeper discussion on what happens with the time steps as we increase them (Spoiler!  They converge to the Black-Scholes-Merton model).

Again, we’re going to rely on NumPy for our linear algebra in the model.  We will need NumPy’s matrix math later, so it’s better to have our formulas built on them now.  I’m a fan of not reinventing the wheel every time I need a new custom function. So as a quick reminder, here’s our old model:

Screen Shot 2020-04-11 at 8.33.55 AM

So how do we go about adding additional decision nodes?  Think about the stock price tree we had before. We essentially need to layer it on two of itself so it looks like the below.  Each time we move we need a value for the up risk-neutral probability and the down risk neutral probability. Then we are using the values at the previously calculated nodes, starting in the future and working our way back to the present, as inputs for the next calculation.  

Untitled presentation

The tree also gives an idea of the math we need to work through.  Let’s take a look at what the formula for a two step tree looks like and then we’ll break it down.

Screen Shot 2020-04-11 at 8.34.08 AM

The first thing we notice is that the time step size is now a function of the requested number of time steps, two in our case, and the time to maturity for the option.  We can think about the size of the time step as a percentage of time to maturity. This idea will make further model discussions easier. The risk-neutral probability of an up or a down move do not change.  Only the application changes; we’ll need to use them twice.

Next we look at the price of the stock in each node. You’ll see that the first time step is the same. At the next time step, we are evaluating the same up and down price move as before as if it is the beginning of a new one step tree with the conditional that the first step has happened.  I liken this to conditional probability, because it makes the math conceptually clear in my opinion. Let’s look closer at the first up scenario. We again need to make a decision on the size of the up move and the size of the down move, given that we already made an up move.  Hopefully this makes sense.  It’s important to grasp this so we get what’s going on ‘under the hood’ later on.

white vehicle in road close up photography
Photo by Liza Lova on Pexels.com

With the call values in the future it’s a little easier.  We’re only working with European-style calls so we only need to evaluate the payoff at maturity.  Instead of two separate payoffs we now need to calculate four. The basic math is the same, maximum between the stock price and the strike price or 0.  

The expected value of the call can now be calculated.  Let’s walk through the up-up value. Instead of only multiplying the ending payoff by one probability we need to address each step in the tree.  So, the expected value of the up-up node is the probability of one up move, multiplied by a subsequent up move, multiplied by the price in that scenario.  We do this three more times for each of the payoff scenarios to arrive at the expected value of the call.

Then again simply, we take the present value of the expected value, discounting it at the risk-free rate of interest to arrive at the call value today.

gray concrete structure
Photo by Luan Oosthuizen on Pexels.com

Congratulate yourself on making it through to this point in the series.  This is a very important topic in the FRM curriculum and the basis for modern option pricing theory.  It is also not terribly simple, so kudos for sticking with me.

The next section builds on what we’ve done already.  If you look at the 2-step binomial model – it looks clunky and you can see that it would be really cumbersome to add a third step (not to mention 50 steps!).  How do we clean the programming up to address the possibility of more than 2 time-steps?  As I alluded to earlier, we’re going to take a look at some matrix math using NumPy. But, since I already took up a bunch of time, that is a discussion we’ll save for next week.  

In the meantime, stay safe and healthy everyone.  Happy Modeling!

 

Binomial Model Feature Changes

Before we continue the series on the Binomial option pricing model that started here, we are going to briefly turn our attention to a concept called the put-call parity and some of the factors we held constant for the previous model iterations.  For those just joining, I recommend that you start reviewing this series from the beginning before digging into this article: Part 1, Part 2, and Part 3.

1*GbR-NLcMbudjSLdANvCw5w

Previously, we have really only discussed pricing when it comes to a European style call.  And a simple call at that, with no dividends, in domestic currency, and no other option styles.  This article will describe how to manage these other facets among others.

The first thing we will address is the put-call parity.  The concept gives us a framework for valuing a put or call option based on the price of the other.  As a reminder, a call option is a derivative security where you can purchase the right to BUY a set number of shares at an agreed upon price.  A put option is the right to SELL a set number of shares at an agreed upon price. You can think about a put like you are ‘putting’ the security back on the company’s balance sheet.  Since the underlying asset of a put and a call on the same stock are related, it makes sense that the prices of the options are related. The price of a put is equal to the price of a call less the current stock price plus a strike price discount factor.  

put = call price – stock price + strike price * e^(risk free rate * time)

This is easy to program into Python using Numpy.  In the example that we worked through previously where Risk and Data Co. had a current option value $1.76, therefore the price of the put is $0.98.  

Screen Shot 2020-04-03 at 10.33.37 AMScreen Shot 2020-04-03 at 10.33.44 AM

What about the other scenarios that we ignored or held constant for the model?  The model is fortunately straight forward to calculate and therefore relatively straightforward to modify.  We will address these scenarios each in turn by altering the risk neutral probabilities.  

In the case of a stock that pays a dividend, we need to remove the dividend yield from the risk-free rate.  Why? Well, the payment of dividends is baked into the price of a stock, but we are not interested in the stock in this case, we want to know about a stock derivative.  The owner of the derivative is not entitled to the payment of dividends, and since the dividend is included in the price, which in turn is part of the expected return of the stock, we need to back it out.  Computationally, this is much simpler to review. We would alter our up-move probability as follows, remembering that the probabilities must sum to 1, so the down probability formula remains unchanged. 

𝛑ᵤ= e(r-q)t -D / U-D  , where q is the dividend yield

Options on stock indices are valued the same way because it assumes that the stocks in the index pay dividends.  

black and white newspaper stock exchange stock market
Photo by Markus Spiske on Pexels.com

What about options valued on foreign currency assets?  The rational and computation are very similar to dividend yield.  Because the expected return of the option is the risk free rate in the domestic currency we need to base our probability on the difference between the domestic and foreign currency risk free rates.  Again, we’re adjusting the risk free rate in our risk neutral probability. 

𝛑ᵤ = e(RDC – RFC)t -D / U-D  ,

where RDC is the domestic risk free rate and RFC is the foreign currency risk free rate

One additional modification to note is the ability of this model to value a futures contract.  Since futures are costless to enter into, they are considered to be 0 growth instruments in our risk-neutral setting.  Therefore the discount factor is replaced with 1. 

𝛑ᵤ = 1-D / U-D

These are all the function changes included in the Python models we made in Part 3.  As you may have guessed from our discussion here, the implementation is clean and simple.

Screen Shot 2020-04-03 at 11.13.38 AM

What about other styles of options, mainly American style? American-style options can be exercised at any time up to the expiration date, where European-style options can only be exercised on the expiration date.  In order to incorporate this into the model, we simply evaluate whether to exercise the option at each time step. If the payoff from early exercise is greater than the present value of the option then it is optimal to exercise early.  This evaluation should also be done at time 0, or now. If the price today is less than the value of early exercise today, the option should be exercised today. Forgetting to assess time 0, can lead to errors in expected value and payoffs.

There we have it, a closer look at option pricing.  The next stage of the game will include the long-awaited 2- Step Binomial Option Pricing model.  In the meantime, Happy Modeling!  

 

Stay Safe and Stay Healthy!   

 

Binomial Models in Option Pricing, Part 3

We have previously reviewed the basis for the Binomial Option Pricing model and then expanded our understanding by introducing the concept of Risk Neutrality and Risk-Neutral Probabilities.  This installment in our potion pricing series will walk through some Python code that enables us to very easily, and efficiently run the model. If you have not read Part 1 and Part 2, I recommend you do so before you read this installment.  

g_m_a_vol

Let’s do a little bit of a recap with the various components of the model.  Essentially, the whole model is a series of small calculations that are applied to a set of key variables, Current Stock Price, Option Strike Price, Time to Maturity, Stock Price Volatility, and the Bankruptcy Risk Free Interest Rate.  Fortunately for us all of these calculations are fairly straight forward and can be done using either Python’s built in math library or by using Numpy. Later on in our analysis, part 4, we will create a function that will work on more  time steps (no peeking!) and we will need Numpy for that, which is also what we’ll use now.

After importing Numpy with the standard alias np, we can begin our programming.  Recall that the first step in the calculation of an option pricing tree is to calculate the up and down factors.  These factors are calculations that allow us to predict, based on time and volatility, how much the stock price will move in one step, in our example case we’ll see later, in one year.  Since an option is a derivative, and therefore its value is based on the underlying asset, in this case the stock price, starting with the stock prices is natural.   

Screen Shot 2020-03-27 at 10.11.37 AMScreen Shot 2020-03-27 at 10.11.54 AM

Now that we know how to calculate the up and down move factors for stock price, we also need to calculate the risk neutral probabilities that each of those moves will be realized.  These probabilities are not the true probabilities of the up and down moves but rather the probabilities of the up and down moves if investors were essentially indifferent to risk. These probabilities must sum to 1, and in a two-state scenario, this means we only need to spend time calculating one of those values.  In our case, we’ll calculate the up move and then subtract it from 1 to get the down move.  

Screen Shot 2020-03-27 at 10.13.26 AM

Up and down move factors? Check.  Up and Down Risk Neutral Probabilities? Check.  Next would be to take the move factors and figure out the actual stock price in the future in each state.

Screen Shot 2020-03-27 at 10.13.54 AM

At this point our stock price tree is complete.  We can see the structure that I used in the previous post below, and if we run these formulas in order, you can see we arrive at the same values from Part 2, with a little rounding error.

Untitled presentation

Screen Shot 2020-03-27 at 10.21.09 AM

We walk through this process, in order to calculate the value of the option using recursion.  Effectively, we start at the farthest point out from today, and work our way backwards to now.  So, the option payout in the future states is next, which is represented by a maximum value of $0, the option expires with no value and is not exercised, and the difference between the stock price and the strike price, representing positive gain when exercised.

Screen Shot 2020-03-27 at 10.22.31 AM

Untitled presentation-2

With the option payout and the probabilities we calculated earlier, we can calculate the expected value of the option in the future.  The expected value is then discounted back, according to the time value of money, at the risk free interest rate to arrive at the current value of the option.  If this is a little confusing take another look at the trees and remember the basic tenant of the time value of money: A bird in hand is worth two in the bush.  

Screen Shot 2020-03-27 at 10.27.16 AM

Great, so we know that our math works, there’s no reason why we can’t make this all one function.  With only a few inputs that are re-used for all the calculations it would be relatively easy.  

Screen Shot 2020-03-27 at 10.28.04 AM

So there we have it. A quick function that will allow us to use the binomial model to value a call option on a stock.  Wait, wait, we’ve been only talking about CALL options. What about a PUT option?! No fear! We’ll take a quick peek at the put-call parity before we manage a two step model in our next installment.  

 

Happy Modeling!  And Please everyone, Stay Healthy and Stay Home!

 

Binomial Models in Option Pricing, Part 2

If you are just joining this particular conversation I encourage you to read part 1 here before continuing.  This post assumes a basic knowledge of the time value of money and knowledge of a basic binomial model. The description of the model is based on the Financial Risk Manager certification curriculumⁱ.  

numbers money calculating calculation
Photo by Breakingpic on Pexels.com

The previous post walked through the simplest example of a binomial option pricing model.  Before we move on, it is important to note that the binomial model is only applicable where the decision to exercise your option is not dependent on interest rate movement.  For example, the model would fail for Mortgage Backed Securities because of the dependency of prepayment on interest rate movements. Something to note as we start throwing around risk terms like baseballs, in the building of this model we are really focusing on creating a bankruptcy-free portfolio, and we in the course of this tutorial, have and may continue to refer to this as risk-free.  This may not be the case for all risk modeling so I wanted to clarify the distinction here.

Now as we develop the model further, this post will introduce the concept of risk neutrality and provide some additional clarity around the model. 

So what exactly is risk-neutrality?  It is the indifference between various levels of risk.  In our case, we will be looking at risk-neutral probabilities.  The probabilities we will be working with in this post are not the actual probabilities of an up or down move; they are the probabilities if investors were totally risk neutral.  

We’re introducing some new variables into our calculations for this walkthrough.  

e = Euler’s constant

t = length of the step in the model

σ = annual volatility of the underlying asset’s returns

U = the size of the up-move factor =e^σ√t

D = the size of the up-move factor = e^-σ√t= 1/e^σ√t= 1/U

 

Therefore, if we have a value for r, the continuously compounded risk-free rate we can calculate the risk neutral probabilities for both upward and downward movements.

𝝅ᵤ = probability of an up move = (e^rt – D) / (U-D)

𝝅ᵢ = probability of a down move = 1- 𝝅ᵤ

woman writing on a whiteboard
Photo by ThisIsEngineering on Pexels.com

 Now that we know the formulas we’ll be working with we can walk through the steps of valuing an option just as we did before.  We can go back to our Risk and Data Company for the example. Suppose the current stock price is $20 and the annual volatility is 14%.  Risk and Data is new and growing so we know they don’t pay dividends. We also know our risk-free rate is 4% per year. What should the value of a 1-year European Call option with a strike price of $20 be if we use a one-step binomial model?

First, we’ll need to calculate the payoff for the option at maturity in both the up-move and down-move scenarios.  Then we will calculate the expected value of the option in one year as the risk-neutral probability weighted payoff in each state.  Then finally we can discount the probability back to today at the risk free rate.

Step 1: We need the up-move and down-move factors to find the price of the stock in both the up state and the down state.  

U = e^0.14x√1= 1.15

D = 1/1.15 = 0.87

 

Step 2: Next, is calculating the risk-neutral probabilities.

𝝅ᵤ = (e^0.04×1 – 0.87) / (1.15-0.87)= 0.61

𝝅ᵢ = 1- 0.61 = 0.39

 

Step 3: The stock prices in 1 year need to be calculated for the up and the down states.

Su = $20 x 1.15= $23

Sd = $20 x 0.87= $17.40

 

Step 4: Now that we know the expected price of a share of stock in Risk and Data Co, one year in the future, we can find the payoff of the European call option in either the up and down states.  Recall that the payoff from a call option is the max(0, Current price- Exercise price).

c-up = max($0, $23- $20)= $3

c-down = max($0, $17.40- $20)= $0

Step 5: Now that we have the payoff, and we have our risk-neutral probabilities, we calculated in Step 2, we can calculate the expected value of the option.

EV = ($3 x 0.61) + ($0 x 0.39)= $1.83

Step 6: The final step is discounting the future payoff back to today using the risk free interest rate of 4%.

PV = EVx e^-rt = $1.83 x (e^-0.04 x 1)= $1.76

There we have it; the current risk-neutral price of the call option is $1.76.  

These are also great visual tools, as tree models.  You can see the tree structure of the model unfold below.  

Untitled presentationUntitled presentation-2Untitled presentation

 

These trees can also be expanded to multiple time-steps.  The next part of the option series will walk through some python code for manually constructing a binomial tree.

Happy Modeling!

 

ⁱKaplan, Inc. FRM 2015 Part I Book4: Valuation and Risk Models (La Crosse: Kaplan Schweser 2014), 95-97.