What Does Index Sampling Mean?

Index sampling is a litte known technique that is used differently by various ETF providers across the industry. It can impact you. Click here to find out what it means.

Indices come in all sizes. Some have a few dozens of constituents, some have thousands. For ETF fund providers, it may be costly or impractical to fully replicate a large basket of constituents or purchase small positions that have minor weights. Typically, ETF fund providers seek to track an underlying index as closely as possible. However sometimes they use a technique called index sampling where they will only purchase a subset of constituents. Often, this means purchasing the largest stocks that will drive the index’s performance.

Index sampling is a little-known feature but may impact the performance of ETFs. If done right, it helps index providers achieve the optimal balance between tracking error and costs, ultimately benefitting investors.


ETF fund providers, who are fiduciaries, aim to deliver the performance of the index as best as they can by purchasing the index constituents.  An ETF provider tracking an index has different ways to deliver returns, these are:

  1. Full replication
  2. Synthetic replication (via futures, options or swap transactions)
  3. Sampling
  4. A mixture (the most common way, e.g IWM is 90% invested + futures + options)

Full Replication

Full replication occurs when the ETF fund provider physically purchases all or most of the index constituents to match the index’s performance. Full replication usually happens when it is easy to trade all the constituents, or there is a small number of constituents and the fund provider must be mindful of the impact of individual stocks on the index’s performance. There is no hard rule on where full replication will stop but where the number of constituents is less than 2,000 an ETF provider may purchase all the constituents. If there are more, they may turn to index sampling.

ETF fund providers want to undertake full replication to minimize tracking risk between the index and fund. This can be illustrated in an example, let’s assume we have 50 constituents in a thematic index and they are equally weighted, i.e. 2% each. Now imagine an extreme scenario: the index provider has only purchased 49 constituents but the lone left-out stock has become a takeover target of a large-cap. The stock receives a 50% takeover premium, which makes its 2% position in the index now worth 3%. In turn the ETF fund provider is 1% off in tracking the index, while still holding 2% in cash (the unpurchased constituent). The ETF value would be out 1% (less fees) from the index value. The tracking error would rise and the ETF fund provider would most likely face severe criticism for not tracking the index properly, calling their competence into question and raising the ire of their investors.

Given the financial and reputational risks at hand, ETF fund providers have to strike the optimal balance between tracking errors and costs.

Synthetic Replication

More commonly found in Europe, synthetic replication refers to the practice in which the ETF fund provider holds the investors’ money cash or fixed income instruments and enters into a swap transaction (with a bank) to get exposure to an index.

In this case, a swap is when two parties enter a contract agreeing to pay each other the difference of index price moves, and typically no notional is exchanged for a fee. The ETF fund provider has a long position in the index and the bank has a short position. If the reference index is at $100 and moves by $1, the bank will owe the ETF fund provider $1 less applicable fees. The ETF fund provider will pass this $1 (minus fees) to investors. The bank will replicate the basket of index constituents and can do so efficiently with its large trading operations. It can also seek efficiencies through securities lending.

Index Sampling

Index sampling is an efficient and cost-effective technique to track an index with a large number of constituents. The largest and best-known indices can have up to thousands of constituents, but they are also typically weighted by market cap which means some of the constituents have a very small weight. In such cases, the product provider will find it easier to purchase the larger constituents as they tend to represent the majority of the index movements. They then either invest in a future to track the remaining constituents or leave the fund in cash. Apart from size, the ETF fund provider also needs to balance off such factors as sectors, countries or regions in order to avoid tracking error by minimizing large exposures.

Purchasing small constituents can be highly inefficient as the underlying asset may have a large bid/offer spread, or they can be difficult to purchase due to infrequent trading, as for example with corporate bonds. The provider may elect to run the risk of missing out on the performance of the very small constituents as the impact on the ETF would be negligible anyway.

Consider an extreme example of a market cap weighted index with five constituents: two stocks with a $200 billion market cap, one $97 billion, one $2.9 billion and one $100 million, totaling $500 billion. With index sampling to replicate the index, the ETF fund provider would usually purchase the four largest stocks with a total market cap of $499.9 billion. They would exclude the $100m stock since it’s not cost-effective and may have large bid/offer spreads.

If each of the five stocks goes up by 1%, what is the difference between the values of the index and the ETF?

The index:       $500 billion x 1.01 = $505 billion

The ETF:         $499.9 billion x 1.01 = $504.899 billion

If the ETF fund provider had to replicate the index in $100,000 size, their position would be $99,980, i.e. $20 off every $100,000. So, the difference is minimal.

The table below highlights the number of constituents of some large, well-known indices, compared to the number that some ETF fund providers purchase when they use these indices as the underlying indices. 

  Index Name

 Approximate Index       Constituents

 Holdings Where   Index sampling is Used

Bloomberg Barclays Aggregate Index

    Over 10,900


FTSE Global All Capitalization Index

    Over 8,900


MSCI All Country World Index Ex-USA IMI Index

    Over 6,500


S&P National AMT-Free Municipal Bond Index

    Over 12,000


Index sampling is a technique that many large ETF fund provider use as an efficient, time saving way to replicate large indices with thousands of index constituents. The approach to replicating an index varies from ETF fund providers and they usually choosing one approach across their ETF fund range. Neither approach is right or wrong as all firms are acting in their fiduciary capacity to try and deliver the optimal balance between tracking error and costs.

For example, Vanguard typically try to replicate all indices as fully as possible, whereas iShares take a sampling approach. A firm that fully replicates a basket captures full returns from all index constituent movements, but this may cost more to trade. Whereas a firm using index sampling may have a cheaper basket to trade, which is passed to investors but risk missing out on any index movers. So ultimately it is a choice across firm. As an investor it is important to understand and be aware of this choice and determine if you want cheaper costs or the opportunity to capture the full index returns.

