Computing Minimum Sample Size for A/B Tests in Statsmodels: How and Why | by Jason Jia

A popular, high-performing numerical optimization method is Brent’s method. Brent’s method is a root-finding algorithm that combines various techniques such as the bisection method, the secant method and inverse quadratic interpolation. Further details of its implementation in Statsmodels can be found here.

In Python, the implementation looks like this:

def solve_power(self, effect_size=None, nobs1=None, alpha=None, power=None,
ratio=1., alternative='two-sided'):
print('--- Arguments: ---')
print('effect_size:', effect_size, 'nobs1:', nobs1, 'alpha:', alpha, 'power:', power, 'ratio:', ratio, 'alternative:', alternative, '\n')# Check that only nobs1 is None
kwds = dict(effect_size=effect_size, nobs1=nobs1, alpha=alpha,
power=power, ratio=ratio, alternative=alternative)
key = [k for k,v in kwds.items() if v is None]
assert(key == ['nobs1'])
# Check that the effect_size is not 0
if kwds['effect_size'] == 0:
raise ValueError('Cannot detect an effect-size of 0. Try changing your effect-size.')
# Initialize the counter
self._counter = 0
# Define the function that we want to find the root of
# We want to find nobs1 s.t. current power = target power, i.e. current power - target power = 0
# So func = current power - target power
def func(x):
kwds['nobs1'] = x
target_power = kwds.pop('power') # always the same target power specified in keywords, e.g. 0.8
current_power = self.power(**kwds) # current power given the current nobs1, note that self.power does not have power as an argument
kwds['power'] = target_power # add back power to kwds
fval = current_power - target_power
print(f'Iteration {self._counter}: nobs1 = {x}, current power - target power = {fval}')
self._counter += 1
return fval
# Get the starting values for nobs1, given the brentq_expanding algorithm
# In the original code, this is the self.start_bqexp dictionary set up in the __init__ method
bqexp_fit_kwds = {'low': 2., 'start_upp': 50.}
# Solve for nobs1 using brentq_expanding
print('--- Solving for optimal nobs1: ---')
val, _ = brentq_expanding(func, full_output=True, **bqexp_fit_kwds)
return val

1.2. Writing a stripped-down version of tt_ind_solve_power that is an exact implementation of the statistical derivation and produces the same output as the original function

The source file in Statsmodels is available here. While the original function is written to be more powerful, its generalizability also makes it harder to gain intuition on how the code works.

I thus looked through the source code line-by-line and simplified it down from 1,600 lines of code to 160, and from 10+ functions to just 2, while ensuring the that implementation remains identical.

The stripped-down code contains just two functions under the TTestIndPower class, exactly following the statistical derivation explained in Part 1:

power, which computes power given a sample size
solve_power, which finds the minimum sample size that achieves a target power using Brent’s method

This is the full code for the stripped-down version with a test to check that it produces the same output as the original function:

Source link