From Punch Cards to ChatGPT. My Grandfather’s Glimpse into… | by Ty Stephens | Nov, 2023


Enter the experiment.

I decided over the weekend to show not tell. I wanted Skip to experience the new technology for himself — not just hear me talk about it. What if we could make a business plan to launch a hypothetical new farm in under 40 minutes? I could do it using a ChatGPT 4 with the Advanced Data Analysis enabled, over screen share to his iPhone — we’d collaborate together to build it.

We briefly discussed our assumptions:

  1. We have 1,000 acres to farm in East Texas
  2. We have $1M in start up capital to purchase equipment with
  3. We need to secure a line of credit from a bank on year 1 operating expense to get started
  4. We are interested in growing Wheat and Soybeans (what Skip actually farmed back in the day)

We started simple:

Prompt 1: “Write me an outline for a business plan to start a new farm in East Texas to grow wheat and soybeans”.

Skip was immediately impressed with the long form results. But wait, we were just getting started…

Prompt 2: “Give me a list of equipment I need to get started”
Prompt 3: “Put it in a table and prioritize numerically based on what is most urgently needed to least urgent”
Prompt 4: “now create a new column for budget allocation, Assume I’m going use a combination of leases and purchases, create an additional column to recommend lease vs purchase, assume I have $1M in starting capital.”

(Time elapsed 4 minutes)

Result:

Data Generated from GPT-4, Image by Author.

Prompt 5: “How much should I set aside to ask for a line of credit for variable cost (seed, fertilizer, labor, etc.), assume 1000 acres. Create a new table.”

Result:

Data Generated from ChatGPT 4, Image by Author

By this point, I felt like Skip has just witnessed Apollo 11 all over again. He could barely contain his excitement. But we still weren’t done…

Prompt 6: “Write me a Monte Carlo simulation that I can adjust my input assumptions and test the outcome on my my budget”

Results:

import numpy as np
import matplotlib.pyplot as plt

# Define number of simulations
n_simulations = 10000

# Assumptions with [mean, std deviation]
wheat_seed_cost = [20, 2] # Cost per acre
soybean_seed_cost = [50, 5]
fertilizer_wheat_cost = [50, 5]
fertilizer_soybean_cost = [40, 4]
labor_cost = [25, 3]
pesticide_cost = [30, 3]
fuel_maintenance_cost = [15, 2]
miscellaneous_cost = [10, 1]

def simulate_farm_costs():
wheat_seed = np.random.normal(wheat_seed_cost[0], wheat_seed_cost[1])
soybean_seed = np.random.normal(soybean_seed_cost[0], soybean_seed_cost[1])
fertilizer_wheat = np.random.normal(fertilizer_wheat_cost[0], fertilizer_wheat_cost[1])
fertilizer_soybean = np.random.normal(fertilizer_soybean_cost[0], fertilizer_soybean_cost[1])
labor = np.random.normal(labor_cost[0], labor_cost[1])
pesticide = np.random.normal(pesticide_cost[0], pesticide_cost[1])
fuel_maintenance = np.random.normal(fuel_maintenance_cost[0], fuel_maintenance_cost[1])
miscellaneous = np.random.normal(miscellaneous_cost[0], miscellaneous_cost[1])

total_cost = (wheat_seed + soybean_seed + fertilizer_wheat + fertilizer_soybean +
labor + pesticide + fuel_maintenance + miscellaneous) * 1000 # for 1000 acres

return total_cost

results = [simulate_farm_costs() for _ in range(n_simulations)]

# Visualization
plt.hist(results, bins=50, edgecolor='black', alpha=0.7)
plt.title('Distribution of Total Farm Costs')
plt.xlabel('Total Cost')
plt.ylabel('Frequency')
plt.grid(True, alpha=0.2)
plt.show()

Now for the moment of truth — will the code run? We hopped over to Jupyter notebook to run the code. And there it is on the first try:

Data Generated from ChatGPT 4, Image Generated by Python, Image from Author

By this point, we were only 10–15 minutes in. Still have some time to spare…Could we make it interactive?

We need to create a list of parameters that allow for the user to dynamically adjust the assumptions on a slider. Again we turn the ChatGPT for some recommendations of what those parameters need to be based on the Monte Carlo we built earlier:

Data Generated by GPT 4, Image by Author

Once we have our list of parameters built, we create a ‘measures’ table in Power BI tied to 16 slicer visuals that allow the user to manually select their inputs and dynamically update the Monte Carlo simulation. To do this we create a ‘Python Visual’ in Power BI, drag all of our measures in, and then update the code as follows:

# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script: 

# dataset = pandas.DataFrame(fertilizer_soybean_cost_avg Value, fertilizer_soybean_cost_std Value, fertilizer_wheat_cost_avg Value, fertilizer_wheat_cost_std Value, fuel_maintenance_cost_avg Value, fuel_maintenance_cost_std Value, labor_cost_avg Value, labor_cost_std Value, miscellaneous_cost_avg Value, miscellaneous_cost_std Value, pesticide_cost_avg Value, pesticide_cost_std Value, soybean_seed_cost_avg Value, wheat_seed_cost_avg Value, wheat_seed_cost_std Value)
# dataset = dataset.drop_duplicates()

# Paste or type your script code here:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Assuming the data from Power BI is passed as 'dataset'
df = dataset

# Fetch values from the dataset
wheat_seed_cost_avg = df['wheat_seed_cost_avg Value'].iloc[0]
wheat_seed_cost_std = df['wheat_seed_cost_std Value'].iloc[0]
soybean_seed_cost_avg = df['soybean_seed_cost_avg Value'].iloc[0]
soybean_seed_cost_std = df['soybean_seed_cost_std Value'].iloc[0]
fertilizer_wheat_cost_avg = df['fertilizer_wheat_cost_avg Value'].iloc[0]
fertilizer_wheat_cost_std = df['fertilizer_wheat_cost_std Value'].iloc[0]
fertilizer_soybean_cost_avg = df['fertilizer_soybean_cost_avg Value'].iloc[0]
fertilizer_soybean_cost_std = df['fertilizer_soybean_cost_std Value'].iloc[0]
labor_cost_avg = df['labor_cost_avg Value'].iloc[0]
labor_cost_std = df['labor_cost_std Value'].iloc[0]
pesticide_cost_avg = df['pesticide_cost_avg Value'].iloc[0]
pesticide_cost_std = df['pesticide_cost_std Value'].iloc[0]
fuel_maintenance_cost_avg = df['fuel_maintenance_cost_avg Value'].iloc[0]
fuel_maintenance_cost_std = df['fuel_maintenance_cost_std Value'].iloc[0]
miscellaneous_cost_avg = df['miscellaneous_cost_avg Value'].iloc[0]
miscellaneous_cost_std = df['miscellaneous_cost_std Value'].iloc[0]

# Define number of simulations
n_simulations = 10000

# Assumptions with [mean, std deviation]
wheat_seed_cost = [wheat_seed_cost_avg, wheat_seed_cost_std]
soybean_seed_cost = [soybean_seed_cost_avg, soybean_seed_cost_std]
fertilizer_wheat_cost = [fertilizer_wheat_cost_avg, fertilizer_wheat_cost_std]
fertilizer_soybean_cost = [fertilizer_soybean_cost_avg, fertilizer_soybean_cost_std]
labor_cost = [labor_cost_avg, labor_cost_std]
pesticide_cost = [pesticide_cost_avg, pesticide_cost_std]
fuel_maintenance_cost = [fuel_maintenance_cost_avg, fuel_maintenance_cost_std]
miscellaneous_cost = [miscellaneous_cost_avg, miscellaneous_cost_std]

def simulate_farm_costs():
wheat_seed = np.random.normal(wheat_seed_cost[0], wheat_seed_cost[1])
soybean_seed = np.random.normal(soybean_seed_cost[0], soybean_seed_cost[1])
fertilizer_wheat = np.random.normal(fertilizer_wheat_cost[0], fertilizer_wheat_cost[1])
fertilizer_soybean = np.random.normal(fertilizer_soybean_cost[0], fertilizer_soybean_cost[1])
labor = np.random.normal(labor_cost[0], labor_cost[1])
pesticide = np.random.normal(pesticide_cost[0], pesticide_cost[1])
fuel_maintenance = np.random.normal(fuel_maintenance_cost[0], fuel_maintenance_cost[1])
miscellaneous = np.random.normal(miscellaneous_cost[0], miscellaneous_cost[1])

total_cost = (wheat_seed + soybean_seed + fertilizer_wheat + fertilizer_soybean +
labor + pesticide + fuel_maintenance + miscellaneous) * 1000 # for 1000 acres

return total_cost

results = [simulate_farm_costs() for _ in range(n_simulations)]

# Convert results into a dataframe
df_simulated_results = pd.DataFrame(results, columns=['Total Cost'])

# Calculate the Interquartile Range (IQR)
Q1 = df_simulated_results['Total Cost'].quantile(0.25)
Q3 = df_simulated_results['Total Cost'].quantile(0.75)

# Plotting the histogram
plt.figure(figsize=(10, 6))
n, bins, patches = plt.hist(df_simulated_results['Total Cost'], bins=50, color='blue', edgecolor='black', alpha=0.7)
plt.title('Distribution of Year 1 Variable Farm Costs from Simulation')
plt.xlabel('Year 1 Variable Cost')
plt.ylabel('Frequency')
plt.grid(True, which='both', linestyle='--', linewidth=0.5)

# Shade the IQR
for i in range(len(bins)):
if bins[i] > Q1 and bins[i] < Q3:
patches[i].set_facecolor('green')

plt.axvline(Q1, color='red', linestyle='dashed', linewidth=1)
plt.axvline(Q3, color='red', linestyle='dashed', linewidth=1)
plt.tight_layout()
plt.savefig('simulated_costs_histogram.png') # This will save the figure as an image file
plt.show()

Just for fun, we prompted ChatGPT to define the Inter-Quartile Range (IQR) and shade it a different color, we also manually updated the chart labels and x-axis. The rest of just cleaning up the visuals a bit in Power BI to make it a little more user friendly. The end result:

Data Generated by Monte Carlo Simulation based on User Selected Parameter Inputs, Python Code Generated by ChatGPT 4, Dashboard built in MS PowerBI, Image by Author

Now we have a dynamic Monte Carlo simulation to play around with different input cost assumptions and project out the variable operating expense we would need to get our farming operation off the ground. Leveraging ChatGPT 4, we wrote virtually no code, just tweaked it a bit, did most of the work from an iPhone over screen share, built the last piece in PBI desktop and screen shared it over the PBI iPhone app. All in about 30–40 minutes of work.

My grandfather’s verdict? “We accomplished in 40 minutes what would have taken him 2 years to do on his own ‘back in the day’.” Yes, I recognize there is a lot more we could do — and it “simulation” is far from perfect. (For example, we do not delineate between % of crops devoted to soybeans vs wheat.) But for 40 minutes? Even I was impressed. And this is the promise of Gen AI — democratizing data science, encouraging experimentation, and speeding up the ability to develop all within the palm of your hand. Allowing a grandfather and his grandson an opportunity to reconnect over some statistics and leveraging technology in new and unexpected ways.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*