1. Create your track folder
mkdir -p ~/dev/data-science
cd ~/dev/data-science
2. Data science tools: let Claude Code do it
Open Claude Code in your track folder:
claude
I'm setting up a data science environment. Please:
1. Install Python 3.11+ via Miniconda, then create a conda environment called "ds"
2. Install core packages in the ds environment: pandas, jupyter, matplotlib, seaborn,
scipy, statsmodels, scikit-learn, plotly
3. Check if Docker is installed. If not, tell me how to install it (it needs admin access)
After each step, verify it worked and show me the result.
Verify
Once Claude Code finishes:
conda activate ds
python --version
python -c "import pandas; import matplotlib; import scipy; import statsmodels; import sklearn; print('All packages installed')"
jupyter notebook --version
You should see Python 3.11+, "All packages installed", and a Jupyter version number.
3. Your first look
Everything is installed. Before you start Project 1, see what Claude Code can do when you point it at a data science problem.
Create a small CSV dataset of 300 hospital appointments with columns: patient_age,
day_of_week, lead_time_days, sms_reminder_sent, no_show. About 20% should be no-shows.
Then explore the data: profile it, check for patterns in no-shows by age group and
day of week, run a chi-squared test on sms_reminder vs no_show, and produce 3
visualizations that tell the story. Summarize the findings in plain language.
As you work through the track, you'll learn why a single prompt isn't enough: why that chi-squared test might have violated its assumptions, why those visualizations might be misleading, why "20% no-show rate" might hide important subgroup differences, and why a client would need you to explain what the findings actually mean for their business.
But for now, look at what just happened. That's the starting point.