Real patterns from 88,000+ patient records. Cleaned via dataset.py, served by Flask.
Charts call GET /api/vizdata?dataset=cardio. Data is cleaned in dataset.py: impossible BP, height, weight and BMI values removed. The ML model (GradientBoosting) trains on the combined 88k dataset โ accuracy ~73.3%, AUC ~0.80.
The Hypertension Dataset contains 175k records from 175 countries with additional fields: family history, stress level, and salt intake. Target: Hypertension (High/Low).
How age correlates with high vs low risk diagnosis
Proportion of high vs low risk patients in dataset
Breakdown of risk by biological sex
1=Normal(<200), 2=Above Normal(200โ239), 3=High(240+) mg/dL
Among high-risk patients, what fraction smokes?
200 sampled patients โ red = high risk, blue = low risk
Risk count across BMI categories
Physically active vs inactive risk comparison
cardio (0=no disease, 1=disease present).