Home Know My Risk Reduce Risk Visualize FAQ
Data Insights

Dataset
Visualization

Real patterns from 88,000+ patient records. Cleaned via dataset.py, served by Flask.

Loadingโ€ฆ

๐Ÿ“Š About this data

Charts call GET /api/vizdata?dataset=cardio. Data is cleaned in dataset.py: impossible BP, height, weight and BMI values removed. The ML model (GradientBoosting) trains on the combined 88k dataset โ€” accuracy ~73.3%, AUC ~0.80.
The Hypertension Dataset contains 175k records from 175 countries with additional fields: family history, stress level, and salt intake. Target: Hypertension (High/Low).

Age Distribution by Cardiovascular Risk

How age correlates with high vs low risk diagnosis

Overall Risk Prevalence

Proportion of high vs low risk patients in dataset

Risk by Gender

Breakdown of risk by biological sex

Cholesterol Distribution

1=Normal(<200), 2=Above Normal(200โ€“239), 3=High(240+) mg/dL

Smoker vs Non-Smoker (High Risk)

Among high-risk patients, what fraction smokes?

BMI vs Systolic Blood Pressure

200 sampled patients โ€” red = high risk, blue = low risk

BMI Category Distribution

Risk count across BMI categories

Activity Level & Risk

Physically active vs inactive risk comparison

Data sources: Cardio Train dataset (Kaggle, Russian cohort) + Shanxi Cardiovascular dataset (Chinese cohort). 88,202 records after cleaning. Target: cardio (0=no disease, 1=disease present).