Open-Source ADHD Objective Assessment: Design + AI Interpretation
0) Core idea (what we’re replicating—and improving)
- 
Task engine (CPT family): Go/No-Go + Stop-Signal + Sustained Attention (CPT), with visual + optional auditory streams. 
- 
Activity capture: Webcam-based micromovement tracking (pose/landmark kinetics) ± optional IMU (phone in pocket) to approximate “QbActivity”. 
- 
Outcome metrics: Standard CPT metrics (omission/commission errors, reaction time (RT), RT variability), lapses, post-error slowing, and movement indices (fidget index, head sway, body displacement, micro-saccade proxy). 
- 
Normative scoring: Age/sex-adjusted Q-scores and percentiles using normative modeling (see §5). 
- 
AI layer: Probabilistic severity estimates + explanations (feature importances, counterfactuals), not a standalone diagnosis. 
1) Stack & licenses (fully open)
- 
Front-end task runtime: jsPsychorlab.js(browser-based, low-latency; works on clinic PCs).
- 
Motion capture: MediaPipeFace/Hands/Pose (WebAssembly via browser) orOpenCV + MediaPipe(Python desktop).
- 
Back-end & analytics: FastAPI(Python),pydantic,uvicorn.
- 
Stats/ML: scikit-learn,statsmodels,PyMC(Bayesian),xgboost/lightgbm,shap(explainability).
- 
Visualization & reports: plotly,weasyprint/reportlab→ PDF.
- 
Packaging: Docker for one-click local installs. 
- 
License: AGPL-3.0 (or Apache-2.0 if you prefer more permissive). 
2) Task battery (replicable + extensible)
A. Sustained Attention CPT (15–20 min, adjustable)
- 
Rare target (e.g., press on “X” with 1:3 to 1:4 non-target ratio). 
- 
Outputs: omissions, commissions, d′ (signal detection), RT mean/SD/CV, lapses (RT > 90th percentile), drift (time-on-task decline). 
B. Go/No-Go (5–7 min)
- 
Outputs: inhibitory failure rate, prepotent bias, post-error slowing, ex-Gaussian RT parameters (μ, σ, τ) for variability. 
C. Stop-Signal Task (optional, 8–10 min)
- 
Outputs: SSRT (stop-signal reaction time), inhibition function (by SSD). 
D. Distractor CPT (MOXO-style, optional)
- 
Background audiovisual distractors (classroom noise, moving objects) with culturally neutral assets. 
- 
Outputs: performance deltas with/without distractors. 
All tasks log millisecond timestamps, stimulus IDs, and raw keypress/mouse events.
3) Activity/movement capture (your “QbActivity” analogue)
Webcam-only baseline (no wearables):
- 
Pose landmarks: head, shoulders, wrists; compute per-frame displacement velocity, jerk, and spectral power of micro-movements. 
- 
Fidget index: weighted sum of (head yaw/pitch variance, wrist trajectory entropy, trunk sway). 
- 
Stationarity drift: long-window variance vs. short-window variance ratio. 
- 
Motion-noise guardrails: face occlusion detection, confidence thresholds; flag invalid segments. 
Optional IMU add-on (smartphone in pocket)
- 
Lightweight PWA reads accelerometer/gyroscope; syncs via WebRTC timestamping. 
- 
Derive RMS acceleration, jerk, and burst frequency—correlate with webcam metrics. 
4) Feature set (for both human interpretation and AI)
Attention/Impulsivity (CPT family)
- 
Omissions, commissions, RT mean/SD/CV, ex-Gaussian τ, sequential effects (AR(1) in RT), post-error slowing, vigilance slope, d′, β. 
Activity
- 
Fidget index, head/torso sway (px/s), movement bursts per minute, micro-pause frequency, entropy of wrist paths, spectral centroid of movement. 
Ecological robustness
- 
Performance with vs without distractors; delta scores summarize susceptibility. 
5) Scoring: percentiles & Q-scores via normative modeling
- 
Normative dataset schema: age,sex,handedness,task version,device,lighting,education band.
- 
Model: Hierarchical Bayesian Normative Modeling (HBM) or Gaussian Process Normative Modeling: - 
Fits expected value and variance of each feature as a function of covariates (age/sex/device). 
- 
Individual deviation (z-like Q-score): Q=x−μ^(age,sex,…)σ^(age,sex,…)Q = \frac{x – \hat{\mu}(age,sex,…)}{\hat{\sigma}(age,sex,…)} 
- 
Multi-feature fusion: robust Mahalanobis distance → domain Q-scores (Activity, Inattention, Impulsivity) + Total Q. 
 
- 
- 
Interpretive bands (proposed, modifiable after validation): - 
0.0–1.0: within typical limits 
- 
1.1–1.4: mildly atypical 
- 
≥1.5: atypical / clinically concerning 
 
- 
- 
Change metrics: ΔQ ≥ 0.5 (half SD) ⇒ clinically meaningful improvement flag. 
6) AI-assisted interpretation (decision support, not diagnosis)
- 
Models: Gradient boosted trees or calibrated logistic regression with probability of clinically significant deviation in each domain. 
- 
Calibration: Platt/Isotonic per domain; reliability curves surfaced in UI. 
- 
Explainability: - 
Global: feature importance bars, partial dependence. 
- 
Local: SHAP values for the patient; counterfactuals (“If RT variability dropped by 20%, Total Q would reduce by 0.35”). 
 
- 
- 
Narrative engine: Template + LLM fill for patient-friendly and clinician-grade summaries, gated by strict guardrails (facts only, cite metrics, surface uncertainty). 
7) Reporting (clear, audit-safe)
- 
One-page patient summary: percentile dials for three domains, green/amber/red bands, plain-language explanation. 
- 
Clinician report (PDF): task plots (RT time series, error raster), movement spectrograms, domain Q-scores, ΔQ vs prior visits, QC flags (lighting, face lost %, dropped frames). 
- 
CSV/Parquet export: raw + derived features for research. 
8) Data, privacy, consent
- 
Local-first processing: default on-device; optional encrypted sync to clinic server. 
- 
De-ID: face embeddings discarded; store only numerical landmarks/velocities; SHA-256 of video chunks (no video retention by default). 
- 
Consent tiers: (a) care only; (b) anonymized norms; (c) research sharing (with IRB). 
- 
Audit logs: every model version & parameter set stamped into each report. 
9) Validation plan (publishable, pragmatic)
- 
Feasibility & reliability (n≈60): test–retest ICC for key features and domain Q-scores over 1–2 weeks. 
- 
Construct validity (n≈150): correlations with CPT-3/TOVA/IVA-2 and rating scales (ADHD-RS-5, Conners 3). 
- 
Known-groups validity (n≈200): ADHD vs. non-ADHD; compute AUC per domain. 
- 
Sensitivity to change (n≈80): pre/post stimulant titration; ΔQ distribution; anchor with clinician CGI-I. 
- 
Device variance study: cheap webcam vs. HD webcam; adjust normative model if needed. 
Power calcs & pre-registration recommended; share scripts to keep it open and credible.
10) Regulatory stance (clear and safe)
- 
Position as clinical decision support / research tool, not a diagnostic. 
- 
Add Intended Use statement, risk controls, and QC gates. 
- 
If later seeking medical device classification, align code/process with IEC 62304, ISO 14971, and ISO 13485 processes. 
11) MVP roadmap (12 weeks)
Weeks 1–2:
- 
jsPsych CPT + Go/No-Go; FastAPI backend; JSON event logs; webcam capture + MediaPipe pose; minimal UI. 
Weeks 3–4:
- 
Feature extraction pipeline; initial normative bootstrap (healthy volunteers across age bands); simple z/Q-scores; PDF report v1. 
Weeks 5–6:
- 
Add fidget index; distractor module; IMU optional; QC metrics (lighting/face loss). 
Weeks 7–8:
- 
AI v1 (calibrated LR / XGBoost); SHAP explanations; clinician/patient report split. 
Weeks 9–10:
- 
Test–retest study; bug bash; Dockerized installer; offline mode. 
Weeks 11–12:
- 
Documentation, unit tests, telemetry (opt-in), preprint of methods, invite collaborators. 
12) Data schemas (practical)
/data/raw/
- 
events_{uuid}.parquet→trial_id, stim_type, is_target, t_on, t_resp, correct, rt_ms
- 
pose_{uuid}.parquet→t, landmark_id, x, y, conf
- 
imu_{uuid}.parquet→t, ax, ay, az, gx, gy, gz
/data/derived/
- 
features_{uuid}.parquet→ flat feature vector
- 
qc_{uuid}.json→ dropped_frames, face_lost_pct, light_score
/reports/
- 
report_{uuid}.pdf,report_{uuid}.json(for EHR ingestion)
13) Example feature definitions (succinct)
- 
RT variability (CV): std(RT)/mean(RT)
- 
Lapses: % trials with RT > mean+2*SD
- 
Post-error slowing: mean(RT post-error) - mean(RT post-correct)
- 
Fidget index: w1*var(head_yaw)+w2*var(head_pitch)+w3*path_entropy(wrists)+w4*jerk_rms(weights from training)
14) AI interpretation snippet (logic you can ship)
- 
Domain probabilities: P(atypical domain)=calibrated_model(features)P(\text{atypical domain}) = \text{calibrated\_model}(features) 
- 
Total Q: robust combination of domain Q-scores (e.g., trimmed mean). 
- 
Narrative rules: - 
If Q_inattention ≥ 1.5andRT_CV high→ “Marked variability consistent with inattention.”
- 
If Q_activity ≥ 1.5with high fidget index and frequent bursts → “Elevated motor restlessness.”
- 
Always surface confidence and QC notes. 
 
- 
15) Ethics & equity
- 
Bias checks: stratify performance by age, sex, language, device quality. 
- 
Transparent thresholds: publish how bands map to Q/percentiles. 
- 
No black-box edicts: every AI claim paired with visible metrics. 
16) Where this goes next (your clinic’s edge)
- 
Clinic-ready: run assessments before and after stimulant trials, neurofeedback blocks, or school term changes; track ΔQ. 
- 
Research: rapid piloting for ADHD subtyping, digital phenotyping, and treatment personalization. 
- 
Education: patient-friendly visuals demystify “brain fog” vs. inattention, reduce stigma, and anchor shared decisions. 
About the Author
✦ Dr. Srinivas Rajkumar T, MD (AIIMS, New Delhi)
Assistant Professor of Psychiatry, Sree Balaji Medical College & Hospital, Chennai
Consultant Psychiatrist, Mind and Memory Clinic, Apollo Clinic, Velachery, Chennai (Opp. Phoenix Mall)
My expertise spans ADHD, neurodevelopmental disorders, and neuromodulation therapies (rTMS, tDCS, neurofeedback, and digital brain-based tools). I am also passionate about integrating AI and open-source methods into clinical psychiatry to enhance diagnostic objectivity and patient outcomes.
📧 srinivasaiims@gmail.com
📍 Chennai, India
Call for Collaboration
If this blueprint nudges your curiosity, don’t wait for perfect—pick a module, build a scrappy v0, and ship. I’m actively collaborating with clinicians, engineers, and researchers on open, auditable ADHD assessment tools.
Email me at srinivasaiims@gmail.com with the subject “Open ADHD Tool — Collab” and a 3–5 line summary of what you’re tackling. When you make progress—code, dataset, validation, or even a negative result—send an update. I’ll credit contributors, review useful PRs, and help test promising ideas in clinic.