Back to QuantAI

Methodology

How QuantAI computes
your statistics

Every number QuantAI reports comes from a deterministic mathematical function — not an AI model. This page documents exactly which libraries, functions, and decision rules back each analysis, so you can cite your tools accurately and verify any result independently.

Computation Engine

Statistical libraries

scipy.stats

SciPy ≥ 1.11

Core statistical tests — t-tests, ANOVA, Mann-Whitney, Kruskal-Wallis, Wilcoxon, Pearson & Spearman correlation, chi-square, normality and homogeneity checks.

Virtanen et al. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17, 261–272.

statsmodels

statsmodels ≥ 0.14

OLS and logistic regression, VIF computation, Durbin-Watson statistic, Breusch-Pagan homoscedasticity test, Tukey HSD post-hoc.

Seabold & Perktold (2010). Statsmodels: Econometric and statistical modeling with Python. Proceedings of the 9th Python in Science Conference.

pandas

pandas ≥ 2.0

Data loading, missing data computation, listwise deletion, variable type detection, and Cronbach's alpha via covariance matrix.

McKinney (2010). Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference.

Anthropic Claude

claude-sonnet-4-6

Writes the APA Results paragraph only. Claude receives computed statistics and variable names — never raw data. It cannot modify or fabricate statistical values.

Anthropic (2024). Claude: A next-generation AI assistant. anthropic.com

12 Analyses

Function-level documentation

The exact function called for each analysis, the effect size reported, and any automatic corrections applied.

AnalysisLibraryFunction / MethodEffect SizeAuto-correction
Independent Samples t-Testscipy.statsttest_ind(equal_var=False if Levene p ≤ .05)Cohen's dWelch's applied automatically when Levene's p ≤ .05
Paired Samples t-Testscipy.statsttest_relCohen's d
One-Way ANOVAscipy.statsf_oneway + Tukey HSD post-hocη²
Mann-Whitney Uscipy.statsmannwhitneyurank-biserial r
Kruskal-Wallisscipy.statskruskalη²
Wilcoxon Signed-Rankscipy.statswilcoxoneffect size r
Pearson Correlationscipy.statspearsonrr, 95% CI
Spearman Correlationscipy.statsspearmanrρ
OLS Linear RegressionstatsmodelsOLS.fit() + VIF + Durbin-Watson + Breusch-PaganR², Adjusted R²Heteroscedasticity flagged via Breusch-Pagan
Binary Logistic RegressionstatsmodelsLogit.fit()OR, 95% CI, McFadden R²
Chi-Square Testscipy.statschi2_contingencyCramér's VYates' correction for 2×2 tables
Cronbach's Alphanumpy / pandasCovariance matrix approach (k/(k-1)) × (1 − Σσᵢ²/σₜ²)α, 95% CI

Note. All functions are called with their default validated parameters unless a correction is explicitly triggered by an assumption check result.

Assumption Checking

Automatic, before every analysis

QuantAI runs a suite of assumption checks before computing any statistic. Each check returns a green / yellow / red flag with a plain-language explanation and, where applicable, a recommendation. Flags are advisory — they do not block analysis, but they are reported alongside results so you can address them in your Methods section.

Missing Data

All analyses

Method: pandas .isna() count across all analysis columns

Thresholds: 0% → green; <5% → yellow (listwise deletion noted); ≥5% → red (imputation recommended)

Normality

t-tests, ANOVA, regression residuals

Method: Shapiro-Wilk (scipy.stats.shapiro) for n < 50; D'Agostino-Pearson (scipy.stats.normaltest) for n ≥ 50

Thresholds: p > .05 → assumption met (green); p ≤ .05 → assumption violated (red), non-parametric alternative recommended

Homogeneity of Variance

Independent t-test, One-Way ANOVA

Method: Levene's test (scipy.stats.levene)

Thresholds: p > .05 → equal variances (green); p ≤ .05 → unequal variances (yellow), Welch's correction applied automatically for t-tests

Sample Size

OLS linear regression, logistic regression

Method: 10:1 rule — minimum n = number of predictors × 10

Thresholds: Met → green; Not met → red with specific minimum reported

Multicollinearity (VIF)

OLS linear regression, logistic regression

Method: statsmodels variance_inflation_factor per predictor

Thresholds: VIF < 5 → acceptable (green); 5–10 → moderate (yellow); > 10 → severe (red), removal/combination recommended

Independence of Errors

OLS linear regression

Method: Durbin-Watson statistic (statsmodels.stats.stattools.durbin_watson)

Thresholds: 1.5 < DW < 2.5 → errors independent (green); outside range → potential autocorrelation (yellow)

Homoscedasticity

OLS linear regression

Method: Breusch-Pagan test (statsmodels.stats.diagnostic.het_breuschpagan)

Thresholds: p > .05 → homoscedastic (green); p ≤ .05 → heteroscedasticity detected (yellow), robust standard errors recommended

AI Narrative Generation

What Claude sees — and what it does not

Claude receives

  • Analysis type name (e.g., "independent_samples_ttest")
  • Computed statistics (t, df, p, Cohen's d — numbers only)
  • Variable names as you assigned them
  • Assumption check results (flag names and status)
  • Whether any automatic correction was applied

Claude never receives

  • Your raw data file or any rows from it
  • Individual participant responses
  • Any personally identifiable information
  • Email addresses, IDs, or demographic raw values
  • File contents or file names

Claude system prompt constraints (enforced on every call)

  • Write in past tense, third person, formal academic register
  • Report all statistics in APA 7 format: F(df1, df2) = X.XX, p = .XXX, Cohen's d = X.XX
  • Never fabricate statistics — only report values provided in the input
  • State what was tested, what was found, and what the effect size indicates
  • If an assumption was violated, note the corrective action taken
  • Keep the paragraph between 80 and 150 words

APA 7th Edition Formatting

Table and notation standards

Table bordersHorizontal rules at top, below column headers, and at bottom only. No vertical lines. Follows APA 7 §7.13.
Statistical notationTest statistics italicized (t, F, U, H, W, r, χ²). Degrees of freedom in parentheses. p-values reported exactly to three decimal places (p = .003, not p < .05) unless p < .001.
Effect sizesAlways reported. Cohen's d for t-tests; η² for ANOVA and Kruskal-Wallis; rank-biserial r for Mann-Whitney and Wilcoxon; Cramér's V for chi-square; R² and Adjusted R² for regression.
Table notesAppear below every table. Begin with "Note." (italic). Report key statistics not included in table columns (e.g., t, p, d values below a group-means table).
Confidence intervalsReported for Pearson r (95% CI via Fisher z-transform) and logistic regression odds ratios (95% CI via profile likelihood).

Reference: American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.).

Verification

How to verify any result

The verification section of the QuantAI homepage publishes reference datasets with known expected values — computed independently using SPSS and R — alongside the QuantAI output for the same data. You can download any reference dataset and run the same analysis in your preferred tool to confirm the values match.

Citing QuantAI in your Methods section

"Statistical analyses were conducted using QuantAI (quantai.study), which uses scipy.stats (Virtanen et al., 2020) and statsmodels (Seabold & Perktold, 2010) for statistical computation and Claude Sonnet (Anthropic, 2024) for APA 7th edition narrative generation. All analyses followed APA 7th edition reporting standards."

Data Sources

Works with any tabular data

QuantAI analyzes any well-structured CSV, Excel, or SPSS file — survey exports, secondary databases (IPEDS, Census, administrative records), experimental data, or manually entered spreadsheets. Survey platform exports from Qualtrics, SurveyMonkey, Prolific, and Google Forms are cleaned automatically (extra header rows and metadata columns removed). For all other sources, upload your file as-is. The only requirement is one row per observation and numeric values in the columns you intend to analyze.

Ready to run your analysis?

Start with 3 free analyses — no credit card required.