JSON includes full metadata, per-trial responses, and summary statistics. CSV is flat trial-by-trial for R/Python/MATLAB. PNG exports the staircase chart.
Generates a URL that restores all control values when opened.
Clears all trial data, staircase state, and charts. Cannot be undone.
Standards governing stimulus presentation, measurement protocols, and analysis methods in psychophysics research. This engine follows the principles below; calibrated hardware is required for publication-grade compliance.
CIE Standards for Visual Threshold Measurement
CIE 017:2023 — International Lighting Vocabulary. Defines luminance threshold, contrast threshold, adaptation luminance, and related photometric quantities used in psychophysical measurement. Supersedes CIE S 017/E:2011.
CIE 041-1978 — Light as a true visual quantity. Establishes principles for photometric measurements consistent with human visual response, including spectral luminous efficiency functions V(λ) (photopic) and V′(λ) (scotopic).
CIE 159:2004 — CIECAM02 colour appearance model. Provides the chromatic adaptation and appearance framework underlying colour-appearance psychophysics. Used when experiments cross viewing conditions.
CIE 224:2017 — Colour fidelity index for accurate scientific use (R_f). Defines how faithfully a light source renders test colours, critical for colour naming and discrimination experiments.
CIE S 026:2018 — α-opic action spectra and toolbox for measuring ipRGC-influenced responses. Required for flicker and temporal sensitivity experiments involving melanopsin pathways.
ISO Standards for Psychophysical Testing
ISO 3664:2009 — Viewing conditions for graphic technology and photography. Specifies D50 and D65 reference illuminants, luminance levels (500–2 000 cd/m²), and surround conditions for consistent psychophysical evaluation.
ISO 12646:2015 — Displays for colour proofing. Specifies white point (D50 ±3 MIRED), luminance (≥80 cd/m²), uniformity (≤10% variation), and colour gamut for visual colour-matching experiments.
ISO 11664-4:2019 — CIE 1976 L*a*b* and CIEDE2000 (ΔE00). The standard metric for perceptual colour differences in discrimination experiments. Threshold for just-noticeable difference: ΔE00 ≈ 1.0–2.3 under controlled conditions.
ISO 14253-1:2017 — Test uncertainty for measurement conformance. Sets the framework for reporting measurement uncertainty, applicable when psychophysical thresholds inform acceptance criteria.
ISO/IEC 29341 (UPnP AV) — Display compliance for video content; relevant for flicker sensitivity and temporal contrast measurements.
Signal Detection Theory (SDT) Framework
Equal-variance Gaussian SDT (Green & Swets, 1966) assumes signal and noise distributions are unit-normal with equal variance. Sensitivity d′ = z(H) − z(FA). Criterion c = −½[z(H) + z(FA)]. Likelihood ratio β = exp(−½[z(H)² − z(FA)²]).
Unequal-variance SDT relaxes the s = 1 assumption. Necessary when perceptual variance differs between signal and noise (e.g., recognition memory, familiarity paradigms). ROC slope ≠ 1 is diagnostic.
Non-parametric index A′ provides a threshold-free sensitivity estimate when distributional assumptions cannot be verified. A′ = 0.5 (chance) to 1.0 (perfect).
Multi-interval SDT: in m-AFC, d′m = d′ · √(m / 2(m−1)) for equal-variance Gaussian model. Corrects for interval count.
ROC analysis plots H vs FA across all criterion levels. AUROC = P(X_S > X_N) = Φ(d′/√2) under equal-variance Gaussian model. Provides non-parametric sensitivity measure independent of observer criterion.
Hautus corrections (1995) for extreme rates: apply 0.5/N and 1−0.5/N corrections to avoid infinite z-scores at H = 0/1 or FA = 0/1.
Adaptive Procedure Standards & Terminology
Classical threshold (Fechner, 1860): absolute threshold = lowest stimulus intensity detected on 50% (or 75%) of presentations; difference threshold = smallest detectable change (ΔI/I = Weber fraction).
Transformed up-down rules (Levitt, 1971): step up after 1 wrong, step down after N correct in a row. Convergence point p satisfies p^N = 0.5.
- 1-up / 1-down: p = 50.0%
- 1-up / 2-down: p = 70.7%
- 1-up / 3-down: p = 79.4%
- 1-up / 4-down: p = 84.1%
QUEST (Watson & Pelli, 1983): Bayesian posterior on a grid of θ values. Next stimulus = posterior mean or MAP. Updates analytically via Bayes’ rule. Achieves 95% of maximum efficiency after ∼30 trials.
PEST (Taylor & Creelman, 1967): parameter estimation by sequential testing. Step doubles in the same direction; halves on reversal. Uses Wald’s sequential probability ratio test to determine termination.
Constant stimuli: fixed intensity levels presented in random order. No adaptive rule. Requires ≥100 trials per level; most accurate for psychometric curve shape but least efficient.
Minimum trial recommendations: QUEST ≥40; transformed staircase ≥60–80; constant stimuli ≥100 per level. Bootstrap CI with ≥1 000 resamples for threshold confidence intervals.
Display & Timing Requirements
Refresh rate: 60 Hz provides 16.67 ms frame resolution; suitable for stimuli ≥50 ms. 120 Hz (8.33 ms) or 240 Hz (4.17 ms) for temporal contrast & flicker work. CRT phosphor decay <1 ms was the historical gold standard.
Luminance calibration: Use a photometer (e.g., Konica Minolta LS-100) to verify display gamma. sRGB assumes γ = 2.2 but individual panels deviate. Incorrect gamma introduces non-linear contrast errors of up to ±30%.
Bit depth: 8-bit: 256 grey levels; insufficient for <5% Michelson contrast. 10-bit: 1 024 levels; acceptable for most CSF work. 12-bit: 4 096 levels; required for sub-threshold masking experiments. DVI/HDMI 8-bit typically limits low-contrast Gabors even on 10-bit panels.
Spatial calibration: Viewing distance and display pixel pitch determine cycles per degree. At 57 cm, 1° ≈ 1 cm. Typical 24-inch 1080p at 57 cm: ∼37 pixels per degree. Set spatial frequency fields accordingly.
Temporal timing (web): This engine uses rAF + performance.now() sub-millisecond timestamps. Browser compositing adds ∼1 frame latency. Tab-switching or background processes degrade timing. For publication, use PsychoPy, PsychToolbox, or jsPsych with hardware-verified onset timestamps.
Viewing conditions (ISO 3664): Surround luminance 10–15% of display white; room illuminance ≤64 lx; no specular reflections. Maintain ≥20 min dark adaptation for scotopic threshold work.
Experimental Design & Ethics
Observer blinding: Where possible use a blinded operator and randomised trial order to minimise experimenter and observer expectation bias.
Practice trials: Always include practice (minimum 10 trials) before data collection. Practice trials are not analysed. This engine’s Practice mode supports this.
Breaks: Recommend a 5-minute break every 15–20 minutes of continuous testing. Eye fatigue affects thresholds by up to 0.1 log units.
Ethics: Photosensitive epilepsy risk is present with flicker stimuli at 3–30 Hz. Do not use flicker paradigms without informed consent screening. This tool includes a flicker stimulus type; use responsibly.
Reporting standards: Report mean threshold ± bootstrap 95% CI, staircase type, number of reversals, reversals discarded, trials per block, viewing distance, and display specifications (make, model, refresh rate, bit depth, luminance).
All formulas implemented in psychophysical-experiment.js.
Notation follows standard psychophysics convention (Green & Swets 1966; Wichmann & Hill 2001;
Watson & Pelli 1983).
Psychometric Functions
ψ(x; α, β, γ, δ) = γ + (1 − γ − δ) · [1 − exp(−(x/α)β)]
α = threshold (62% point after correction)
β = shape / slope (typically 2–4 for detection)
γ = guessing rate (0.5 for 2AFC; 1/m for m-AFC; ~0 for yes/no)
δ = lapse rate (typically 0.01–0.04; models inattention)
At α: ψ = γ + (1−γ−δ)(1−e−1) ≈ 63.2% uncorrected
Logistic Psychometric Function (standard in this engine):
ψ(x; θ, s) = γ + (1−γ−δ) / [1 + exp(−s(x − θ))]
θ = threshold (50% point of the sigmoid, corrected for γ/δ)
s = logistic slope ≈ 3.5 / (α · σ) where σ is the psychometric width
Probit (Normal) Function:
ψ(x) = Φ(μ + σx) — used when distributional assumptions are justified
Maximum Likelihood Estimation (MLE):
log L(θ, s | data) = ∑i [ki log ψ(xi) + (ni−ki) log(1−ψ(xi))]
Grid search over θ × s (31×21 = 651 evaluations in this engine).
Goodness of fit:
G2 = 2 ∑ [k ln(k/&hat;k) + (n−k) ln((n−k)/(n−&hat;k))]
χ2 with df = (bins − 2) parameters.
Staircase Convergence & Threshold Estimation
Target performance p* = 0.51/Ndown
1-up / 2-down: p* = √0.5 ≈ 70.71%
1-up / 3-down: p* = 0.51/3 ≈ 79.37%
1-up / 4-down: p* = 0.51/4 ≈ 84.09%
Threshold Estimation from Reversals (Levitt, 1971):
θ̂ = (1/M) ∑i=k+1N xrev,i
Discard first k = 4 warmup reversals; use M = N−4 stable reversals.
SE = SD(xrev) / √M
Need ≥6 total reversals (≥2 stable) for a valid estimate.
Geometric Step Reduction (this engine):
sn = s0 × (1/√2)floor(R/4)
R = number of reversals; reduces step by ~29% every 4 reversals.
Precision & efficiency:
Variance of threshold estimate ∝ (step size)2 + observer variability
Smaller final step size → lower variance but requires more trials.
QUEST — Bayesian Adaptive Estimation
ψ(x | θ) = γ + (1−γ−δ) [1 − exp(−10β(x−θ))]
Bayesian update:
p(θ | data) ∝ p(θ) · ∏i ψ(xi|θ)ri (1−ψ)1−ri
Normalised: pn+1(θ) = pn(θ) · L / Z, where Z = ∑ p(θ)L(θ)
Next stimulus selection:
MAP: x* = argmaxθ p(θ | data)
Posterior mean (this engine): x* = E[θ | data] = ∑ θi p(θi | data)
Posterior SD: σpost = √E[(θ−μ)2] — decreases with trials
QUEST+ entropy minimisation (Watson 2017):
H[p] = −∑ p(θ) log2 p(θ) (bits)
Choose x* = argminx Er[H[p(θ|data,x,r)]]
Simultaneously estimates α, β, δ with ≥3D grid.
Grid implementation (this engine):
300-point uniform grid on [minInt, maxInt]
γ = 0.5 (2AFC), 0.01 (yes/no); β = 3.5; δ = 0.02
Posterior mean used as next intensity (more stable than MAP).
PEST — Parameter Estimation by Sequential Testing
W = Wald’s SPRT statistic
Same direction k consecutive steps: s ← min(2s, smax) every 4 steps
Reversal: s ← max(s/2, smin)
This engine’s PEST implementation:
newDir = correct ? −1 : +1
Reversal ⇒ pestStep ×= 0.5; pestConsecutive = 0
Same dir ⇒ pestConsecutive++; if pestConsecutive ≥ 4: pestStep ×= 2
x ← clamp(x + newDir × pestStep, minInt, maxInt)
Convergence target: 50% correct (yes/no) or 75% (2AFC with appropriate prior)
Efficiency: PEST typically reaches a stable estimate in 20–40 trials; similar efficiency to QUEST for single-parameter estimation.
Signal Detection Theory Metrics
d′ = z(H) − z(FA) [sensitivity; 0 = chance, 4.65 = p(err)<0.001%]
c = −½[z(H) + z(FA)] [criterion; 0 = unbiased]
β = exp(−½[z(H)2 − z(FA)2]) [likelihood ratio]
A′ = ½ + sign(H−FA)|H−FA|2 / (4 max(H,FA)(1−min(H,FA)))
Hautus (1995) corrections for extreme rates:
H = 1 ⇒ H′ = 1 − 1/(2NS)
FA = 0 ⇒ FA′ = 1/(2NN)
m-AFC correction (Hacker & Ratcliff, 1979):
d′m = d′2AFC × √2 × f(m)
f(m): integration of multivariate normal CDF
Approximate d′ ↔ percent correct (2AFC):
PC = Φ(d′ / √2)
d′ = 0 → 50%; d′ = 1 → 76%; d′ = 2 → 92%; d′ = 3 → 98.6%
ROC area (AUROC):
A = Φ(d′ / √2) (equal-variance Gaussian model)
Trapezoidal rule for empirical ROC from rated/multiple-criterion data.
Contrast, Sensitivity & Stimulus Metrics
CM = (Lmax − Lmin) / (Lmax + Lmin)
Range [0, 1]; C = amplitude / mean luminance for sinusoidal gratings
Weber Contrast (for uniform patches or spots on backgrounds):
CW = ΔL / Lb = (Ltarget − Lbackground) / Lbackground
RMS Contrast:
CRMS = √(mean[(L(x,y) / ⟨L⟩ − 1)2])
Used for broadband stimuli (noise, natural images)
Gabor stimulus (this engine):
G(x,y) = ½ + ½C · cos(2πf(x cosθ + y sinθ)) · exp(−(x2+y2)/(2σ2))
f = spatial frequency (cpd converted to cycles/px at runtime)
σ = Gaussian envelope σ = stimSize/4
Contrast Sensitivity Function (Mannos & Sakrison, 1974):
CSF(f) = a · fc · exp(−b · f); a=2.6, b=0.0192, c=1.1
CS(f) = 1/Cthreshold(f); peak ≈ 3–5 cpd
Cycles per degree conversion:
cpd = cycles/px × (px/deg); px/deg ≈ viewing_distance_mm / pixel_pitch_mm
Bootstrap Confidence Intervals
1. Fit psychometric function to observed data → obtain θ̂, ŝ
2. For b = 1…B:
a. Resample data with replacement (non-parametric) or simulate from ψ(x;θ̂)
b. Re-fit log-logistic MLE → θ̂b
3. 95% CI: [percentile2.5(θ̂b), percentile97.5(θ̂b)]
This engine uses non-parametric resampling (B = 1000 by default) with MLE fit per resample.
BCa (bias-corrected accelerated) CI (Efron, 1987):
More accurate than percentile CI when distribution is skewed.
z0 = Φ−1(#{θ̂b < θ̂} / B)
â = (1/6) ∑(Ui−&bar;U)3 / (∑(Ui−&bar;U)2)3/2
Recommended B ≥ 2000 for BCa; B = 1000 sufficient for percentile CI.
Reaction Time Analysis
Ex-Gaussian fit (Luce, 1986): RT distribution often modelled as convolution of Gaussian and exponential:
f(t) = (λ/2) exp[(λ/2)(2μ + λσ2 − 2t)] × erfc[(μ + λσ2 − t)/(√2 σ)]
μ = Gaussian mean, σ = Gaussian SD, λ = exponential rate
Race model / Poffenberger:
Mean RT = sensory latency + motor latency + decision time
Speed-accuracy trade-off (SAT): d′ ∝ √Tobservation under diffusion models
Outlier trimming: standard practice is to exclude RT < 150 ms (anticipations) and RT > 3 SD above mean or > 3000 ms (lapses). This engine filters RT < 0 or > 5000 ms.
Canonical literature underlying the methods implemented in this engine. Sorted by topic; numbered for cross-reference with the Formulas and Standards tabs.
Foundational Texts
[2] Green, D.M. & Swets, J.A. (1966). Signal Detection Theory and Psychophysics. New York: Wiley. Definitive treatment of SDT, d′, ROC, and criteria bias; republished 1974 by Krieger.
[3] Luce, R.D. (1986). Response Times: Their Role in Inferring Elementary Mental Organisation. Oxford University Press. RT models, ex-Gaussian distributions, diffusion models.
[4] Kingdom, F.A.A. & Prins, N. (2010). Psychophysics: A Practical Introduction. London: Academic Press. Most accessible modern textbook; covers adaptive methods, fitting, and SDT for practitioners.
Adaptive Staircase Methods
→ Derivation of convergence points for m-down/1-up rules; reversal averaging; discard-first-N protocol implemented in this engine.
[6] Watson, A.B. & Pelli, D.G. (1983). QUEST: A Bayesian adaptive psychometric method. Perception & Psychophysics, 33(2), 113–120. doi:10.3758/BF03202828
→ Full QUEST algorithm; prior/posterior update; MAP/mean selection; efficiency analysis vs. constant stimuli.
[7] Taylor, M.M. & Creelman, C.D. (1967). PEST: Efficient estimates on probability functions. Journal of the Acoustical Society of America, 41(4A), 782–787. doi:10.1121/1.1910407
→ PEST step-doubling and halving rules; Wald SPRT termination criterion; implemented with geometric step bounds in this engine.
[8] Watson, A.B. (2017). QUEST+: A general multidimensional Bayesian adaptive psychometric method. Journal of Vision, 17(3), 10. doi:10.1167/17.3.10
→ Extension of QUEST to estimate α, β, δ jointly; entropy-minimisation stimulus selection. Basis for future QUEST+ integration.
[9] García-Pérez, M.A. (1998). Forced-choice staircases with fixed step sizes: Asymptotic and small-sample properties. Vision Research, 38(12), 1861–1881. doi:10.1016/S0042-6989(97)00340-4
→ Bias in fixed-step staircases; argues for variable-step methods and large-reversal discard; supports ≥40 trials recommendation.
Psychometric Function Fitting
→ MLE fitting of Weibull/logistic; bootstrap CI (parametric and non-parametric); goodness-of-fit G²; lapse-rate handling. This engine implements MLE grid search and non-parametric bootstrap.
[11] Wichmann, F.A. & Hill, N.J. (2001). The psychometric function: II. Bootstrap-based confidence intervals and sampling. Perception & Psychophysics, 63(8), 1314–1329. doi:10.3758/BF03194545
→ Systematic evaluation of bootstrap CI validity; minimum sample sizes; bias-corrected intervals.
[12] Prins, N. & Kingdom, F.A.A. (2018). Applying the model-comparison approach to test specific research hypotheses in psychophysical research using the Palamedes toolbox. Frontiers in Psychology, 9, 1250. doi:10.3389/fpsyg.2018.01250
→ Palamedes MATLAB/Python toolbox; model comparison with AIC/BIC; recommended for formal analysis after data collection with this engine.
[13] King-Smith, P.E. & Rose, D. (1997). Principles of an adaptive method for measuring the slope of the psychometric function. Vision Research, 37(12), 1595–1604. doi:10.1016/S0042-6989(96)00310-0
→ Adaptive slope estimation; importance of β for characterising threshold steepness beyond the 50%/75% point estimation.
Signal Detection Theory
→ Comprehensive SDT reference covering all paradigms; unequal variance, A′, rating ROC, m-AFC corrections. Chapter 4 covers 2AFC matched to this engine.
[15] Hautus, M.J. (1995). Corrections for extreme proportions and their effect on estimated values of d′. Behavior Research Methods, Instruments & Computers, 27(1), 46–51. doi:10.3758/BF03203619
→ 0.5/N correction for H = 1 and FA = 0; implemented in calcSDT() in this engine.
[16] Hacker, M.J. & Ratcliff, R. (1979). A revised table of d′ for M-alternative forced choice. Perception & Psychophysics, 26(2), 168–170. doi:10.3758/BF03208311
→ Tabulated d′ values for m-AFC paradigms; correction factors for mAFC experiments.
Contrast Sensitivity & Spatial Vision
→ Original CSF paper; demonstrates the visual system as a multi-channel spatial frequency filter; basis for Gabor stimulus design and CSF sweep preset.
[18] Mannos, J.L. & Sakrison, D.J. (1974). The effects of a visual fidelity criterion on the encoding of images. IEEE Transactions on Information Theory, 20(4), 525–536. doi:10.1109/TIT.1974.1055250
→ Analytical CSF formula CS(f) = a·f^c·exp(−b·f); parameter values: a=2.6, b=0.0192, c=1.1 used in CSF sweep preset.
[19] Watson, A.B. & Ahumada, A.J. (2005). A standard model for foveal detection of spatial contrast. Journal of Vision, 5(9), 717–740. doi:10.1167/5.9.6
→ Unified foveal detection model incorporating optics, neural noise, uncertainty, and CSF; provides normative reference for Gabor detection thresholds.
[20] De Valois, R.L. & De Valois, K.K. (1988). Spatial Vision. Oxford University Press.
→ Comprehensive review of spatial frequency selectivity in V1 complex/simple cells; theoretical basis for grating orientation experiments.
Temporal Sensitivity & Flicker
→ dTVF and the temporal CSF; sensitivity peaks at 8–16 Hz; basis for 8 Hz flicker parameter in this engine’s flicker stimulus.
[22] de Lange, H. (1958). Research into the dynamic nature of the human fovea-cortex systems with intermittent and modulated light. I. Attenuation characteristics with white and coloured light. Journal of the Optical Society of America, 48(11), 777–784. doi:10.1364/JOSA.48.000777
→ de Lange curves; critical fusion frequency (CFF) ≈ 50–60 Hz at high luminance; temporal contrast sensitivity function model.
Software & Toolboxes
→ Psychtoolbox (MATLAB) — hardware-synchronised timing; recommended for publication-grade data. Used as gold standard to validate timing in this engine.
[24] Pelli, D.G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10(4), 437–442. doi:10.1163/156856897X00366
→ VideoToolbox; CLUT manipulations for 10-bit contrast; historical reference for canvas-based contrast generation in this engine.
[25] Peirce, J.W. (2007). PsychoPy — Psychophysics software in Python. Journal of Neuroscience Methods, 162(1–2), 8–13. doi:10.1016/j.jneumeth.2006.11.017
→ PsychoPy — open-source Python alternative to Psychtoolbox; validated timing, hardware integration; recommended for follow-up laboratory replication of experiments prototyped in this engine.
[26] Prins, N. (2012). The psychometric function: The lapse rate revisited. Journal of Vision, 12(6), 25. doi:10.1167/12.6.25
→ Argues δ should be estimated jointly with α and β via MLE; uninformed δ fixing leads to threshold bias at low PCs.
About this engine
This tool implements methods from references [5–16] in a zero-network, client-side JavaScript engine. It is designed for rapid prototyping, educational demonstrations, and pilot data collection. It is not a substitute for calibrated laboratory equipment or validated software (PsychoPy [25], Psychtoolbox [23–24], Palamedes [12]). When reporting results from this tool, cite the underlying methods ([5], [6], [10], [14]) and disclose participant viewing conditions and display specifications.
Advanced analysis tools, batch processing, threshold distributions, and research guidance for interpreting psychophysical data. Run an experiment in the Lab tab first.
Non-parametric bootstrap (Wichmann & Hill, 2001): resamples trial data B times, fits MLE logistic per resample, plots the distribution of θ̂ estimates. 95% CI = [2.5th, 97.5th] percentiles of bootstrap distribution.
Interpreting Your Results
Staircase trace (Lab tab): Gold line = trial-by-trial intensity. Hollow red circles = warmup reversals (first 4, discarded). Filled red circles = stable reversals (used for threshold). Blue dashed = threshold estimate. Expect an oscillating pattern converging on the threshold region. If the trace hits min or max intensity repeatedly, adjust initial intensity or step size.
Psychometric function (Lab tab): Gold dots = binned proportion correct (size ∝ √n trials in bin). Blue curve = MLE logistic fit. Threshold θ is where the curve crosses ~75% for 2AFC. A steep slope (high s) indicates sharp threshold; a shallow slope suggests variability or criterion inconsistency.
RT histogram (Lab tab): Green bars = correct trials; red bars = error trials. Gold dashed = mean RT. Correct-trial RT should be shorter than error RT for difficult stimuli (speed-accuracy consistency check). Very fast responses (<150 ms) may be anticipations; very slow (>2000 ms) may be lapses.
d′ interpretation: d′ = 0 is chance; d′ = 1 is ~76% 2AFC accuracy; d′ = 2 is ~92%. For yes/no, d′ ≈ c = 0 means unbiased responding. Positive c = conservative; negative c = liberal criterion placement.
Bootstrap CI width: Wide CI (>0.2 in normalised intensity) indicates insufficient trials or high within-session variability. Aim for CI width <0.1. Run more trials or increase B to 2000–5000 for publication.
Normative Reference Values
At 60+ years: contrast sensitivity reduced ≈ 0.5–1 log unit [Watson & Ahumada 2005]
Critical fusion frequency (flicker): CFF at 500 cd/m² ≈ 55–60 Hz (Granit-Harper law)
CFF at 20 cd/m² ≈ 35–40 Hz
Orientation discrimination: Just-noticeable difference ≈ 1–3° at 90° (cardinal); 5–7° at 45° (oblique effect)
Weber fraction (luminance increment): ΔI/I ≈ 0.01–0.02 in photopic range (Weber’s law)
Typical 2AFC threshold accuracy: At 70.7% (1-up/2-down): ~30–50 trials to stabilise within ±0.5 SD
QUEST posterior SD < 0.1 log units after ∼30 trials
Reaction times (detection tasks, young adults): Simple RT: 180–250 ms | Choice RT: 250–400 ms | Discrimination: 300–600 ms
Paste JSON trial data from the Export JSON button (Actions
tab) — the trials array. Computes accuracy, mean RT, d′, criterion, β,
and MLE threshold/slope from the pasted data.
Method Comparison: Staircase vs QUEST vs PEST
| Property | 1-up/2-down | 3-down/1-up | QUEST | PEST | Constant |
|---|---|---|---|---|---|
| Convergence | 70.7% | 79.4% | Any (via prior) | 50% (no-correct bias) | All levels |
| Min trials | ≥60 | ≥60 | ≥30–40 | ≥20–40 | ≥100/level |
| Estimates slope? | No | No | With QUEST+ | No | Yes (MLE) |
| Robust to lapses? | Moderate | Low | High (δ prior) | Moderate | High (MLE δ) |
| Best for | Threshold tracking | High accuracy | Fast, Bayesian | Speed | Full curve shape |
Recommended Experimental Workflow
- Pilot: Run 20–30 practice trials (Practice button) to verify response key configuration, timing, and initial intensity range. Adjust Initial intensity so practice accuracy is 60–80%.
- Choose method: For rapid threshold estimate use QUEST (30–50 trials). For reproducibility and conventional reporting use 1-up/2-down with 60–80 trials and 12 reversals (discard first 4). For full psychometric curve use Constant with ≥5 levels ×≥20 trials.
- Data collection: Encourage the observer to respond quickly but accurately. Remind them to press practice keys before the main run. Offer breaks every 15–20 minutes.
- Check staircase trace: Confirm the trace oscillates (reversals visible) and has not saturated at min/max intensity >50% of trials. If saturated, abort and adjust initial intensity & step size.
- Export: Actions tab → Export JSON (full metadata) and Export CSV (for R/Python/MATLAB). JSON includes MLE fit, reversal statistics, and session metadata.
- Bootstrap CI: Research tab → Run Bootstrap with B = 1000. Report mean threshold ± 95% CI. Aim for CI < 0.1 log units for publication.
- Batch analysis: For multi-session data, paste each session’s
trialsarray into Batch Analysis; compare MLE thresholds and slopes across conditions. Use the Session Comparison (Actions tab) for quick within-session summary. - Replicate in lab software: Use PsychoPy or Psychtoolbox to replicate key findings with hardware-verified timing, calibrated display output, and chin-rest-controlled viewing distance.