GLM

Functions and classes related to basic GLMs.

class lazyfmri.glm.ANOVA[source]

Bases: Posthoc

Runs an ANOVA using pingouin and subsequently performs posthoc tests. This class allows for immediate visualization of significant results using a Matplotlib axis. Unlike lazyfmri.glm.Posthoc, arguments for the ANOVA test are provided at initialization. If covar is specified, an ANCOVA is run instead of an ANOVA.

Parameters:

alpha (float, optional) – Alpha value to determine statistical significance. Default is 0.05.
axs (mpl.axes._axes.Axes, optional) – Matplotlib axis on which to plot significance bars. Default is None.
posthoc_kw (dict, optional) – Dictionary of arguments passed to lazyfmri.glm.Posthoc.plot_bars(). Default is {}.
plot_kw (dict, optional) – Additional keyword arguments for customizing the plot. Default is {}.
bar_kw (dict, optional) – Additional keyword arguments for customizing the significance bars. Default is {}.
*args (tuple) – Additional positional arguments for pingouin.anova() or pingouin.ancova().
**kwargs (dict) – Additional keyword arguments for pingouin.anova() or pingouin.ancova().

Example

from lazyfmri import glm

# Run an ANOVA on a dataset
aov = glm.ANOVA(
    data=df,
    dv="dependent_variable",
    between="grouping_variable",
    posthoc_kw={
        "effsize": "cohen",
        "test": "t-test",
        "paired": True,
        "subject": "vox",
        "padjust": "holm"
    }
)

# If using LazyBar, add significance bars
aov.plot_bars(
    axs=bar.axs,
    ast_frac=0,
    y_pos=1.15,
    line_separate_factor=-0.075
)

run_anova(alpha: float = 0.05, axs: Axes = None, parametric='auto', posthoc_kw=None, plot_kw=None, bar_kw=None, *args, **kwargs)[source]

Runs an ANOVA or ANCOVA analysis.

Uses the pingouin package to perform an ANOVA, automatically choosing between parametric (ANOVA/ANCOVA) and non-parametric (Friedman test) based on normality testing.

Parameters:

alpha (float, optional) – Significance level for statistical tests. Default is 0.05.
axs (mpl.axes._axes.Axes, optional) – Matplotlib axis on which to plot significance bars. Default is None.
parametric (str or bool, optional) –
Determines whether to run a parametric or non-parametric test:
- ”auto” (default): Tests normality and selects automatically.
- True: Runs a parametric ANOVA.
- False: Runs a non-parametric Friedman test.
posthoc_kw (dict, optional) – Dictionary of arguments for posthoc testing. Default is {}.
plot_kw (dict, optional) – Additional parameters for customizing plots. Default is {}.
bar_kw (dict, optional) – Parameters for significance bars. Default is {}.
args (tuple) – Additional positional arguments for pingouin.anova() or pingouin.ancova().
kwargs (dict) – Additional keyword arguments for pingouin.anova() or pingouin.ancova().

Raises:

ImportError – If pingouin is not installed.

Example

aov.run_anova(
    data=df,
    dv="response",
    between="condition",
    posthoc_kw={"effsize": "cohen", "padjust": "holm"}
)

print(aov.ano)  # Print ANOVA table

class lazyfmri.glm.GenericGLM[source]

Bases: object

Main class to perform a simple GLM with python. Will do most of the processes internally, and allows you to plot various processes along the way.

Parameters:

onset (pandas.DataFrame) – Dataframe containing the onset times for all events in an experiment. Specifically design to work smoothly with lazyfmri.dataset.ParseExpToolsFile. You should insert the output from lazyfmri.dataset.ParseExpToolsFile.get_onset_df() as onset
data (numpy.ndarray, pandas.DataFrame) – <time,voxels> numpy array or pandas DataFrame; required for creating the appropriate length of the stimulus vectors
hrf_pars (dict, optional) –
dictionary collecting the parameters required for lazyfmri.glm.double_gamma() (generally the defaults are fine though!)
```
pars = {
    'lag': 6,
    'a2': 12,
    'b1': 12,
    'b2': 12,
    'c': 12,
    'scale': True
}
```
TR (float) – repetition time of acquisition
osf (int, optional) – Oversampling factor used to account for decimal onset times, by default None. The larger this factor, the more accurate decimal onset times will be processed, but also the bigger your upsampled convolved becomes, which means convolving will take longer.
type (str, optional) – Use block design of event-related design, by default ‘event’. If set to ‘block’, block_length is required.
block_length (int, optional) – Duration of block in seconds, by default None
amplitude (int, list, optional) – Amplitude to be used when creating the stimulus vector, by default None. If nothing is specified, the amplitude will be set to ‘1’, like you would in a regular FSL 1-/3-column file. If you want variable amplitudes for different events for in a simulation, you can specify a list with an equal length to the number of events present in onset_df.
regressors (pandas.DataFrame, numpy.ndarray, optional) – Add a bunch of regressors to the design
make_figure (bool, optional) – Create overview figure of HRF, stimulus vector, and convolved stimulus vector, by default False
scan_length (int) – number of volumes in data (= scan_length in lazyfmri.glm.make_stimulus_vector())
xkcd (bool, optional) – Plot the figre in XKCD-style (cartoon), by default False
plot_vox (int, optional) – Instead of plotting the best-fitting voxel, specify which voxel to plot the timecourse and fit of, by default None
plot_event (str, int, list, optional) – If a larger design matrix was inputted with multiple events, you can specify here the name of the event you’d like to plot the betas from. It also accepts a list of indices of events to plot, so you could plot the first to events by specifying plot_event=[1,2]. Remember, the 0th index is the intercept! By default we’ll plot the event right after the intercept
contrast_matrix (numpy.ndarray, optional) – contrast array for the event regressors. If none, we’ll create a contrast matrix that estimates the effect of each regressor and the baseline
nilearn (bool, optional) – use nilearn implementation of FirstLevelModel (True) or bare python (False). The later gives easier access to betas, while the former allows implementation of AR-noise models.

Returns:

dict – Dictionary collecting outputs under the following keys
- ”betas”: <n_regressors (+intercept), n_voxels> beta values
- ”tstats”: <n_regressors (+intercept), n_voxels> t-statistics (FSL-way)
- ”x_conv”: <n_timepoints, n_regressors (+intercept)> design matrix
- ”resids”: <n_timepoints, n_voxels> residuals>
matplotlib.pyplot – plots along the process if make_figure=True

Example

# import modules
from lazyfmri.glm import GenericGLM
from lazyfmri import dataset

# define file with fMRI-data and the output from Exptools2
func_file = "some_func_file.mat"
exp_file = "some_exp_file.tsv"

# load in functional data
obj = dataset.Dataset(
    func_file,
    exp_file=exp_file,
    subject=1,
    run=1,
    deleted_first_timepoints=200,
    deleted_last_timepoints=200)

# fetch HP-filtered, percent-signal changed data
data = obj.fetch_fmri()
onsets = obj.fetch_onsets()

# do the fitting
fitting = GenericGLM(
    onsets,
    data.values,
    TR=func.TR,
    osf=1000
)

convolve_stims(**kwargs)[source]

create_design(make_figure=False)[source]

define_hrf()[source]

define_stimvector()[source]

fit(nilearn_method=False, make_figure=False, xkcd=False, plot_vox=None, plot_event=None, cmap='inferno', copes=None, plot_full_only=False, plot_full=False, save_as=None, **kwargs)[source]

plot_contrast_matrix(save_as=None)[source]

plot_design_matrix(save_as=None)[source]

resample_stimvector()[source]

class lazyfmri.glm.Posthoc[source]

Bases: Defaults

Initializes the Posthoc class for conducting posthoc statistical tests following an ANOVA or ANCOVA analysis. This class is designed for simple pairwise comparisons using pingouin.pairwise_tukey() or pingouin.pairwise_tests(). It also provides visualization support for significance bars on a given matplotlib axis.

Parameters:: **kwargs (dict) – Additional parameters that will be passed to lazyfmri.plotting.Defaults(), allowing customization of plotting settings.

Example

from lazyfmri import glm
posth = glm.Posthoc()
posth.run_posthoc(
    data=df,
    dv="dependent_variable",
    between="grouping_variable"
)
posth.plot_bars(axs=axs)

plot_bars()[source]

Plots significance bars on a matplotlib axis based on sorted posthoc results. The function determines significance levels using asterisks (“*” for p < 0.05, “**” for p < 0.01, “***” for p < 0.001) and positions bars accordingly.

Parameters:

axs (mpl.axes._axes.Axes, optional) – Axis to plot significance bars on. Default is None.
alpha (float, optional) – Significance threshold. Default is 0.05.
y_pos (float, optional) – Starting position of the top significance line in axis proportions (1 = top of plot). Default is 0.95. This factor is reduced incrementally using line_separate_factor.
line_separate_factor (float, optional) – Factor by which subsequent significance bars are shifted downward. Default is -0.065.
ast_frac (float, optional) – Distance between significance line and annotation (e.g., asterisks or “ns” for non-significance). Default is 0.2.
ns_annot (bool, optional) – If True, non-significant contrasts are annotated with “ns”. Default is False.
ns_frac (float, optional) – Additional factor to scale the distance between significance lines and “ns” annotations. Default is 5.
leg_size (float, optional) – Size of the overhang from the significance bars, defined as a fraction of the total y-axis limit. Default is 0.02.
color (str, optional) – Color of significance bars. Default is “black”.
*args (tuple) – Additional arguments.
**kwargs (dict) – Additional keyword arguments.

Example

fig, ax = plt.subplots()
posth.plot_bars(axs=ax, alpha=0.05, y_pos=1.1, color="red")

run_posthoc()[source]

Runs the posthoc test. By default, a Tukey test is executed (“tukey”), but other tests from pingouin.pairwise_tests() can be used. The function supports both parametric and non-parametric comparisons.

Parameters:

test (str, optional) – Type of posthoc test to execute. Default is “tukey”. If another value is provided, pingouin.pairwise_tests() is used.
ano (dict, optional) – Dictionary containing ANOVA results. If provided, posthoc p-values can be inherited from the ANOVA output.
paired (bool, optional) – If True, assumes a paired comparison (e.g., within-subject analysis). Default is False.
*args (tuple) – Additional positional arguments for the pingouin posthoc functions.
**kwargs (dict) – Additional keyword arguments for the pingouin posthoc functions.

Raises:

ImportError – If pingouin is not installed.

Example

posth.run_posthoc(
    data=df,
    dv="response",
    between="condition",
    test="tukey"
)

print(posth.posthoc)  # Print posthoc test results

sort_posthoc()[source]

Sorts the output of posthoc tests based on the distance between compared conditions. The function ensures that the longest significance bar spans the largest distance in the plot.

Parameters:: df (pd.DataFrame) – Dataframe containing posthoc test results, with columns “A” and “B” indicating the conditions being compared.
Returns:: The sorted dataframe with an additional “distances” column indicating the distance between compared conditions.
Return type:: pd.DataFrame

Example

sorted_df = posth.sort_posthoc(posth.posthoc)
print(sorted_df)

lazyfmri.glm.calculate_r2(data, sse)[source]

lazyfmri.glm.calculate_tstats(dm=None, C=None, betas=None, sse=None, rank=None)[source]

lazyfmri.glm.convolve_hrf()[source]

Convolve lazyfmri.glm.double_gamma() with lazyfmri.glm.make_stimulus_vector(). There’s an option to plot the result in a nice overview figure, though python-wise it’s not the prettiest..

Parameters:

hrf (numpy.ndarray) – HRF across given timepoints with shape (,`x.shape[0]`)
stim_v (numpy.ndarray, list) – Stimulus vector as per lazyfmri.glm.make_stimulus_vector() or numpy array containing one stimulus vector (e.g., a key from lazyfmri.glm.make_stimulus_vector())
TR (float) – repetition time of acquisition
make_figure (bool, optional) – Create overview figure of HRF, stimulus vector, and convolved stimulus vector, by default False
scan_length (int) – number of volumes in data (= scan_length in lazyfmri.glm.make_stimulus_vector())
xkcd (bool, optional) – Plot the figre in XKCD-style (cartoon), by default False
add_array1 (numpy.ndarray, optional) – additional stimulus vector to be plotted on top of stim_v, by default None
add_array2 (numpy.ndarray, optional) – additional convolved stimulus vector to be plotted on top of stim_v, by default None
regressors (pandas.DataFrame) – add a bunch of regressors with shape <time,voxels> to the design matrix. Should be in the dimensions of the functional data, not the oversampled..

Returns:

matplotlib.plot – if make_figure=True, a figure will be displayed
pandas.DataFrame – if osf > 1, then resampled stimulus vector DataFrame is returned. If not, the convolved stimulus vectors are returned in a dataframe as is

Example

from lazyfmri.glm import convolve_hrf
convolved_stim_vector_left = convolve_hrf(hrf_custom, stims, make_figure=True, xkcd=True) # creates figure too
convolved_stim_vector_left = convolve_hrf(hrf_custom, stims) # no figure

lazyfmri.glm.define_hrf(hrf_pars='glover', TR=0.105, osf=1, dispersion=False, derivative=False)[source]

lazyfmri.glm.design_variance(X, which_predictor=1)[source]

Returns the design variance of a predictor (or contrast) in X.

Parameters:

X (numpy array) – Array of shape (N, P)
which_predictor (int or list/array) – The index of the predictor you want the design var from. Note that 0 refers to the intercept! Alternatively, “which_predictor” can be a contrast-vector (which will be discussed later this lab).

Returns:

des_var – Design variance of the specified predictor/contrast from X.

Return type:

float

lazyfmri.glm.double_gamma()[source]

Create a double gamma hemodynamic response function (HRF).

Parameters:

x (numpy.ndarray) – timepoints along the HRF
lag (int, optional) – duration until peak of HRF is reached, by default 6
a2 (int, optional) – second determinant of the HRF drop, by default 12
b1 (float, optional) – first determinant of HRF rise, by default 0.9
b2 (float, optional) – second determinant of HRF rise, by default 0.9
c (float, optional) – constant for HRF drop, by default 0.35
scale (bool, optional) – normalize course of HRF, by default True

Returns:

HRF across given timepoints with shape (,`x.shape[0]`)

Return type:

numpy.ndarray

Example

dt = 1
time_points = np.linspace(0,36,np.rint(float(36)/dt).astype(int))
hrf_custom = lazyfmri.glm.double_gamma(time_points, lag=6)
hrf_custom = hrf_custom[np.newaxis,...]

lazyfmri.glm.first_level_matrix(stims_dict, regressors=None, add_intercept=True, names=None)[source]

lazyfmri.glm.fit_first_level()[source]

First level models are, in essence, linear regression models run at the level of a single session or single subject. The model is applied on a voxel-wise basis, either on the whole brain or within a region of interest. The timecourse of each voxel is regressed against a predicted BOLD response created by convolving the haemodynamic response function (HRF) with a set of predictors defined within the design matrix (source: https://nilearn.github.io/glm/first_level_model.html)

Parameters:

stim_vector (pandas.DataFrame, numpy.ndarray) – either the output from lazyfmri.glm.resample_stim_vector() (convolved stimulus vector in fMRI-acquisition time domain) or a pandas.DataFrame containing the full design matrix as per the output of lazyfmri.glm.first_level_matrix().
data (numpy.ndarray) – <time,voxels> numpy array; same input as data from lazyfmri.glm.make_stimulus_vector()
make_figure (bool, optional) – Create a figure of best-voxel fit, by default False
copes ([type], optional) – [description], by default None
xkcd (bool, optional) – Plot the figre in XKCD-style (cartoon), by default False
plot_vox (int, optional) – Instead of plotting the best-fitting voxel, specify which voxel to plot the timecourse and fit of, by default None
plot_event (str, int, list, optional) – If a larger design matrix was inputted with multiple events, you can specify here the name of the event you’d like to plot the betas from. It also accepts a list of indices of events to plot, so you could plot the first to events by specifying plot_event=[1,2]. Remember, the 0th index is the intercept! By default we’ll plot the event right after the intercept

Returns:

numpy.ndarray – betas for each voxel for the intercept and the number of stim_vectors used (in case you also add regressors)
numpy.ndarray – the design matrix X_conv

Example

from lazyfmri.glm import fit_first_level

# plots first event
betas_left,x_conv_left = fit_first_level(
    convolved_stim_vector_left_ds,
    data,
    make_figure=True
)

# plots first two events
betas_left,x_conv_left = fit_first_level(
    convolved_stim_vector_left_ds,
    data,
    make_figure=True,
    plot_events=[1,2]
)

lazyfmri.glm.get_event_prediction(ev, X, betas, ev_names, rf_mode=False, include_intercept=True, include_derivatives=True)[source]

Get prediction for one event by selecting matching design columns and betas.

Parameters:

ev (str) – Event name to match in col_names, e.g. “CSm”, “CSpr”, “CSpu”.
X (ndarray) – Convolved design matrix, shape (n_timepoints, n_regressors).
betas (ndarray) – Beta vector, shape (n_regressors,) or compatible.
col_names (list of str) – Column names corresponding to columns in X_conv.
include_intercept (bool) – Whether to always include the intercept column.
include_derivatives (bool) – Whether to include columns containing “regressor” representing the derivatives.
rf_mode (bool) – If True, interpret X as basis set and return RF.

Returns:

ev_preds (ndarray) – Predicted signal for this event.
event (ndarray) – Event-specific design matrix.
betas (ndarray) – Event-specific beta values.
beta_idx (list[int]) – Selected column indices.

lazyfmri.glm.get_event_predictions(events, X, betas, ev_names, **kwargs)[source]

lazyfmri.glm.glover_hrf(osf=1, TR=0.105, dispersion=False, derivative=False, time_length=25)[source]

lazyfmri.glm.make_stimulus_vector()[source]

Creates a stimulus vector for each of the conditions found in onset_df. You can account for onset times being in decimal using the oversampling factor osf. This would return an upsampled stimulus vector which should be convolved with an equally upsampled HRF. This can be ensured by using the same osf in lazyfmri.glm.double_gamma().

Parameters:

onset_df (pandas.DataFrame) – onset times as read in with lazyfmri.dataset.ParseExpToolsFile
scan_length (float, optional) – length of the , by default None
TR (float, optional) – Repetition time, by default 0.105. Will be used to calculate the required length of the stimulus vector
osf ([type], optional) – Oversampling factor used to account for decimal onset times, by default None
type (str, optional) – Use block design of event-related design, by default ‘event’. If set to ‘block’, block_length is required.
block_length (int, optional) – Duration of block in seconds, by default None
amplitude (int, list, optional) – Amplitude to be used when creating the stimulus vector, by default None. If nothing is specified, the amplitude will be set to ‘1’, like you would in a regular FSL 1-/3-column file. If you want variable amplitudes for different events for in a simulation, you can specify a list with an equal length to the number of events present in onset_df.

Returns:

Dictionary collecting numpy array stimulus vectors for each event present in onset_df under the keys <event name>

Return type:

dict

Raises:

ValueError – onset_df should contain event names
ValueError – if multiple amplitudes are requested but the length of amplitude does not match the number of events
ValueError – block_length should be an integer

Example

from lazyfmri import utils, glm
exp_file = 'path/to/exptools2_file.tsv'
exp_df = utilsParseExpToolsFile(exp_file, subject=1, run=1)
times = exp_df.get_onset_df()
# oversample with factor 1000 to get rid of 3 decimals in onset times
osf = 1000
# make stimulus vectors
stims = glm.make_stimulus_vector
    times,
    scan_length=400,
    osf=osf,
    type='event'
)

# stims
{
    'left': array([0., 0., 0., ..., 0., 0., 0.]),
    'right': array([0., 0., 0., ..., 0., 0., 0.])
}

lazyfmri.glm.resample_stim_vector()[source]

Resample the oversampled stimulus vector back in to functional time domain

Parameters:

convolved_array (dict, numpy.ndarray) – oversampled convolved stimulus vector as per lazyfmri.glm.convolve_hrf()
scan_length (int) – number of volumes in data (= scan_length in lazyfmri.glm.make_stimulus_vector())
interpolate (str, optional) – interpolation method, by default ‘nearest’

Returns:

convolved stimulus vector in time domain that matches the fMRI acquisition

Return type:

dict, numpy.ndarray

Example

from lazyfmri.glm import resample_stim_vector
scan_length = 230
convolved_stim_vector_left_ds = resample_stim_vector(convolved_stim_vector_left, scan_length)

lazyfmri.glm.spm_hrf(osf=1, TR=0.105, dispersion=False, derivative=False, time_length=25)[source]