Preproc
The functions and classes in this file are related to basic preprocessing of dataframe, including frequency spectra, ICA, regressing out confounds, and filtering.
- class lazyfmri.preproc.DataFilter[source]
Bases:
objectA class for filtering functional fMRI data based on subject, task, and run identifiers. It supports multiple filtering strategies, including high-pass and low-pass filtering.
- Parameters:
func (pd.DataFrame) – The input functional data as a Pandas DataFrame.
**kwargs (dict) – Additional filtering parameters.
Example
from lazyfmri.preproc import DataFilter obj = DataFilter( func=df_func, filter_strategy="hp", hp_kw={"cutoff": 0.01}, ) filtered_df = obj.get_result()
- filter_input(**kwargs)[source]
Filter input data
Filters the input data by applying subject-level, task-level, and run-level filtering.
- Parameters:
**kwargs (dict) – Additional parameters for filtering.
- filter_runs(df_func, **kwargs)[source]
Filter runs
Extracts and processes functional data for each unique run in the dataset.
- Parameters:
df_func (pd.DataFrame) – Functional data to be filtered.
**kwargs (dict) – Additional parameters for filtering.
- Returns:
Filtered functional data, concatenated across runs.
- Return type:
pd.DataFrame
- filter_subjects(df_func, **kwargs)[source]
Filter subjects
Extracts and processes functional data for each unique subject in the dataset.
- Parameters:
df_func (pd.DataFrame) – Functional data to be filtered.
**kwargs (dict) – Additional parameters for filtering.
- Returns:
Filtered functional data, concatenated across subjects.
- Return type:
pd.DataFrame
- filter_tasks(df_func, **kwargs)[source]
Filter tasks
Extracts and processes functional data for each unique task in the dataset.
- Parameters:
df_func (pd.DataFrame) – Functional data to be filtered.
**kwargs (dict) – Additional parameters for filtering.
- Returns:
Filtered functional data, concatenated across tasks.
- Return type:
pd.DataFrame
- get_result()[source]
Get filtered result
Returns the final filtered DataFrame.
- Returns:
Filtered functional data.
- Return type:
pd.DataFrame
- plot_task_avg(orig=None, filt=None, t_col='t', avg=True, plot_title=None, incl_task=None, sf=None, use_cols=['#cccccc', 'r'], power_kws={}, make_figure=True, **kwargs)[source]
Plot task-averaged time series
Plots the original and filtered time series averaged across tasks.
- Parameters:
orig (pd.DataFrame, optional) – Original unfiltered data. Defaults to self.func.
filt (pd.DataFrame, optional) – Filtered data. Defaults to self.df_filt.
t_col (str, optional) – Column name representing time. Default is “t”.
avg (bool, optional) – Whether to compute the average time series across subjects. Default is True.
plot_title (str or dict, optional) – Title for the plot. If dict, it should contain additional title arguments.
incl_task (str or list, optional) – Specific tasks to include. If None, all tasks are included.
sf (matplotlib.figure.SubFigure, optional) – SubFigure object for multiple plots.
use_cols (list, optional) – Colors to use for the original and filtered data. Default is [“#cccccc”, “r”].
power_kws (dict, optional) – Additional parameters for power spectrum computation.
make_figure (bool, optional) – Whether to create a new figure. Default is True.
**kwargs (dict) – Additional plotting parameters.
- Returns:
If make_figure=True, returns a figure. Otherwise, returns a DataFrame of task-averaged time series.
- Return type:
matplotlib.figure.Figure or pd.DataFrame
- classmethod power_spectrum(tc1, tc2, axs=None, TR=0.105, figsize=(5, 5), **kwargs)[source]
Compute power spectrum
Computes and plots the power spectrum of two time series.
- Parameters:
tc1 (pd.DataFrame) – First time series.
tc2 (pd.DataFrame) – Second time series.
axs (matplotlib.axes._axes.Axes, optional) – Matplotlib axis object for plotting. If None, a new figure is created.
TR (float, optional) – Repetition time (TR) of the fMRI scan. Default is 0.105 seconds.
figsize (tuple, optional) – Figure size for plotting. Default is (5, 5).
**kwargs (dict) – Additional parameters.
- Returns:
Power spectrum plot.
- Return type:
matplotlib.figure.Figure
- classmethod single_filter(func, filter_strategy='hp', hp_kw={}, lp_kw={}, **kwargs)[source]
Apply a single filtering step
Performs high-pass or low-pass filtering on the input data.
- Parameters:
func (pd.DataFrame) – Functional data to be filtered.
filter_strategy (str or list, optional) – Filtering strategy to apply. Options: [“hp”, “lp”]. Default is “hp”.
hp_kw (dict, optional) – Parameters for high-pass filtering.
lp_kw (dict, optional) – Parameters for low-pass filtering.
**kwargs (dict) – Additional parameters.
- Returns:
Filtered data.
- Return type:
pd.DataFrame
- class lazyfmri.preproc.EventRegression[source]
Bases:
InitFitterPerforms event regression on functional fMRI data. This class takes functional time series and event onsets to regress out specific event-related activity.
- Parameters:
func (pd.DataFrame) – Functional time series data.
onsets (pd.DataFrame) – Event onsets with associated event types.
TR (float, optional) – Repetition time (TR) of the fMRI scan. Default is 0.105 seconds.
merge (bool, optional) – Whether to merge event-related regressors. Default is False.
evs (list, str, optional) – List of event types to regress out. If None, all event types will be used.
ses (int, optional) – Session identifier, if applicable.
prediction_plot (bool, optional) – Whether to generate plots for predicted timecourses. Default is False.
result_plot (bool, optional) – Whether to generate plots for the final regression results. Default is False.
save_ext (str, optional) – File extension for saved plots (e.g., “svg” or “png”). Default is “svg”.
reg_kw (dict, optional) – Keyword arguments for regression.
**kwargs (dict) – Additional keyword arguments for processing.
Example
from lazyfmri.preproc import EventRegression obj = EventRegression( func=df_func, onsets=df_onsets, TR=0.105, evs=["stimulus", "response"], result_plot=True ) regressed_df = obj.df_regress
- classmethod plot_model_fits(model, save=False, fig_dir=None, basename=None, TR=0.105, cm='inferno', ext='svg', time_col='time', w_ratio=[0.8, 0.2], evs=None, loc=[0, 1], **kwargs)[source]
Plot model fits
Visualizes model-predicted and observed time series for different voxels.
- Parameters:
model (object) – Fitted model object.
save (bool, optional) – Whether to save the plot. Default is False.
fig_dir (str, optional) – Directory to save figures.
basename (str, optional) – Basename for saved figures.
TR (float, optional) – Repetition time (TR) of the fMRI scan. Default is 0.105 seconds.
cm (str, optional) – Colormap for plotting.
ext (str, optional) – File extension for saving plots.
**kwargs (dict) – Additional plotting parameters.
- Return type:
None
- plot_power_spectrum(tc2, axs=None, TR=0.105, figsize=(5, 5), **kwargs)[source]
Plot power spectrum
Computes and plots the power spectrum before and after regression.
- Parameters:
tc1 (pd.DataFrame) – Original time series.
tc2 (pd.DataFrame) – Regressed time series.
axs (matplotlib.axes._axes.Axes, optional) – Matplotlib axis object for plotting.
TR (float, optional) – Repetition time (TR) of the fMRI scan. Default is 0.105 seconds.
figsize (tuple, optional) – Figure size. Default is (5, 5).
**kwargs (dict) – Additional plotting parameters.
- Returns:
Power spectrum plot.
- Return type:
matplotlib.figure.Figure
- classmethod plot_result(raw, regr, avg=True, save=False, fig_dir=None, basename=None, TR=0.105, ext='svg', w_ratio=[0.8, 0.2], cols=['#cccccc', 'r'], evs=None, **kwargs)[source]
- plot_timecourse_prediction(tc2, axs=None, figsize=(16, 4), time_col='t', t_axis=None, TR=0.105, **kwargs)[source]
Plot timecourse prediction
Plots original and predicted timecourses to visualize regression results.
- Parameters:
tc1 (pd.DataFrame) – Original time series.
tc2 (pd.DataFrame) – Predicted time series from the regression model.
axs (matplotlib.axes._axes.Axes, optional) – Matplotlib axis object for plotting.
figsize (tuple, optional) – Figure size. Default is (16, 4).
time_col (str, optional) – Column name for time axis. Default is “t”.
t_axis (list or np.ndarray, optional) – Time axis values.
TR (float, optional) – Repetition time (TR) of the fMRI scan. Default is 0.105 seconds.
**kwargs (dict) – Additional plotting parameters.
- Returns:
Timecourse prediction plot.
- Return type:
matplotlib.figure.Figure
- regress_input(**kwargs)[source]
Perform event regression on input data
Runs event regression for all subjects in the dataset.
- Parameters:
**kwargs (dict) – Additional keyword arguments for processing.
- regress_runs(df_func, df_onsets, basename=None, final_ev=True, make_figure=False, plot_kw={}, reg_kw={}, **kwargs)[source]
Regress out events per run
Performs event regression separately for each run.
- Parameters:
df_func (pd.DataFrame) – Functional time series data.
df_onsets (pd.DataFrame) – Event onsets for each run.
basename (str, optional) – Basename for saving figures. Default is None.
final_ev (bool, optional) – Whether this is the final event to be regressed. Default is True.
make_figure (bool, optional) – Whether to generate plots. Default is False.
plot_kw (dict, optional) – Additional plotting parameters.
reg_kw (dict, optional) – Additional regression parameters.
**kwargs (dict) – Additional keyword arguments.
- Returns:
Functional data with event regressors removed.
- Return type:
pd.DataFrame
- regress_subjects(df_func, df_onsets, evs=None, ses=None, reg_kw={}, **kwargs)[source]
Regress out events per subject
Performs event regression separately for each subject.
- Parameters:
df_func (pd.DataFrame) – Functional time series data.
df_onsets (pd.DataFrame) – Event onsets for each subject.
evs (list, str, optional) – List of event types to regress out. Default is None (all events).
ses (int, optional) – Session identifier, if applicable.
reg_kw (dict, optional) – Additional regression parameters.
**kwargs (dict) – Additional keyword arguments.
- Returns:
Functional data with event regressors removed.
- Return type:
pd.DataFrame
- regress_tasks(df_func, df_onsets, basename=None, reg_kw={}, **kwargs)[source]
Regress out events per task
Performs event regression separately for each task.
- Parameters:
df_func (pd.DataFrame) – Functional time series data.
df_onsets (pd.DataFrame) – Event onsets for each task.
basename (str, optional) – Basename for saving figures. Default is None.
reg_kw (dict, optional) – Additional regression parameters.
**kwargs (dict) – Additional keyword arguments.
- Returns:
Functional data with event regressors removed.
- Return type:
pd.DataFrame
- classmethod single_regression(func, onsets, reg_kw={}, **kwargs)[source]
Regress out events per subject
Performs event regression separately for each subject.
- Parameters:
df_func (pd.DataFrame) – Functional time series data.
df_onsets (pd.DataFrame) – Event onsets for each subject.
evs (list, str, optional) – List of event types to regress out. Default is None (all events).
ses (int, optional) – Session identifier, if applicable.
reg_kw (dict, optional) – Additional regression parameters.
**kwargs (dict) – Additional keyword arguments.
- Returns:
Functional data with event regressors removed.
- Return type:
pd.DataFrame
- class lazyfmri.preproc.ICA[source]
Bases:
objectWrapper around scikit-learn’s FastICA, with a few visualization options. The basic input needs to be a
pandas.DataFrameornumpy.ndarraydescribing a 2D dataset (e.g., the output oflinescanning.dataset.Datasetorlinescanning.dataset.ParseFuncFile).- Parameters:
subject (str, optional) – Subject ID to use when saving figures (e.g.,
sub-001)data (pd.DataFrame, np.ndarray) – Dataset to be ICA’d in the format if
<time,voxels>n_components (int, optional) – Number of components to use, by default 10
filter_confs (float, optional) – Specify a high-pass frequency cut off to retain task-related frequencies, by default 0.02. If you do not want to high-pass filter the components, set
filter_confs=Noneandkeep_compsto the the components you want to retain (e.g.,keep_comps=[0,1]to retain the first two components)keep_comps (list, optional) – Specify a list of components to keep from the data, rather than all high-pass components. If
filter_confs=None, but keep_comps is given, no high-pass filtering is applied to the components. Iffilter_confs=None&keep_comps=None, an error will be thrown. You must either specifyfilter_confsand/orkeep_compsverbose (bool, optional) – Turn on verbosity; prints some stuff to the terminal, by default False
TR (float, optional) – Repetition time or sampling rate, by default 0.105
save_as (str, optional) – Path pointing to the location where to save the figures.
sub-<subject>_run-{self.run}_desc-ica.{self.save_ext}), by default Nonesession (int, optional) – Session ID to use when saving figures (e.g., 1), by default 1
run (int, optional) – Run ID to use when saving figures (e.g., 1), by default 1
summary_plot (bool, optional) – Make a figure regarding the efficacy of the ICA denoising, by default False
melodic_plot (bool, optional) – Make a figure regarding the information about the components themselves, by default False
ribbon (tuple, optional) – Range of gray matter voxels. If None, we’ll check the efficacy of ICA denoising over the average across the data, by default None
save_ext (str, optional) – Extension to use when saving figures, by default “svg”
Example
from lazyfmri.preproc import ICA # intialize ica_obj = ICA( data_obj.hp_zscore_df, subject=f"sub-{sub}", session=ses, run=3, n_components=10, TR=data_obj.TR, filter_confs=0.18, keep_comps=1, verbose=True, ribbon=None ) # actually run the regression ica_obj.regress()
- melodic()[source]
Plot information about the components from the ICA. For each component until
plot_comps, plot the 2D spatial profile of the component, its timecourse, and its power spectrum. Ifzoom_freq=True, we’ll add an extra subplot next to the power spectrum which contains a zoomed in version of the power spectrum withzoom_limas limits.- Parameters:
color (str, tuple, optional) – Color for all subplots, by default “#6495ED”
zoom_freq (bool, optional) – Add a zoomed in version of the power spectrum, by default False
task_freq (float, optional) – If
zoom_freq=True, add a vertical line where the task-frequency (task_freq) should be, by default 0.05zoom_lim (list, optional) – Limits for the zoomed in power spectrum, by default [0,0.5]
plot_comps (int, optional) – Limit the number of plots being produced in case you have a lot of components, by default 10
Example
ica_obj.melodic( # color="r", zoom_freq=True, zoom_lim=[0,0.25] )
- lazyfmri.preproc.get_freq()[source]
Create power spectra of input timeseries with the ability to select implementations from nitime. Fourier transform is implemented as per J. Siero’s implementation.
- Parameters:
func (np.ndarray) – Array of shape(timepoints,)
TR (float, optional) – Repetition time, by default 0.105
spectrum_type (str, optional) – Method for extracting power spectra, by default ‘psd’. Must be one of ‘mtaper’, ‘fft’, ‘psd’, or ‘periodogram’, as per nitime’s implementations.
clip_power (_type_, optional) – _description_, by default None
- Returns:
freq – numpy.ndarray representing the frequencies
power – numpy.ndarray representing the power spectra
- Raises:
ValueError – If invalid spectrum_type is given. Must be one of psd, mtaper, fft, or periodogram.
- lazyfmri.preproc.highpass_dct()[source]
Discrete cosine transform (DCT) is a basis set of cosine regressors of varying frequencies up to a filter cutoff of a specified number of seconds. Many software use 100s or 128s as a default cutoff, but we encourage caution that the filter cutoff isn’t too short for your specific experimental design. Longer trials will require longer filter cutoffs. See this paper for a more technical treatment of using the DCT as a high pass filter in fMRI data analysis (https://canlab.github.io/_pages/tutorials/html/high_pass_filtering.html).
- Parameters:
func (np.ndarray) – <n_voxels, n_timepoints> representing the functional data to be fitered
lb (float, optional) – cutoff-frequency for low-pass (default = 0.01 Hz)
TR (float, optional) – Repetition time of functional run, by default 0.105
modes_to_remove (int, optional) – Remove first X cosines
- Returns:
dct_data (np.ndarray) – array of shape(n_voxels, n_timepoints)
cosine_drift (np.ndarray) – Cosine drifts of shape(n_scans, n_drifts) plus a constant regressor at cosine_drift[:, -1]
Notes
High-pass filters remove low-frequency (slow) noise and pass high-freqency signals.
Low-pass filters remove high-frequency noise and thus smooth the data.
Band-pass filters allow only certain frequencies and filter everything else out
Notch filters remove certain frequencies
- lazyfmri.preproc.lowpass_savgol()[source]
The Savitzky-Golay filter is a low pass filter that allows smoothing data. To use it, you should give as input parameter of the function the original noisy signal (as a one-dimensional array), set the window size, i.e. n° of points used to calculate the fit, and the order of the polynomial function used to fit the signal. We might be interested in using a filter, when we want to smooth our data points; that is to approximate the original function, only keeping the important features and getting rid of the meaningless fluctuations. In order to do this, successive subsets of points are fitted with a polynomial function that minimizes the fitting error.
The procedure is iterated throughout all the data points, obtaining a new series of data points fitting the original signal. If you are interested in knowing the details of the Savitzky-Golay filter, you can find a comprehensive description [here](https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter).
- Parameters:
func (np.ndarray) – <n_voxels, n_timepoints> representing the functional data to be fitered
window_length (int) – Length of window to use for filtering. Must be an uneven number according to the scipy-documentation (default = 7)
poly_order (int) – Order of polynomial fit to employ within window_length. Default = 3
- Returns:
<n_voxels, n_timepoints> from which high-frequences have been removed
- Return type:
np.ndarray
Notes
High-pass filters remove low-frequency (slow) noise and pass high-freqency signals.
Low-pass filters remove high-frequency noise and thus smooth the data.
Band-pass filters allow only certain frequencies and filter everything else out
Notch filters remove certain frequencies