Neuroglia is a Python machine learning library for neurophysiology data. It provide a scikit-learn compatible transformers for extracting features from extracellular electrophysiology & optical physiology data for machine learning pipelines.
For a brief introduction to the ideas behind the package, you can read the introductory notes. If you want to get started, see the installation page, then check out the API reference to learn how to use the package.
To see the code or report a bug, please visit the github repository.
Neuroglia is a suite of scikit-learn transformers to facilitate converting between the canonical data structures used in ephys & ophys:
Transformations between these representations are implemented as scikit-learn transformers. This means that they all are defined as objects with “fit” and “transform” methods so that, for example, applying a Gaussian smoothing to a population of spiking data means transforming from a “spike times” structure to a “traces” structure like so:
smoother = neuroglia.spike.Smoother(
sample_times=np.arange(0,MAX_TIME,0.001), # <- this is the time base that the smoothed traces will be cast onto
kernel=’gaussian’, # <- this is the kernel that will be used
tau=0.005, # <- this is the width of the kernel in whatever time base the spike times are in
)
smoothed_traces = smoother.fit_transform(SPIKES)
Conforming to the syntax that is expected by the scikit learn API turns these transformers into building blocks that can plug into a scikit learn pipeline. For example, let’s say you wanted to do some dimensionality reduction on the smoothed traces.
from sklearn.decomposition import NMF
nmf = NMF(n_components=10)
reduced_traces = nmf.fit_transform(smoothed_traces)
You could also chain these together like so
from sklearn.pipeline import Pipeline
pipeline = Pipeline([
(‘smooth’,smoother),
(‘reduce’, nmf),
])
reduced_traces = pipeline.fit_transform(SPIKES)
And if you wanted to change an analysis step, it just becomes a matter of replacing that piece of the pipeline
from sklearn.decomposition import PCA
pipeline = Pipeline([
(‘smooth’,smoother),
(‘reduce’, PCA(n_components=10)),
])
reduced_traces = pipeline.fit_transform(SPIKES)
I’ve also implemented annotating events with event-aligned responses, so I can build an entire decoding pipeline that decodes the stimulus that was presented to a population from (for example) the peak response in any 10ms bin in a 250ms window after the stimulus onset:
from neuroglia.event import PeriEventSpikeSampler
from neuroglia.tensor import ResponseReducer
from sklearn.neighbors import KNeighborsClassifier
pipeline = Pipeline([
('sample', PeriEventSpikeSampler(
spikes=SPIKES,
sample_times=np.arange(0.0,0.25,0.01),
tracizer=Binner,
)),
('reduce', ResponseReducer(method='max')),
('classify', KNeighborsClassifier()),
])
Then, once this pipeline has defined, we can take advantage of scikit-learn infrastructure for cross validation to do a 4-fold cross validation across stimulus presentations
from sklearn.model_selection import cross_val_score
X = EVENTS[‘times’]
y = EVENTS[‘image_id’]
scores = cross_val_score(pipeline, X, y, cv=4)
These examples illustrate the major features of the package & how the API works.
The sources for neuroglia can be downloaded from the `Github repo`_.
$ git clone git://github.com/AllenInstitute/neuroglia
$ cd neuroglia
$ pip install -r requirements
$ pip install ./
Note
Click here to download the full example code
This is an example of how to create synthetic calcium traces
from neuroglia.datasets import make_calcium_traces
calcium = make_calcium_traces(duration=10.0,oscillation=False)
calcium['traces'].plot()
The synthetic spikes that underly the synthetic calcium traces are also available
calcium['spikes'].plot()
We can also generate synthetic calcium traces where a gamma oscillation provides an input to the population
calcium = make_calcium_traces(duration=10.0,oscillation=True)
calcium['traces'].plot()
Note
Click here to download the full example code
This is an example of how to decode natural images from ophys traces in V1
OEID = 541206592
First, let’s download an experiment from the Allen Institute Brain Observatory
from allensdk.core.brain_observatory_cache import BrainObservatoryCache
nwb_dataset = boc.get_ophys_experiment_data(OEID)
Next, we’ll load the dF/F traces and put them in a DataFrame
timestamps, dff = nwb_dataset.get_dff_traces()
neuron_ids = nwb_dataset.get_cell_specimen_ids()
traces = pd.DataFrame(
dff.T,
columns=neuron_ids,
index=timestamps,
)
print(traces.head())
Next, we’ll load stim_table
stim_table = nwb_dataset.get_stimulus_table('natural_scenes')
print(stim_table.head())
The stim_table lists stimulus times in terms of the start and end frames of the calcium traces, but we need start times and durations for neuroglia, so we’ll need to reshape
stim_table['time'] = timestamps[stim_table['Start']]
stim_tabel['End'] = timestamps[stim_table['End']]
stim_table['duration'] = stim_tabel['End'] - stim_tabel['time']
print(stim_table.head())
Note
Click here to download the full example code
This is an example of how to infer spike events
First, we’ll generate some fake data
import numpy as np
import pandas as pd
from neuroglia.datasets import make_calcium_traces
data = make_calcium_traces(duration=10.0,oscillation=False)
traces = data['traces']
spikes = data['spikes']
let’s plot the data
import matplotlib.pyplot as plt
traces.plot()
plt.show()
Now, we’ll deconvolve the data
from neuroglia.calcium import CalciumDeconvolver
deconvolver = CalciumDeconvolver()
detected_events = deconvolver.transform(traces)
neuron_ids = traces.columns
for neuron in neuron_ids:
y_true = spikes[neuron]
y_pred = detected_events[neuron]
corr = np.corrcoef(y_pred,y_true)[0,1]
print("{}: {:0.2f}".format(neuron,corr))
detected_events.plot()
plt.show()
Now, we’ll predict spikes
spikes_pred = deconvolver.predict(traces)
spikes_true = (spikes>0).astype(int)
for neuron in neuron_ids:
y_true = spikes_true[neuron]
y_pred = spikes_pred[neuron]
corr = np.corrcoef(y_pred,y_true)[0,1]
print("{}: {:0.2f}".format(neuron,corr))
Note
Click here to download the full example code
This is an example of how to decode natural images from spikes recorded in V1
from __future__ import print_function
first, we need to load the data
data_path = '/allen/aibs/mat/RamIyer/frm_Dan/NWBFilesSev/V1_NI_pkl_data/'
import pandas as pd
ephys_data = pd.read_pickle(data_path+'M15_ni_data.pkl')
Let’s get the dataframe of image presentations and rename the columns
events = ephys_data['stim_table'].rename(
columns={
'Start':'time',
'Frame':'image_id',
},
)
print(events.head())
Next, let’s reformat the spike times into a single table
from neuroglia.nwb import SpikeTablizer
spikes = SpikeTablizer().fit_transform(ephys_data['spiketimes'])
print(spikes.head())
Now, we’ll sample spikes near each event & build this into a xarray 3D tensor
from neuroglia.event import PeriEventSpikeSampler
import numpy as np
spike_sampler = PeriEventSpikeSampler(
spikes=spikes,
sample_times=np.arange(0.1,0.35,0.01),
)
tensor = spike_sampler.fit_transform(events)
print(tensor)
We can get the average elicited spike count with the ResponseReducer
from neuroglia.tensor import ResponseReducer
import numpy as np
reducer = ResponseReducer(func=np.mean)
means = reducer.fit_transform(tensor)
print(means)
Let’s use the scikit-learn pipeline to chain these steps into a single decoding pipeline
from sklearn.pipeline import Pipeline
from sklearn.neighbors import KNeighborsClassifier
pipeline = Pipeline([
('spike_sampler',PeriEventSpikeSampler(spikes=spikes,sample_times=np.arange(0.1,0.35,0.01))),
('extract', ResponseReducer(func=np.mean)),
('classify', KNeighborsClassifier()),
])
Now we can train the full pipeline on the training set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
events, events['image_id'],test_size=0.33,
)
pipeline.fit(X_train,y_train)
Finally we’ll test the pipeline on the held out data
score = pipeline.score(X_test,y_test)
n_images = len(events['image_id'].unique())
print(score*n_images)
spike.Binner (sample_times) |
Bin a population of spike events into an array of spike counts. |
spike.Smoother (sample_times[, kernel, tau]) |
Smooth a population of spike events into an array. |
nwb.SpikeTablizer () |
Convert a dictionary of spike times to a dataframe of spike times. |
trace.Binarizer ([threshold, copy]) |
Binarize data (set feature values to 0 or 1) according to a threshold |
trace.EdgeDetector ([falling]) |
Detect rising or falling edges in a trace |
trace.WhenTrueFinder () |
Finds times when a trace is non-negative |
calcium.CalciumDeconvolver |
|
calcium.MedianFilterDetrender |
|
calcium.SavGolFilterDetrender |
|
calcium.EventRescaler |
event.PeriEventSpikeSampler (spikes, sample_times) |
Take event-aligned samples of spikes from a population of neurons. |
event.PeriEventTraceSampler (traces, sample_times) |
Take event-aligned samples of traces from a population of neurons. |
event.PeriEventTraceReducer (traces, sample_times) |
Take event-aligned samples of traces from a population of neurons. |
epoch.EpochTraceReducer (traces, func) |
Take event-aligned samples of traces from a population of neurons. |
tensor.ResponseReducer (func[, dim]) |
Reduces a response tensor by performing a function along one dimension |