DSIMPORT - load data into DynaSim formatted data structure. Usage: [data,studyinfo] = dsImport(data_file) [data,studyinfo] = dsImport(studyinfo) data = ImportData(data_file) Inputs: - First input/argument: - data_file: data file name in accepted format (csv, mat, ...) - cell array of data files - study_dir - studyinfo structure - studyinfo file - options: 'verbose_flag': {0,1} (default: 1) 'process_id' : process identifier for loading studyinfo if necessary 'time_limits' : [beg,end] ms (see NOTE 2) 'variables' : cell array of matrix names (see NOTE 2) 'simIDs' : array of simIDs to import (default: []) Outputs: - DynaSim data structure: data.labels : list of state variables and monitors recorded data.(state_variables): state variable data matrix [time x cells] data.(monitors) : monitor data matrix [time x cells] data.time : time vector [time x 1] data.simulator_options: simulator options used to generate simulated data data.model : model used to generate simulated data [data.varied] : list of varied model components [data.results] : list of derived data sets created by post-processing - studyinfo: DynaSim studyinfo structure (see CheckStudyinfo) Note: if data is missing, studyinfo.simulations will only show found data Notes: - NOTE 1: CSV file structure assumes CSV file contains data organized according to output from dsWriteDynaSimSolver: time points along rows; state variables and monitors are columns; first column is time vector; next columns are state variables; final columns are monitors. first row has headers for each column. if a population has more than one cell, different cells are sequential columns with same header repeated for each cell. - NOTE 2: DynaSim data exported to MAT-files are HDF-compatible. To obtain partial data sets without having to load the entire file, use dsImport with options 'time_limits' and/or 'variables'. Alternatively, the entire data set can be loaded using dsImport with default options, then subsets extracted using dsSelect with appropriate options. Examples: - Example 1: full data set data=dsImport('data.mat'); % load single data set data=dsImport(studyinfo); % load all data sets in studyinfo.study_dir - Example 2: partial data set with HDF-style loading data=dsImport('data.mat','variables','pop1_v','time_limits',[1000 4000]) TODO: - specify subsets to return in terms of varied parameters, time_limits, ROIs, etc possible format for specifying range_varied: {'E','gNa',[.1 .3]; 'I->E','tauI',[15 25]; 'I','mechanism_list','+iM'} - achieve by calling function dsSelect() at end of this function.
0001 function [data,studyinfo] = dsImport(file,varargin) 0002 %DSIMPORT - load data into DynaSim formatted data structure. 0003 % 0004 % Usage: 0005 % [data,studyinfo] = dsImport(data_file) 0006 % [data,studyinfo] = dsImport(studyinfo) 0007 % data = ImportData(data_file) 0008 % 0009 % Inputs: 0010 % - First input/argument: 0011 % - data_file: data file name in accepted format (csv, mat, ...) 0012 % - cell array of data files 0013 % - study_dir 0014 % - studyinfo structure 0015 % - studyinfo file 0016 % - options: 0017 % 'verbose_flag': {0,1} (default: 1) 0018 % 'process_id' : process identifier for loading studyinfo if necessary 0019 % 'time_limits' : [beg,end] ms (see NOTE 2) 0020 % 'variables' : cell array of matrix names (see NOTE 2) 0021 % 'simIDs' : array of simIDs to import (default: []) 0022 % 0023 % Outputs: 0024 % - DynaSim data structure: 0025 % data.labels : list of state variables and monitors recorded 0026 % data.(state_variables): state variable data matrix [time x cells] 0027 % data.(monitors) : monitor data matrix [time x cells] 0028 % data.time : time vector [time x 1] 0029 % data.simulator_options: simulator options used to generate simulated data 0030 % data.model : model used to generate simulated data 0031 % [data.varied] : list of varied model components 0032 % [data.results] : list of derived data sets created by post-processing 0033 % - studyinfo: DynaSim studyinfo structure (see CheckStudyinfo) 0034 % Note: if data is missing, studyinfo.simulations will only show found data 0035 % 0036 % Notes: 0037 % - NOTE 1: CSV file structure assumes CSV file contains data organized 0038 % according to output from dsWriteDynaSimSolver: time points along rows; state 0039 % variables and monitors are columns; first column is time vector; next 0040 % columns are state variables; final columns are monitors. first row has 0041 % headers for each column. if a population has more than one cell, different 0042 % cells are sequential columns with same header repeated for each cell. 0043 % 0044 % - NOTE 2: DynaSim data exported to MAT-files are HDF-compatible. To obtain 0045 % partial data sets without having to load the entire file, use dsImport 0046 % with options 'time_limits' and/or 'variables'. Alternatively, the entire 0047 % data set can be loaded using dsImport with default options, then subsets 0048 % extracted using dsSelect with appropriate options. 0049 % 0050 % Examples: 0051 % - Example 1: full data set 0052 % data=dsImport('data.mat'); % load single data set 0053 % data=dsImport(studyinfo); % load all data sets in studyinfo.study_dir 0054 % - Example 2: partial data set with HDF-style loading 0055 % data=dsImport('data.mat','variables','pop1_v','time_limits',[1000 4000]) 0056 % 0057 % TODO: 0058 % - specify subsets to return in terms of varied parameters, time_limits, ROIs, 0059 % etc possible format for specifying range_varied: {'E','gNa',[.1 .3]; 0060 % 'I->E','tauI',[15 25]; 'I','mechanism_list','+iM'} 0061 % - achieve by calling function dsSelect() at end of this function. 0062 0063 % See also: dsSimulate, dsExportData, dsCheckData, dsSelect 0064 % 0065 % Author: Jason Sherfey, PhD <jssherfey@gmail.com> 0066 % Copyright (C) 2016 Jason Sherfey, Boston University, USA 0067 0068 0069 % Check inputs 0070 options=dsCheckOptions(varargin,{... 0071 'verbose_flag',1,{0,1},... 0072 'process_id',[],[],... % process identifier for loading studyinfo if necessary 0073 'time_limits',[],[],... 0074 'variables',[],[],... 0075 'simIDs',[],[],... 0076 'auto_gen_test_data_flag',0,{0,1},... 0077 },false); 0078 0079 %% auto_gen_test_data_flag argin 0080 if options.auto_gen_test_data_flag 0081 varargs = varargin; 0082 varargs{find(strcmp(varargs, 'auto_gen_test_data_flag'))+1} = 0; 0083 varargs(end+1:end+2) = {'unit_test_flag',1}; 0084 argin = [{file}, varargs]; % specific to this function 0085 end 0086 0087 if ischar(options.variables) 0088 options.variables = {options.variables}; 0089 end 0090 0091 % check if input is a DynaSim study_dir or path to studyinfo 0092 if ischar(file) 0093 if isdir(file) % study directory 0094 study_dir = file; 0095 clear file 0096 file.study_dir = study_dir; 0097 elseif strfind(file, 'studyinfo') 0098 filePath = fileparts2(file); 0099 if isempty(filePath) 0100 filePath = pwd; 0101 end 0102 study_dir = filePath; 0103 clear file 0104 file.study_dir = study_dir; 0105 end 0106 end 0107 0108 if isstruct(file) && isfield(file,'study_dir') 0109 % "file" is a studyinfo structure. 0110 % retrieve most up-to-date studyinfo structure from studyinfo.mat file 0111 studyinfo = dsCheckStudyinfo(file.study_dir,'process_id',options.process_id, varargin{:}); 0112 0113 % compare simIDs to sim_id 0114 if ~isempty(options.simIDs) 0115 [~,~,simsInds] = intersect(options.simIDs, [studyinfo.simulations.sim_id]); 0116 end 0117 0118 % get list of data_files from studyinfo 0119 if isempty(options.simIDs) 0120 data_files = {studyinfo.simulations.data_file}; 0121 else 0122 data_files = {studyinfo.simulations(simsInds).data_file}; 0123 end 0124 success = cellfun(@exist,data_files)==2; 0125 0126 if ~all(success) 0127 % convert original absolute paths to paths relative to study_dir 0128 for i = 1:length(data_files) 0129 [~,fname,fext] = fileparts2(data_files{i}); 0130 data_files{i} = fullfile(file.study_dir,'data',[fname fext]); 0131 end 0132 0133 success = cellfun(@exist,data_files)==2; 0134 end 0135 0136 data_files = data_files(success); 0137 sim_info = studyinfo.simulations(success); 0138 studyinfo.simulations = studyinfo.simulations(success); % remove missing data 0139 0140 % load each data set recursively 0141 keyvals = dsOptions2Keyval(options); 0142 num_files = length(data_files); 0143 0144 for i = 1:num_files 0145 fprintf('loading file %g/%g: %s\n',i,num_files,data_files{i}); 0146 tmp_data=dsImport(data_files{i},keyvals{:}); 0147 num_sets_per_file=length(tmp_data); 0148 modifications=sim_info(i).modifications; 0149 0150 if ~isfield(tmp_data,'varied') && ~isempty(modifications) 0151 % add varied info 0152 % this is necessary here when loading .csv data lacking metadata 0153 tmp_data.varied={}; 0154 modifications(:,1:2) = cellfun( @(x) strrep(x,'->','_'),modifications(:,1:2),'UniformOutput',0); 0155 0156 for j=1:size(modifications,1) 0157 varied=[modifications{j,1} '_' modifications{j,2}]; 0158 for k=1:num_sets_per_file 0159 tmp_data(k).varied{end+1}=varied; 0160 tmp_data(k).(varied)=modifications{j,3}; 0161 end 0162 end 0163 end 0164 0165 % store this data 0166 if i==1 0167 total_num_sets=num_sets_per_file*num_files; 0168 set_indices=0:num_sets_per_file:total_num_sets-1; 0169 0170 % preallocate full data matrix based on first data file 0171 data(1:total_num_sets)=tmp_data(1); 0172 % data(1:length(data_files))=tmp_data; 0173 % else 0174 % data(i)=tmp_data; 0175 end 0176 % replace i-th set of data sets by these data sets 0177 data(set_indices(i)+(1:num_sets_per_file))=tmp_data; 0178 end 0179 0180 return; 0181 else 0182 studyinfo=[]; 0183 end 0184 0185 % check if input is a list of data files (TODO: eliminate duplicate code by 0186 % combining with the above recursive loading for studyinfo data_files) 0187 if iscellstr(file) 0188 data_files=file; 0189 success=cellfun(@exist,data_files)==2; 0190 data_files=data_files(success); 0191 keyvals=dsOptions2Keyval(options); 0192 0193 % load each data set recursively 0194 for i=1:length(data_files) 0195 tmp_data=dsImport(data_files{i},keyvals{:}); 0196 % store this data 0197 if i==1 0198 % preallocate full data matrix based on first data file 0199 data(1:length(data_files))=tmp_data; 0200 else 0201 % replace i-th data element by this data set 0202 data(i)=tmp_data; 0203 end 0204 end 0205 return; 0206 end 0207 0208 if ischar(file) 0209 [~,~,ext]=fileparts2(file); 0210 switch lower(ext) 0211 case '.mat' 0212 % MAT-file contains data fields as separate variables (-v7.3 for HDF) 0213 if isempty(options.time_limits) && isempty(options.variables) 0214 % load full data set 0215 data=load(file); 0216 0217 % if file only contains a structure called 'data' then return that 0218 if isfield(data,'data') && length(fieldnames(data))==1 0219 data=data.data; 0220 end 0221 else 0222 % load partial data set 0223 % use matfile() to load HDF subsets given varargin options... 0224 obj=matfile(file); % MAT-file object 0225 varlist=who(obj); % variables stored in mat-file 0226 labels=obj.labels; % list of state variables and monitors 0227 0228 if iscellstr(options.variables) % restrict variables to load 0229 labels=labels(ismember(labels,options.variables)); 0230 end 0231 0232 simulator_options=obj.simulator_options; 0233 time=(simulator_options.tspan(1):simulator_options.dt:simulator_options.tspan(2))'; 0234 time=time(1:simulator_options.downsample_factor:length(time)); 0235 0236 if ~isempty(options.time_limits) 0237 % determine time indices to load 0238 time_indices=nearest(time,options.time_limits(1)):nearest(time,options.time_limits(2)); 0239 else 0240 % load all time points 0241 time_indices=1:length(time); 0242 end 0243 0244 % create DynaSim data structure: 0245 data=[]; 0246 data.labels=labels; 0247 0248 % load state variables and monitors 0249 for i=1:length(labels) 0250 data.(labels{i})=obj.(labels{i})(time_indices,:); 0251 end 0252 0253 data.time=time(time_indices); 0254 data.simulator_options=simulator_options; 0255 0256 if ismember('model',varlist) 0257 data.model=obj.model; 0258 end 0259 0260 if ismember('varied',varlist) 0261 varied=obj.varied; 0262 data.varied=varied; 0263 for i=1:length(varied) 0264 data.(varied{i})=obj.(varied{i}); 0265 end 0266 end 0267 0268 if ismember('results',varlist) 0269 results=obj.results; 0270 if iscellstr(options.variables) 0271 results=results(ismember(results,options.variables)); 0272 end 0273 data.results=results; 0274 0275 % load results 0276 for i=1:length(results) 0277 data.(results{i})=obj.(results{i})(time_indices,:); 0278 end 0279 end 0280 end 0281 case '.csv' 0282 % assumes CSV file contains data organized according to output from dsWriteDynaSimSolver: 0283 data=dsImportCSV(file); 0284 0285 if ~(isempty(options.time_limits) && isempty(options.variables)) 0286 % limit to select subsets 0287 data=dsSelect(data,varargin{:}); % todo: create dsSelect() 0288 end 0289 otherwise 0290 error('file type not recognized. dsImport currently supports DynaSim data structure in MAT file, data values in CSV file.'); 0291 end 0292 end 0293 0294 %% auto_gen_test_data_flag argout 0295 if options.auto_gen_test_data_flag 0296 argout = {data, studyinfo}; % specific to this function 0297 0298 dsUnitSaveAutoGenTestData(argin, argout); 0299 end 0300 0301 end % main fn