2. Dataframe creator class (DldFlashDataframeCreatorExpress)
- class processor.DldFlashDataframeCreatorExpress.DldFlashProcessorExpress(runNumber=None, channels=None, settings=None, beamtime_dir=None, parquet_path=None, parquet_dir=None, beamtime_id=None, year=None, daq='fl1user2', silent=False)[source]
The class generates multiindexed multidimensional pandas dataframes from the new FLASH dataformat resolved by both macro and microbunches alongside electrons.
- property addChannels
Add new channels using a dict format defined by: “channel_name”: {
“format”: “per_pulse” | “per_train” | “per_electron”, “group_name”: “channel_group_path”, “slice”: “:”
}
- property availableChannels
Returns the channel names that are available for use, excluding pulseId, defined by the json file
- property channelsPerElectron
Returns a list of channels with per_electron format
- property channelsPerPulse
Returns a list of channels with per_pulse format, including all auxillary channels
- property channelsPerTrain
Returns a list of channels with per_train format
- concatenateChannels(h5_file, format_=None)[source]
Returns a concatenated pandas DataFrame for either all pulse resolved or all electron resolved channels.
- createDataframePerChannel(h5_file, channel)[source]
Returns a pandas DataFrame for a given channel name for a given file. The Dataframe contains the MultiIndex and returns depending on the channel’s format
- createDataframePerFile(file_path)[source]
Returns two pandas DataFrames constructed for the given file. The DataFrames contains the datasets from the iterable in the order opposite to specified by channel names. One DataFrame is pulse resolved and the other electron resolved.
- createMultiIndexPerElectron(h5_file)[source]
Creates an index per electron using pulseId for usage with the electron resolved pandas dataframe
- createMultiIndexPerPulse(train_id, np_array)[source]
Creates an index per pulse using a pulse resovled channel’s macrobunch ID, for usage with the pulse resolved pandas dataframe
- createNumpyArrayPerChannel(h5_file, channel)[source]
Returns a numpy Array for a given channel name for a given file
- readData(runs=None, ignore_missing_runs=False, settings=None, channels=None, beamtime_dir=None, parquet_path=None, beamtime_id=None, year=None, daq='fl1user2')[source]
Read express data from DAQ, generating a parquet in between.
- Args:
- runs: int | list
run number or list of run numbers to load
- ignore_missing_runs: bool
if False, rises FileNotFoundError in case files for a run are not available.
- settings: str | path
pointer to the ini settings file, handeled by the dldProcessor class. It can be the name of a default settings file found in the settings dir of the repo, or the path to a specific settings file.
- channels: list
list of channel names to include in the dataframe. if none defaults to all available channels
- beamtime_dir: str | path
path to the raaw data. If none, its inferred from the settings file
- parquet_path: str | path
path relative to beamtime_dir where to storethe parquet files. Defaults to “beamtime_dir/processed/parquet”
- beamtime_id: int
the id of the beamtime. If none it is inferred from settings
- year: int
the year of the beamtime. If none it is inferred from settings
- daq: str
the daq containig the data. If none it is inferred from settings
- returns:
- prc: DldProcessor
returns an instance of the processor class, with electron and pulse dataframes loaded.
- property removeChannels
Removes the unnecessary channels from the available channels using list of channels to remove