Generic
- read_bufr(path, reader='generic', columns=[], filters={}, required_columns=True)
Extract the specified
columnsfrom BUFR as a pandas.DataFrame using a hierarchical collector.- Parameters:
path (str, bytes, os.PathLike or a message list object) – path to the BUFR file or a message list object
columns (str, sequence[str]) – a list of BUFR keys and computed keys to extract from each BUFR message/subset. Please note that computed keys do not preserve their position in
columnsbut are placed to the end of the resulting DataFrame.filters (dict) – defines the conditions when to extract the specified
columns. The individual conditions are combined together with the logical AND operator to form the filter. See Filters for details.required_columns (bool, iterable[str]) –
the list of ecCodes BUFR keys that are required to be present in the BUFR message/subset. It has a twofold meaning:
if any of the keys in
required_columnsis missing in the message/subset the whole message/subset is skippedif all the keys in
required_columnsare present, the message/subset is processed even if some key fromcolumnsare missing (supposing the filter conditions are met)
Bool values are interpreted as follows:
True means all the keys in
columnsare required. It means that if any of the keys incolumnsmissing in the message/subset the whole message/subset is skipped.False means no columns are required
- Return type:
pandas.DataFrame
How the generic reader works
The generic reader reader interprets each BUFR message/subset as a hierarchical structure (see Hierarchical structure for details). During data extraction pdbufr traverses this hierarchy and when all the
columnsare collected and the all thefiltersmatch a new record is added to the output. With this several records can be extracted from the same message/subset.
Example
The input is one of the tests data files with classic radiosonde observations, where each message contains a single location (“latitude”, “longitude”) with several pressure levels of temperature, dewpoint etc. The message hierarchy is shown in the following snapshot:
![]()
To extract the temperature profile for the first two stations we can use this code:
df = pdbufr.read_bufr( "tests/sample_data/temp.bufr", columns=("latitude", "longitude", "pressure", "airTemperature"), filters={"count": [1, 2]}, )which results in the following DataFrame:
latitude longitude pressure airTemperature 0 58.47 -78.08 100300.0 258.3 1 58.47 -78.08 100000.0 259.7 2 58.47 -78.08 99800.0 261.1 ... 46 53.75 -73.67 25000.0 221.1 47 53.75 -73.67 23200.0 223.1 48 53.75 -73.67 20500.0 221.5 [48 rows x 4 columns]