Utils

cloudcasting.utils.find_contiguous_t0_time_periods(contiguous_time_periods: DataFrame, history_duration: timedelta, forecast_duration: timedelta) DataFrame

Get all time periods which contain valid t0 datetimes. t0 is the datetime of the most recent observation.

Parameters:
  • contiguous_time_periods (pd.DataFrame) – Dataframe of continguous time periods

  • history_duration (timedelta) – Duration of the history

  • forecast_duration (timedelta) – Duration of the forecast

Returns:

A DataFrame with two columns start_dt and end_dt (where ‘dt’ is short for ‘datetime’). Each row represents a single time period.

Return type:

pd.DataFrame

cloudcasting.utils.find_contiguous_time_periods(datetimes: DatetimeIndex, min_seq_length: int, max_gap_duration: timedelta) DataFrame

Return a pd.DataFrame where each row records the boundary of a contiguous time period.

Parameters:
  • datetimes (pd.DatetimeIndex) – Must be sorted.

  • min_seq_length (int) – Sequences of min_seq_length or shorter will be discarded. Typically, this would be set to the total_seq_length of each machine learning example.

  • max_gap_duration (timedelta) – If any pair of consecutive datetimes is more than max_gap_duration apart, then this pair of datetimes will be considered a “gap” between two contiguous sequences. Typically, max_gap_duration would be set to the sample period of the timeseries.

Returns:

The DataFrame has two columns start_dt and end_dt (where ‘dt’ is short for ‘datetime’). Each row represents a single time period.

Return type:

pd.DataFrame

cloudcasting.utils.lon_lat_to_geostationary_area_coords(x: Sequence[float], y: Sequence[float], xr_data: Dataset | DataArray) tuple[Sequence[float], Sequence[float]]

Loads geostationary area and change from lon-lat to geostationary coords

Parameters:
  • x (Sequence[float]) – Longitude east-west

  • Sequence[float] (y) – Latitude north-south

  • xr_data (xr.Dataset | xr.DataArray) – xarray object with geostationary area

Returns:

x, y in geostationary coordinates

Return type:

tuple[Sequence[float], Sequence[float]]

cloudcasting.utils.numpy_validation_collate_fn(samples: list[tuple[Float[ndarray, 'channels time height width'], Float[ndarray, 'channels rollout_steps height width']]]) tuple[Float[ndarray, 'batch channels time height width'], Float[ndarray, 'batch channels rollout_steps height width']]

Collate a list of data + targets into a batch.

Parameters:

samples (list[tuple[SampleInputArray, SampleOutputArray]]) – List of (X, y) samples, with sizes of X (batch, channels, time, height, width) and y (batch, channels, rollout_steps, height, width)

Returns:

The collated batch of X samples in the form (batch, channels, time, height, width) and the collated batch of y samples in the form (batch, channels, rollout_steps, height, width)

Return type:

tuple(np.ndarray, np.ndarray)