This Python script contains three functions (`rank_corr_grid`, `anomly_cal`, and `averaged_rank_corr`)...

August 25, 2025 at 07:17 PM

def rank_corr_gird(data, verifier, subject, name):
    print(verifier+ '-' +subject)
    # 2. Prep columns
    data['Date'] = pd.to_datetime(data['Date'])
    data['Month'] = data['Date'].dt.month
    data['Year'] = data['Date'].dt.year

    # 3. Filter to summer months (June, July, August)
    data = data[data['Month'].isin([6, 7, 8])]
    print(data[[verifier, subject]])
    # 4. Rank NDVI and SMERGE by grid and month, across years
    data[verifier +'_rank'] = data.groupby(['PageName', 'Month'])[verifier].transform(lambda x: rankdata(x, method='average'))
    data[subject + '_rank'] = data.groupby(['PageName', 'Month'])[subject].transform(
        lambda x: rankdata(x, method='average'))

    # 7. Compute Pearson correlation for each grid cell (PageName)

    result = data.groupby('PageName').apply(lambda subdf: grid_correlation(subdf, verifier, subject)).reset_index()
    result.columns = ['PageName', 'R']
    print(result)
    # 8. Save result
    result.to_csv(name+'_'+verifier+'-'+subject+'_correlation.csv', index=False)
    return result

def anomly_cal(data, verifier, subject):
  data.dropna(inplace=True)
  data['Date'] = pd.to_datetime(data['Date'], format="%Y-%m-%d")
  data['Month'] = data['Date'].dt.month
  data['Year'] = data['Date'].dt.year
  ##################################################
  month_ave = data[[verifier, subject, 'Year', 'Month', 'PageName']].groupby(
      ['Year', 'Month', 'PageName']).mean()
  month_ave.reset_index(inplace=True)
  ##################################################
  hist_month_ave = month_ave[[verifier, subject, 'Month', 'PageName']].groupby(['Month', 'PageName']).mean()
  hist_month_ave.reset_index(inplace=True)
  ##################################################
  hist_month_ave.rename(columns={verifier: verifier+'_M'}, inplace=True)
  hist_month_ave.rename(columns={subject: subject+'_M'}, inplace=True)

     data_t = data.merge(hist_month_ave, on=['PageName', 'Month'])
  data_t[verifier+'_A'] = data_t[verifier] - data_t[verifier+'_M']
  d          ata_t[subject+'_A'] = data_t[subject] - data_t[subject+'_M']
  return data_t

def averaged_rank_corr(data_t, verifier, subject):
    ##################################################
    d = data_t.groupby(['Month', 'Year']).agg(verifier_A=(verifier, 'mean'),
                                              subject_A=(subject, 'mean')).reset_index()
    print(d)
    spearman_corr, spearman_p_value = spearmanr(d['verifier_A'], d['subject_A'])
    return spearman_corr, spearman_p_value

This Python script contains three functions (rank_corr_grid, anomly_cal, and averaged_rank_corr) that perform data analysis tasks such as ranking, anomaly calculation, and correlation analysis, primarily involving data grouped by spatial grids and time. Here's the breakdown:

1. Function: `rank_corr_grid(data, verifier, subject, name)`

This function calculates rank correlation between two variables (verifier and subject) for specific geographical grid cells, considering only summer months. Here's what it does:

Prepares data columns: The Date column is converted to a datetime object, and new columns for Month and Year are derived.
Filters summer months: The function filters the data to only include entries for June, July, and August (Month values 6, 7, 8).
Ranks data values: The function computes the rank of the verifier and subject values by grid (PageName) and month across years using the rankdata method.
Computes correlation: For each grid cell (PageName), it calculates Pearson correlation between the ranks of verifier and subject using the helper function grid_correlation.
Saves results: The results, including grid cell and correlation value (R), are saved to a CSV file with a naming convention based on the input arguments.

2. Function: `anomly_cal(data, verifier, subject)`

This function calculates anomalies for two variables (verifier and subject). Anomalies quantify the deviation of a value from its historical average. Here's what the function does:

Prepares and cleans data: It converts the Date to datetime format, splits it into Month and Year, and drops missing values.
Calculates monthly averages:
- Computes the average values by year, month, and grid cell.
- Computes the historical average values by month and grid cell.
Calculates anomalies: For each data point, it calculates the deviation (anomaly) of verifier and subject from the historical monthly averages.

3. Function: `averaged_rank_corr(data_t, verifier, subject)`

This function computes the Spearman rank correlation between the anomalies of two variables (verifier and subject). Here's what it does:

Aggregates anomalies: Calculates the monthly and yearly average anomalies for verifier and subject.
Computes correlation: Uses the spearmanr function from Scipy to find the Spearman rank correlation between the average anomalies of the two variables.

Summary of Workflow:

Use rank_corr_grid for ranking and finding correlation over spatial grids in summer months.
Use anomly_cal to generate anomaly data by comparing values to historical averages.
Use averaged_rank_corr to compute the correlation between anomalies of two variables.

Each function serves a specific purpose in spatial-temporal analysis and provides intermediate or final correlation metrics.

Generate your own explanations

Download our vscode extension

Read other generated explanations

Built by @thebuilderjr

This Python script contains three functions (`rank_corr_grid`, `anomly_cal`, and `averaged_rank_corr`)...

1. Function: rank_corr_grid(data, verifier, subject, name)

2. Function: anomly_cal(data, verifier, subject)

3. Function: averaged_rank_corr(data_t, verifier, subject)

Summary of Workflow:

1. Function: `rank_corr_grid(data, verifier, subject, name)`

2. Function: `anomly_cal(data, verifier, subject)`

3. Function: `averaged_rank_corr(data_t, verifier, subject)`