This Python script contains three functions (`rank_corr_grid`, `anomly_cal`, and `averaged_rank_corr`)...

August 25, 2025 at 07:17 PM

def rank_corr_gird(data, verifier, subject, name): print(verifier+ '-' +subject) # 2. Prep columns data['Date'] = pd.to_datetime(data['Date']) data['Month'] = data['Date'].dt.month data['Year'] = data['Date'].dt.year # 3. Filter to summer months (June, July, August) data = data[data['Month'].isin([6, 7, 8])] print(data[[verifier, subject]]) # 4. Rank NDVI and SMERGE by grid and month, across years data[verifier +'_rank'] = data.groupby(['PageName', 'Month'])[verifier].transform(lambda x: rankdata(x, method='average')) data[subject + '_rank'] = data.groupby(['PageName', 'Month'])[subject].transform( lambda x: rankdata(x, method='average')) # 7. Compute Pearson correlation for each grid cell (PageName) result = data.groupby('PageName').apply(lambda subdf: grid_correlation(subdf, verifier, subject)).reset_index() result.columns = ['PageName', 'R'] print(result) # 8. Save result result.to_csv(name+'_'+verifier+'-'+subject+'_correlation.csv', index=False) return result def anomly_cal(data, verifier, subject): data.dropna(inplace=True) data['Date'] = pd.to_datetime(data['Date'], format="%Y-%m-%d") data['Month'] = data['Date'].dt.month data['Year'] = data['Date'].dt.year ################################################## month_ave = data[[verifier, subject, 'Year', 'Month', 'PageName']].groupby( ['Year', 'Month', 'PageName']).mean() month_ave.reset_index(inplace=True) ################################################## hist_month_ave = month_ave[[verifier, subject, 'Month', 'PageName']].groupby(['Month', 'PageName']).mean() hist_month_ave.reset_index(inplace=True) ################################################## hist_month_ave.rename(columns={verifier: verifier+'_M'}, inplace=True) hist_month_ave.rename(columns={subject: subject+'_M'}, inplace=True) data_t = data.merge(hist_month_ave, on=['PageName', 'Month']) data_t[verifier+'_A'] = data_t[verifier] - data_t[verifier+'_M'] d ata_t[subject+'_A'] = data_t[subject] - data_t[subject+'_M'] return data_t def averaged_rank_corr(data_t, verifier, subject): ################################################## d = data_t.groupby(['Month', 'Year']).agg(verifier_A=(verifier, 'mean'), subject_A=(subject, 'mean')).reset_index() print(d) spearman_corr, spearman_p_value = spearmanr(d['verifier_A'], d['subject_A']) return spearman_corr, spearman_p_value

This Python script contains three functions (rank_corr_grid, anomly_cal, and averaged_rank_corr) that perform data analysis tasks such as ranking, anomaly calculation, and correlation analysis, primarily involving data grouped by spatial grids and time. Here's the breakdown:


1. Function: rank_corr_grid(data, verifier, subject, name)

This function calculates rank correlation between two variables (verifier and subject) for specific geographical grid cells, considering only summer months. Here's what it does:

  • Prepares data columns: The Date column is converted to a datetime object, and new columns for Month and Year are derived.
  • Filters summer months: The function filters the data to only include entries for June, July, and August (Month values 6, 7, 8).
  • Ranks data values: The function computes the rank of the verifier and subject values by grid (PageName) and month across years using the rankdata method.
  • Computes correlation: For each grid cell (PageName), it calculates Pearson correlation between the ranks of verifier and subject using the helper function grid_correlation.
  • Saves results: The results, including grid cell and correlation value (R), are saved to a CSV file with a naming convention based on the input arguments.

2. Function: anomly_cal(data, verifier, subject)

This function calculates anomalies for two variables (verifier and subject). Anomalies quantify the deviation of a value from its historical average. Here's what the function does:

  • Prepares and cleans data: It converts the Date to datetime format, splits it into Month and Year, and drops missing values.
  • Calculates monthly averages:
    • Computes the average values by year, month, and grid cell.
    • Computes the historical average values by month and grid cell.
  • Calculates anomalies: For each data point, it calculates the deviation (anomaly) of verifier and subject from the historical monthly averages.

3. Function: averaged_rank_corr(data_t, verifier, subject)

This function computes the Spearman rank correlation between the anomalies of two variables (verifier and subject). Here's what it does:

  • Aggregates anomalies: Calculates the monthly and yearly average anomalies for verifier and subject.
  • Computes correlation: Uses the spearmanr function from Scipy to find the Spearman rank correlation between the average anomalies of the two variables.

Summary of Workflow:

  • Use rank_corr_grid for ranking and finding correlation over spatial grids in summer months.
  • Use anomly_cal to generate anomaly data by comparing values to historical averages.
  • Use averaged_rank_corr to compute the correlation between anomalies of two variables.

Each function serves a specific purpose in spatial-temporal analysis and provides intermediate or final correlation metrics.

Generate your own explanations
Download our vscode extension
Read other generated explanations

Built by @thebuilderjr
Sponsored by beam analytics
Read our terms and privacy policy
Forked from openai-quickstart-node