This code defines a function `show_correlation(df)` that performs the following...

May 18, 2025 at 05:33 AM

def show_correlation(df): print("\nCalculating correlation matrix...") corr_matrix = df.corr(numeric_only=True) print("\nCorrelation Matrix:") print("=" * 90) print(corr_matrix.round(2).to_string()) print("=" * 90) print("\nGenerating correlation heatmap...") plt.figure(figsize=(10, 8)) sns.heatmap(corr_matrix, annot=True, fmt=".2f", cmap='coolwarm', center=0, vmin=-1, vmax=1, linewidths=0.5) plt.title("Correlation Heatmap") plt.tight_layout() plt.show() show_correlation(df)

This code defines a function show_correlation(df) that performs the following actions:

  1. Input: It accepts a DataFrame df as input.

  2. Calculates Correlation Matrix:

    • It calculates the correlation matrix between numeric columns of the provided DataFrame using df.corr(numeric_only=True) (only numeric columns are considered for correlation).
  3. Prints Correlation Matrix:

    • It prints a neatly formatted correlation matrix to the console, rounding the values to 2 decimal places. The matrix is visualized with horizontal separators made of equals (=) signs to improve readability.
  4. Generates a Heatmap:

    • Using the correlation matrix, it generates a heatmap using seaborn (sns.heatmap).
    • The options include:
      • annot=True: Annotates the heatmap cells with correlation values.
      • fmt=".2f": Ensures the correlation values are displayed with 2 decimal places.
      • cmap='coolwarm': Uses a color scale ranging from blue (negative correlations) to red (positive correlations).
      • center=0: Centers the colormap at 0 (neutral correlation).
      • vmin=-1 and vmax=1: Correlation values range between -1 and 1.
      • linewidths=0.5: Adds small lines around each cell for better visual separation.
    • The heatmap's size is set to 10x8 inches and the layout is adjusted using plt.tight_layout().
  5. Displays the Heatmap:

    • The heatmap is displayed using plt.show().

Intended Purpose:
This function helps in analyzing and visualizing the correlation between numeric columns of a DataFrame by:

  • Printing the raw correlation matrix for numeric inspection.
  • Providing a heatmap for intuitive visual analysis of correlations.

Usage:
You can call show_correlation(df) with any pandas DataFrame df containing numeric columns to analyze and visualize correlations.

Note: The libraries matplotlib.pyplot as plt and seaborn as sns need to be imported for this code to work.

Generate your own explanations
Download our vscode extension
Read other generated explanations

Built by @thebuilderjr
Sponsored by beam analytics
Read our terms and privacy policy
Forked from openai-quickstart-node