This Python script is designed to: 1. Fetch data from a...

August 29, 2025 at 08:08 PM

#!/usr/bin/python #Need to install requests package for python #easy_install requests import requests import psycopg2 import psycopg2.extras from pprint import pprint from datetime import datetime, date, time, timedelta SN_RECORDS_LIMIT = 200 DB_NAME = 'servs_edw' DB_USER_NAME = 'ed_dmin' DB_HOST_NAME = '1.0.1' DB_HOST_NAME = 'su2' DB_PASSWD = 'N5' DB_PORT = 5432 # Init database connection conn = psycopg2.connect(dbname=DB_NAME, user=DB_USER_NAME, host=DB_HOST_NAME, password=DB_PASSWD, port=DB_PORT) cursor = conn.cursor(cursor_factory=psycopg2.extras.DictCursor) cursor = conn.cursor() def getRecords(first, last): # Set the request parameters url = 'https://inteleraddev.service-now.com/api/int25/ims_finance_rion?first='+str(first)+'&last='+str(last) print url # Eg. User name="admin", Password="admin" for this code sample. user = 'finanpi_ro' pwd = 'gs$w' # Set proper headers headers = {"Content-Type":"application/json","Accept":"application/json"} # Do the HTTP request response = requests.get(url, auth=(user, pwd), headers=headers ) # Check for HTTP codes other than 200 if response.status_code != 200: print('Status:', response.status_code, 'Headers:', response.headers, 'Error Response:',response.json()) exit() # Decode the JSON response into a dictionary and use the data data = response.json() resultsReturned = 0 for d in data['result']['response']: resultsReturned += 1 user_name = d['name'] user_email = d['email'] user_manager = d['manager'] if user_manager == '': user_manager = None user_active = d['active'] cursor.execute("insert into report_change_request_state_duration \ (\ user_name,\ user_email,\ user_manager,\ user_active)\ values (\ %s, %s, %s, %s )", ( user_name, user_email, user_manager, user_active )) conn.commit() return resultsReturned cursor.execute('TRUNCATE report_change_request_state_duration') conn.commit() nbRecordsFetch = 5000 firstRecord = 1 continueFetching = True while continueFetching: print "First: " + str(firstRecord) + ", nbRecordsFetch:" + str(nbRecordsFetch) nbResults = getRecords(firstRecord, nbRecordsFetch) print "Nb results: " + str(nbResults); if nbResults != 4999: continueFetching = False firstRecord += 5000 nbRecordsFetch += 5000 print " --> next round: "+ str(firstRecord) cursor.close() conn.close()

This Python script is designed to:

  1. Fetch data from a REST API endpoint.
  2. Parse and process the retrieved JSON response.
  3. Insert the extracted data into a PostgreSQL database table.
  4. Repeat this process until all records are retrieved from the API.

Detailed Explanation:

Key Components of the Code:

  1. Database connection setup:

    • Connects to a PostgreSQL database using credentials specified in the variables (DB_NAME, DB_USER_NAME, DB_PASSWD, etc.).
    • Two database cursors are created, but one of them (cursor = conn.cursor()) overwrites the first one, making the first cursor redundant.
  2. getRecords function:

    • Sends an HTTP GET request to a ServiceNow REST API endpoint (https://inteleraddev.service-now.com/api/...) to fetch user data for a specific record range (first and last).
    • Authenticates the API request using basic authentication (user and pwd).
    • Decodes the JSON response and extracts specific fields like name, email, manager, and active.
    • Inserts the extracted data into the PostgreSQL table report_change_request_state_duration.
  3. Main Logic:

    • Clears (truncates) the target table report_change_request_state_duration before inserting new data.
    • Iteratively fetches records 5,000 at a time from the REST API.
    • The getRecords function is repeatedly called in a loop until the number of records fetched in an API response is not equal to 4,999, which signals the end of available data.
    • The loop increments both firstRecord and nbRecordsFetch to fetch the next batch of records on subsequent API calls.
  4. API Error Handling:

    • If the API request fails (status code other than 200), it prints the error details and exits the script.
  5. Commit changes and close connections:

    • After inserting data into the database in each batch, the script commits the transaction.
    • Closes the cursor and database connection at the end.

Key Functionality:

  1. Fetches data from an external API:

    • The script interacts with the REST API using the requests library to fetch paginated data.
  2. Processes the data:

    • Iterates through the response in the JSON payload, extracting key information about users and checks for missing or blank values (e.g., setting user_manager to None if empty).
  3. Stores the data in a PostgreSQL database:

    • Inserts records into a table report_change_request_state_duration using SQL INSERT statements.
  4. Handles large datasets:

    • Implements pagination to handle large datasets by fetching data in chunks of 5,000 records.
  5. Error management:

    • Provides rudimentary error handling for API errors, but lacks robust exception handling for database operations.

Potential Issues:

  1. Security:

    • The script hardcodes credentials (DB_PASSWD, user, pwd), which is a security risk.
    • Sensitive details like database login info and API credentials should not be written in plain text; secure mechanisms such as environment variables or vaults should be used.
  2. Code Redundancy:

    • The second initialization of cursor overwrites the first, making the previous cursor assignment redundant.
  3. Performance:

    • Records are committed to the database individually within a loop (conn.commit()). Batch insertion or optimizing the loop could improve performance.
  4. Error Handling:

    • The script lacks exception handling for database queries. If an INSERT statement fails, the script may terminate without cleaning up the database connection or rolling back the transaction.
  5. Pagination Logic:

    • Assumes 4,999 results per API response to determine when to stop fetching. If the response size varies or the endpoint's behavior changes, this assumption may fail.

Summary:

The script is designed for data ingestion from a REST API endpoint into a PostgreSQL database in chunks. It processes paginated API responses, extracts and transforms user data, and inserts the results into a database table. However, improvements can be made in terms of security, robustness, and performance optimization.

Generate your own explanations
Download our vscode extension
Read other generated explanations

Built by @thebuilderjr
Sponsored by beam analytics
Read our terms and privacy policy
Forked from openai-quickstart-node