We always hear about Kenya’s external rising debt in headlines, but how fast is it growing? And what are the trends year over year?
As a data engineer, I wanted to answer these questions with data, not just opinions. So I built a pipeline that connects the dots: from pulling debt data via the World Bank API, transforming it using pandas, storing it in a PostgreSQL database, and visualizing the story through Grafana.
You can get the project code in [GitHub].(https://github.com/dkkinyua/KenyanExternalDebtAnalysis)
Project requirements
Python for scripting and data processing
Pandas for data transformation
PostgreSQL for data storage
Grafana for interactive dashboards
World Bank API as the data source
Project Breakdown
1. Extracting Data from the World Bank API
I used the World Bank API to fetch Kenya’s external debt stock from 2010 to 2023 using the requests
library.
def extract():
url = "https://api.worldbank.org/v2/country/KE/indicator/DT.DOD.DECT.CD"
params = {
'date': '2010:2024',
'format': 'json'
}
response = requests.get(url, params=params)
if response.status_code == 200:
rel_data = []
data = response.json()
for item in data[1]:
rel_data.append({
'countryName': item['country']['id'],
'year': item['date'],
'debtValue': item['value'],
'indicator': item['indicator']['value']
})
return rel_data
else:
print(f"Error fetching data: {response.status_code}, {response.text}")
2. Transforming Data with Pandas
I cleaned and processed the data, by handling and dropping NaN/missing values,
def transform_load(data):
df = pd.DataFrame(data)
df.dropna(inplace=True)
3. Loading into a PostgreSQL database:
Load into the data into a PostgreSQL database for easier integration with Grafana for the dashboards:
def transform_load(data):
# code
try:
engine = create_engine(url)
df.to_sql('external_data', con=engine, index=False, if_exists='replace')
print('Data loaded successfully!')
except Exception as e:
print(f"Error loading into db: {e}")
4. Building Grafana dashboards
A. Creating and configuring Grafana.
- Go to Grafana Cloud and create a new account.
- Head over to Dashboards and click on Data Sources
- On your right-hand side, click on Add Data Source and connect your PostgreSQL database
- Click on Create a dashboard
B. Dashboards
Insights Derived
There has been steady growth of external debt between 2010 and 2023, which reflects a 383% increase in external debt over a period of 13 years, showing an overreliance on external borrowing to finance development
From the bar chart in the dashboard, 2014-2015 and 2017-2018 stand out as periods of high borrowing. The 2017 spike was the highest of them all, which coincided with the 2017 election period.
There has been slow debt growth between 2021 and 2023, which shows efforts to slow down external borrowing or external pressure from debt servicing.
Technical or logical challenges encountered
Some of the technical or logical challenges encountered include:
- Inconsistent time formats
- Dealing with missing/NaN values
- The API returns nested JSON or XML, not always straightforward for Pandas ingestion
- This data is just from external debts and does not use any other indicators to analyze any other debt patterns
Conclusion
This project gives hands-on experience with end-to-end ETL pipeline design, from data extraction from the World Bank API using requests
to transformation using pandas, loading, and visualization using Grafana. It's a solid foundation for building more robust data pipelines.
For more blogs like this, please like, comment and follow me to stay updated in the data engineering world!
Top comments (0)