Skip to content

TaiNguyen42/workforce-operations-sql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Workforce Operations Analytics — SQL Portfolio Project

Tools: SQL (PostgreSQL) · Data Analysis · KPI Reporting
Domain: HR Operations · Workforce Management · Data Quality
Author: Tai Nguyen | GitHub


Overview

This project simulates a real-world workforce operations dataset and answers 10 business questions a Data Analyst would face in an operations or HR analytics role. It covers employee attendance, shift scheduling, task productivity, and performance tracking — mirroring the KPI workflows I support in my current role.

The dataset is synthetic but modeled after realistic operational patterns: scheduling gaps, late arrivals, quality variance across departments, and productivity trends.


Business Questions Answered

# Question SQL Skills Used
01 How many active employees does each department have, and what is the average hourly rate? JOIN, GROUP BY, aggregation
02 Which employees have the highest and lowest attendance rates? LEFT JOIN, CASE, calculated fields
03 How often is each employee late, and by how many minutes on average? TIMESTAMP math, filtering
04 Who are the top performers based on task completion and quality score? Multi-aggregate, ORDER BY
05 How many hours did each employee work, and who is under target? TIMESTAMP math, CASE
06 What is the cumulative task output day-over-day? Window: SUM() OVER (ORDER BY)
07 Who is the top performer in each department? Window: RANK() OVER (PARTITION BY)
08 Which employees score below the company-wide average quality score? CTE, CROSS JOIN, variance
09 Which scheduled shifts had no attendance record (coverage gaps)? Anti-join: LEFT JOIN + IS NULL
10 Full operational dashboard: attendance + quality + hours + performance flag Multi-CTE chain, COALESCE, CASE

Database Schema

departments        employees           shifts
-----------        ---------           ------
department_id  --> department_id       shift_id
department_name    employee_id    -->  employee_id
location           first_name         shift_date
manager_id         last_name          scheduled_start
                   job_title          scheduled_end
                   hire_date          shift_type
                   hourly_rate
                   employment_type    attendance              tasks
                   is_active          ----------              -----
                                      attendance_id           task_id
                                      shift_id <--            employee_id
                                      employee_id             task_date
                                      clock_in                task_type
                                      clock_out               tasks_assigned
                                      status                  tasks_completed
                                                              quality_score

Key Findings

  • 1 coverage gap identified — Employee 109 (Taylor Johnson, Research) had a scheduled shift with no attendance record on Jan 11
  • Top performer: Jordan Kim (Data Operations) — 99% quality score, 100% attendance rate
  • Highest late-arrival rate: Multiple employees in QA averaged 15–32 minutes late across tracked shifts
  • Below-average quality flag: Employees scoring below the company average (93.7%) identified and ranked by variance
  • Department with highest avg hourly rate: Engineering ($46.67/hr avg)

How to Run

  1. Clone this repo
  2. Open any PostgreSQL-compatible editor (pgAdmin, DBeaver, TablePlus, or DB Fiddle online)
  3. Run data/seed_data.sql first to create and populate all tables
  4. Run any query from queries/analysis_queries.sql

No setup required for quick testing: paste the seed data + any single query into db-fiddle.com (select PostgreSQL 15) and run instantly.


Skills Demonstrated

  • JOIN types: INNER, LEFT, CROSS
  • Aggregation: COUNT, SUM, AVG, MIN, MAX, ROUND
  • Window functions: RANK(), SUM() with OVER, PARTITION BY, ORDER BY
  • CTEs: single and chained (WITH statements)
  • CASE WHEN logic for business rules and flags
  • TIMESTAMP arithmetic with EXTRACT(EPOCH FROM ...)
  • Anti-join pattern with LEFT JOIN + IS NULL
  • NULLIF and COALESCE for safe division and null handling
  • ROUND, DECIMAL precision handling

Project Structure

workforce-ops-sql/
├── data/
│   └── seed_data.sql          # Table definitions + sample records
├── queries/
│   └── analysis_queries.sql   # All 10 business questions + solutions
└── README.md

About This Project

This project was built to demonstrate SQL skills applicable to Data Analyst roles in operations, HR analytics, and business intelligence. The dataset design mirrors real patterns I've worked with in data operations — scheduling workflows, data quality tracking, and KPI reporting against acceptance rate thresholds.


Connect with me on LinkedIn or check out my other projects on GitHub.

About

SQL analysis of workforce KPIs — attendance, productivity, and performance. PostgreSQL, CTEs, window functions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors