Education Access and Enrollment Patterns for Children with Disabilities (2022-2023)¶

Nina Flores & Steven Lora¶

Hypothesis:¶

Enrollment of children with disabilities differs significantly across states and disability categories, indicating varying levels of accessibility and support within special education programs.

Project Overview¶

This project will analyze the 2022-2023 IDEA Section 618 Child Count and Educational Environment dataset, which documents the number and characteristics of children with disabilities served under Part B of the Individuals with Disabilities Education (IDEA). The focus of this analysis is to explore enrollment patterns across states, disability categories, age groups, and demographic factors such as gender and race. By examining how different populations are represented in special education services, the project aims to identify states with the largest served populations and reveal areas where gaps in inclusion and representation persist.

Building on the project hypothesis, the objective is to identify and interpret patterns in enrollment equity and service accessibility. Through visual analytics, this project seeks to uncover meaningful differences in how students with disabilities are supported across the United States and to promote data-driven discussion about improving educational access and inclusion. The analysis emphasizes trends across age, disability type, and educational environment to provide a comprehensive view of service distribution.

To Accomplish this, this project incorporates a suite of visualizations designed to provide both broad and detailed insights into the data:

• Choropleth maps will show geographic variation in the number and proportion of students served under IDEA Part B.

• Stacked bar charts will compare educational environments across disability types to highlight inclusion levels.

• Box plots will display the distribution of enrollment rates across states and identify statistical outliers.

• Bubble charts will visualize relationships between inclusion rates, total enrollment, and regional grouping, offering a multidimensional perspective.

• Heatmaps will depict demographic representation by race, gender, and disability category

Mount Drive¶

In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive

Import Libraries¶

In [ ]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.io as pio
from os import rename

pio.renderers.default = 'colab'

Get Data¶

In [ ]:
datadir = '/content/drive/My Drive/Colab Notebooks/'

filepath = datadir + 'bchildcountandedenvironment2022-23.csv'

df_raw = pd.read_csv(filepath, header =None)

df = pd.read_csv(filepath, header =4)

Understanding the Data Frame¶

In [ ]:
df.head()
Out[ ]:
Year State Name SEA Education Environment SEA Disability Category Age 3 Age 4 Age 5 (Early Childhood) American Indian or Alaska Native - Early Childhood Asian - Early Childhood Black or African American - Early Childhood ... EL No - School Age Female - School Age Male - School Age American Indian or Alaska Native - School Age Asian - School Age Black or African American - School Age Hispanic/Latino - School Age Native Hawaiian or Other Pacific Islander - School Age Two or more races - School Age White - School Age
0 2022 Alabama Correctional Facilities All Disabilities - - - - - - ... 26 2 24 0 0 21 1 0 0 4
1 2022 Alabama Home All Disabilities 52 60 6 1 1 21 ... - - - - - - - - - -
2 2022 Alabama Homebound/Hospital All Disabilities - - - - - - ... 409 151 268 1 9 165 26 0 10 208
3 2022 Alabama Inside regular class 40% through 79% of the day All Disabilities - - - - - - ... 6032 2004 4332 33 79 2497 502 5 235 2985
4 2022 Alabama Inside regular class 80% or more of the day All Disabilities - - - - - - ... 74908 27634 51089 567 484 26555 6287 62 2530 42238

5 rows × 53 columns

In [ ]:
df.shape
Out[ ]:
(16231, 53)
In [ ]:
num_cols = [ "Age 3 to 5 (Early Childhood)", "Ages 6-21"]

for col in num_cols:
  df[col] = pd.to_numeric(df[col], errors="coerce")

df[num_cols].describe().round()
Out[ ]:
Age 3 to 5 (Early Childhood) Ages 6-21
count 8241.0 542.0
mean 517.0 50252.0
std 7769.0 361424.0
min 0.0 0.0
25% 0.0 119.0
50% 1.0 976.0
75% 17.0 15458.0
max 535392.0 6809208.0

Visualizations¶

The following visualizations will be used to provide insights into the data.

In [ ]:
mask = (
    (df["SEA Disability Category"] == "All Disabilities") &
    (df["SEA Education Environment"] == "Total, School Age")
)

df_state_total = df[mask].copy()

df_state_total["Ages 6-21"] = pd.to_numeric(df_state_total["Ages 6-21"], errors="coerce")

state_counts = (
    df_state_total.groupby("State Name", as_index=False)["Ages 6-21"]
    .sum()
    .dropna()
)

state_counts = state_counts.sort_values("Ages 6-21", ascending=False)
In [ ]:
fig = px.bar(
    state_counts,
    x="State Name",
    y="Ages 6-21",
    title="Total School Age Children with Disabilities (Ages 6-21) by State - 2022-23",
    labels={"Ages 6-21": "Number of Children", "State Name": "State"}
)

fig.update_layout(xaxis_tickangle=-45)
fig.show()

Choropleth Map - Geographic Distribution of Students Served¶

The purpose of this map is to visualize the number and proportion of students served under IDEA Part B across all U.S States and territories.

Key Findings:

  • States such as, Califronia, Texas, New York, Georgia, and Florida, show the highest enrollment rates.

  • States such as, Wyoming, Vermont, Montana, Deleaware, and U.S. territories show the lowest enrollment rates.

  • South and West tend to have higher numbers across more age groups. The northeast and mountain states have more moderate levels.

This map highlights how population size and resource distribution influence the number of students served. High enrollment states may require more infrastructure, while low enrollment regions may face challenges in access or staffing.

In [ ]:
#Choropleth Map
state_abbrev = {
    "Alabama": "AL",
    "Alaska": "AK",
    "Arizona": "AZ",
    "Arkansas": "AR",
    "California": "CA",
    "Colorado": "CO",
    "Connecticut": "CT",
    "Delaware": "DE",
    "Florida": "FL",
    "Georgia": "GA",
    "Hawaii": "HI",
    "Idaho": "ID",
    "Illinois": "IL",
    "Indiana": "IN",
    "Iowa" : "IA",
    "Kansas": "KS",
    "Kentucky": "KY",
    "Louisiana": "LA",
    "Maine": "ME",
    "Maryland": "MD",
    "Massachusetts": "MA",
    "Michigan": "MI",
    "Minnesota": "MN",
    "Mississippi": "MS",
    "Missouri": "MO",
    "Montana": "MT",
    "Nebraska": "NE",
    "Nevada": "NV",
    "New Hampshire": "NH",
    "New Jersey": "NJ",
    "New Mexico": "NM",
    "New York": "NY",
    "North Carolina": "NC",
    "North Dakota": "ND",
    "Ohio": "OH",
    "Oklahoma": "OK",
    "Oregon": "OR",
    "Pennsylvania": "PA",
    "Rhode Island": "RI",
    "South Carolina": "SC",
    "South Dakota": "SD",
    "Tennessee": "TN",
    "Texas": "TX",
    "Utah": "UT",
    "Vermont": "VT",
    "Virginia": "VA",
    "Washington": "WA",
    "West Virginia": "WV",
    "Wisconsin": "WI",
    "Wyoming": "WY",
    "District of Columbia": "DC",
    "American Samoa": "AS",
    "Guam": "GU",
    "Northern Mariana Islands": "MP",
    "Puerto Rico": "PR",
    "United States Minor Outlying Islands": "UM",
    "U.S. Virgin Islands": "VI",
}

state_counts_usa = state_counts[state_counts["State Name"].isin(state_abbrev.keys())].copy()
state_counts_usa["state_code"] = state_counts_usa["State Name"].map(state_abbrev)

fig = px.choropleth(
    state_counts_usa,
    locations="state_code",
    locationmode="USA-states",
    color="Ages 6-21",
    scope="usa",
    title="Total School Age Children with Disabilities (Ages 6-21) by State - 2022-23",
    labels={"Ages 6-21": "Number of Children"}
)

fig.show()

Stacked Bar Chart - Educational Environment by Disability Category¶

This chart compares how students across different disability categories are served in various educational environments.

Key Findings:

  • Students with a specific learning disability, speech or language impairments are most frequently served inside the regular class.

  • Low incidence disabilities show small but consistent values

  • Residential and seperate school placements are minimal, confirming national trends toward inclusive practices.

The stacked bars expose disparities in inclusion levels by disability category. Some student groups consistently receive moore inclusive placement, while others experience more restrictive settings. This aligns with IDEA's emphasis on Least Restrictive Environment (LRE) but also highlights areas where improvement is needed.

In [ ]:
# Stacked Bar Chart
def stacked_bar_by_state(state):
  env_keep = [
      "Inside regular class 80% or more of the day",
      "Inside regular class 40% through 79% of the day",
      "Inside regular class less than 40% of the day",
      "Seperate Class",
      "Seperate School, School Age",
      "Residential Facility, School Age"
  ]

  subset = df[
      (df["State Name"] == state) &
      (df["SEA Disability Category"] != "All Disabilities") &
      (df["SEA Education Environment"].isin(env_keep))
  ].copy()

  age_cols_6_21= [c for c in subset.columns
                  if ("Age 6" in c) or ("Age 12" in c) or ("Age 18-21" in c)]
  print("Using these columns to compute Ages 6-21:", age_cols_6_21)

  for col in age_cols_6_21:
    subset[col] = pd.to_numeric(subset[col], errors="coerce")

  subset["Ages_6_21_calc"] = subset[age_cols_6_21].sum(axis=1)


  grouped = (
      subset
      .groupby(["SEA Disability Category", "SEA Education Environment"], as_index=False)["Ages_6_21_calc"]
      .sum()
  )

  display(grouped.head())

  fig = px.bar(
      grouped,
      x="SEA Disability Category",
      y="Ages_6_21_calc",
      color="SEA Education Environment",
      title=f"Educational Environments by Disability Category in [state] (Ages 6-21, 2022-23)",
      labels={
          "Ages 6-21": "Number of Children",
          "SEA Disability Category": "Disability Category",
          "SEA Education Environment": "Educational Environment"
      }
  )

  fig.update_layout(xaxis_tickangle=-45)
  fig.show()
In [ ]:
stacked_bar_by_state("Florida")
Using these columns to compute Ages 6-21: ['Age 6', 'Age 12', 'Age 6-11', 'Age 12-17', 'Age 18-21']
SEA Disability Category SEA Education Environment Ages_6_21_calc
0 Autism Inside regular class 40% through 79% of the day 2255.0
1 Autism Inside regular class 80% or more of the day 11738.0
2 Autism Inside regular class less than 40% of the day 7764.0
3 Autism Residential Facility, School Age 5.0
4 Deaf-blindness Inside regular class 40% through 79% of the day 8.0
In [ ]:
stacked_bar_by_state("New York")
Using these columns to compute Ages 6-21: ['Age 6', 'Age 12', 'Age 6-11', 'Age 12-17', 'Age 18-21']
SEA Disability Category SEA Education Environment Ages_6_21_calc
0 Autism Inside regular class 40% through 79% of the day 4920.0
1 Autism Inside regular class 80% or more of the day 7171.0
2 Autism Inside regular class less than 40% of the day 6782.0
3 Autism Residential Facility, School Age 443.0
4 Deaf-blindness Inside regular class 40% through 79% of the day 0.0

Box Plot - Enrollment Distribution Across States¶

The purpose of this box plot to show the spread, median, and variability of enrollment totals in special education.

Key Findings:

  • The median enrollment for states is 85,9155 but the upper whisker stretches to around 725,000 showing strong right skew. A few large states pull the distribution upward significantly.

  • Small territories appear as extreme low outliers, while a handful of states act as high outliers.

  • The interquartile range (IQR) is fairly narrow compared to the overall speed, suggesting most states cluster in mid-range enrollment zone.

The box plot emphasizes disparities in total enrollment, which often reflect differences in population rather than accessibility. Still, extreme outliers may justify further examination of support systems and demographic trends.

In [ ]:
#Box plot
fig = px.box(
    state_counts_usa,
    y="Ages 6-21",
    title= "Distribution of School Age Enrollment (Ages 6-21) Across States - 2022-23",
    points="all",
    labels={"Ages 6-21": "Number of Children"}
)

fig.update_yaxes(type="log")
fig.show()

Bubble Chart - Inclusion Rate vs. Total Enrollment by Region¶

This Bubble Chart illustrates the relationship between inclusion rate and total enrollment, with bubble size indicating the number of students and color representing each region.

Key Findings:

  • The South and West contain many of the highest population states. Northeast and Midwest states cluster toward smaller enrollment and mid-range inclusion rates.

  • Most states fall between 0.55 and 0.85 inclusion rates. Very few fall below 0.5, indicating a push toward inclusion.

  • California is likely to be the cause of the one extremely large bubble around inclusion at 0.67 and enrollment at 7M, dominating the chart and visualizing the scale difference.

  • Higher inclusion does not necessarily correlate with higher enrollment. Some states with low total enrollment outperform larger states in inclusion.

The Bubble Chart helps reveal whether heavily populated states are keeping pace with inclusive education practices. Larger states may face structural or capacity challenges, while smaller states may achieve higher inclusion due to more manageable case loads. The visualization provides evidence that both supports and questions the hypothesis, depending on how disparities align with regional patterns, population size, or disability characteristics.

In [ ]:
# Bubble Chart
total_mask = (
    (df["SEA Disability Category"] == "All Disabilities") &
    (df["SEA Education Environment"] == "Total, School Age")
)

df_total = df[total_mask].copy()
df_total["Ages 6-21"] = pd.to_numeric(df_total["Ages 6-21"], errors="coerce")

print("df_total rows:", len(df_total))
print(df_total[["State Name", "Ages 6-21"]].head())

total_by_state = (
    df_total
    .groupby("State Name", as_index=False)["Ages 6-21"]
    .sum()
    .rename(columns={"Ages 6-21": "total_6_21"})
)

incl_mask = df["SEA Education Environment"].fillna(" ").str.contains("80%")
df_incl = df[incl_mask].copy()
df_incl["Ages 6-21"] = pd.to_numeric(df_total["Ages 6-21"], errors="coerce")

print("df_total rows:", len(df_total))
print(df_total[["State Name", "Ages 6-21"]].head())

total_by_state = (
    df_total.groupby("State Name", as_index=False)["Ages 6-21"]
    .sum()
    .rename(columns={"Ages 6-21": "total_6_21"})
)

incl_mask = df["SEA Education Environment"].fillna(" ").str.contains("80%")
df_incl = df[incl_mask].copy()
df_incl["Ages 6-21"] = pd.to_numeric(df_incl["Ages 6-21"], errors="coerce")

print("df_incl rows:", len(df_incl))
print(df_incl[["State Name", "SEA Education Environment", "Ages 6-21"]].head())


incl_by_state = (
    df_incl
    .groupby("State Name", as_index=False)["Ages 6-21"]
    .sum()
    .rename(columns={"Ages 6-21": "incl_6_21"})
)

bubble = total_by_state.merge(incl_by_state, on="State Name", how="left")
bubble["incl_6_21"] = bubble["incl_6_21"].fillna(0)

bubble["inclusion_rate"] = np.where(
    bubble["total_6_21"] > 0,
    bubble["incl_6_21"] / bubble["total_6_21"],
    np.nan
)

print("bubble rows:", len(bubble))
print(bubble.head())
df_total rows: 61
          State Name  Ages 6-21
18           Alabama    91686.0
284           Alaska    17466.0
550   American Samoa      431.0
816          Arizona   134762.0
1082        Arkansas    67505.0
df_total rows: 61
          State Name  Ages 6-21
18           Alabama    91686.0
284           Alaska    17466.0
550   American Samoa      431.0
816          Arizona   134762.0
1082        Arkansas    67505.0
df_incl rows: 854
   State Name                    SEA Education Environment  Ages 6-21
4     Alabama  Inside regular class 80% or more of the day    75905.0
23    Alabama  Inside regular class 80% or more of the day        NaN
42    Alabama  Inside regular class 80% or more of the day        NaN
61    Alabama  Inside regular class 80% or more of the day        NaN
80    Alabama  Inside regular class 80% or more of the day        NaN
bubble rows: 61
       State Name  total_6_21  incl_6_21  inclusion_rate
0         Alabama     91686.0    75905.0        0.827880
1          Alaska     17466.0    11698.0        0.669758
2  American Samoa       431.0      356.0        0.825986
3         Arizona    134762.0    92711.0        0.687961
4        Arkansas     67505.0    43714.0        0.647567
In [ ]:
region_map = {
    "Maine": "Northeast", "New Hampshire": "Northeast", "Vermont": "Northeast",
    "Massachusetts": "Northeast", "Rhode Island": "Northeast", "Connecticut": "Northeast",
    "New York": "Northeast", "New Jersey": "Northeast", "Pennsylvania": "Northeast",
    "Ohio": "Midwest", "Indiana": "Midwest", "Illinois": "Midwest", "Michigan": "Midwest",
    "Wisconsin": "Midwest", "Minnesota": "Midwest", "Iowa": "Midwest",
    "Missopuri": "Midwest", "North Dakota": "Midwest", "South Dakota": "Midwest","Nebraska": "Midwest", "Kansas": "Midwest",
    "Delaware": "South", "Maryland": "South", "District of Columbia": "South",
    "Virginia": "South", "West Virginia": "South", "North Carolina": "South", "South Carolina": "South", "Georgia": "South", "Florida": "South",
    "Kentucky": "South", "Tennessee": "South", "Alabama": "South", "Mississippi": "South", "Arkansas": "South", "Louisiana": "South", "Texas": "South",
    "Montana": "West", "Idaho": "West", "Wyoming": "West", "Colorado": "West", "New Mexico": "West", "Arizona": "West", "Utah": "West", "Nevada": "West",
    "Washington": "West", "Oregon": "West", "California": "West", "Alaska": "West", "Hawaii": "West"

}

bubble["region"] = bubble["State Name"].map(region_map).fillna("Other")

bubble_plot = bubble.copy()

fig = px.scatter(
    bubble_plot,
    x="inclusion_rate",
    y="total_6_21",
    size="incl_6_21",
    color="region",
    hover_name="State Name",
    title="Inclusion Rate VS Total Enrollment by Region (Ages 6-21, All Disabilities, 2022-23)",
    labels={
        "inclusion_rate": "Inclusion Rate (80%+ in Regular Class)",
        "total_6_21": "Total Enrollment (Ages 6-21)",
        "incl_6_21": "Included in Regular Class ≥ 80% of Day",
        "region": "Region"
    },
    size_max=40,
    opacity=0.8
)

fig.update_layout(
    height=600,
    width=900,
    template="plotly_white",
    legend_title_text="Region"
)

fig.show()

Heatmap - Demographic Representation (Age, Gender, Race, Disability Type)¶

This Heatmap higholights demographic patterns using available age-level counts from the School year of 2022 and 2023.

Key Findings:

  • States such as, California, Texas, New York and Florida, show the brightest values across nearly every age group. While small population states and territories show significantly lower enrollment

  • Most states show a steady increase from early childhood (Ages 3-5) into school age ranges, peaking around ages 10-14. Enrollment tends to narrow slightly later in teen years (Ages 17-21)

  • South and West tend to have higher numbers across more age groups. The northeast and mountain states have more moderate levels.

Overall, the Heatmap illustrates the age-based demographic profile of students receiving special education services, revealing both concentrations and underrepresented age groups. These pattterns may reflect differences in early identification, population distribution, or referral trends.

In [ ]:
[col for col in df.columns if "Age" in col]
Out[ ]:
['Age 3',
 'Age 4',
 'Age 5 (Early Childhood)',
 'Age 3 to 5 (Early Childhood)',
 ' Age 5 (School Age)',
 'Age 6',
 'Age 7',
 'Age 8',
 'Age 9',
 'Age 10',
 'Age 11',
 'Age 12',
 'Age 13',
 'Age 14',
 'Age 15',
 'Age 16',
 'Age 17',
 'Age 18',
 'Age 19',
 'Age 20',
 'Age 21',
 'Age 5 (School Age)-11',
 'Age 6-11',
 'Age 12-17',
 'Age 18-21',
 'Age 5 (School Age)-21',
 'Ages 6-21',
 'EL Yes - School Age',
 'EL No - School Age',
 'Female - School Age',
 'Male - School Age',
 'American Indian or Alaska Native - School Age',
 'Asian - School Age',
 'Black or African American - School Age',
 'Hispanic/Latino - School Age',
 'Native Hawaiian or Other Pacific Islander - School Age',
 'Two or more races - School Age',
 'White - School Age']
In [ ]:
age_cols = [c for c in df.columns if c.startswith("Age ")]

age_cols_clean = [
    col for col in age_cols
    if "School Age" not in col
    and "-" not in col
    and "Race" not in col
    and "Male" not in col
    and "Female" not in col
]

age_cols_clean
Out[ ]:
['Age 3',
 'Age 4',
 'Age 5 (Early Childhood)',
 'Age 3 to 5 (Early Childhood)',
 'Age 6',
 'Age 7',
 'Age 8',
 'Age 9',
 'Age 10',
 'Age 11',
 'Age 12',
 'Age 13',
 'Age 14',
 'Age 15',
 'Age 16',
 'Age 17',
 'Age 18',
 'Age 19',
 'Age 20',
 'Age 21']
In [ ]:
# Heat Map
age_cols_clean = [
    col for col in age_cols
    if "School Age" not in col and "-" not in col and "Race" not in col and "Male" not in col and "Female" not in col
]

for col in age_cols_clean:
  df[col] = pd.to_numeric(df[col], errors="coerce")

state_age_matrix = (
    df.groupby("State Name")[age_cols_clean]
    .sum()
    .sort_index()
)

rows_to_drop = [
    "U.s. Outlying Areas and Freely Associated States",
    "United States",
]

state_age_matrix = state_age_matrix.drop(
    [r for r in rows_to_drop if r in state_age_matrix.index],
    errors="ignore"
)


fig = px.imshow(
    np.log10(state_age_matrix +1),
    x=state_age_matrix.columns,
    y=state_age_matrix.index,
    labels=dict(
        x="Age Group",
        y="State",
        color="log10(Number of Children +1)"
    ),
    aspect="auto",
    title="Log Scaled Enrollment Heatmap by Age Group and State (2022-23)",
    color_continuous_scale="Viridis"
)

fig.update_layout(
    height=900,
    xaxis_tickangle=45,
    margin=dict(l=120, r=20, t=60, b=120))
fig.show()

Conclusion¶

The combines visual analyzes provide strong evidence regarding the hypothesis. Across states and categories, the findings reveal:

  • Clear Geographic disparities in enrollment

  • Disability specific differences in educational placement

  • Significant statistical variabillity in state service levels

  • Notable regional patterns in inclusion

  • Demographic trends suggesting uneven representation across age groups

Together, these results confirm that access, representation, and inclusion for students with diasabilities are not uniformly distributed across the U.S. The visuals underscore the need for targeted policy interventions, improved support systems, and ongoing monitoring of IDEA compliance to ensure equitable service delivery nationwide.

References¶

U.S. Department of Education – IDEA Section 618: Child Count and Educational Environments (Part B)

https://catalog.data.gov/dataset/idea-section-618-state-part-b-child-count-and-educational-environments-0ae56?utm