Constructing a Fair Congressional Map: A Statistical Analysis
In this analysis, a ‘fair map’ is defined as one where expected seat outcomes align with national vote share under a probabilistic district model.
Gerrymandering has been a fact of American political life basically since the conception of our country. Coined in 1812 after Massachusetts Governor Elbridge Gerry redrew their congressional districts in an obviously biased way, the term gerrymandering describes the process of redrawing electoral boundaries to favor one political party or group. This tactic has been used in American politics ever since—perhaps numbing our country to its quite frankly disastrous effects for the health and legitimacy of our democracy. This article’s purpose is not to delve any deeper on the history of the subject or the numerous potential yet unlikely remedies, rather it is a statistical analysis of maps drawn by me for the 2020-2030 election cycles.
Explore the constructed fair map (click to open interactive map)
Chapter 1: Partisan Fairness
To evaluate fairness we used a model based on the following core principle often ignored in more rudimentary fairness measures: that is that each district is treated like a weighted coin flip, not a guaranteed outcome. A district that voted for a democrat with 50.01% of the vote should, and therefore is treated differently than one that gives a democratic candidate 70% of the vote—a key distinction most models miss. To create our formal model, we use a logistic function:
P(Dem win in district i) = 1 / (1 + e^{−k(Vᵢ − 0.5)})
Where:
Vᵢ = Democratic vote share in district i
k = sensitivity parameter
A “success” in the statistical sense is defined as a Democratic victory.
Important to note, that as follows a success in the statistical meaning is defined as a democratic victory. As follows,
E(S) = Σ pᵢ
Var(S) = Σ pᵢ(1 − pᵢ)
Where:
pᵢ = P(Dem win in district i)
To calculate the number of effective competitive seats, we scale variance to a more intuitive measure with the equation: Number of effective competitive seats
C = 4 Σ pᵢ(1 − pᵢ)
While these formulas define the model formally, I would like to add how each statistic captures a different aspect of how a district map behaves. Expected seats represent what the map tends to produce on average over many elections. Rather than assigning each district a fixed outcome like many models, this model allows for the nuance that comes from different levels of competitiveness in different districts. Variance measures how much uncertainty exists in the system. Districts that are close to evenly split contribute more to overall variance, while heavily partisan districts contribute very little. As a result, variance serves as a basic statistic reflecting overall competitiveness. The effective number of competitive seats rescales this uncertainty into a more intuitive quantity. Instead of thinking in terms of abstract variance, this measure approximates how many districts are meaningfully competitive in practice.
To formalize the probabilistic framework, we model each district outcome as a random variable. Each district 𝑖, i is defined as:
𝑆𝑖∼Bernoulli(𝑃𝑖)
Using this framework, we can isolate structural bias in the map by evaluating outcomes under a standardized condition. To do this, we conceptually re-center the national vote so that both parties receive exactly 50% of the total vote. We then compute the expected number of seats under this condition: Bias
B= 435 E[pᵢ∣national vote=50%]−0.5.
This model shows that the district maps used has been very fair for the past three presidential elections. As is natural, it provides slight bias in favor of the winners of each of those elections, but the expected and actual seat count match nearly identically in each election.
Chapter 2: A deeper dive into overall competitiveness
Recall from the first chapter that by definition,
Var(S) = Σ pᵢ(1 − pᵢ)
And number of effective competitive seats.
C = 4 Σ pᵢ(1 − pᵢ) = 4 Var(S)
The factor of 4 rescales variance so that a perfectly competitive district (50–50) counts as one full competitive seat, while less competitive districts count proportionally less.
As is clear, the number of effectively competitive seats is quite high, representing about 40% of congressional districts and continuously increasing. Increasing the amount of competitive districts is generally seen as a universal principle of fair redistricting practices.
If you want a more rudimentary measure of competitiveness, we can measure the amount of seats that fall into certain margins of victory. Here, we see the same trend where the number of highly competitive seats (with margins below 5%) increased from 78 to 88 to 97 and the number of generally competitive seats (with margins below 10%) increasing from 164 to 175 to 189.
Chapter 3: Demographics
As mandated by section 2 of the Voting Rights Act of 1965, it is vital to make sure that racial minority groups are provided adequate representation by ensuring they make up the majority or plurality or the VAP in a proportionately appropriate amount of districts. In effect, this law is the only law the limits gerrymandering, in effect barring republicans from gerrymandering the deep south—though this may change with recent redistricting wars and a conservative supreme court that may skeptical of explicit racial voting protections. Regardless, in drawing my maps I made sure to pay special attention to ensuring adequate racial representation.
Given that the nation is 60% white voting age population (VAP), and races do not perfectly sort within precinct boundaries, it is impossible by definition for seat representation to exactly match national VAP. As a result, redistricting is not a question of perfectly reproducing population shares, but of how representation is distributed—whether minority voters are concentrated into a smaller number of districts or spread across a broader set of districts where they may exert influence. These statistics show my map successfully increased racial minority representation (by 4.6%), not perfectly but better reflecting our actual population make-up. As shown in chapters one and two, this was done without sacrificing overall fairness and competitiveness, making it a win-win situation.
This section is included less as a tool for evaluating map fairness and more as an exploration of the broader electoral landscape. The correlations shown here reflect underlying patterns in how demographic composition relates to voting behavior, offering a psephological perspective rather than a direct critique of the maps themselves.
The results align with well-established patterns in American politics. Districts with higher white population shares are associated with lower Democratic vote share (r ≈ −0.63), while Black population share shows a strong positive relationship (r ≈ 0.42). Latino share exhibits a weaker positive correlation (r ≈ 0.23), reflecting greater heterogeneity in voting behavior across Latino communities. Educational attainment, as measured by the share of residents with a bachelor’s degree or higher, also shows a notable positive relationship with Democratic vote share (r ≈ 0.48).
Model Notes
This model treats districts as independent Bernoulli trials with win probabilities derived from vote share using a logistic function. In reality, election outcomes are correlated and constrained by geography, so results should be interpreted as illustrating statistical tradeoffs—particularly between fairness and competitiveness—rather than as a fully realistic redistricting simulation.
Chapter 4: An unfortunate conclusion
To be quite frank, there is no moral defense of gerrymandering. So to examine why it exists, why there isn’t some common sense ban on it, it may be helpful to approach it from a very simple game theory perspective. If you are a congressman (the person who would need to vote in favor in such ban), there is a very high likelihood that your seat is gerrymandered (and by extension your job security and political survival are dependent on the existence of this phenomenon). If so, in what world would you vote to ban gerrymandering? Ask yourself, in what world would you voluntarily give up job security for a mere moral victory in return—and let’s be real.
Given that we live in a 50/50 nation, in aggregate, gerrymandering tends to cancel itself out. In 2022, Republicans won 51.3% of the 2-party vote and won 51.0% of seats accordingly; in 2024, Republicans won 50.5% of the vote, winning 50.6% of the seats. What a remarkably fair result! Unfortunately, as stated before this is because the extreme amount of gerrymandering canceled itself out, that is republican gerrymandering canceled out democratic gerrymandering. The recent redistricting wars show this trend will only continue to the detriment of the American voter.
So, let’s define some clear parameters: gerrymandering benefits republican and democratic politicians, neither benefits nor harms the republican or democratic parties at-large, but massively mutilates the voting rights of both republican and democratic voters. So a plausible solution would have to shift these parameters to one where neither republican nor democratic politicians or their party at-large are helped or harmed, and where republican and democratic voters are protected. I am going to throw a bit of a curveball idea:
Ban gerrymandering by mandating independent redistricting commissions and expanding the size of the House of Representatives from 435 members to 699 members
The banning gerrymandering part is obvious, but why expand the size of the House of Representatives? Simply, to ensure politicians can keep their jobs in congress—not a noble goal, but a realistic one needed to satisfy in order to find a solution that has the possibility of passing. I feel my point can be best illustrated in an example, let’s say New Mexico. Now despite being a state that typically votes for democrats by about 10%, it has been so brutally gerrymandered that all three of its congressional districts vote democratic. Under my proposal with an expanded house, New Mexico would now have five seats, three voting democratic, two voting republican to reflect the states partisan makeup fairly. Now let’s examine our parties—are republican and democratic voters better represented? Yes, they have fair maps now. Are republican or democratic politicians harmed? No, the three current democrats still have three democratic seats they can represent in the state. Now, are the republican or democratic parties harmed? In this state, yes the democratic seat share would go from 100% to 60%, but this would balance out in the ungerrymandering of republican states (say South Carolina for example)—so nationwide neither party would be helped nor harmed. From a game theory perspective, this seems to be the only plausible solution that is fair, protects the interests of voters, and importantly for passing purposing and not for moral reasons, protects the interests of current congress members.
Quite frankly, do I think this will happen (or something like) in my lifetime (of around 60 remaining years)—actually yes. Do I think it will happen any time soon—a resigned no.