Automating and Optimizing Building Design with Deep Reinforcement Learning

Urban redevelopment requires smart, compliant, and economically viable building designs. In this project, we developed an end-to-end system that automates building design—following legal constraints and user preferences—while maximizing financial value through reinforcement learning.
Mission Statement
- Urban areas increasingly require reconstruction.
- Accurate valuation of new buildings is crucial for land investment decisions.
- If we can automatically estimate and generate optimal designs, identifying undervalued land becomes much easier and more scalable.
Problem Definition
Input Data

- Parcel polygons
- Legal boundary polygons
- Road network (as LineStrings)
- Numerical constraints:
- Road width
- Parking requirements (as equations)
- BCR (Building Coverage Ratio) & FAR (Floor Area Ratio)
- User preferences (e.g., unit mix, aesthetics)
Objective

Generate a building design that:
- Complies with legal codes
- Maximizes total building value (unit area × revenue per area)
- Ensures livable and practical unit design
- Is visually appealing
Problem Structure

The problem is decomposed into three stages:
- Parameterization: Represent the building design as a set of adjustable parameters.
- Estimation: Evaluate the design value based on quantitative and qualitative factors.
- Optimization: Search for the best parameter set to maximize the estimated value.
1. Parameterization

What We Did
- Defined core design components: massing, core, corridor, unit, parking
- Parameterized each component to make the design space computable
- massing

- core

- corridor

- unit

- parking

- massing
- Prevented invalid designs to reduce search space complexity

My Contribution
- While my colleague Tzung-Kuan Hsu built the initial PoC in Rhino,
I refined the logic and implemented the entire system as software.
Key Principles
- Compact parameter space: Easier to search and optimize.
- Intuitive mapping: Enables agent generalization and policy learning.
2. Estimation

What We Did
- Created metrics to evaluate both quantitative (e.g., unit area) and qualitative (e.g., livability, beauty) factors.
- Designed a weighted objective function combining multiple value dimensions.
My Contribution
- Integrated and extended the metric set.
- Developed the final scoring function used for optimization.
Key Challenges
- Conflicting goals: e.g., floor area vs. livability
- Need to quantify abstract values: beauty, openness, utility
3. Optimization

What We Did
- Framed the design process as a combinatorial optimization problem.
- Applied Deep Reinforcement Learning (DRL) to explore and generalize over large search spaces.
My Contribution
- Designed the DRL agent architecture and training pipeline.
- Selected REINFORCE as the learning algorithm for its simplicity and direct return optimization.
- Implemented masking and pointer networks to handle dynamic action/state spaces.
Why DRL?
- Traditional search methods (e.g., greedy, exhaustive) fail due to high interdependencies.
- DRL allows feature-based decision making, enabling generalization across parcels.
Technical Insights
- Used autoregressive policies to model joint action distributions.

- Incorporated pointer networks to remove index binding from action spaces.

- Inspired by AlphaStar's architecture for structured state/action encoding.

- Designed Modular RL Agent Architecture for Adaptive Environments.

Real-World Deployment
- Launched as a mobile app service: Landbook Premium (Sep 2021)
- Generates viable building designs within 3 minutes
Retrospective
Strengths
- Successfully automated a multi-step architectural design process
- Achieved generalization and rapid generation over various parcels
Limitations
- Some action heads showed low entropy and weak training signals
- Requires further tuning and experimentation for better convergence