Kevin Lynch, 1960
Visual representation in architecture, urban design, and planning is critical for both the design and decision-making processes. Despite major advancements in the field of computer graphics, crafting visual representations is still a complex and costly task, usually carried out by highly trained professionals. This is particularly true during preliminary design stages—such as zoning exercises or schematic design—in which key decisions are made, yet partial information about final design details is available. This work attempts to substitute common practices of urban design visualization with a machine-learnt, generative approach. By implementing a deep convolutional generative adversarial network (DCGAN) and a tangible user interface (TUI), this work aims to allow for real-time urban prototyping and visualizations.
The DCGAN model was trained on Cityscapes, a semantic street-view dataset. A version of CityScope (CS), a rapid urban-prototyping platform, is used as tangible design interface. Following each design iteration on CS, the DCGAN model generates a rendering associated with the selected street view in the design space. A lightweight, web-based, and platform-agnostic tool was also created for visualization and UI. Unlike traditional rendering techniques, this tool could help designers focus on spatial organization, urban programming, and massing exercises without the need for detailed design, complex visualization processes, and costly setups. This approach could support rudimentary urban design processes that are enhanced by the visual atmosphere, impression, and discussion around "The Image of the City."
CS is designed to allow for a playful, unrestricted interaction with a tangible environment, which is augmented by real-time analysis, simulations and predictions. This differentiates CS from other design tools, in which basic interaction requires experience and feedback is rarely produced in real-time.
The DeepScope platform was designed with a 16x16 grid of 4x4 Lego tiles that are randomly populated by clones of five patterns. Each pattern is a unique 16 bits matrix of black and white studs facing downwards to be scanned via the CS scanner. Each tile and pattern represent a different land-use or streetscape element: roads, buildings, green-spaces, parking and sidewalks. In addition, a single "observer" pattern was designed to mimic a virtual pedestrian and to set its angle of view, and each tile is associated with virtual parameters (such as z-axis height, rotation, density, etc.) at a scale of 10x10 meters.
As users manipulate the tiles via the TUI and design the streetscape, their interactions are visualized on the CS table-top surface and on the vertically mounted display. Each design iteration creates a new array of digital patterns. This array is than translated into a 3d environment in which each grid-cell is represented via its label and predefined parameters. For example, a vegetation pattern yields a flat green rectangle colored in RGB values that correspond to the Cityscapes label-color.
In addition, this pattern is triggering an algorithm that proliferates small-scale streetscape objects associated with that label, so that in the case of the vegetation label, trees, bushes and live-fences will emerge in the virtual space. This algorithm is also controlling the position, rotation, shape and height of these generated objects. As such, a sidewalk pattern will yield pedestrians or street-signs and a parking-lot pattern will be proliferated with parked vehicles. As users design the overall urban structure, the environment is autonomously filled with small-scale street elements.
This composed 3d environment is captured via the "observer" pattern. This one-off plate is designed to control the 3d camera position so that each move or rotation will create a new perspective viewport. The camera baseline parameters, such as FOV and height were approximated to the Cityscapes camera calibration appendix, so that the Observer would yield a matching input vector to the trained dataset. Nevertheless, additional camera controls were implemented to allow CS users to rotate, pan and zoom the "observer" via a custom TUI built of a game pad joystick.
With every change to the CS grid array, the observer current viewpoint is captured as a raster image. This image is then converted into an input vector which is fed to the DCGAN Tensorflow.js converted model. The model then predicts an image that corresponds to the input labeled image and its output is then drawn onto the CS canvas and displayed on a vertical monitor.
The CS platform horizontal tabletop is used as both the design space as well as a canvas for visualization. Following each design move, a schematics diagram of the Cityscapes labels is projected onto the canvas to reference the design. The observer position is displayed using a colored tile with a perspective cone that indicates the direction of its view. Together, the HCI components of this system allow users to effortlessly design and amend the urban streetscape environment and observe the effects of different scenarios in real-time.