1 Introduction

Urbanization is a global issue characterized by continuous urban land expansion and rural–urban migration (Alcock et al. 2017; Seto et al. 2012). Urban development has brought social, economic, and technological changes, particularly, in developing countries, where cities are sprawling at high rates and metropolitan areas are emerging (Bai et al. 2012; Shahbaz et al. 2016; Zhou et al. 2004). However, large-scale population growth often leads to urban development beyond the carrying capacity of cities. Most of the urban development in developing countries is in the form of sprawl in urban fringes, causing many negative consequences to urban development and the eco-environment at unparalleled scales (Burak et al. 2017; Weeberb 2015). Thus, research into the mechanisms of urban expansion is of great significance for planners and governments to enhance their understanding of urban sustainability.

For understanding the complexity of urban systems, cellular automata (CA), that can provide a powerful simulation tool to predict and understand urban transformation over space and time, is one of the most prevalent urban modeling methods in recent years (Aburas et al. 2016; Santé et al. 2010; Musa et al. 2017). CA offer governments, planners, and stakeholders a tool to forecast and evaluate potential social benefits and environmental outcomes of urban development before implementation. CA also advance our fundamental understanding of urban dynamics and the complex relationships among urban changes, socio-economic development, and sustainable systems.

CA are a kind of discrete dynamic model with unique advantages for simulating complex nonlinear problems. CA originated in the 1940s, when S. Ulan and J. von Neumann considered the possibility of a self-replicating machine. Subsequently, many scholars undertook further studies of CA and helped with its advancement (Codd 1968; Gardner 1971). Wolfram (1984) demonstrated the capacities of CA for modeling complicated natural processes and generating spatio-temporal global changes through local interactions among components. The application of cellular-space models in geographic research was first proposed by Tobler in 1979. Then, the first theoretical approaches of urban CA modeling emerged in the 1980s (Batty and Xie 1994; Couclelis 1985; White and Engelen 1994). The integration of CA and geographic information systems (GIS) led to the simulation of real-world urban development. After the initial wave of urban CA modeling led by Batty, Couclelis, Clarke, and Tobler, research on urban CA moved to China quickly (Li et al. 2017; Zhuang et al. 2017). Since the end of the 1990s, Yeh and Li have developed a series of CA techniques, mainly combining CA with other models and extending cellular states, neighborhood definitions, and transition rules (Yeh and Li 2001; Li and Yeh 2002a). These models have been successfully applied to solving the environmental and ecological problems of rapid urban development in China.

The increasing popularity of CA in urban modeling could be largely attributed to their simplicity, flexibility, controllability, and ability to incorporate the spatial and temporal dimensions of urban development processes. CA can simulate complex dynamic urban systems through simple rules that can work with remotely sensed data and GIS (Santé et al. 2010; Musa et al. 2017). CA are more convenient than other models, such as agent-based models, because of methodologies developed in the past two decades. Another reason why CA have been widely applied in urban modeling is because CA can be easily integrated with GIS. The integration of CA with GIS provides a tool for performing complicated computations based on local information, thus producing better results than differential equations (Musa et al. 2017). However, despite the popular use of CA in urban modeling, errors in input spatial data sources and uncertainty in policies (Yeh and Li 2006) pose challenges in using CA to solve real planning problems (Poelmans and Rompaey 2010).

CA are increasingly being used to simulate spatio-temporal urban expansion and to address many environmental problems. However, defining the most suitable model structures for a specific application problem is difficult. To help users who are not familiar with CA, this chapter provides an overview of the basic and state-of-the-art concepts and methods in urban CA modeling, as well as the latest studies, applications, and current problems. The aim of this chapter is to provide an overview of defining, modifying, and applying CA for urban studies and planning from the perspectives of cell, cell space, neighborhood, time step, and transition rule, along with the collection of required data sources. The different types of CA and their characteristics are described, and the applications and urban issues involved in CA modeling are presented. These discussions attempt to answer the question, “what can and cannot CA provide for the modeler?” In addition, the strengths and weaknesses of CA are identified and common problems of current studies are discussed.

2 Methodology and Data Collection

2.1 Urban CA for Formulating Urban and Regional Planning Scenarios

The basic components of CA include cell space, cell, neighborhood, time steps, and transition rules. In an urban CA model, each component has geographic implications (Triantakonstantis and Mountrakis 2012). The cell space represents the two-dimensional geographic space composed of regular cells, and the states of cells represent different land uses. The core of a CA model is formed by transition rules. Each cell changes constantly in accordance with its states and the transition rules as time goes on, which represents the systemic deduction and change from an overall perspective.

A formal cell can be a regular grid consisting of square cells, which is particularly suitable for computer processing and compatible with remotely sensed data. Scholars have defined a hexagonal cell space such that the neighborhood could be homogeneous (Iovine et al. 2005). Besides, a cell space can be three-dimensional to represent the vertical growth of urban areas. To make the simulation process closer to the real world, relaxations to the two components are needed. The modified cell space can be based on irregular spatial units, such as Voronoi polygons (Shi and Pang 2000) or graphs (O’Sullivan 2001). Irregular cell space is sometimes presented as a patch-based space (Chen et al. 2014; Wang and Marceau 2013). The irregular spatial unit, such as a cadastral parcel or a census block, is usually represented as a polygon, to reflect land use, population, and economic conditions. Compared with regular cells, parcels or blocks provide a good representation of reality, but lead to complicated definitions of neighborhood. Cell space is normally assumed homogeneous in standard CA, indicating identical and exclusive cells characterized by their states. Nevertheless, the great influence of land attributes on land-use changes, such as transport accessibility or physical conditions, varies the suitability of different cells for certain land uses. Subsequently, the requirements for a non-uniform cell space emerge.

As for neighborhood, there are often two kinds of relaxations. In standard CA, neighborhood is isotropic and homogeneous for each cell (Wu 2002; Xie 1996) and consists of a fixed set of geometrically closest cells (i.e. Moore neighborhood). In urban applications, an extended neighborhood is adopted to consider the neighboring effect of geographic entities (White and Engelen 2000). Neighborhood size can be extended to a specified distance and a weight can be introduced according to the distance, to consider the effect of distance decay. If it is based on irregular units, adjacent units within a certain distance or degree of proximity are used to represent a neighborhood (Shi and Pang 2000). Another widely acknowledged modification is to a non-stationary neighborhood, which defines different neighborhood spaces for different cells (Couclelis 1985). However, this relaxation has been seldom applied due to the difficulty of implementation and vague geographic meanings.

As the core of CA model, transition rules usually entail substantial modifications, considering the particularities and complexity of specific applications. Original transition rules only depend on the states of a cell and its neighborhoods. Given that urban processes are influenced by numerous factors, such as transport accessibility and physical conditions, urban CA models are modified to consider external effects. As CA are flexible, transition rules can be defined in different ways according to the preferences of modelers. Randomness and uncertainty of urban growth, as well as many urban theories, can be reflected in the model structure. Besides, in standard CA, transition rules are static and the same at every time step. However, urban processes and determinants change over time and space, which leads to the necessity of calibrating transition rules based on the specific characteristics of different periods and areas (Clarke et al. 1997; Geertman et al. 2007; Li et al. 2008). For example, Clarke et al. (1997) proposed a self-modifying CA in which transition rules vary over time. The time steps in a formal CA are discrete, which assumes that urban growth occurs at the same time. Many urban CA models apply time steps of different lengths or various time steps for different cells to reflect the influence of specific events with different duration. However, compared with other components of CA, less relaxations have been implemented for time steps.

The future state of a cell depends on the transition rules and its state in the previous moment. A standard CA can be mathematically expressed as follows (Ahmed and Ahmed 2012):

$$S^{t + 1} = f(S^{t} ,N)$$
(45.1)

where t and t + 1 represent discrete time points, St and St+1 represent the states of the cell at time t and t + 1, respectively, N represents the set of states of neighborhood cells, and f is a transition rule.

The straightforward nature of standard CA limits the ability to represent real-world geographic phenomena (Couclelis 1985). To adapt standard CA in urban applications, the particularities of geographic processes should be included for representing geographic heterogeneity, which leads to the relaxation of original CA components (Couclelis 1997). For example, geographic features in the neighborhood can be embodied in a simplified CA using rule-based structures (Batty 1997; Fig. 45.1):

Fig. 45.1
figure 1

Neighborhood and basic transition rules of cellular automata

By integrating CA with GIS databases, a constrained urban CA can be further developed for formulating planning scenarios. It is assumed that the evolution of real cities is influenced by a series of complicated factors which can be defined at various local, regional, and global levels. Some kinds of constraints should be used to regulate the simulation to improve modeling performance. Without constraints, urban simulation will generate patterns as usual based on historical trends. Constraints can be added into urban CA models to reflect environmental and sustainable development considerations. They are the important factors for the formation of idealized patterns. The generic constrained CA model takes into account not only the influences of neighboring states, but also a series of economic and environmental constraints. These constraints may include environmental suitability, urban forms, and development density (Yeh and Li 2001, 2002; Li and Yeh 2000; Fig. 45.2).

Fig. 45.2
figure 2

Constrained CA with GIS and planned development database

2.2 Data Collection and Model Calibration

As a bottom-up model, urban CA models are data hungry and usually require a large set of data input for real-world simulation. Remotely sensed data are often used for monitoring and measuring alterations and characteristics of land-use changes on the Earth’s surface. Time series of historical remotely sensed images or land-use maps with different time phases in the same area can be used for model calibration and validation. In addition, traffic networks, natural attributes (i.e. elevation), and other physical factors are commonly used to evaluate the suitability of land for development. Land-use plans can provide land-development information, for example, a planned regional development center, which is crucial for considering the effects of urban planning on future development. Many studies have used fine socio-economic data, such as population density, to produce more realistic simulation results.

The data quality of these input data sources is a concern in urban CA applications (Aburas et al. 2016). Supervised classification is adopted to classify remote-sensing images into different land-use types: for example, urban and non-urban. Moreover, GIS software tools are used to create maps with different spatial resolutions for comparative analysis. Errors and uncertainty can be produced by these common operations and the input data sources themselves, thus, influencing the results of urban simulation (Yeh and Li 2006). There are debates on whether urban CA models can provide meaningful results, especially for urban planning, due to inherent errors and uncertainty. Overall, considering the above two aspects, modelers can follow the flow chart in Fig. 45.3 to create an urban CA model.

Fig. 45.3
figure 3

Flow chart of urban CA modeling

3 Types of Urban CA Models

The model developed by Batty and Xie (1994) in Amherst, New York was one of the first applications of urban CA in real-world simulation. However, the first widespread empirical applications of urban CA were carried out by White et al. (1997) and Clarke et al. (1997). The application of White and Engelen was based on the previous work of White and Engelen (1993, 1997). In the model of White et al., the transition potential of conversion into different land uses is calculated for each cell, which can be regarded as a function of various factors, including suitability for different land uses, neighborhood and inertia effects, and stochastic disturbance. Several models of this functional type were applied to Cincinnati (White et al. 1997), the Netherlands (Engelen et al. 1999), Tokyo (Arai and Akiyama 2004), Dublin (Barredo et al. 2003), Lagos (Barredo et al. 2004), and San Diego (Kocabas and Dragicevic 2006). These applications confirmed the capacity of urban CA models in highly realistic simulation of urban transformation. Several improvements have been proposed to reinforce the methodological and theoretical basis of this type of model (Arai and Akiyama 2004; Caruso et al. 2005). Another application is the SLEUTH model, which is an acronym of the input maps: slope, land use, exclusion, urban extent, transportation, and hill shade (Clarke et al. 1997). SLEUTH considers four types of growth behaviors, which are spontaneous, diffusive, organic, and road-influenced. This model is designed to learn from the feedback of its local settings over time through self-modification, and its calibration is based on combining different metrics of the goodness-of-fit between observed and simulated results. SLEUTH has been applied to many cities, initially in North America (Berling-Wolff and Wu 2004; Clarke and Gaydos 1998; Dietzel and Clarke 2006; Herold et al. 2003; Yang and Lo 2003), and later in Europe (Silva and Clarke 2002), South America (Leao et al. 2004), and Asia (Feng et al. 2012; Mahiny and Gholamalifard 2007). Efforts have been made to improve SLEUTH, such as introducing new metrics and functionality (Guan and Clarke 2010; Jantz et al. 2010; Liu et al. 2012).

Other early urban CA models include those developed by Wu (2002, 1998), Wu and Webster (1998), and Wu and Martin (2002), in which the probability of urban development for each cell was calculated based on a group of factors, such as neighborhood. The first urban planning CA models proposed by Li and Yeh (2002b) and Yeh and Li (2001, 2002) adopted gray cells to represent continuous cell states and cumulative degrees of development. They developed a family of constrained CA urban planning models that can be used to generate different planning options according to different environmental considerations, urban forms, and densities, for the evaluation of urban development and planning for sustainable development. They added some constraint functions in CA modeling that incorporate environmental and urban-form data obtained from GIS.

The methods of multi-criteria evaluation and logistic regression were first introduced by Wu and Webster (1998) and Wu (2002) to allocate weights to different factors, which are simpler and require lesser computation compared with Monte Carlo (Chen et al. 2002). As urban development is a complicated and nonlinear process, Yeh and Li (2003) proposed to define transition rules using a neural network as a black box. Instead of mathematical transition rules, Li and Yeh (2004) defined explicit transition rules using IF–THEN statements, which are straightforward and intuitive. Several statistical, probabilistic, and artificial-intelligence algorithms were used to calibrate these types of urban CA models (Wu and Martin 2002; Almeida et al. 2008; Li and Liu 2006; Feng and Liu 2013).

Other popular urban CA models were derived from other research fields, such as DINAMICA, which is a CA-based model originally designed for deforestation simulation (Soares-Filho et al. 2002; Almeida et al. 2003,2005). As a bottom-up dynamic model, urban CA can be integrated with top-down models to gain complexity and power. The integration with the Markov approach compensates for its growth constraints and thus has received much attention recently (Al-Shalabi et al. 2013; Araya and Cabral 2010; Arsanjani et al. 2011; Li et al. 2014; Memarian et al. 2012; Samat et al. 2011; Deep and Saklani 2014; Olusina et al. 2014).

4 Applications of Urban CA in Urban Planning

The development of CA for urban and regional applications is considerably influenced by the intended use and functionality of models. Urban CA models are applied for exploring spatial complexity, testing urban theories and ideas, and as planning support tools (Fig. 45.4).

Fig. 45.4
figure 4

Potential applications of urban CA modeling

For exploring spatial complexity, urban CA models are used to advance the understanding of cities as complex adaptive and dynamic systems. Limited adjustments in the CA formalism are required for the models applied in exploring the principles governing urban spatial development. CA are the combination of a spatial structure and a set of states and transition rules. The idea behind CA is to find simple elements of complexity in cities and to compare these elements with similar models in other fields. The original work by Tobler and Couclelis in the 1970s and 1980s emphasized the conceptual and theoretical aspects of CA and related them to the theory of complex systems (Tobler 1979; Couclelis 1985). CA were taken as an epistemological tool to show how spatial development can be produced out of simple rules. CA for exploring spatial complexity were further developed along with fractal theory, chaos, nonlinearity, computer graphics, and complexity (Batty 2007; Torrens and O’Sullivan 2001).

CA can be used to test theories and ideas of urban development, examining the roles of complexity in the driving dynamics of urban processes, such as urban sprawl, diffusion and coalescence, and polycentricism. CA models are used as laboratories to test theories and ideas in urban economics, geography, and sociology. The formulation of transition rules is the key to developing close and direct links between urban CA models and urban theories. The transition rules derived from urban theories can help to explore various hypothetical ideas about cities. The complex relationships between physical and socio-economic processes and urban environments have been explored (Alberti 1999; Dietzel et al. 2005). Efforts have been extended to embrace other urban theories, including urban ecology, design, and sociology (Batty 1998; Benati 1997; Portugali et al. 1997). These studies have advanced the theoretical basis of urban CA models. However, CA models of urban theories are often concerned with details on how to build the model, but fail to explain the theories that they intended to explore (Torrens and O’Sullivan 2001). Thus, they are interesting but not well explored in urban CA modeling.

The use of urban CA models as planning support systems requires modifications of the above two applications of CA models to produce more realistic results relevant to urban planning, management, and policies. These CA models serve as planning support tools that can assist governments, planners, and stakeholders in evaluating the social benefits and environmental and ecological consequences of different urban planning goals, options, and policies. Various urban issues have been addressed in these types of urban CA models, including the delineation of urban growth boundaries, assessment of urban planning options, and prevention of illegal development (Jantz et al. 2010; Xia et al. 2020a). Despite the fact that urban CA models are increasingly developed in applied research, a gap exists in supporting practical planning of urban spaces and land uses (Santé et al. 2010).

In addition to using CA as a planning support system to (1) construct baseline growth simulation and prediction; (2) evaluate existing development as compared with optimal development; and (3) simulate development alternatives according to different planning objectives for assisting the urban planning process (Yeh and Li 2009), another example of using CA in urban planning is to delineate urban growth boundaries (UGBs). UGBs have become an important part of territorial planning in China. The objective is to ensure smart urban growth, which can increase the density of urban services and protect surrounding natural ecosystems (Jun 2004). UGBs have been regarded as an important element in designing land-use plans in China, although the concept can be traced to Great Britain’s green belts in the 1930s (Nelson and Moore 1993). China needs to restrain its chaotic urban expansion via the delineation of UGBs to sustain its shrinking farmland stock.

The designers of UGBs should understand the mechanism of urban dynamics and consider various geographic factors. These models can assist planners in delimiting optimal UGBs for directing the future urban expansion from a spatial optimization perspective. Traditionally, evaluation models for land-use suitability provide a simple way for delimiting UGBs (Bhatta 2009). A major problem is that cities are dynamic systems influenced by anthropogenic activities and natural processes. These suitability-based methods ignore landscape characteristics during the delineation of UGBs (Santé et al. 2008). This approach requires efficient and feasible techniques to delimit those boundaries. CA can satisfy multiple objectives in delineation of UGBs, including maximum urban suitability, high-quality farmland preservation to the greatest extent, and the most compact landscape pattern (Ma et al. 2017; Liang et al. 2018).

An example is to use the software GeoSOS-FLUS (https://www.geosimulation.cn), which is available on the Internet, to serve as an effective tool to delineate UGB. The implementation of UGB using GeoSOS-FLUS involves several procedures. First, we retrieved various spatial variables and historical land-use data for estimating the transition probability of each land-use type. Second, we defined the simulation subject to different planning visions according to a number of scenarios, such as baseline, economic zoning development, and excessive urban growth scenarios. Third, we carried out the simulation of UGBs on the basis of the above urban development probability and multi-scenarios constraints, as well other constraint factors. Fourth, the simulated UGBs can be further modified by using two common morphology operators, namely, dilation and erosion.

Figure 45.5 shows the example of using GeoSOS-FLUS to simulate UGBs in the study area of Guangdong-Hongkong-Macau Bay Area (GHMBA), which is one of the fastest-developing urban agglomerations in China, projected to 2030. This GeoSOS-FLUS has also been applied to the delineation of UGBs in other fast-growing cities of China, such as Foshan, Zhengzhou, and Chongqing. The simulated UGBs can be used to guide future urban master plans, which can prevent wastage of land resources.

Fig. 45.5
figure 5

Simulation of UGBs in the study area of GHMBA in 2030

5 Discussion and Conclusion

5.1 Current Issues in Urban CA Modeling

Urban CA models have strengths and weaknesses. The fast development of urban CA models is mainly due to their simplicity. However, simplicity often limits the CA capacity to represent realistic urban phenomena, leading to extensive modifications and introduction of complexity into the model. Questions are raised over whether these elaborated models actually constitute CA at all, if the relaxations are too much. Another strength of urban CA models is flexibility, which allows them to be adopted to different applications. However, flexibility may cause confusion and difficulties for users if there is no standard definition of transition rules. Although difficult, finding the balance between simplicity and realism, as well as between flexibility and standardization, is needed. As descriptive models, urban CA models have the ability to examine hypothetical ideas related to cities. In terms of data requirements, input data collected for different models can vary greatly. In the past, the software available for implementing general urban CA models has been very limited and inconvenient to use; users are usually required to modify or re-design their models for specific purposes (Xia et al. 2018, 2020b).

In recent years, more user-friendly CA packages have been developed to solve various simulation and planning problems, such as the CA_MARKOV module in IDRISI, and GeoSOS. The CA_MARKOV module in IDRISI adopts a hybrid Markov-CA model to allocate land use until the areas that are predicted by a Markov chain are achieved (Yang et al. 2014). GeoSOS also provides a variety of CA models (e.g. neural network CA, logistic regression CA, decision tree CA), which can be freely downloaded at https://www.geosimulation.cn. Moreover, GeoSOS for ArcGIS (a software add-in that runs in ArcGIS Desktop) has been developed to provide the full functions of simulating, predicting, optimizing, and displaying a variety of geographic patterns and dynamic processes, such as land-use changes, urban evolution, zoning of natural areas for protection, and facilities sitting. As the only software integrating spatial simulation and optimization capability together, GeoSOS for ArcGIS comprises a geographic simulator and optimizer, which use multiple CA models and ACO-based model, respectively, by coupling their results to solve complex spatial simulation and optimization problems. GeoSOS for ArcGIS is a free and open-source software and is also available for freely downloading at the GeoSOS Web site (https://www.geosimulation.cn). So far, this ArcGIS Desktop added-in component has been downloaded by users in 46 countries all round the world.

The current literature on CA applications reflects problems that have arisen from researchers who just applied CA, but were not familiar with the CA models themselves. First, many users have claimed that their simulation results can support urban planning and management without offering good examples of real-world applications. Successful applications should demonstrate that governments or planners can make better decisions due to the use of CA models. Second, many users have difficulty in obtaining details of the input data, especially the dates in acquiring them. In some cases, the present road network that was built after the simulated period was used in the simulation, making the simulation somewhat questionable. Third, they evaluated their simulation results by comparing the simulated map to the reference map of the entire study area, but failed to compare the percentage of errors to the percentage of converted areas (Liu et al. 2014; Pontius and Millones 2011). Therefore, they used flawed metrics for assessing model performance such as the goodness-of-fit (Pontius and Millones 2011). Finally, they just separated calibration information from validation information through space (by selecting pixels randomly), rather than through time (by using an urban map in another year), leading to overestimation of the accuracy of the model.

5.2 Summary and Future Research Directions

This chapter has summarized the basic concepts and techniques of CA modeling for urban and regional planning from the perspectives of basic CA components, formulation of urban CA, and data collection. Urban CA were classified into different types, and systematic and critical reviews on previous and recent studies and applications were provided. Finally, the strengths and weaknesses of urban CA models were pointed out for new modelers, along with current problems in the literature.

Further studies are needed to provide new insights into the uses of CA in geographic and urban theories, which would advance the theoretical basis of urban CA. The integration of urban CA models and other models may overcome the weaknesses of CA, such as with economic models, thus improving model performance. More effort should be made on improving CA by incorporating microlevel interactions and multiple processes. So far, the calibration is often based on two years of land-use maps. There is an issue of over-calibration because of bifurcation effects inherited from complex systems. Bifurcation refers to the fact that a small smooth change in the parameter values may cause a sudden change in the model’s behavior. Finally, elaboration is also required to demonstrate how urban CA models can support planning and management in practice. Urban CA models should not be used to provide exact predictions of urban systems, but to simulate interactively different what-if scenarios for policy implementation through the modification of transition rules.

Concern for global changes has grown tremendously in recent years. CA should incorporate factors of climate change in urban planning, such as the effects of urban heat islands, changes in agricultural production, and changes in land-use patterns. CA simulation could be integrated with climate and hydrological models in future studies (Chen et al. 2020). For example, urban simulation could incorporate the universal climate scenarios developed by the Intergovernmental Panel on Climate Change, such that future land use can meet the demand required by economic and social development. This integration can facilitate the simulation of future changes in global and regional land covers. For example, the simulation of urban evolution with finer urban land categories should be attractive for actual planning practice. This requires the integration of current CA with big data or social media data.