![]() The package first appeared on CRAN at the end of 2016 and is under very active development. The sf package implements the simple features open standard for the representation of geographic vector data in R. ![]() 3 The package remains the backbone of many packages that provide GIS capabilities in R. The sp package introduced a coherent set of classes and methods for handling spatial data in 2005. The sp and sf packages use different methodologies for integrating spatial data into R. This post will build off of the location data obtained there to introduce the two main R packages that have standardized the use of spatial data in R. In my previous post on geocoding with R I showed the use of the ggmap package to geocode data and create maps using the ggplot2 system. The extent of the geographic capabilities of R is readily apparent from the many packages listed in the CRAN task view for spatial data. Since the early 2000s, an active community of R developers has built a wide variety of packages to enable R to interface with geographic data. The goal of this post is to introduce the basic landscape of working with spatial data in R from the perspective of a non-specialist. 1 My own interest in coding and R began with my desire to dip my toes into geographic information systems (GIS) and create maps of an early modern correspondence network. Using a command-line interface has a steep learning curve, but it has the benefit of enabling approaches to analysis and visualization that are customizable, transparent, and reproducible. There are advantages and disadvantages to these different types of tools. There are a plethora of tools that can visualize geographic information from full-scale GIS applications such as ArcGIS and QGIS to web-based tools like Google maps to any number of programing languages. Stack_list <- list.files(path, pattern=".tif$", full.The geographic visualization of data makes up one of the major branches of the Digital Humanities toolkit. Library(foreach) #Provides foreach looping construct Library(doParallel) #Foreach Parallel Adaptor And this is how you do it: Example with parallelisation library(raster) Here using multiple cores will save you a lot of time. However, imagine you have 100 rasters you would like to process. This code will work just fine and will need approximately 2min to execute for smaller rasters (4000x4000px). Stack_list <- list.files(path, pattern=".tif$", full.names=T) ![]() #get file names using list.files() function This is how I would calculate the NDVI using a for-loop without parallel processing: #load raster package Let’s say we have a folder with eight layerstacks and we would like to calculate the NDVI for every stack. So how do we implement this in R? Parallelisation – Step by Step Example without parallelisation ![]() This is esentially what parallelisation means: Using multiple cores, at the same time (parallel) for repetitive tasks. What a waste, right? Why not use all cores for the loop? For example, core one processes iteration 1, while core 2 processes iteration 2, while core 3 processes iteration 3,… etc. This means that one core will be doing all the work and the other core will esentially be doing (almost) nothing, because R on default uses single core processing. This is what R is doing in this situation: It activates only one core and lets it handle the iterations of the loop, step by step, one iteration at a time. Imagine we have a big processing task that we have to perform over and over again (for example inisde a for loop). ![]() You can see that only one core (in this case it’s core 3) is being used and the other ones are basically at 0% usage. The following picture shows the CPU usuage across my 4 cores, during a typical R session: If you use R for you calculations, usually only one core is used to handle this calculation and the other ones are basically sleeping or handling some overhead operations like copying data and making sure your other programs are running properly. My laptop for example has 4 cores and my dektop PC has 8. “Paralellise” is such a fancy word, but what does it mean? Well, you probably know that your CPU (if you didn’t get stuck in the 90ties) has multiple cores to process your requests. This part will be a little more “sophisticated” since we will have a look on how to parallelise R processes. In the previous part I showed you how to speed up R by increasing the maxmemory limit. Welcome to the second part of my tutorial on how to increase the processing speed of the R package. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |