Geospatial Intelligence plays a critical supporting role in human security operations. In this space, preparation typically serves as a precursor to success. Emergency managers can limit the impact of disaster through mitigation and contingencies. These actions are enabled from a thorough understanding of the operating environment, which is facilitated by geospatial intelligence. At times there are obstacles to using geospatial intelligence effectively, such as the various methods of collection that can make interpreting the intelligence a slow process. There are also various databases and organizations that maintain data but do not readily share it or adhere to the same standards, limiting its broad application. Additionally, the quantity of aerial imagery collected creates storage issues. In response, the Institute for Creative Technologies has pursued the creation of OWT, which aims to provide a set of 3D global terrain capabilities and services that can replicate the coverage and complexities of the operational environment . This would permit more rapid planning and decision making in support of human security operations, whether that is the management of a natural disaster area or military operations to secure high-value targets. With this ambitious project, come the issues associated with storing and compiling vast quantities of aerial imagery, collected via different sources.
This research supports OWT by narrowing down various delivery methods for raster mosaics via cloud hosting services through implementation of file compression utilizing the OptimizeRasters Geoprocessing Toolbox and the Mosaic Dataset Configuration Script (MDCS). The two applications are used in tandem to prepare data for cloud delivery to the end-user, which would likely be an emergency responder operating in a low-bandwidth environment . Because of the austere environments and high-risk situations that users would likely find themselves in the compression method ultimately chosen for ICT’s data needs to be rapidly accessible, facilitating hasty mission planning.
Mosaicking of raster data presents challenges not only in the development of the mosaic, but also in its delivery . Delivery of the mosaic to the end user requires a delivery system compatible with various machines and bandwidths. Users will potentially be in austere environments with limited IT support, necessitating a delivery method that limits requirements of the user. The final application and how it is delivered will be dependent upon mission requirements, the number of persons accessing the data, how it is stored, and how it is to be used.
Historically, remotely sensed images were preprocessed with software packages before use in GIS applications. This process involved orthorectification, pan-sharpening, image enhancement, mosaicking, and clipping. Each of these steps results in intermediary results that take up disk space, in addition to taking up time to write the intermediary results to a disk . This traditional workflow is inefficient in disaster scenarios where rapid response is crucial. For this reason, the geospatial intelligence industry has transitioned away from local storage options to cloud environments.
There is a significant body of literature related to the efficiency of producing map tiles for web services under limited bandwidths . Various mosaicking and pyramid geoprocessing toolsets exist, such as the Mosaic Dataset Toolset. However, performance of pyramids and mosaicked images is largely dependent on server capabilities. Therefore cloud storage is generally preferred, since server capabilities are maintained by an outside party. The OptimizeRaster toolset is primarily cloud-based rather than dependent on local storage and a local server, which makes it a better option than other toolsets for ICT’s needs . Another option for cloud-based mosaicking is ESRI’s ArcGIS Image Server . This tool, however, is older than OptimizeRasters and is designed primarily for ArcGIS Desktop, versions including and before 10.3 . OptimizeRasters can be used in ArcGIS Desktop or ArcGIS Pro, and is available for versions after 10.3.
3. Information and Methods
For the purposes of this project, two real-world datasets were provided by the Institute for Creative Technologies, in order to construct raster mosaics. These included orthorectified imagery of USC’s Catalina Science Center (Figure 1), and the Rose Bowl Stadium (Figure 2). Although smaller than the requirements of the planned 3D global terrain, these two data sets allow us to test the effectiveness of various compression methods, and have many of the same characteristics, such as large structures in an urban environment (Rose Bowl), or remote locations (Catalina Science Center).
Figure 1. Full extent of the Catalina Science Center.
Figure 2. Full extent of the Rose Bowl dataset.
The Catalina Science Center dataset contains orthorectified imagery, meaning it is remotely sensed imagery that has been geometrically adjusted for lens distortion, camera tilt, etc. to increase accuracy. It was collected in 2016 with a DJI Phantom 4. The data was collected in a WGS84 coordinate system in UTM 11N. TIF file sizes range from 19 MB to 49 MB. The overall file size for the Science Center was 1.35 GB. The data was collected in 3 bands (RGB). The pixel type is unsigned integer and the pixel depth is 8 bit.
The Rose Bowl data also contains orthorectified imagery with fifty-nine TIF and TFW files, collected in 2017 with a DJI Phantom 4. It was collected in a WGS84 coordinate system. This data was collected in 3 bands. The pixel cell size is 0.028, 0.028. The pixel depth is 8 bit. The overall file size of the Rose Bowl dataset before compression is 2.71 GB.
3.2. Mosaicking Tools: OptimizeRasters
OptimizeRasters is an open-source geoprocessing toolbox created by ESRI to optimize file conversion and compression. It converts a variety of file formats to more optimized formats such as MRF, tiled TIFF, and Cloud Optimized GeoTIFF. This allows for more efficient and scalable data access. It also transfers to and from an enterprise storage system and/or cloud storage, such as Amazon S3, Microsoft Azure, and Google Cloud Storage. It also creates raster proxies which simplifies data management by storing small files on local systems that reference much larger files stored on a cloud . Since ICT considers itself “data agnostic”, the OptimizeRasters tools are useful in converting multiple file types to one uniform and optimized file type . We consider MRF to be the most appropriate file type for ICT’s purposes as it is optimized for both enterprise storage and cloud storage, thus supporting its accessibility for those operating outside of standard working conditions .
3.3. Meta Raster Format
MRF data is tiled and has pyramids, like Tiled TIFFs and Cloud Optimized GeoTIFFs, and was developed by NASA for the explicit purpose of storing and indexing rasters more efficiently than other file types. This efficiency comes from the ability to individually store the pyramids, index, and metadata as separate files, which means they can be read faster .
MRF also supports JPEG and LZW compression in addition to Limited Error Rate Compression (LERC). LERC is an even more efficient compression method than JPEG and LZW because it speeds up data access while also saving additional storage space. Large-bit-depth rasters, such as elevation and digital camera imagery are more prone to lossy compression, which is supported by LERC . Since ICT considers itself “data agnostic,” and some of the data may be collected by digital imagery, using the LERC extension during MRF conversion would be a good option for ensuring all data is uniform in the OptimizeRaster process. The MRF file is also relevant because we are moving into a time where our industry is transitioning away from the traditional desktop environment to a cloud-based platform . However, if downloading data is an important component for the final product application’s use, then MRF may not be an ideal file type. MRF data takes longer for users to download and takes more space to save .
3.4. Mosaicking Tools: Mosaic Dataset Creation Scripts
When discussing mosaic dataset creation scripts (MDCS), it is important to consider that a mosaic dataset is a dataset which allows you to store collections of raster/image data, essentially a matrix of cells and pixels organized into a grid, with each “cell” containing a value which represents information, for example temperature. As such, these datasets allow you to manage, query, and view these data collections . MDCS enables the creation of these datasets by automatically adding the raster and dealing with all the desired parameters executed in a single step. This saves the user a lot of time and makes the creation much simpler via an efficient and consistent automation, as well as ensuring best practices are used .
Since these workflows are parameterized, MDCS is also very useful for documentation purposes, particularly for quality assurance and control. The main use for MDCS is to avoid having to create each dataset one at a time. MDSC is compatible with ArcMap, a user interface from which MDCS can be called, which is developed to make the UI process much simpler. It does, however, require knowledge of XML files and a base understanding of the creation and usage of mosaic datasets. The reason this automation is so important, providing the user has this foundation of understanding, is because manual creation is prone to error. MDSC provides a convenient solution to make the process of creating mosaic rasters repeatable and sufficiently documented .
The OptimizeRasters Toolbox was first used to standardize and compress the Rose Bowl and Catalina Science Center datasets. The “Input Path” and “Output Path” parameters are the user’s opportunity to add either local or cloud-based sources. See Figure 3 below for the various “Configuration File” options. In this study, both “Imagery to MRF_JPEG” and “Imagery to MRF_LERC” were used to compare the results of the two options. The MRF_LERC files were then further compressed through the MDCS windows batch file.
ICT was upgrading their infrastructure at the time of this project, so there was no access to ICT’s Enterprise Server environment, Microsoft Azure or AWS S3 services. Had there been access, the compressed files would have been uploaded to an enterprise environment and more accessibly downloadable. Instead, the MRF LERC compressed data produced from MDCS was published as a service from ArcCatalog to an ArcGIS Online account (Figure 4) . In ArcGIS Online this data is accessible only among USC Spatial Sciences Institute users, but
Figure 3. Configuration templates .
Figure 4. Workflow methodology used for this project.
could be shared with a broader audience if desired through permissions . This change from Azure/AWS S3 to ArcGIS Online provided the opportunity to assess the viability of ArcGIS Online as a web hosting proxy.
Through the use of OptimizeRasters and MDCS, we were able to generate raster mosaic files at a much smaller file size than their original, in theory making them ideal for delivery via cloud systems such as AWS S3 and Microsoft Azure . Reduction of file size was key, as although the cloud systems have very impressive speeds; the download of data to physical workstations or mobile devices is limited by the bandwidth of the end user. While MDCS does compress the data well, if the client requires just one specific square raster as opposed to a greater viewing area (raster mosaic), then MDCS may not be ideal. In hosting to ArcGIS Online, we found our hosted file size comparable to its original size prior to being processed in OptimizeRasters and MDCS, making ArcGIS Online an insufficient alternative for hosting data (Table 1).
Due to questions about file sizes produced from JPEG and MRF_LERC compression and their ease of download, we performed both compression methods for both sets of data to discover the output file sizes. We discovered that JPEG compression, which is known to be lossy, removed tiles from the outer edges of the larger Rose Bowl dataset, though the compression size was smaller than MRF_LERC. Due to this, when running MDCS, we chose to use the MRF_LERC compressed files as our input.
4.1. Catalina Science Center
Table 1 provides information on the size of the Catalina Science Center dataset subject to the compression method used, as described previously. However, it is important to note that the file size increased over 700 MBs after the compressed files were hosted to ArcGIS Online (Figures 5-7).
Figure 5. Illustrates the OptimizeRasters tool with MRF_JPEG compression that resulted in a file size of 148 MB.
Figure 6. Illustrates the OptimizeRasters tool with MRF_LERC compression that resulted in a file size of 753 MB.
Figure 7. Illustrates the OptimizeRasters output of MRF_LERC compression wth MDCS that resulted in a file size of 1.4 MB.
Table 1. File size changes with different compression methods.
4.2. Rose Bowl
The changes in file size of the Rose Bowl dataset subject to the compression method used are shown in Table 1. These results provide two significant take-aways. Firstly, Figure 8 illustrates the known lossiness of the MRF_JPEG method. This outcome resulted in the use of the MRF_LERC method for comparison. Secondly, while the combination of MRF_LERC and MDCS greatly reduced the file size, the result was rendered useless when hosted to ArcGIS Online (Figure 9 and Figure 10).
This project was ultimately successful in developing a raster mosaic and hosting it online, though not without the identification of some key concerns. We found JPEG compression’s lossiness to be an issue with the larger Rose Bowl dataset. If ICT’s other datasets are of comparable or larger size, it is reasonable to assume that this would be a problem for those datasets as well. For this reason, we used MRF_LERC compression, and would recommend that future research attempts follow this procedure. We also identified imagery hosted in ArcGIS Online to
Figure 8. Illustrates the OptimizeRasters tool with MRF_JPEG compression that resulted in a file size of 282 MB.
Figure 9. Illustrates the OptimizeRasters tool with MRF_LERC compression that resulted in a file size of 1.4 MB.
Figure 10. Illustrates the OptimizeRasters output of MRF_LERC compression wth MDCS that resulted in a file size of 2.07 MB.
increase the compressed file size close to the original file size of the data. This project successfully identified MRF_LERC compression as the ideal file configuration, and was successful in eliminating ArcGIS Online as a hosting medium.
5.1. Cloud Environment
Due to ICT’s infrastructure update at the time of this study, we were not able to integrate their cloud infrastructure into our project via AWS or Azure. For this reason, all inputs and outputs for the OptimizeRasters geoprocessing tool were located in local folders. For projects with large datasets especially, utilizing web hosting would be beneficial. If a study such as this were to be repeated, these authors would recommend utilizing web hosting capabilities rather than ArcGIS Online.
5.2. Mosaic Dataset Creation Scripts
In ModelBuilder, OptimizeRasters can be added to the MDCS script to automate the entire two-step process. The benefit of automating both OptimizeRasters and MDCS together is of course, that it is a valuable way to save time and increase efficiency when working with large datasets. These authors attempted to automate the two processes together but were unsuccessful due to the time constraints of the project. Future research on this topic should also involve adding a script so that all batch files can be run at the same time. This way, the script can be automated to run daily, weekly, etc. Temporal automation would be particularly useful in the case of satellite imagery collection. Both of these refinements to the script are areas where we suggest future work be focused.
This project aimed to identify successful file compression and hosting methods for imagery data. By utilizing the OptimizeRasters toolbox and MDCS, development of the raster mosaic and online hosting were ultimately successful. However, JPEG compression lossiness of the Rose Bowl dataset indicated that MRF_LERC compression was the ideal configuration. Additionally, hosting the imagery via ArcGIS Online was found to increase the compressed file size close to the original file size of the data, making the compression obsolete. Future testing should consider usage of an enterprise cloud environment to avoid this issue.
Beyond the scope of ICT’s One World Terrain, file compression and hosting have other Human Security applications. Any organization with large amounts of imagery data, require an effective way to store their data and share it within their organization, particularly in times of crisis when data needs to be quickly accessible. Humanitarian, defense, and environmental entities face data emergencies during natural and human imposed disasters. During these emergencies, geospatial personnel need access to imagery data, downloaded to their devices quickly and efficiently so that reputable decisions can be made with this information.
This project is the result of research done for a MS degree at the University of Southern California focused on Human Security and Geospatial Intelligence. The authors would like to thank USC for their training in geospatial intelligence, GIS, database management, and remote sensing. We likewise thank our advisors (Col. retired) Dr. Steven Fleming and Dr. Jason Knowles and acknowledge their guidance and review of this project. We would also like to acknowledge the Institute for Creative Technology for providing us with the data and framework necessary for completing this project.
 Xu, H. and Becker, P. (2012) ArcGIS Data Models for Managing and Processing Imagery. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XXXIX-B4.
 Huang, W., Wang, C. and Tang, D. (2018) Design and Applications of Rapid Image Tile Producing Software Based on Mosaic Dataset. The International Archives of the Photo-Grammetry, Remote Sensing, and Spatial Information Sciences, XLII-3.