GIS System Performance Optimization Strategy

I. Hardware configuration

  • Memory

    Generally, it should not be less than 1G, but the larger the memory, the better the performance. Reasonable configuration and management are needed to maximize the advantages of memory to performance.
    At present, there is a scheme that can give full play to the advantages of large memory, that is, to establish a memory database, which will increase the speed by two orders of magnitude, about 100 times.
    The two key factors for performance optimization are reduced disk reads and writes, or I/O, and reduced network traffic. Compared with memory and CPU, the growth rate of hard disk is the slowest, so it is a good solution to store data in memory.
    A 32-bit operating system can theoretically only use up to 4 gigabytes of RAM, but in practice it can only use up to 3.7 gigabytes or less (for a general configuration, without any optimization, it may only use 1.7 gigabytes).
    For servers with more than 4 gigabytes of RAM, it is recommended to install a 64-bit operating system and 64-bit Oracle.

  • CPU

    It is recommended to use servers with multi-core and pluggable multi-way motherboards as much as possible, and their performance is certainly better than that of servers with single-core CPU.

  • Hard drive

    Hard disk is still a bottleneck of performance improvement, such as IDE7200 RPM and SCSI10000 RPM. Although the speed is only increased by 20%, the performance is quite different, which will be increased by 2 to 3 times.

    Raid (Redundant Array of Inexpensive Disks) technology, namely hard disk cluster technology, is adopted by many servers

    at present, which will bring many advantages, such as high availability, security and performance improvement. However, Raid technology also needs to be used reasonably, otherwise it will not achieve the desired results.

    Raid0, Raid1 and Raid5 are currently used more. In addition, there are Soft Raid and Hard Raid, the latter is more expensive, but the performance is better than Soft Raid, which can be found in the related materials to learn more about the characteristics and use of disk clustering technology.

  • Network

    Network is also a bottleneck that affects performance, but it is also the most cost-effective link, and it is also the most easily overlooked. The characteristics of GIS data determine that the pressure on the network is greater than general data, so it is very worthwhile to invest a relatively high cost in the network.

  • Choice of Brand and Compatible Machines

    Well-known brand machine is the advantage of good stability, good after-sales service, once there is a problem, you can find manufacturers to help solve. However, the cost of brand machines is often higher, especially in the field of servers, which is more than twice as high as that of compatible machines.
    The advantage of compatible machine is that the price is relatively low and the performance is good. If the configuration is optimized, the performance will often be better than brand machine. Its disadvantage is that stability can not be guaranteed, once problems arise, it is not easy to solve.
    In the choice of brand machines and compatible machines, comprehensive consideration should be made. For large projects invested by government departments, it is generally recommended to choose brand machines. For some small projects, it is recommended to choose compatible machines when the budget is relatively small.

2. Selection of Operating System and Database

  • Operating System

    It mainly measures the three major operating systems of Windows, Linux and Solaris.
    The advantage of Windows OS is that it is easy to use and does not need a very senior system administrator. However, the stability of Windows is relatively poor, and its performance is far inferior to Linux and Unix. In terms of security, there are many vulnerabilities and backdoors of its own, which are easy to be attacked by viruses and hackers.
    Linux OS, its advantage is that the operating system is open source, can save project costs, excellent performance, security is much higher than Windows, stability is also good. The average Linux + compatible machine is often comparable to a brand machine with a price of hundreds of thousands. However, Linux OS is slightly inferior to Windows in terms of ease of use, but in recent years, it has greatly surpassed Windows, and the interface of Linux has gradually become more beautiful and gorgeous, which will have an impact on its performance to a certain extent. If only the Linux kernel is installed and there is no graphical interface, the best performance will be achieved by using the command line. In addition, because the Linux operating system is mostly free, there is no Technical Support guarantee provided by a well-known large manufacturer behind it, once the system has problems, it is often very difficult.
    Solaris OS, with its advantages of stability, compatibility and performance, is more complete and optimized with brand machines, such as IBM Power with AIX and Sun with Solaris. However, such operating systems are often very dependent on the technical level of individuals. Moreover, the combination of Solaris and brand servers is more suitable for large projects of government enterprises. Windows Server can also be considered for small and medium-sized projects with low budgets.
    It has been tested that the performance of single-client access to Windows is better than that of the default configuration of Linux operating system. The reason may be that the default configuration of Linux is not the best, and some unnecessary services and graphical interfaces will occupy resources. Based on the above, an operating system with a pure Linux kernel with no graphical interface will definitely outperform Windows. In terms of high concurrency access, when the number of concurrency is less than 20, the advantage of Linux is also not obvious. Generally, when the number of concurrency is more than 50, the advantage will gradually emerge.

  • Database Software

    The current popular database software is Microsoft SQL Server and Oracle. Most users are more familiar with the former because it is easy to use, but in the long run, we recommend that you use Oracle, which has a wider range of control once you master Oracle related technologies.

3. Server optimization

The following are several principles of server optimization for reference:

  • Install only required options
  • Turn off the interface beautification option

    For example, in the Windows operating system, there is a Performance Option in the system properties. Generally, Performance Priority is selected instead of Display Priority. If 3D graphic display is not available in the application system, you can set Hardware Acceleration to None in Display Properties/Settings/Advanced/Troubles hoot in the operating system.

  • Set background service priority

    In System Properties, Performance Options, Advanced page, adjust Processor Scheduling to Background services first.

  • Close unnecessary services and ports

    Check all the services and ports in your operating system. Stop or close some of the services and ports you do not need.

  • Try to keep the database server separate from the IS server In

    most cases, there is a conflict between the optimization of the database server and the IS server, and if the two are on the same device, there will be resource contention, and the IS server will consume a lot of memory. In addition, it is worth noting that it is better not to divide these two types of servers into two subnets, that is, it is better not to have routers in the middle.

  • Other advanced optimizations

    In addition to the above optimization suggestions, there are many advanced optimization methods for servers, which need some experience and practical case support. In short, you need to form a concept that a server is not available by default.

  • About Antivirus Software

    Some antivirus software consumes a lot of resources when running virus detection. Try not to install antivirus software on the server. In the case that some optimization measures have been implemented, such as turning off unnecessary services and ports, viruses generally will not infect the server. If you want to install anti-virus software under Windows Server 2003, try to check the virus when the system is idle. Under Solaris, AIX or Linux, it is better not to install anti-virus software.

IV. Optimization of Database

  • Install only required packages
  • Select only the required options when creating a library
  • Allocate appropriate memory
  • Set reasonable parameters
  • Reasonable distribution of data files The table space

    with the most access is placed on the disk with the largest and optimal remaining space.

  • Use RAC (Real Application Cluster) option if necessary The

    early "dual hot standby" is set up to prevent disconnection, and one standby often causes a lot of idle and empty energy consumption. Later, the concept of RAC appeared, which is actually a shared disk, closely related to GRID grid computing. Disks and machine nodes are transmitted through optical fibers, and each node is running and can share memory. Its advantage is that it greatly meets the high availability. When a node fails, the client hardly feels disconnected from the server. It is truly parallel. It is recommended that enterprises use RAC, but the management and installation of RAC are more complex.

V. Optimization of Vector Data

  • Coding Technology

    Data encoding is a concept of compression, similar to ZIP, RAR, etc. The reduction of data volume can greatly improve the efficiency of disk reading and writing and network transmission, and significantly improve performance. If a Dataset originally has no code, it can be specified by Copy Dataset. Vector Dataset has multiple Encode Types. For details, see the constant seEncodedType in the Online Help of SuperMap Objects. There are typically two kinds of SDC and SWC.
    SWC, WORDEncode Type, an Encode Type applied to a Vector Dataset (line, surface type), has no effect on a Point Dataset. The compression ratio is 4 times. The loss of accuracy is 1/216. This Encode Type is recommended when objects in the Dataset are of average size.
    SDC, DWORDEncode Type, an Encode Type applied to a Vector Dataset (line, surface type), has no effect on a Point Dataset. The compression ratio is 2 times. The accuracy loss is 1/232, which is estimated to be in the millimeter scale according to the global size of the object. To sum up its advantages: compression speed, small loss, the original map and the map after the encoding of contrast browsing, the basic can not see the difference. It can be said that SDC is an Encode Type that is almost lossless, and it is recommended to use SDC for general data.
    For some area data that are adjacent in space and have common edges, there is no problem at all with SDC coding, but with SWC, there will be gaps when the magnification is very large. For some Vector Data scattered in space, such as real estate patches, there is no problem with SWC, and there is almost no loss of accuracy.
    In theory, the compressed data takes up less space and is faster to query and browse.
    Cost of data compression: The compressed data is irreversible, that is, the state before data compression cannot be returned. In addition, there is a certain loss in accuracy, which has an impact on query and analysis. If very accurate query and Spatial Analysis are performed, SWC will cause some errors in the results, but SDC will not have an impact. The resulting loss of accuracy is within the tolerance of Spatial Analysis.

  • Spatial Index

    At present, there are mainly four types of Spatial indexes: R-tree, quadtree, Dynamic Index and Mapsheet Index. In practical application, the appropriate index type should be selected according to the specific situation and the characteristics of various indexes.
    Briefly summarize the characteristics and shortcomings of various indexes:

    • R-tree Index
      Advantages: In general, the query performance is the highest. However, for some specially organized data, Mapsheet Index + local cache is the fastest.
      Disadvantages: The concurrent support is poor, so it is generally only used for the exclusive File Database, but not for the Database-type Datasource that requires frequent concurrent editing; the maintenance cost is high, and the index creation time is long.
      In a word, for static data, R-tree has obvious advantages.
    • Q-tree Index
      Advantages: Better concurrency support and faster indexing.
      Disadvantages: Query performance is lower than R-Tree.
    • Dynamic Index
      Advantages: good concurrency support, fast indexing, and support for large amounts of data. The size of each level of the grid can be customized and is not limited by Dataset Bounds. It is suitable for Database-type Datasource, and its performance is slightly lower than Mapsheet Index in the case of large data volume and read-only.
    • Mapsheet Index
      Are presented separately in the next subsection.
  • Mapsheet Index

    For some applications, Mapsheet Index is the fastest one, which is most suitable for regularly divided data, such as 1: 250,000 or 1: 100,000 national Basic Scale topographic maps numbered by standard map sheets. For ordinary data, if Mapsheet Index is used, it will also bring performance improvement.
    The Mapsheet Index is for the database Datasource. Create a Mapsheet Index for a Vector Dataset based on a range or a field. Mapsheet Index is created with the field representing the area to which the object belongs, or when creating by range, the number of records in each block should be as average as possible, for example, there are 2000 to 20000 objects in each block, and the range should not be too large or too small, so the effect is very good. General recommended range: 30 × 30, i.e. 900 grids, which is suitable for Dataset with 100,000 records. If the actual data is different, you need to modify the length and width of the range.

  • Cache

    Mapsheet Index combined with Local File cache is the fastest and most effective for static data. The cache can be set through the Dataset Properties, and the location of the local cache can be specified or modified through the Config Filesupermap. XML under the Bin directory.
    After the Mapsheet Index is established for the data with a large amount of data, and after the first full-size browsing (at this time, all the data on the server side will be cached to the local, usually the first full-size speed is slightly slower, but also faster than ArcSDE), the Map Operation of browsing and roaming is performed again after the local caching. The increase in speed will be clearly felt.

VI. Optimization of Raster Data

  • Stitching of Framed Images

    Generally, the satellite Remote Sensing Imagery are obtained in units of scenes. It is better to splice these image data before operating them, which is conducive to giving full play to the role of Image Pyramid and data coding.

  • Code Discrete Cosine Transform (

    DCT) is the most suitable Encode Type for Remote Sensing Imagery, with a compression ratio of 20. Other Encode Types, such as SPC, SGL, and LZW, are for Grid or DEM data sets, but the first two are no longer maintained. LZW is a lossless compression, and the compression ratio is related to the specific data.

  • Image Pyramid

    For Image Data, create a series of layers with gradually reduced resolution. When displaying images within a certain range at a certain scale, automatically call the closest layer of images to improve the overall browsing speed. For images with a large amount of data, it is necessary to establish a pyramid.

  • SIT

    (This data format is obsolete, and as of the SuperMap iDesktopX 11i (2023) release, import and export of this type of data is no longer supported)

    It stands for SuperMap Image Tower and is by far the fastest Image Files format available. The performance of ECW and MrSID compressed images is more than ten times lower than that of SIT.
    DCT coding is used inside SIT, which has a compression rate of 20 times without pyramid, that is, the amount of data can be reduced by about 20 times compared with the original. Generally, the pyramid will be built for images, and the storage pyramid will occupy a certain space. Therefore, after the Image Dataset is compressed into SIT, it is often less than 20 times. After the Image Dataset is exported to the SIT format, it can be imported again. However, because DCT is lossy compression, it is not exactly the same as the Original Image after being imported again.

VII. Optimization of drawing

  • Set a reasonable visual scale When

    a map is displayed to a certain scale, it is not necessary to display all layers, so according to the actual needs, the reasonable maximum Minimum Visible Scale of each Layer Settings will greatly improve the speed of Map Browsing.

  • Simplified data

    Such as resampling the data, or for the data with a relatively large display area, try to use a small scale instead of a large scale.

  • Try to use simple linetypes The setting

    of linetype will also affect the display efficiency. If it is not necessary, try to use a simple linetype.

  • Minimize the number of data sets contained in a map

    If the Dataset corresponding to the layer no longer exists, try to remove the layer. If there are layers that are not displayed in the map, it is also recommended to remove them. Empty layers will still occupy the resources of the cursor.

  • Remove snapable and selectable options for layers that do not need to be edited By default

    , the capturable and selectable options for layers are turned on. When the mouse moves over the map, it still takes up a certain amount of resources. It is recommended to turn off such options that do not require editing layers.

  • Use Map Fixed Scale Cache If

    you need to use iServer to publish the map, you can use the iDesktopPublish Map Cache first, which can improve the speed of iServer client to browse the map.

VIII. Business optimization

  • Index Business Fields For the fields to be used in the Where condition of the

    query statement or in the grouping clause, it is recommended to set the Create Field Index.

  • Use the grouping and association functions of the database during statistics
  • Database Performance Monitoring and Analysis Tools