Performance Tips

 

Performance Tips

Manifold is designed for an era where processor cores, disk drives and memory are cheap but human time is expensive.    

 

That's why Manifold .map files will generally be significantly larger than the sum of the file sizes of files that have been imported into the .map file:  by pre-computing essential structures and storing data for speed rather than compactness the larger .map file provides significantly faster performance and saves us a lot of time.

Systems

 

Recent versions of Windows such as Windows 10 are much faster and more stable than earlier versions of Windows.  Not only Windows itself is faster but also accessory subsystems, such as DirectX rendering, that are used by Manifold are also much faster than earlier subsystems.

 

Modern Windows editions provide many power saving options, even on desktop machines that are plugged in all the time.  Surprisingly, the default choice might be a Balanced plan that reduces performance to reduce energy consumption.   By running the processor slower or by turning off disk drives a power plan can significantly reduce energy consumption, but at the cost of significantly reduced computer performance.

 

For maximum performance with Manifold, make sure your power plan is set to High performance.  Ensure that in whatever power plan settings have been specified, all disk drives that you will be using are never turned off, since the time to turn on a disk drive and spin it up for full readiness can result in dramatically slower operation.   External, portable drives, for example, are often turned off to save power after a period of inactivity.

 

Virus checkers and other programs running in background can be performance killers.  Learn how to prevent them from checking and re-checking the work you do in Manifold.

 

Manifold utilizes NVIDIA GPUs for GPGPU processing to achieve much better performance through massively parallel computation, where such computation makes sense.  When GPU makes a difference almost any NVIDIA GPU delivers big results.   Therefore, all systems for Manifold should have at least some GPGPU-capable NVIDIA GPU in them.  Even the cheapest GPGPU-capable card is better than none.

 

Modern GPUs are so fast, however, that even with Manifold parallelism only very computationally intensive, large jobs will be limited by GPU performance.   If we purchase a "good value" GPU card instead of the most expensive, latest and hottest GPU card we usually will get the best price/performance ratio.   If funds are unlimited then, of course, we can spend big and buy whatever is the latest, hottest GPU card or, better still, buy as many of them as will fit into our computer.   But in most cases having one or two relatively inexpensive but also relatively recent GPU cards will provide performance that is not noticeably slower.   

Memory

 

Although Manifold has remarkably good performance in systems which are under-equipped with memory, the system can do better if we have installed lots of cheap memory in our computer.   Think big: install maximum memory that your system can host.  Memory is cheap: your time is expensive.  Burn memory, not time.

 

If we have lots of memory in our system Manifold can minimize the need to utilize much slower disk drive storage.   Disk will still be used: there is no getting around the need to get data off disk and into memory and no matter how fast Manifold may be Windows will impose a limit on how fast disk drives can perform.  Therefore it pays to use faster disk drives and to use solid state disk, SSD, drives for maximum speed.  More memory is more important than faster disk drives, so better to spend the same amount on increasing RAM from 32 GB to 128 GB than to buy a smaller, but fast, SSD instead of larger, reasonably fast disk drives.

 

Having more memory is also usually a higher priority than buying the fastest possible CPU.   More CPU cores usually is better than faster, but fewer, cores.  In many tasks having a CPU with many cores that run at  average speed is better than having a CPU with fewer cores which run faster.  The classic example is installing an inexpensive 8 or 10 core processer that ends up delivering better performance than an expensive 4 core processor.

 

It should go without saying that we should be working with 64-bit Windows on a 64-bit computer.   Manifold can be surprisingly quick even within the limitations of 32-bit Windows but in modern times we should try to avoid crippling our productivity by utilizing 32-bit architectures that became obsolete almost 20 years ago.

The User

 

The user is usually the key factor in performance.  The greatest gains in performance are usually achieved by using a better method or algorithm, not by throwing money at faster hardware. More often than not the sole factor in whether a better method is used is the expertise and clarity of mind that can be mustered by the user. A healthy, well-rested, expert user is the best performance accelerator around.

Threads

tech_ravi_sm.png

Tech Tip: In queries use THREADS to make full use of processor cores available.  This is automatic in pre-built templates in the Transform panel, but it is something we must add when writing queries from scratch.

 

The THREADS command takes a value for the number of threads to use.  For example, if we know we have six CPU cores but we only want to use four threads we could write...

 

THREADS 4

 

To automatically use however many CPUs we have available we can add...

 

THREADS SystemCpuCount() BATCH 1

 

...to the end of a query we are telling Manifold to see how many CPUs are available, the result of SystemCpuCount(), and to use that many threads.   The result can be dramatic, literally running a query several times faster than without launching multiple threads.

Must Have Free Space on Disk

It should go without saying that we should have enough free space on disk to work with the size data desired.  We cannot work with 100 GB data if we have only 20 GB free space on disk.  That much is obvious but there are nuances that might be easy to forget.    For example, we may use a faster, smaller SSD drive for TEMP folder and pagefile storage.   In such cases it is easy to forget that while we may have ample free space, terabytes, on our primary disk we might have a very limited amount of free space, only a few gigabytes, on the SSD drive we use for TEMP storage.   Both Windows and Manifold make heavy use of TEMP storage and, potentially, pagefile storage so if there is not enough space on the SSD drive we might not have the free disk space resources to work with the size project desired.   

 

As a general rule of thumb, many Manifold users like to have three times the maximum size of the project available in TEMP storage free space.    That is larger than necessary for simple operations but may come into play with more complicated tasks.

 

Another nuance is forgetting that compressed file formats can result in much larger data sizes when the data is decompressed into working form.   A classic example are image or LiDAR storage formats which often store data in compressed form.   Compressed formats may be compact, but they are very slow and in any event the data will be decompressed into the true, binary form for use within Manifold.  Compression factors of 10 or 20 are common, so that a 10 GB file in compressed form will decompress into 100 GB or 200 GB of real data.    Multiply by three and we might realize that we do not have 600 GB of free space on disk available.

 

The strategy to deal with such issues is simple and inexpensive:  disk drives providing many terabytes of storage are now cheap.   Use larger disk drives so that always several terabytes of free space are available even when working with very large data.    Free space on disk drives is absurdly inexpensive compared to the cost of your time.   Spend disk drive space, not time.

Performance and File Formats

Manifold can seem to work miracles even with slow formats.  For example, Manifold .MAPCACHE technology allows Manifold to link large ESRI shapefiles or MapInfo MIF files into a project and to open and render them even faster than ESRI's own ArcGIS shapefile rendering software or MapInfo's own software.  At times Manifold can even render and edit large linked shapefiles or MIF files with speed approaching that of Manifold's own .map format.

 

In most cases the fastest possible setup is to import data into a project and to work with the .map project file in local storage, such as fast, local RAID hard disk storage or a fast, local SSD.   Manifold working with Manifold .map format usually runs faster than disks can serve data so in the case of projects involving lots of data the faster the disk storage, the better.  Even better is to have lots of cheap main memory so the system does not have to touch disks drives as often.

 

Manifold can also do read/write work with many data formats without importing the data but working with it while it is linked from and still residing in the original file format.  That's great for casual use like editing some data "in place" in the original file but it may not be as fast as native Manifold .map storage.

 

The convenience of editing an older format "in place" may make the limitations imposed by that format tolerable for small projects; however, as a general rule of thumb it is a better idea to import the data into the Manifold project, save it in a Manifold .map file and then if need be export it back out to the older format.   That's true even though there is overhead required in the time necessary to import data into Manifold and then time again required to export the data back to the original format.

 

Better still is to import the data sets that we may frequently use into Manifold .map format and then work exclusively with .map format as much as possible, exporting to other formats only when necessary to send data to someone who does not have Manifold.

 

See Also

Getting Started

 

How to Edit a Single File

 

The MAPCACHE File