Compute each pixel value of all the images in the input stack for generate the composed of the median, maximum value, pixels valid, last pixel valid and others.
The StackComposed compute the stack composed of a specific statistic of band values for several time series of georeferenced data (such as Landsat images), even if these are in different scenes or tiles. The result is a output of statistic compute for all valid pixels values across the time axis (z-axis), in the wrapper extent for all input data in parallels process.
The main aim of this app are:
Improve the velocity of compute the stack composed
Compute several statistics in the stack composed easily.
Compute a stack composed for data in different position/scenes using a wrapper extent.
Include the overlapping areas for compute the statistics, e.g. two adjacent scenes with overlapping areas.
Compute some statistics that depend of time data order (such as last valid pixel, pearson correlation) using the filename for parse metadata (for now only for Landsat images)
The general process flow is:
Read all input data (but not load the raster in memory)
Calculate the wrapper extent for all input data
Position each data in the wrapper extent (the app does not exactly do this, use a location for extract the chunk in the right position in wrapper, this is only for understand the process)
Make the calculation of the statistic in parallel process by chunks
Save result with the same projection with the wrapper extent
The wrapper extent is the minimum extent that cover all input images, in this example there are 3 scenes of the images with different position, the wrapper extent is shown in dotted line:
The wrapper extent is the size for the result.
With the wrapper extent then the images are located in a right position in it and put all images in a stack for process, the images are ordered across the time like a cube or a 3D matrix. When compute a statistic, it process all pixel for the wrapper extent, first extract all pixel values in all images in their corresponding position across the z-axis, for some images this position don’t have data, then it return a NaN value that is not included for the statistic.
There are mainly two problems for serial process (no parallel):
For solved it, the StackComposed divide the data cube in equal chunks, each chunks are processes in parallel depends of the number of process assigned. When one chunk is being process, it load only the chunk part for all images and not load the entire image for do it, with this the StackComposed only required a ram memory enough only for the sizes and the number of chunks that are currently being processed in parallel.
This is a mini guide step by step for use the StackComposed
There are some recommendation for the data input for process, all input images need:
For the moment, the image formats support are: tif
, img
and ENVI
(hdr)
StackComposed
takes some command-line options:
stack-composed -stat STAT -preproc PREPROC -bands BANDS [-p P] [-chunks CHUNKS] [-start DATE] [-end DATE] [-o OUTPUT]
[-ot dtype] inputs
-stat
STAT (required)
extract_NN
: extract from the inputs the value NN, any other value will be ignored, overlapped values NN
remain NN, for example, for to extract the value 2 put “extract_2”median
: compute the medianmean
: compute the arithmetic meangmean
: compute the geometric mean, that is the n-th root of (x1 * x2 * … * xn)max
: compute the maximum valuemin
: compute the minimum valuestd
: compute the standard deviationvalid_pixels
: compute the count of valid pixelslast_pixel
: return the last valid pixel base on the date of the raster image, required filename as
metadata (extra metadata)jday_last_pixel
: return the julian day of the last valid pixel base on the date of the raster image,
required filename as metadata (extra metadata)jday_median
: return the julian day of the median value base on the date of the raster image, required
filename as metadata (extra metadata)percentile_nn
: compute the percentile nn, for example, for percentile 25 put “percentile_25” (must be in the
range 0-100)trim_mean_LL_UL
: compute the truncated mean, first clean the time pixels series below to percentile LL (
lower limit) and above the percentile UL (upper limit) then compute the mean, e.g. trim_mean_25_80. This
statistic is not good for few time series datalinear_trend
: compute the linear trend (slope of the line) using least-squares method of the valid pixels
time series ordered by the date of images. The output by default is multiply by 1000 in signed integer.
required filename as metadata (extra metadata)-preproc
PREPROC (optional)
>N
>=N
<N
<=N
: conditionals, e.g. “>0” (remember, here you are defining the valid data)>A and <B
: between conditionals, e.g. “>0 and <=1000” (or
is not supported)percentile_LL_UL
: define the valid data between the percentile LL and UL values and remove outside this
limit,
for example, for define the valid data between the percentile 25 and 75 put “percentile_25_75”NN_std_devs
: define the valid data between the mean minus and plus NN standard deviations and remove
outside this limit, for example, for define the valid data between the mean minus and plus 2.5 standard
deviations put “2.5_std_devs”NN_IQR
: define the valid data between the NN IQR (interquartile range) and remove outside this limit,
for example, for define the valid data between the 1.5 IQR put “1.5_IQR”-bands
BANDS (required)
-nodata
NODATA (optional)
-p
P (optional)
-chunks
CHUNKS (optional)
-o
OUTPUT (optional)
-ot
DTYPE (optional)
-start
DATE (optional)
-end
DATE (optional)
inputs
(required)
Choosing good values for chunks can strongly impact performance. StackComposed only required a ram memory enough only for the sizes and the number of chunks that are currently being processed in parallel, therefore the chunks sizes going together with the number of process. Here are some general guidelines. The strongest guide is memory:
The size of your blocks should fit in memory.
Actually, several blocks should fit in memory at once, assuming you want multi-core
The size of the blocks should be large enough to hide scheduling overhead, which is a couple of milliseconds per task
Some statistics or arguments required extra information for each image to process. The StackComposed acquires this extra metadata using parsing of the filename. Currently support two format:
For them extract: landsat version, sensor, path, row, date and julian day.
Issues, ideas and enhancements: https://github.com/SMByC/StackComposed/issues
StackComposed was developing, designed and implemented by the Group of Forest and Carbon Monitoring System (SMByC), operated by the Institute of Hydrology, Meteorology and Environmental Studies (IDEAM) - Colombia.
Author and developer: Xavier C. Llano
Theoretical support, tester and product verification: SMByC-PDI group
Xavier C. Llano: xavier.corredor.llano@gmail.com
SMByC: smbyc (a) ideam.gov.co
StackComposed is a free/libre software and is licensed under the GNU General Public License.