Performance and accuracy of the automated measurement software: Simple Online Automated Plant Phenomics (SOAPP)
Catégorie d'article: Research Note
Publié en ligne: 08 août 2025
Pages: 51 - 64
DOI: https://doi.org/10.2478/gsr-2025-0004
Mots clés
© 2025 Ariel M. Hughes et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.
Phenotypic analysis is an essential component for studying the growth and development of plants in a number of areas of research, including studies of tropisms and plant responses to microgravity (Vandenbrink et al., 2016; Medina et al., 2022). However, many of the traits analyzed in phenotypic studies, such as root and hypocotyl length, leaf area, secondary root count, and number of root hairs, currently rely upon measurements collected using software like ImageJ, which relies upon the user to manually designate the lengths and sizes to be measured (this method will be subsequently addressed as “manual measurement”) (Kiss et al., 2012; Shymanovich et al., 2022). New advances in phenotypic image analysis technology utilizing automated analysis software built upon open source OpenCV (OpenCV. org) and PlantCV (plantcv.danforthcenter.org) packages offer an opportunity for more efficient phenotypic analyses and a reduction of time spent in the data collection phase. Taking advantage of these new data analysis methodologies will be crucial for increasing the speed of data collection moving forward, as current methods are both time consuming and susceptible to analyst bias.
Multiple automated and AI-powered measurement software projects dedicated to extracting data from images of plants have been developed, many of which are dedicated to analysis of roots from image data, such as the Root Measurement System, RootTrace, RootReader2D, IJ_Rhizo, RhizoVision Explorer, My ROOT, DynamicRoots, and GiA Roots (Ingram and Leers, 2001; French et al., 2009; Galkovskyi et al., 2012; Clark et al., 2013; Pierret et al., 2013; Symonova et al., 2015; Betegón-Putze et al., 2019; Seethepalli et al., 2021). Other automated measurement software, such as HYPOTrace, can measure other aspects of experimental plant phenotypes, such as cotyledon and hypocotyl area, hypocotyl length and bending (Wang et al., 2009).
One of the benefits of using automated measurement software to collect data is the lack of human input bias and variations in manual measurements made during the data collection phase (Jacob et al., 2011). When a plant is measured manually, deviations in measurements due to naturally occurring differences in human sight and hand-eye coordination result in data that are inherently biased in favor of what the analyst deems fit for the target of interest, producing fluctuating values across multiple measurement trials. As a result, a measurement made for an individual plant or other assay subjects will vary from person to person and can never be truly reproducible. With automated measurements, all measurements are made using image analysis algorithms and are less susceptible to the inherent differences in measurement values with manual analysis. Furthermore, these measurements are reproducible due to reliance on programmed instructions rather than human manipulation, resulting in more consistent datasets that do not deviate across multiple measurement trials. By utilizing automated measurement software for isolating different plant tissues of interest, it is possible to analyze large amounts of experimental image data quickly, and without human input bias for a multitude of phenotypic traits.
Automated and artificially intelligent measurement software is still in its infancy for use in scientific data collection applications. The greatest hurdle in developing and perfecting automated data collection software currently is determining the most effective data to train the software upon to produce results that most closely isolate the targets of interest (Sugali et al., 2021). This process can be very time consuming, but once perfected, should result in robust analysis code that can readily analyze images with both accuracy and reproducibility. However, if training data contain deviations or irregular example imagery, the resulting algorithms may produce erroneous results, eliminating the benefit of efficiency that comes with making software perform measurements on behalf of the user. Therefore, the onus is on the software developers to find and determine which training data will produce the most accurate and dependable measurement algorithm. This necessitates real-time, consecutive analysis studies to determine whether the software is producing results that align with what can be observed and measured manually to confirm the accuracy of the algorithm for isolating the items of interest.
Thus, the present study examines the ability of a new data collection software, SOAPP (Simple Online Automated Plant Phenomics), to isolate and measure hypocotyl lengths of seedlings of
Thus, SOAPP is an in-development, browser-based plant phenotypic image analysis tool which enables users to analyze images for statistics-ready color and shape phenotypes using classic phenotypic analysis techniques without writing any code. The SOAPP program establishes five critical and extensible steps in a phenotypic image analysis workflow: Preprocessing, masking, setting regions of interest (ROIs), analyzing the plants, and retrieving phenotypic data. This program guides users through each step and allows users to set, save as a downloadable file, and restore all parameters in the five-step workflow. Once a workflow is defined, analysis can be run sequentially on a set of images.
SOAPP is implemented in Python using the package Streamlit (
Hypocotyl size and root length are key phenotypic traits used to assess
In the long-term, this SOAPP analysis performance study will contribute to plant growth and development projects such as genome-wide association studies (GWAS), which aim to identify the genes associated with gravitropism and phototropism in
This project utilized

A sample scan of clinostat-grown 4-day-old

A manual hypocotyl length measurement of a seedling collected using the program ImageJ. To conduct a length measurement, the user manually plots a segmented line (circled, appears as a yellow line connected by manually placed white plotting points) using the included segmented line tool (circled in red in the toolbar).
Image data for measurements came from previously completed
SOAPP was originally designed to measure leaf and cotyledon area but has since been amended to measure lengths within an image. The program was run in-browser on a Windows 10 operating system using Docker, a software that allows users to run program local source code instances (technical info:
After resizing the TIF photo sizes, measurements with SOAPP followed a specific protocol such that each image was analyzed using the same pre-analysis settings. Image size was reduced by using the “Resize Image” function within the native Windows MS Photos program included with Windows operating systems in order to make the file (about 1GB is in size) more manageable for the SOAPP software. This resulted in uniform image compression (i.e., a 2000 × 1000 photo was resized to 1000 × 500). After an image was uploaded to SOAPP for analysis, a binary mask was set using an RGB (red, green, blue) colorspace and threshold value to isolate the seedlings from the background (Figure 3). This first pre-analysis page, users can select between multiple colorspaces, along with minimum and maximum threshold values to isolate different plant tissues of interest. Within SOAPP, L, A, and B are color channels for the CIELAB (or LAB for short) colorspace, and H, S, and V are the channels for the HSV colorspace, which are multiple representation of color in graphics (

The SOAPP binary masking interface. Prior to image analysis, users must assign and adjust a binary mask to each image for removal of unwanted background or plant tissue areas within the ROIs (Regions of Interest). First, a color space is designated (Box A) for isolation of objects with specific hues and brightness. Color space V was used for all plants in this study. Second, the user selects the masking method (binary by default) and threshold for the color space mask, or how intense the applied color space should be on the image to be analyzed (Box B). Finally, the user selects whether the target objects are darker or lighter than their relative background (Box C).
All images were analyzed using the software’s Value (“V”) colorspace (essentially the brightness), which was found to best isolate the hypocotyls from the other plant tissues in preliminary analysis as show in Figure 3. A different threshold value was used for each image as variations in contrast resulted in the need for this approach to isolate the hypocotyls from both the background and the roots. The maximum threshold value was kept the same for all images.
Following the construction of the binary mask, ROIs were placed over each plant to separate the plants individually for analysis (Figure 4). ROIs are defined by entering the corresponding vertical and horizontal pixel coordinates into the coordinate boxes to move the ROI grid, along with the number of plants to be analyzed (denoted by number of rows and columns, first and fourth entry boxes in Figure 4). ROIs are defined by the area within the purple circles which are placed by the user via pixel coordinate input (i.e., 5000, 2000 would place an ROI at pixel 5000 on the X axis and pixel 2000 on the Y axis of the image) (Figure 5). When placing multiple ROIs, they are spaced out according to the pixel value entered by the user (i.e., a horizontal spacing of 1000 and a vertical spacing of 1000 would separate the 2–4 ROIs by 1000 pixels in the x axis direction and 1000 in the y axis direction, respectively) (Figure 4). The last entry box is used to adjust the pixel radius of each of the ROIs. The same ROI selection radius was used for all plants (300 pixels).

The SOAPP ROI interface. After the construction of a binary mask, users must input the location of ROIs for SOAPP to measure within an image. Circular ROIs are designated by pixel location, with the x axis represented by horizontal inputs and the y axis represented by vertical inputs. The purple circles indicate the area where the ROIs have been designated, and the software will then locate the subject specified in that general area, i.e., so long as part of the subject is within the designated purple ROI area, that subject will be measured. When measuring multiple ROIs at once, indicated by row and column inputs >1, users must indicate the spacing between the ROIs, also on a per-pixel basis.

The SOAPP results interface. Following analysis of user-inputted ROIs, SOAPP will present its analysis results both visually (measured plant tissue areas are visualized by purple polygons overlaid upon the original image selected for analysis) and in value table format (Box D). Measurement values provided in the result value table for the designated ROIs include area of the target plant tissue, solidity, and the following are to the right of the pictured columns, accessible by the shown slide bar: Perimeter length, width, height, center of mass, number of convex hull (represented by purple polygons) vertices, major and minor axes of ellipses containing the convex hull and any plant tissue extending beyond the ROI circle (ellipses are not visualized by SOAPP, an external calculation was conducted to determine their shape and orientation), and the ellipse’s eccentricity.
Hypocotyls eclipsed by other plant tissues were not detected due to the chosen image threshold. Plant-point ellipses that failed Grubb’s Test for outliers were eliminated from the datasets (Grubbs, 1969; Stephens, 1979). Although the membranous background in the scan was wrinkled, the color space chosen for this study “V” effectively isolated the plant tissue from the background reflections caused by wrinkles in the membrane during scanning (as seen in the “cleaned” version of the image, after application of masking as shown in Figure 6). Thus, there was no need for elimination of these images. Severe wrinkling resulted in automatic exclusion of the plant/ROI, and often, it was even difficult for a human to discern where the hypocotyl was in these cases. This is indicated by “not detected” cells in the raw data set. Other reasons for “not detected” results include plants that did not germinate (the color space excluded seeds, as they were usually brown in color and differed greatly from hypocotyl tissue), or hypocotyls that were exceptionally dark in color (usually plants that dried out during the growing process).

The SOAPP binary masking interface, image cleanup panel. After the user has selected a color space, masking technique, colorspace threshold, and relative target object brightness, the final step is designating the size of stray unmasked objects to remove from the final binary mask. During the masking process, bright artifacts such as membrane creases can slip through the initial masking steps. This feature removes any unwanted artifacts of a certain size, leaving behind only the desired target objects. This setting was kept at 2000 pixels for all analyses (Box E).
After analysis was completed, either the major or minor axis of the plant-point ellipses graphed onto the target plants by SOAPP during image analysis were used as a proxy measurement of hypocotyl length, depending on the orientation of the ellipse (Figure 7). Ellipses are not visualized within the current version of the SOAPP graphic user interface (GUI). Instead, values of the ellipses, which are provided on the results page, were input into an Excel sheet which graphed each ellipse from the given values. This ellipse was then compared to the orientation of each target plant (ROI) to determine whether the major or minor axis should be used to represent the hypocotyl. SOAPP provides these axis values in a data table format along with other plant area values in post-analysis (Figure 5). At the time SOAPP measurements were being performed, the values for length were measured in pixels. The pixel value was then converted to millimeters using a pixel value determined using the scale function of ImageJ, which allowed for conversion from pixels to millimeters. This conversion is now done automatically in SOAPP after an update which allows SOAPP to measure length in an additional area.

A plant-point ellipse (digitally zoomed in to show the plant about 20X its size). SOAPP generates a plant-point ellipse (shown in blue) for each analyzed plant that is fitted to the identified plant area polygon (shown in purple). In the current iteration of the software, plant-point ellipses are not available as visualization within the graphic user interface (GUI). To resolve this issue, plant-point ellipse values were entered into an Excel sheet that would output the ellipse for each plant. As shown in this example, the major axis would be used to represent hypocotyl length due to its orientation relative to the target plant.
ImageJ was employed to take what we will describe as “manual measurements” as it requires an operator. The Segmented Line Tool was used to carefully trace through the center of the hypocotyl from the root-shoot junction to the base of the cotyledons (Figure 2). The scale for the measurements was established using the ruler that was scanned alongside the seedlings on their thin membranous layer, and the “Set Scale” function was used.
Images analyzed for this study were from the larger GWAS assay, which investigated the gravitropic response of over 170 different ecotypes grown on a 2-dimensional clinostat. Of the lines investigated in the GWAS study, 20 were selected as defined in the methods section.
All images were analyzed both manually with ImageJ and using the SOAPP software three times for both methods. Each image, containing up to 12 plants each, took approximately 30 minutes to measure manually using ImageJ. Seedlings on six plates were analyzed for each ecotype for six images total per ecotype. Images for one ecotype took approximately 3 hours to measure using ImageJ, for approximately 60 hours of measurement time for the entire 20 ecotype dataset. In comparison, analysis of each image using SOAPP took about 15 minutes on average, so each ecotype took approximately 1.5 hours to complete, for a total of 30 hours of analysis time for the entire ecotype dataset, making the automated measurement method twice as fast as the manual method.
The estimation of human error was calculated by taking the mean of ranges, expressed as percentages, of the mean hypocotyl length of each individual seedling. On average, manual measurements varied by 5.47% across the three individual measurements. Within the sets of three repeated manual measurements using ImageJ, the smallest deviation from the mean of the three values was 0.13% of the mean, and the greatest deviation was 121.91% of the mean. Notably, there was no variability across the three rounds of measurement when SOAPP was utilized.
From the 1440 plants considered in this study, 341 (23.68%) were eliminated from the dataset. The greatest contributor to removal from the dataset was due to obfuscation of the hypocotyl, either from imagery artifacts or by another plant. There were 303 plants eliminated for this reason (21.04% of the overall dataset). Additional reasons for removal include plants for which the plant-point ellipse generated by SOAPP during analysis was abnormally large or did not fit the target plant (19 plants, 1.32% of the overall dataset), plants not being detected due to the chosen binary mask threshold for an image in SOAPP’s pre-analysis stage (18 plants, 1.25% of the overall dataset), and plants that failed the outlier test (Grubb’s test, 20 plants, including plants with “abnormally large” plant-point ellipses, 1.39% of the overall dataset).
Data collected from SOAPP showed that hypocotyl length was, on average, 1.684 mm larger than recorded with manual measurement. The SOAPP values were normalized to the manual measurements by subtracting from the SOAPP values, the mean of the differences between the manual and automated measurements for all ecotypes (which was calculated to be 1.684mm).
Within the manual ImageJ measurement series, the greatest hypocotyl length differences from the mean of a given ecotype were individuals with hypocotyls 133.2% longer than the mean and 80.3% shorter than the mean. For the SOAPP measurement series, the greatest an individual plant hypocotyl differed from the mean of a given ecotype were individuals with hypocotyls 357.9% longer than the mean and 90.2% shorter than the mean, both of which were greater deviations than those observed in the manual measurements. The mean range in hypocotyl length values by ecotype for SOAPP was + 2.02mm, and +10.04 mm for hypocotyl length values recorded manually with ImageJ.
Paired t-tests were performed to test for any statistically significant differences between the normalized hypocotyl measurements produced by SOAPP and the manually recorded measurements using ImageJ for each ecotype. The greatest divergence between plants measured using SOAPP and those measured manually were observed in the plants that were grown on a rotating clinostat, with seven of the 20 lines (Ara-1, Li-7, Rd-0, TDr-2, Vinslav, Ha-HBT1-2, and Udul3-36) exhibiting a significantly (p value < 0.05) reduced length as measured with SOAPP as compared to those taken manually (indicated by blue asterisks in Figure 8). Conversely, with plants grown statically, in four of the 20 lines (Vinslav, Bach2-1, El-0, and Ru4-16) SOAPP measurements were significantly larger than manual measurements (indicated by red asterisks in Figure 8), with only one SOAPP measurement being lower (Rd-0).

Mean of manual and normalized SOAPP hypocotyl measurements by ecotype. Ecotypes are divided into clinostat-grown and stationarily grown (control) plant subsets. SOAPP measurements are shown in light blue and yellow, and manual measurements are shown in navy blue and orange. The yellow and orange dots indicate the mean lengths of the statically grown plants as measured by SOAPP and manually, respectively. The navy blue and light blue dots indicate the average lengths of hypocotyls grown on the 2D clinostat as measured manually and using SOAPP, respectively. Asterisks denote the degree of significance. Blue asterisks indicate a significant difference between the SOAPP and manual measurements for the clinostat group and red asterisks indicate significant differences for the control group. A single asterisk indicates a p-value below 0.05, a double asterisk indicates a p-value below 0.001, and a triple asterisk indicates a p-value below 0.0001.
When we compared the manually collected data of clinostat and statically grown plant measurements, all plant hypocotyls grew significantly longer (except for IP-Ara-4) in the stationary condition (shown in orange) than in the clinostat condition (shown in blue) (Figure 9). The largest differences between the two conditions were observed in ecotypes Ler-1 and Pig-0, with the clinostat-grown plants having a hypocotyl about 2.08 mm and 3.35 mm shorter on average, respectively. The mean reduction in hypocotyl length when grown on the clinostat was 57.8%. The Tsu-0, Tu-W1, and UKID96 lines were the least affected by the experimental condition, with their 2D clinostat-grown hypocotyl lengths reduced to 71.9%, 72.2%, and 97.4% of their length in stationary conditions, respectively. Pig-0 was by far the most affected, with a reduction to 29.9% of the statically grown hypocotyls. Tsu-0 was reduced to 41.6% of the stationary hypocotyl length.

A comparison of the manually measured hypocotyl lengths of the 20 assayed ecotypes grown stationarily (control) and on a rotating 2D clinostat. Mean lengths of the hypocotyls grown statically are shown in orange and those of the plants grown on the 2D clinostat are shown in blue. A single asterisk indicates a p-value below 0.05, a double asterisk indicates a p-value below 0.001, and a triple asterisk indicates a p-value below 0.0001.
This study was conducted to determine the quality of SOAPP measurements of seedling hypocotyls and to assess whether automated measurement software could be used in place of manual measurement methods, such as ImageJ (Jacob et al., 2011). Using images of seedlings from 20 different wild-type ecotypes grown statically and on a rotating 2D clinostat, manual and automated measurements were collected and compared to determine if automated software could more efficiently and effectively collect data in phenotypic assays and thereby reduce time spent in the data collection phase.
Overall, there was a weak correlation between the manually recorded hypocotyl lengths and the measurements made by SOAPP (Figure 8). The average difference in hypocotyl length between the manual and SOAPP methods of measurement was 1.684 mm, which created the need to normalize the data. Although this difference might not be a problem for other assays or phenotypic measurements, as the collected data could be normalized to a control wild-type group in non-GWAS studies, there were other issues of greater concern. The hypocotyls of
One possible explanation for the fact that SOAPP measurements were statistically greater in the static condition and smaller in plants grown in the clinostat is due to the means of measurement of SOAPP and curvature of the hypocotyl. As shown in Figure 7, the ellipse encircling the hypocotyl may extend from the root-shoot junction, all the way to the tip of the shoot apical meristem, whereas, as seen in Figure 2, manual measurements of the hypocotyl can be more accurate in determining the beginning and end of a hypocotyl. This observation may account for the larger measurements recorded by SOAPP in the static condition. SOAPP measured the length of the hypocotyl by measuring the length of the major axis in the determined ellipse within the ROI (Figure 7). While curvature in the hypocotyl can be measured accurately with the Segmented Line tool in ImageJ (Figure 2), SOAPP uses a straight line, from the base of the hypocotyl to the end. Therefore, SOAPP measurements would be shorter along a curved hypocotyl. This may account for the smaller SOAPP-reported hypocotyl measurements in the clinostat condition.
SOAPP also produced multiple irregular results that fell well outside the standard deviations of the dataset and did not correlate with the value measured manually. On average, there was an anomalous plant-point ellipse result produced for more than 1 out of every 76 hypocotyl measurements (20 total anomalous results). Most of these anomalies occurred when analyzing seedlings grown on the control plates, with 16 of the 20 anomalous results involving stationary plants. Notably, seedlings grown on the control plates had longer hypocotyls, as indicated by manual measurements, which may have contributed to the anomalous readings by SOAPP.
However, SOAPP yielded zero variance for all
Regarding ease of use, SOAPP presented several challenges during the hypocotyl measurement process that at times required external solutions to calculate the required values for determining hypocotyl length. Notably, the iteration of SOAPP used for this study lacked a visualization feature in post-analysis for plant-point ellipses and their axes, which was resolved by inputting the values produced by SOAPP for each individual plant into a spread sheet to visualize the ellipses for determining the appropriate axis for hypocotyl length. Excel was also used to calculate image scale, a feature present in ImageJ but lacking in the current iteration of SOAPP used during the analysis phase of the study (automatic image scale conversion is now a present and operable feature of the revised version of SOAPP).
The chosen colorspace, “V”, in SOAPP isolated the hypocotyl and cotyledons from the roots (Figure 3). This resulted in ellipses that “cut-off” at the hypocotyl/root transition. Ellipses included the cotyledons, but due to the anatomy of the plants and the geometrical orientation of the ellipses, this meant that the cut-off often stopped where the hypocotyl did on the cotyledon side. However, this may have contributed some to the lengthening of the hypocotyl values when measuring with SOAPP, which necessitated the normalization of our SOAPP generated data to that of the manually collected measurements. It is worth noting that our sample from a GWAS experimental data set lacks a true control line (a wild-type line against which to compare the other lines, such as Columbia, Col-0), and that this upwards shift of SOAPP data could be modified in future studies containing a true control line.
Additionally, SOAPP demonstrated notable difficulty with analyzing larger, pixel-dense images, resulting in extended analysis processing times, and at times self-termination of the software during analysis. Reducing the file size of the input images appeared to resolve these issues, albeit at the cost of image resolution. Finding the optimal colorspace and image masking thresholds for the binary mask during pre-analysis was also challenging at times, as variations in plant tissue color and contrast between the plants, and the background resulted in some plants being undetected by SOAPP as they were omitted by the binary mask.
This issue could have been mediated by adjusting the threshold value for each individual plant opposed to each image, however, for the sake of continuity within this study, these values were kept the same across all analyzed images (Alam et al., 2014). Significant differences between the manual and SOAPP hypocotyl lengths by ecotype and treatment group could be attributed to this consistent masking value, as determining the optimal binary mask is subjective and susceptible to erroneous output when the termination point between the hypocotyl and root is not well defined (similar plant tissue color and/or long gradients between the two plant components) (Figure 6). Furthermore, maintaining the same binary mask for all plants in an image may have contributed to some of the non-uniform plant-point ellipse values produced by the software, as brighter plants were often overrepresented and plants that contrasted less against the background were underrepresented in each of the images, producing larger and smaller plant-point ellipses than expected, respectively. These irregularities produced by binary masking resulted in the elimination from analysis of 38 plants from the total number of photographed seedlings across all ecotypes.
Outside of the binary masking and file size challenges, SOAPP was relatively intuitive to use; SOAPP’s graphical user interface is sequential, and analysis steps are presented in a step-by-step fashion, guiding users through each of the input fields while offering a plethora of possible binary masking techniques and flexibility to best isolate plant tissues of interest from raw images. After collecting a few preliminary images and learning how to effectively utilize the software, SOAPP proved to be significantly faster than ImageJ in processing time, despite that only four of the 12 individuals on each plate could be measured per round of processing. Therefore, each image had to undergo multiple rounds of processing. However, this processing time was reduced and proved to be faster than the ImageJ measurement method when input imagery data file size was greatly reduced.
With regards to the gravitropic assay results; as previously reported, plants grew more slowly on the 2D clinostat as compared to those grown stationarily, but within the 20 observed ecotypes, interesting responses were observed. In normal Earth gravity, phototropism causes plant shoots to move towards full spectrum light, and their roots move away from it, but unique phototropic responses have been reported in microgravity (Millar et al., 2010; Vandenbrink et al., 2016). In our investigation, differences in hypocotyl growth were revealed between different wild-type lines of
As is the case with many of the newly emerging image analysis software, in its current form, SOAPP may not be the optimal tool for performing measurements if the goal is to extract values for length or width of hypocotyls from a large volume of high-resolution image data (Humplík et al., 2015). This is not unexpected, as the software was primarily designed to determine area, rather than linear values like length, width, and height (the values which we were collecting in the present study). However, reproducibility and speed of data collection with SOAPP is invaluable, as measurements it produces are unaffected by human input bias and deviations in user accuracy. As more improvements and features are introduced, the software has the potential for being a reliable tool for producing linear measurement values. With improvements, it is likely that SOAPP will become an asset in phenotypic assays of gravitropism and serve as a reliable tool for measuring length to assist in further investigation of the response of plants to gravity and microgravity.