Abstract
Data screening is an indispensable phase in initiating the scientific discovery process. Fractional factorial designs offer quick and economical options for engineering highly-dense structured datasets. Maximum information content is harvested when a selected fractional factorial scheme is driven to saturation while data gathering is suppressed to no replication. A novel multi-factorial profiler is presented that allows screening of saturated-unreplicated designs by decomposing the examined response to its constituent contributions. Partial effects are sliced off systematically from the investigated response to form individual contrasts using simple robust measures. By isolating each time the disturbance attributed solely to a single controlling factor, the Wilcoxon-Mann-Whitney rank stochastics are employed to assign significance. We demonstrate that the proposed profiler possesses its own self-checking mechanism for detecting a potential influence due to fluctuations attributed to the remaining unexplainable error. Main benefits of the method are: 1) easy to grasp, 2) well-explained test-power properties, 3) distribution-free, 4) sparsity-free, 5) calibration-free, 6) simulation-free, 7) easy to implement, and 8) expanded usability to any type and size of multi-factorial screening designs. The method is elucidated with a benchmarked profiling effort for a water filtration process.
Highlights
Design of Experiments (DOE) furnishes a conceptual interface through which researchers perturb a phenomenon in an attempt to fathom its behavior
Discovering ways to reduce the amount of work without affecting significantly the information content has been in long pursuit [11]
Incrementing the rank assignment terminates with the allocation of the least (7th run) desirable entry value (84.3 minutes) which will be awarded a rank of value equal to 8
Summary
Design of Experiments (DOE) furnishes a conceptual interface through which researchers perturb a phenomenon in an attempt to fathom its behavior. Screening might aid to make the cost of acquiring information and knowledge more manageable [13]. Executing trials that may involve large processing units in production mode may require experimenting with several process inputs. In either case, harvesting reliable information from such systems demands tweaking inputs in pragmatic conditions. This in turn entails the consumption of a capacious quantity of materials while simultaneously removing availability from the regularly scheduled operational capacity. This is because a great risk always exists with respect to the fate of the output generated from a completed set of production experiments. The operational complications mentioned above justify an impetus for more sophisticated data engineering methods that place emphasis on maximizing data mining efficacy in short structured datasets
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have