Toward Individual Fairness Testing with Data Validity

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Individual fairness testing (Ift) is a framework to find discriminatory instances within a given classifier. In this paper, we show our idea of a Ift framework, that integrates the notion of data validity, termed "Individual Fairness Testing with Data Validity (Ift-v)". We develop a solid foundation of Ift-v and demonstrate the feasibility of Ift-v. Our preliminary evaluation with Ift-v reveals the possibility that many of discriminatory instances detected by state-of-the-art Ift algorithms are considered invalid. These findings prompt a re-think of the current Ift framework, suggesting a transition from solely focusing on the discovery of discriminatory instances to the consideration of valid ones.

Similar Papers
  • Research Article
  • 10.21831/jss.v12i2.11639
Social skill assessment for transvestives through the implementation of social problem solving method
  • Sep 1, 2016
  • Journal of Social Studies (JSS)
  • Aman Aman

This study is aimed to know: 1) how the steps of assessment instrumen building of social skill for transvestives through social problem solving method are, 2) instrument appropriateness of transvestives’ social skill assessment is. This study uses Research and Development method consisting of four stages, which are: (1) introduction stage, (2) design organizing stage of social skill assessment instrument for transvestives as the development of initial product form, (3) trial, evaluation and revision stage, and (4) implementation stage. The numbers of transvestives who become the sample in each trial in 2015 is 5 transvestives selected using purposive sampling. The data collection technique uses FGD, questionaire, observation, and documentation techniques. The data validation uses expert validation, while the accuracy of qualitative data uses source triangulation technique, theory and method. The quantitative data analysis is conducted using descriptive and qualitative data analytic technique with interactive model. Based on the research result, it is concluded as follow: 1) the steps of instrument building on social skill assessment for transvestives through the implementation of social problem solving model are through : a) introductory study to construct theoretical framework, b) plan and organize instrument through FGD and expert validation, c) limited trial or individual test, evaluation and revision of instrument. The guidelines of expert validation result instrument indicates average score as 3.89 which means that the guideline is good or appropriate to test. Also, based on the transvestivesassessment in individual trial indicates good result with average score of 3.69. Meanwhile, the character education model of Social Problem Solving (SPS) for transvestives as the effort to develop their social skill in Special Region of Yogyakarta indicates good result with average score of 3.80.

  • Research Article
  • 10.21831/jss.v12i2.11639.g8388
SOCIAL SKILL ASSESSMENT FOR TRANSVESTIVES THROUGH THE IMPLEMENTATION OF SOCIAL PROBLEM SOLVING METHOD
  • Sep 1, 2016
  • Journal of Social Studies (JSS)
  • Aman

This study is aimed to know: 1) how the steps of assessment instrumen building of social skill for transvestives through social problem solving method are, 2) instrument appropriateness of transvestives’ social skill assessment is. This study uses Research and Development method consisting of four stages, which are: (1) introduction stage, (2) design organizing stage of social skill assessment instrument for transvestives as the development of initial product form, (3) trial, evaluation and revision stage, and (4) implementation stage. The numbers of transvestives who become the sample in each trial in 2015 is 5 transvestives selected using purposive sampling. The data collection technique uses FGD, questionaire, observation, and documentation techniques. The data validation uses expert validation, while the accuracy of qualitative data uses source triangulation technique, theory and method. The quantitative data analysis is conducted using descriptive and qualitative data analytic technique with interactive model. Based on the research result, it is concluded as follow: 1) the steps of instrument building on social skill assessment for transvestives through the implementation of social problem solving model are through : a) introductory study to construct theoretical framework, b) plan and organize instrument through FGD and expert validation, c) limited trial or individual test, evaluation and revision of instrument. The guidelines of expert validation result instrument indicates average score as 3.89 which means that the guideline is good or appropriate to test. Also, based on the transvestivesassessment in individual trial indicates good result with average score of 3.69. Meanwhile, the character education model of Social Problem Solving (SPS) for transvestives as the effort to develop their social skill in Special Region of Yogyakarta indicates good result with average score of 3.80. Keywords: character education, transvestives, and social skill

  • Research Article
  • Cite Count Icon 34
  • 10.1186/s12984-019-0579-8
Movement smoothness during a functional mobility task in subjects with Parkinson\u2019s disease and freezing of gait \u2013 an analysis using inertial measurement units
  • Sep 5, 2019
  • Journal of NeuroEngineering and Rehabilitation
  • Camila Pinto + 6 more

BackgroundImpairments of functional mobility may affect locomotion and quality of life in subjects with Parkinson’s disease (PD). Movement smoothness measurements, such as the spectral arc length (SPARC), are novel approaches to quantify movement quality. Previous studies analyzed SPARC in simple walking conditions. However, SPARC outcomes during functional mobility tasks in subjects with PD and freezing of gait (FOG) were never investigated. This study aimed to analyze SPARC during the Timed Up and Go (TUG) test in individuals with PD and FOG.MethodsThirty-one participants with PD and FOG and six healthy controls were included. SPARC during TUG test was calculated for linear and angular accelerations using an inertial measurement unit system. SPARC data were correlated with clinical parameters: motor section of the Unified Parkinson’s Disease Rating Scale, Hoehn & Yahr scale, Freezing of Gait Questionnaire, and TUG test.ResultsWe reported lower SPARC values (reduced smoothness) during the entire TUG test, turn and stand to sit in subjects with PD and FOG, compared to healthy controls. Unlike healthy controls, individuals with PD and FOG displayed a broad spectral range that encompassed several dominant frequencies. SPARC metrics also correlated with all the above-mentioned clinical parameters.ConclusionSPARC values provide valid and relevant clinical data about movement quality (e.g., smoothness) of subjects with PD and FOG during a functional mobility test.

  • Research Article
  • Cite Count Icon 6
  • 10.1177/1040638717717558
Pooled sample testing for Bonamia ostreae: A tale of two SYBR Green real-time PCR assays.
  • Jun 23, 2017
  • Journal of Veterinary Diagnostic Investigation
  • Henry S Lane + 2 more

Pooled testing of samples is a common laboratory practice to increase efficiency and reduce expenses. We investigated the efficacy of 2 published SYBR Green real-time PCR assays when used to detect the haplosporidian parasite Bonamia ostreae in pooled samples of infected oyster tissue. Each PCR targets a different gene within the B. ostreae genome: the actin 1 gene or the 18S rRNA gene. Tissue homogenates (150 mg) of the New Zealand flat oyster Ostrea chilensis were spiked with ~1.5 × 103 purified B. ostreae cells to create experimental pools of 3, 5, and 10. Ten positive replicates of each pool size were assayed twice with each PCR and at 2 different amounts of DNA template. The PCR targeting the actin 1 gene was unable to reproducibly detect B. ostreae in any pool size. Conversely, the 18S rRNA gene PCR could reproducibly detect B. ostreae in pools of up to 5. Using a general linear model, there was a significant difference in the number of pools that correctly detected B. ostreae between each PCR ( p < 0.01) and each pool size ( p < 0.01). It is likely that the single copy actin 1 gene is more likely to be diluted and not detected by pooling than the multi-copy 18S rRNA gene. Our study highlights that validation data are necessary for pooled sample testing because detection efficacy may not be comparable to individual sample testing.

  • Research Article
  • Cite Count Icon 24
  • 10.1002/jat.3496
Combinations of genotoxic tests for the evaluation of group 1 IARC carcinogens.
  • Jul 11, 2017
  • Journal of Applied Toxicology
  • Jacky Bhagat

Many of the known human carcinogens are potent genotoxins that are efficiently detected as carcinogens in human populations but certain types of compounds such as immunosuppressants, sex hormones, etc. act via non-genotoxic mechanism. The absence of genotoxicity and the diversity of modes of action of non-genotoxic carcinogens make predicting their carcinogenic potential extremely challenging. There is evidence that combinations of different short-term tests provide a better and efficient prediction of human genotoxic and non-genotoxic carcinogens. The purpose of this study is to summarize the in vivo and in vitro comet assay (CMT) results of group 1 carcinogens selected from the International Agency for Research on Cancer and to discuss the utility of the comet assay along with other genotoxic assays such as Ames, in vivo micronucleus (MN), and in vivo chromosomal aberration (CA) test. Of the 62 agents for which valid genotoxic data were available, 38 of 61 (62.3%) were Ames test positive, 42 of 60 (70%) were in vivo MN test positive and 36 of 45 (80%) were positive for the in vivo CA test. Higher sensitivity was seen in in vivo CMT (90%) and in vitro CMT (86.9%) assay. Combination of two tests has greater sensitivity than individual tests: in vivo MN + in vivo CA (88.6%); in vivo MN + in vivo CMT (92.5%); and in vivo MN + in vitro CMT (95.6%). Combinations of in vivo or in vitro CMT with other tests provided better sensitivity. In vivo CMT in combination with in vivo CA provided the highest sensitivity (96.7%).

  • Research Article
  • Cite Count Icon 71
  • 10.1080/13854046.2015.1054437
Comparison of Cognitive Performance on the Cogstate Brief Battery When Taken In-Clinic, In-Group, and Unsupervised
  • May 19, 2015
  • The Clinical Neuropsychologist
  • Jason A Cromer + 7 more

Objective: Repeat cognitive assessment comparing post-injury performance to a pre-injury baseline is common in concussion management. Although post-injury tests are typically administered in clinical settings, baseline tests may be conducted individually with one-on-one supervision, in a group with supervision, or without supervision. The extent to which these different test settings affect cognitive performance is not well understood. To assess if performance on the Cogstate Brief Battery (CBB) differs across these settings, tests completed individually with one-on-one supervision were compared to those taken either in a group with supervision or individually but without supervision. Method: A crossover study design was utilized to account for any effect of individual variability or test order to provide an unbiased examination of the effect of test setting on cognitive performance. Young adult participants completed an individually supervised test either before or after also completing a group or unsupervised test. Results: CBB scores from the same individuals were not significantly different across test settings. Effect sizes ranged in magnitude from .09 to .12 for supervised versus unsupervised tests and from .01 to .37 for individual versus group tests across CBB tasks. Conclusion: These results suggest that cognitive testing with the CBB in alternate settings can provide valid cognitive data comparable to data obtained during individually supervised testing.

  • Research Article
  • Cite Count Icon 11
  • 10.1002/uog.15733
Cell‐free DNA testing: how to choose which laboratory to use?
  • Nov 1, 2015
  • Ultrasound in Obstetrics &amp; Gynecology
  • J Jani + 2 more

Cell‐free <scp>DNA</scp> testing: how to choose which laboratory to use?

  • Research Article
  • 10.53842/qvj.v2i2.34
DEVELOPING ECOLOGY AND ENVIRONMENTAL LEARNING MATERIALS BASED ON INTEGRATION CURRICULUM AND SCIENTIFIC LITERACY FOR NATIONAL PLUS SCHOOL STUDENTS IN INDONESIA
  • Apr 5, 2023
  • Quaerite Veritatem : Jurnal Pendidikan
  • Silvia Sabatini + 2 more

The integration curriculum is a blend of the current national curriculum in Indonesia (Kurikulum 2013) and foreign curriculum from CIE (Cambridge International Examination). The problem is the unavailability of teaching materials is adequate for both curriculums simultaneously to be held. The aim of this study was to reveals the feasibility of learning material on ecology and environment topics based on integration curriculum and all the scientific literacy components. The feasibility of learning material is obtained through the validation by scientific literacy content experts, design experts, assessment from biology teachers and students’ responses. This research used Borg and Gall model which had 10 stages. However, this study is solely limited to preliminary field testing. Data validation was analyzed in descriptively qualitative. The research result showed that according to integration curriculum and scientific literacy by content experts, the feasibility of learning material was very feasible, in which the feasibility based on science as a body of knowledge has an average score 93.75%, whereas science as a way of investigation 88.54%, science as a way of thinking 87.5% and for interaction of science technology and society 86.25%. The feasibility of learning material’s design corresponded to the design expert was very feasible with 87.20% as well. Result of biology teacher assessment on learning material was 92.96% (very feasible). Students' response to the preliminary field individual testing was 83.92% (very feasible), small group testing was 78.56% (feasible) and large group testing was 80.84% (very feasible). The student's learning outcomes were increasing between experimental and control groups. The results of the unpaired t-test that have been obtained indicate that the significance value or Sig. (2 tailed) = 0,000 below 0.001. This indicates that there is a significant difference of using learning material for the ecology and environmental management topic based on integrated curriculum and scientific literacy to students’ learning outcomes of IGCSE 2 / grade X students, because Sig. (2 tailed) &lt; 0.05

  • Research Article
  • Cite Count Icon 2
  • 10.24114/jpb.v6i3.8045
Pengembangan Kegiatan MINI-LAB pada Topik Ekologi dan Lingkungan Untuk Siswa Kelas X SMA
  • Aug 1, 2017
  • Jurnal Pendidikan Biologi
  • Verronicha Crysty + 2 more

This research aimed to develop a Mini-Lab on ecology and environment topics based on scientific literacy and local potential of North Sumatera which is feasible empirically. Feasibility of learning material is obtained through validation of content experts, design expert, assessment from biology teacher and students' response to product that developed. This research and development used Borg and Gall model. However this study was limited to preliminary field testing only. Data validation and the questionnaire responses of teachers and students were analyzed descriptively qualitative. The result showed that according to content experts, the feasibility of product content was very good with average percentage score 93.32%. Feasibility of Mini-Lab’s design that has been developed based on design expert was also very good with a percentage score of 91.66%. Result of biology teacher assessment on Mini-Lab activity on ecology and environment topics for X grade of senior high school students was developed with 93.75% (very good). Students' response to the preliminary field testing individual testing is 80.35% (good), small group trial 84.52% (very good) and big group 91.38%.(very good).

  • Research Article
  • Cite Count Icon 3
  • 10.1177/0734282918787433
Convergent Validity of the A-ToM (Adult Theory of Mind) Test for Individuals With Autism Spectrum Disorder
  • Jul 11, 2018
  • Journal of Psychoeducational Assessment
  • Neil Brewer + 2 more

Brewer, Young, and Barnett reported a comprehensive psychometric evaluation of a new adult theory of mind measure (A-ToM) with a sample of high-functioning autism spectrum disorder (ASD) adults. Although correlations with existing theory of mind (ToM) instruments (i.e., the Strange Stories; the Frith- Happé animations) were reported, relationships with independent putative indicators of ToM development such as social–behavioral and interpersonal proficiencies were not examined. Here, we provide convergent validity data by examining the relations between A-ToM performance, and the social–behavioral skills and interpersonal relationships of ASD adults with IQs exceeding 85. ToM predicted interpersonal relationship quality via the mediating variable, social–behavioral skills, providing evidence of convergent validity for the A-ToM. Alternative models of the relationship between the three variables are described, as are the challenges associated with the interpretation of self-report social and interpersonal functioning measures.

  • Research Article
  • Cite Count Icon 25
  • 10.1007/s10071-014-0744-1
Testing problem-solving capacities: differences between individual testing and social group setting
  • Mar 26, 2014
  • Animal Cognition
  • Anastasia Krasheninnikova + 1 more

Testing animals individually in problem-solving tasks limits distractions of the subjects during the test, so that they can fully concentrate on the problem. However, such individual performance may not indicate the problem-solving capacity that is commonly employed in the wild when individuals are faced with a novel problem in their social groups, where the presence of a conspecific influences an individual's behaviour. To assess the validity of data gathered from parrots when tested individually, we compared the performance on patterned-string tasks among parrots tested singly and parrots tested in social context. We tested two captive groups of orange-winged amazons (Amazona amazonica) with several patterned-string tasks. Despite the differences in the testing environment (singly vs. social context), parrots from both groups performed similarly. However, we found that the willingness to participate in the tasks was significantly higher for the individuals tested in social context. The study provides further evidence for the crucial influence of social context on individual's response to a challenging situation such as a problem-solving test.

  • Research Article
  • Cite Count Icon 45
  • 10.1007/s00204-020-02802-6
The EU-ToxRisk method documentation, data processing and chemical testing pipeline for the regulatory use of new approach methods
  • Jul 1, 2020
  • Archives of Toxicology
  • Alice Krebs + 44 more

Hazard assessment, based on new approach methods (NAM), requires the use of batteries of assays, where individual tests may be contributed by different laboratories. A unified strategy for such collaborative testing is presented. It details all procedures required to allow test information to be usable for integrated hazard assessment, strategic project decisions and/or for regulatory purposes. The EU-ToxRisk project developed a strategy to provide regulatorily valid data, and exemplified this using a panel of > 20 assays (with > 50 individual endpoints), each exposed to 19 well-known test compounds (e.g. rotenone, colchicine, mercury, paracetamol, rifampicine, paraquat, taxol). Examples of strategy implementation are provided for all aspects required to ensure data validity: (i) documentation of test methods in a publicly accessible database; (ii) deposition of standard operating procedures (SOP) at the European Union DB-ALM repository; (iii) test readiness scoring accoding to defined criteria; (iv) disclosure of the pipeline for data processing; (v) link of uncertainty measures and metadata to the data; (vi) definition of test chemicals, their handling and their behavior in test media; (vii) specification of the test purpose and overall evaluation plans. Moreover, data generation was exemplified by providing results from 25 reporter assays. A complete evaluation of the entire test battery will be described elsewhere. A major learning from the retrospective analysis of this large testing project was the need for thorough definitions of the above strategy aspects, ideally in form of a study pre-registration, to allow adequate interpretation of the data and to ensure overall scientific/toxicological validity.

  • Research Article
  • Cite Count Icon 202
  • 10.1207/s15324818ame1902_2
An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test
  • Apr 1, 2006
  • Applied Measurement in Education
  • Steven L Wise

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found that the strongest predictors of the effort received by items were item length (i.e., how much reading or scanning was required) and item position. In addition, it was found that by treating item responses resulting from rapid guesses as missing, item means and item-total correlations were differentially affected and test score reliability decreased, whereas validity increased. Several implications of these results for low-stakes testing are discussed.

  • Research Article
  • Cite Count Icon 17
  • 10.1093/eurpub/cks108
The accuracy of self-reported data concerning recent cannabis use in the French armed forces
  • Aug 28, 2012
  • The European Journal of Public Health
  • A Mayet + 10 more

The aims were to evaluate the accuracy of self-report of past-month cannabis use in a representative sample of French military staff members and to evaluate the scale of the prevarication bias. Data from three cross-sectional surveys conducted between 2005 and 2008 (n = 3493) were used. The characteristics of self-report (sensitivity, specificity, positive predictive value and negative predictive value) were computed using tetrahydrocannabinol detection in urine as the reference. The prevalence for past-month cannabis use was 16.1% and for positive testing was 13.4%. The discriminant power of self-report was good, with an area under the receiver operating characteristics curve 0.90. Specificity (94.5%) and negative predictive values (97.8%) were good, but sensitivity (85.7%) and positive predictive values (70.4%) were lower. The lowest sensitivity values were observed in the higher categories of personnel and in the Navy, which could reflect some prevarication in these sub-populations who might believe they were more exposed to sanctions if detected. Despite certain limitations of urine analysis as a reference, because of its poor detection of occasional users, our study is in favour of good accuracy of self-reported data on cannabis use, even among the military. However, our results, derived from a population study, do not enable any assumptions on the validity of self-reported data collected during individual testing procedures for the purpose of improving occupational safety.

  • Research Article
  • Cite Count Icon 3
  • 10.1063/5.0257732
Development and testing of PMTSMC tensile, torsional, and bending single and composite forming test device.
  • May 1, 2025
  • The Review of scientific instruments
  • Ao Hu + 5 more

Mechanical testing can elucidate the relationship between material structure and guide the manufacturing process, yet the existing molding process apparatus often lacks sufficient integration of testing conditions, making it challenging to meet the demands of composite working environments. In this study, we developed a comprehensive set of molding test equipment capable of accommodating multiple mechanical load combinations to ascertain the elastic-plastic deformation behavior during the molding process. The testing apparatus comprises components for tension, torsion, and bending, and, through a modular structural design and an integrated communication network, it facilitates multi-mechanical load operations and multi-physical quantity control. This system enables the execution of molding tests under five distinct single or composite loading conditions and allows for the real-time collection of data pertaining to six physical quantities. We conducted test verification using the material from the permanent magnet traction motor stator coil of the China railway high-speed moving train (CR450). The results indicate that the comparative analysis error for individual tests remains within 7%, thereby ensuring the validity of the measurement data. Notably, in the stretching-bending composite condition tests, we observed a decrease in bending force and an increase in bending rebound following tensile loading. In the torsion-bending tests, both the bending force and rebound amount remained largely unchanged after torque application. This device offers valuable insights and mechanical testing support for the development of complex molding process apparatus for both metallic and non-metallic materials.

Save Icon
Up Arrow
Open/Close