Abstract

The creation of biomedical databases has been a strongly felt need since the beginning of the seventies [1]. Despite this premise, very few examples of list mode cytometric file databases have been published so far, albeit single FCS files are downloadable from the Internet as supporting information in the report of isolated cases [2, 3]. The ESCCABase project, conceived and developed in 2014 in the context of the University of Urbino “Carlo Bo,” is currently supported by the European Society for Clinical Cell Analysis (ESCCA) and can be accessed on the page https://www.escca.eu/education/esccabase. The ESCCABase project can be described as a functional set consisting of a repository containing both metadata and list mode data-set in FCS format and a series of applications that allow the repository's management. The uploading of the files in ESCCABase is subject to fulfilling a series of criteria verified by an Editorial Committee. The criteria are the certainty of the diagnosis, the completeness of the panel, which must be adequate for the pathology in question, the correctness of the technical execution, the diagnostic relevance, and the relevance for educational purposes. Although potentially exploitable for all the different uses that will be recapitulated later, we started the ESCCABase project to provide clinical cytometrists with a tool akin to a hematology atlas, that is, a collection of pieces of information related to unequivocally diagnosed diseases to be used as an aid in the daily practice. Accordingly, we systematized the files consistently with the current WHO classification [7], except for acute myeloid leukemias, which have been systematized according to the FAB classification [8] because the WHO classification of AMLs has little correlation with the phenotypic aspects and because the FAB classification is still widely used in clinical settings. The objective of representing all the known nosographic entities has not been achieved yet, since the cases are uploaded as they become available, and cases of the rarest forms cannot be uploaded until submitted by contributors. This continuous updating process and the need to comply with classifications continuously revised over time explain why we see the ESCCABase project as a constantly evolving structure. Both repository files and metadata reside in the storage layer of the ESCCABase platform. They can be directly consulted by those entitled, that is, the ESCCA members. ESCCA can modify this approach and grant free access at any time according to its policies. Users can analyze the files but cannot directly download them. They can choose the files of interest in a list available on the main page of the ESCCA website but can only analyze the selected files through the “ESCCABase Viewer” utility (see IT aspects). The first group of files is the “Onco-Hematology” group, which includes eight subgroups, namely Myeloproliferative Neoplasms, Myelodysplastic/Myeloproliferative Neoplasms, Myelodysplastic Syndromes, Acute Myeloid Leukemias, Acute Leukemias of Ambiguous Phenotype, Acute Leukemias, Mature B-Cell Neoplasms, and Mature T-Cell Neoplasms. Each subgroup encompasses different sections, each of which corresponds to a separate clinical entity. Each section contains several cases sharing the same diagnosis, including the data-sets of the analyzes performed for that patient. The second group of files consists of the “Clinical Cytometry/Primary Immunodeficiencies” group, which includes five subgroups: Combined Immunodeficiencies, Predominantly Antibody Deficiencies, Diseases of Immune Regulation, Congenital Defects of Phagocytosis, and Other Well Defined Immunodeficiency Syndromes. The systematization of cases agrees with the classification formulated in 2011 by the Committee for Primary Immunodeficiency Diseases of the International Union of Immunological Societies [9]. The third group of files consists of the “Clinical Cytometry/Case Miscellany” group, which includes four subgroups, namely Bone Marrow Failure, Polyclonal or Reactive Lymphocytoses, Normal Controls used as Comparison, and Ancillary Analyzes (mainly body fluid samples). The fourth group of files, called “Basic and Experimental Cytometry,” collects a series of files variously produced during experimental activities; at the moment, its subgroups are Cell Biology, Environmental Toxicology, and Microbiology. This group is under revision because of abandoning FCS 2.0 format, to which most files in this group comply. Keeping the files as not downloadable raises the issue of the application to be used in the analysis. For this purpose, one of us (MD) developed a utility called ESCCABase Viewer. This utility, freely downloadable from the ESCCABase website, allows users to view and analyze the selected files. The ESCCABase Viewer satisfies the project's fundamental purpose, that is, the representation of files' graphic output, and does not intend to be an alternative to commercially available analysis programs. Nevertheless, the ESCCABase Viewer ensures a series of functions, including the choice of the scale, the choice between mono- or bi-parametric representation of data, the choice of the parameters to be represented, and the possibility of setting gates and getting simple statistical information (Figure 1). The scales available rely on linear and log-like transforms, of which the latter is intended to represent immunophenotypic analyses. The log-like transform behaves in a substantially satisfactory way; still, it sporadically piles up events in the first channels, probably due to outliers adversely affecting proper automatic scaling, but attempts are underway to solve the problem. As for the editing of quadrants' offsets, this feature is not available, but we could add it in the future - like other improvements required by users. Compensation of the data is not currently adjustable as the originally intended use of the archive only involved already compensated files. However, the uploaded files retain their spillover matrix, which might be used in a future version to manage their compensation. ESCCABase project is an ever-growing project because of a series of conditions, namely the rapid change of classifications, the evolution of the technology, and the expansion of its application fields. It follows that only if powered from the outside can the ESCCABase project's functions be maintained and further increased. This need raises the problem of erasing sensitive data. For this purpose, a second unique utility was developed, called ESCCABase Contribute, which will be soon downloadable from the ESCCABase website. ESCCABase Contribute allows editing or removing the keywords containing sensitive data through a simple grid-like interface, uploading the files to the ESCCABase “data bank” storage, and alerting the Editing Committee that a new file has been submitted and is pending review. Besides the data-sets, the utility uploads metadata including information about the case, clinical data, images, and the content of all the residual keywords, among which technical details as laser voltages and spillover matrix. The availability of keywords housing the spillover matrix makes feasible future functions to modify the compensation. As said before, ESCCABase is a client–server platform and consists of a client application, ESCCABase Viewer, and a server application, stratified in three different layers, namely the storage, maintenance, and communication layer. In the previous paragraph, we have described another client application that will be released soon, that is, ESCCABase Contribute. The storage layers mainly consist of a data structure based on a SQL relational database, which manages the logic connection between FCS files and their related metadata. The maintenance layer allows the managing of the to-be-stored data. ESCCABase Contribute opens the FCS format files from version FCS 3.0 on, anonymizes them, and sends them to the Editing Committee. We are still studying a new version of ESCCABase Contribute to managing the not fully FCS-compliant files. The communication layer is where all high-level operations (authentication, requests, et cetera) take place. In the first versions of the ESCCABase Project, the communication layer was implemented through a custom-made User Datagram Protocol (UDP) based communication between the client and the server. Further on, it was modified and enhanced to simulate regular browser communication and overcome usage restrictions inside protected institutions, like hospitals and universities, where firewalls and other network security instruments usually cause difficulties to the users. We designed ESCCABase Viewer as a portable one-file-only client, requiring downloading from the ESCCABase website without any installation. This design application assures the greatest flexibility, overcoming restrictions like the impossibility to install new software on institutional computers. While the server manages authentication, communication, maintenance, and storage of the FCS files, the client manages all the cytometry-related operations, like data displaying and comparison, gating, statistical calculations et cetera. Due to the platform's distributed calculation pattern, ESCCABase can easily manage thousands of simultaneous user connections. Just from the beginning, the ESCCABase project's potentiality exceeded the original purposes, allowing the practical demonstration of correctly diagnosed cases during teaching activities. It is of the utmost importance that students consult files relevant to the topic discussed and interact with them. Collective access to ESCCABase can occur through the standard facilities of a computer room, and ESCCABase based courses already took place successfully during the courses held both by ESCCA, and the Italian Society for Cytometric Cell Analysis (ISCCA), a national scientific society affiliated to ESCCA. Individual but regulated access to the Repository opens the way to a self-administered educational approach, that is, to self-learning activity, allowing participants to answer questions and gain the credits provided by their country's legislation. ESCCABase Viewer's exploitation as the only analytical utility makes ESCCABase especially suitable in external quality assessment (EQA). The absence of variables other than the operator's interpretation is exceedingly desirable in evaluating single results against the general average of all the results obtained. Some attempts in this direction are currently in development thanks to a collaboration between ESCCA and UK NEQAS. The original purposes of ESCCABase project explain why adherence to the MIFlowCyt standard has not been among its primary goals so far. Nonetheless, the tremendous importance of the MIFlowCyt standard makes it desirable that the ESCCABase project also complies with it in the future. ESCCABase Contribute is a tool fit for this purpose. In a forthcoming version, we will shape its grid-like interface to fulfill the MIFlowCyt checklist [10] and transmit to ESCCABase the pieces of information not already contained in the keywords. ESCCABase was born spontaneously and unrelated to the other projects mentioned in this article. While developing a sort of evolutionary convergence over time and sharing a series of possible objectives, ESCCABase maintains its diversity. ImmPort project does not focus on cytometric techniques only but collects, besides clinical and methodological information, a tremendous amount of data produced through other laboratory methods [11]. FlowRepository project focuses on data produced by flow cytometry techniques. Still, its final goal is to offer the possibility of validating and replicating the published results [12]. Cytobank is intended more as a collection of analytical tools and a platform for their development than a real repository [6]. In each case mentioned above, the files are up-loaded “as such” and can be freely downloaded by the users, albeit with restrictions. These files have had a role in activities other than replicating experiments; still, they have not had a role in teaching or quality assurance programs as far as we know. The files uploaded in the ESCCABase project have primarily an educational role; nevertheless, they constitute an innovative multi-purpose tool for clinical diagnostics, the transmission of knowledge, training of operators, and the analytical quality pursuit (Figure 2). Claudio Ortolani, Mario D'Atri, and Stefano Papa: contributed to conception and design, analysis and interpretation of data, and drafting of the article. Genny Del Zotto, Barbara Canonico, and Loris Zamai: critically revised the article. The authors report no potential conflict of interest.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call