Digital imaging in endoscopy

Louis Y. Korman

doi:10.1016/s0016-5107(98)70203-2

Abstract

Gastrointestinal endoscopy is a visual clinical discipline. Modern digital imaging technology and the development of imaging and communication will make the work of the endoscopist widely available as part of the electronic medical record. Videoendoscopy systems can now create text and images that are the components of an electronic record. Most modern endoscopy systems include both image processors and microcomputers capable of capturing, storing, retrieving, and printing these endoscopic images. The ability to create an electronic file representing the endoscopic record and integrating that file into a broader electronic record is limited by the absence of fully developed standards for medical information systems. Fortunately, developments in fields outside medicine, in particular Internet technology, will contribute substantially to the development of a multimedia electronic record. This multimedia electronic record will consist of text, images, and waveform data. The concept of multimedia is widespread in modern computing and is responsible for the rapid expansion of Internet technology. The Internet is capable of easily transmitting text, images, full-motion video, and sound (waveform).1Fraser HS Kohane IS Long WJ. Using the technology of the World Wide Web to manage clinical information.BMJ. 1997; 314: 1600-1603Crossref PubMed Scopus (23) Google Scholar This multimedia capability requires an elaborate and evolving set of standards. The network communication standard is called transfer control protocol/Internet protocol (TCP/IP), and all devices on the Internet must use this standard for communication. All Web pages transmitted on the Internet use a format called hypertext transfer protocol (HTTP) which specifies the way text and images are displayed by Internet browsers. This elaborate system of standards and protocols is maintained by a host of organizations balancing the need to rapidly advance the technology and avoid dominance of the technology by any proprietary interest. Medicine confronts a similar problem in that it must develop and support an information infrastructure capable of linking disparate systems, maintaining security, and adapting to changing needs. At present, the vast majority of endoscopic imaging systems are unable to share information with other medical information systems. These limitations are similar to those confronted by other image-oriented specialty societies such as radiology and cardiology. To circumvent these problems professional societies, manufacturers, and federal agencies have combined to develop standards designed to improve both interconnection and interoperation among medical information systems. This effort has led to the development of the Digital Imaging and Communication Standard in Medicine (DICOM Standard).1Fraser HS Kohane IS Long WJ. Using the technology of the World Wide Web to manage clinical information.BMJ. 1997; 314: 1600-1603Crossref PubMed Scopus (23) Google Scholar This standard was developed as a result of a collaboration between the American College of Radiology (ACR) and National Electrical Manufacturers Association (NEMA).2Digital Imaging and Communications in Medicine (DICOM). The National Electrical Manufacturers Association, Rosslyn (VA)1997: PS3.1-PS3.12Google Scholar, 3Bidgood Jr, WD Horii SC Prior FW Van Syckle DE. Understanding and using DICOM, the medical image communication standard.J Am Med Inform Assoc. 1997; 4: 199-212Crossref PubMed Scopus (256) Google Scholar It is anticipated that the adoption of DICOM and other medical information system standards will advance the integration of information technology into the practice of medicine. Imaging the gastrointestinal tract using a videoendoscope requires several basic steps (Fig. 1): illumination by fiberoptic light transmission, surface reflectance, magnification using a lens system, charged coupled transfer device conversion of the reflected photons to a signal, reconstruction of the signal, and projection onto a monitor.4Sivak MV. Video endoscopy.Clin Gastroenterol. 1986; 15: 205-234PubMed Google Scholar High-speed microcomputers equipped with digital image capture (frame grabber) and network boards linked to video-processors permit these images to be captured, stored, printed, and transmitted. The physical quantities of the colors that represent an image are chromaticity defined by wavelength and luminance defined by the amount of light. The colors produced by a videoendoscope are continuous values. In the digital domain, color must be divided from this continuous or analog value to a discrete digital value. These digital values are derived from a system of three-dimensional coordinates that define color space. Color picture publishing represents a color as a combination of three different colors: cyan, magenta, yellow (CMY). The cones of the human eye and most computer graphics systems represent color as a combination of red, green, and blue (RGB). Thus, any color at a point in space may be identified by a value on a scale representing the amount of red, green, and blue. An alternative system is to use hue, saturation, and intensity (HSI) as measures of color. Hue is the amount of pure color, saturation is a measure of the amount of whiteness, and intensity is the degree of brightness. Each coordinate system has properties that may make it more useful for a particular application. For example, most signals that come from cameras are transmitted as a combination of red, green, and blue. As a result many image capture boards are based on RGB signals. Image processing is easier when color is represented as HSI because calculations may need to be applied only to one HSI axis as opposed to three RGB axes. Figure 2 represents the relationship between CMY, RGB, and HSI color space. The number of unique colors that can be represented by the coordinate system depends on the length of each axis. Because the digital world is binary, that is, on or off, the number of possible values is represented by an exponent of 2 or 2x. If a color is represented in RGB space by 8 unique binary digits (bits), then there are only 28 = 256 colors to choose from. Increasing the number of digits representing a color increases the color range, that is, 16 or 24 bits define 216 = 65,536 and 224 = 16,777,216 colors, respectively. An image is presented as a continuous signal which is converted or transduced by an analog-to-digital device. To create a digital image a specific device in the computer called a frame grabber or capture board converts the color signal into a digital form. The resulting digital values are mapped to specific locations and stored as a two-dimensional array of numbers. The frame grabber performs two functions: sampling and quantification. Sampling captures evenly spaced data points that represent the image. Quantification assigns each data point a binary value. The evenly spaced data points for an image represents specific two-dimensional locations called picture elements or pixels.5Crane R. A simplified approach to image processing. Prentice Hall, Upper Saddle River (NJ)1997Google Scholar The pixel is the basic unit of a digital image and each pixel stores the value produced by quantification (Fig. 3). It is easiest to think of the value at a specific pixel location as a measure of intensity. A black and white (gray scale) image digitized by an 8-bit image capture board is represented by 256 shades of grey because 28 = 256 with black = 0 and white = 256. Color is more complex. The range of colors depends on the number of bits that can be stored at the pixel location. Thus, an 8-bit frame grabber can capture 8 bits/pixel or 256 colors/pixel, a 16-bit frame grabber 16 bits/pixel or 65,536 colors/pixel, and a 24-bit frame grabber 24 bits/pixel or 16.7 million colors per pixel. Figure 4 represents an endoscopic image with 24-, 16-, and 8-bit color ranges.It is important to recognize that the color range of an endoscopic image is small. It is for this reason that there appears to be little difference between frame grabbers that capture 16 and 24 bits/pixel. The sampling process will have a significant effect on the resolution of the image. Sampling density is simply the number of pixels into which an image is divided by the frame grabber. The greater the number of pixels/unit area, the higher the resolution of the image. For an image of a given size, sample density can be defined by the dimension of the image in pixels. For example, 640 × 480 represents an image that is 640 pixels wide and 480 pixels high. If this same image is sampled at 1024 × 768, then the number of pixels/unit area is higher and the resolution is greater. Sampling becomes important when images are enlarged because there is a discrete separation between adjacent points in the image. Thus zooming an image which has been sampled at a low density quickly reveals the pixel, a phenomenon called pixelation. In addition to width and height a digital image is represented by color as the third axis. The relationship between the image size and depth can be seen in Figure 5.The larger the image size and the greater the color depth, the bigger the digital file that is produced. This has significant implications for image management. Larger images reveal more detail, but they require greater computer resources for management. Larger images require more space for storage, faster networks for transmission, and more powerful processors for manipulation. Figure 6 illustrates the effect of changing the height and width of the sampling process without changing the color depth.Figure 6Image size as a function of dimension.View Large Image Figure ViewerDownload Hi-res image Download (PPT)Each frame grabber has sampled the endoscopic image with the same color depth but with different dimensions. The result is that the file size of the smallest image is one tenth the size of the largest image. The smallest image occupies one tenth the disk space and will be transmitted in one tenth the time. However, if the smallest image were enlarged to the size of the largest image, the resolution would be reduced. In some clinical situations size will not make a difference; a large mass may be easily identified at low resolution. But subtle findings such as the granularity of the mucosa may not be identifiable. The degree of resolution is also dependent on how the image is processed. If a small-dimension image is printed, lower resolution may be sufficient. However, if it is being enlarged to examine fine details, higher resolution may be necessary. The image produced by the video signal of an endoscope also has a specific aspect ratio, which is the relationship between height and width. This relationship can be altered by both the capture and display process. An image will be distorted if there is a significant difference between the capture and display aspect ratio. Image compression is a computational processing technique that results in a reduction of the size of an image file.5Crane R. A simplified approach to image processing. Prentice Hall, Upper Saddle River (NJ)1997Google Scholar These techniques allow large image files to be compressed for storage or transmission. Full-motion video images require a display rate of 30 frames/second. If each frame is 0.5 megabytes, then one second of digital video contains 15 megabytes of data. Disk storage would be rapidly exceeded, and image transmission even on high-speed networks would be slow. Compression is measured as a ratio of the size of the original data divided by the compressed data. There are two basic types of image compression: lossless and lossy. Lossless compression encodes and decodes the image exactly; no data is lost. Lossy compression allows redundant and nonessential information to be lost. Lossless compression has a lower compression ratio because all data is retained. A simple example of compression is run length encoding (RLE). This technique takes advantage of the presence of repetitive data. For example: AAAACCCCCCDDDDDDDFFGGGGGGHHHHHH = 4A5C7D2F6G6H. Common techniques for image compression based on different algorithms include the following: JPEG produced by the Joint Photographic Experts Group, MPEG (“M” for motion), and wavelet. One of the uncertain issues in medical imaging is the degree to which images can be compressed without loss of clinically useful information. Images are captured and stored using specific file formats. The file format represents the precise manner in which the data captured are organized. The header provides data about how the file is organized. The image file contains the image data organized by pixel. Files are not equivalent and need to be translated. Examples of image file format are: .tga, .tif, .bmp, .jpg, .gif. In endoscopy most capture boards use .tga or .tif file formats. A network is a system of linked computers arranged in a specific manner or topology. For software applications to operate correctly on these networks, several layers of standards must exist (Fig. 7).Each layer is precisely defined so that any vendor complying with the standard can plug their system into the network. This level of standardization is critical for the development of modern hospital information systems. Hospital information systems have moved away from centralized mainframe computers to networks of decentralized systems that support the operation of various departments. The challenge confronted by the medical community is how to integrate these disparate systems so that information can move easily and securely. Endoscopic information systems must comply with these standards for data to become part of these networks. When images reside on a single system there is no need for a common communication standard. However, the advent of large-capacity, portable media such as portable disks, writeable CD-ROMs, and high-speed computer networks require a common format for data exchange. Several barriers impede the development of the electronic medical record and the integration of endoscopic images into that record. These restrictions include the following:1.Complexity of the healthcare enterprise. Medical information systems must support real world activities to be effective, and the extraordinarily complex, changing nature of healthcare makes systems development a daunting and expensive task.2.Inadequate technologic infrastructure. The basic requirement of a comprehensive solution is the development of effective communication technology. The quantity and nature of the data required and the disparate sources of the data can only be supported by systems that connect a broad range of computing devices. Thus, issues of speed, accessibility, and security work to restrict and impede development.3.Capital investment. Rapid change in technology and the high cost of development inhibit corporate investment. Construction and execution of large data processing programs are notoriously difficult to do and are usually late and over budget. No single organization including federal and state governments are capable of developing and supporting this approach.4.Standards development. In spite of the efforts of numerous organizations to develop common standards for information systems, there is no uniformity in approach, support, or dissemination. The ACR and the National Electrical Manufacturers Association (NEMA) formed a joint committee early in 1983 to develop a standard means of interconnection for medical imaging devices. To accomplish this task, the standard would include a dictionary of the data elements needed for proper image display and a hardware specification for physically connecting the devices. The goal of the ACR-NEMA effort was to (1) promote communication of digital image information regardless of device manufacturer, (2) facilitate the development and expansion of picture archiving and communication systems (PACS) that can also interface with other systems within the hospital information system, and (3) allow the creation of diagnostic information databases that can be interrogated by a wide variety of devices distributed geographically. The DICOM Standard is recognized in the United States, Europe, and Japan as the standard for digital imaging in medicine. DICOM relies on explicit and detailed models of how the “things” (patients, images, reports, etc.) involved in imaging operations are described and how they are related. These models are called entity-relationship (or E-R) models and are a way to be sure that manufacturers and users understand the basis for developing the data structures used in DICOM. Figure 8 shows an example of an E-R diagram. This model is used to create information object definitions (IOD) for all of the imaging modalities covered by DICOM.In looking at an E-R diagram it is important to note that it is not a flowchart that describes the steps of information movement; rather, it shows the relationships and hierarchies of information elements. Arrows are added to diagrams so that the direction of relationships is not misinterpreted. These diagrams are used to show the assumptions made in developing the components of the DICOM standard. An information object is a combination of information entities, and each entity consists of specific modules. A service class defines the service that can take place on an information object, for example, print, store, retrieve. In DICOM a service is combined with an information object to form a service/object pair (SOP). For example, storing a CT scan or printing an ultrasound is a SOP. A device that conforms to the DICOM standard can perform this function. Thus, in a DICOM conforming network the devices must be capable of executing one or more of the operations the SOP definition prescribes. Each imaging modality has an IOD. The result is that different imaging modalities such as CT, MR, digital angiography, ultrasound, endoscopy, pathology; imaging workstations; picture archiving systems; and printing devices can be networked and execute a high level of cooperation. In addition, these imaging networks can be connected to other networks found in a hospital or facility. The modules that comprise an information entity are precisely defined and may be common to multiple entities. The patient entity in Figure 8 is common to all IOD. However, the image entity must be capable of supporting different imaging modalities. An IOD that supports endoscopy will of necessity include modules unique to endoscopy and distinct from a CT IOD. The patient information entity (IE) defines the characteristics of a patient who is the imaging subject of one or more procedures that produce images. The patient IE is modality independent, that is, it is common to all imaging modalities. The patient IE consists of only one module which is illustrated in Table 1.Table 1DICOM patient information entity module attributesAttribute nameTagTypeAttribute descriptionPatient's name(0010,0010)2Patient's full legal name.Patient ID(0010,0020)2Primary hospital identification number or code for the patientPatient's birth date(0010,0030)2Birth date of the patientPatient's sex(0010,0040)2Sex of the named patient; enumerated values: M = male F = female O = otherReferenced patient sequence(0008,1120)3A sequence which provides reference to a patient SOP class/instance pair; only a single reference is allowed; encoded as sequence of items: (0008,1150) and (0008,1155)Referenced SOP class UID(0008,1150)1CUniquely identifies the referenced SOP class; required if referenced patient sequence (0008,1120) is sentReferenced SOP instance UID(0008,1155)1CUniquely identifies the referenced SOP instance; required if referenced patient sequence (0008,1120) is sentPatient's birth time(0010,0032)3Birth time of the patientOther patient ID(0010,1000)3Other identification numbers or codes used to identify the patientOther patient names(0010,1001)3Other names used to identify the patientEthnic group(0010,2160)3Ethnic group or race of the patientPatient comments(0010,4000)3User-defined additional information about the patient Open table in a new tab Each module is a table consisting of four elements: attribute name, tag, type, and attribute description. The attribute name and description define the attribute precisely. The attribute tag uniquely identifies that attribute among all of the many other attributes present. The tag (0010,0010) always identifies the fact that this is the patient name. The attribute type specifies whether this attribute is mandatory or optional. For example, it is not necessary for an image to be transmitted with the patient's name. In fact, DICOM requires only a few mandatory attributes that give the study a unique identifier, define the modality (e.g., CT, MR, ultrasound) and provide information about the image (e.g., pixel data and number of rows and columns). DICOM also provides a dictionary that specifies the form in which the value of each attribute must be presented. The patient name attribute (0010,0010) uses person name (PN) as its value representation. PN contains five components in the following order: family name, given name, middle name, name prefix, and name suffix. Thus any system that complies with DICOM knows that (0010,0010) is a person name and that the format of the information transmitted is defined by the DICOM standard. It is not sufficient to define a standard. It is also necessary to develop a mechanism to enable vendors and purchasers to understand whether the system conforms to the standard. DICOM defines a conformance statement that must be associated with a specific implementation of the DICOM Standard. It specifies the service classes, information objects, communication protocols, and media storage application supported by the implementation. The conformance statement is provided by the vendor and identifies the system capabilities. The American Society for Gastrointestinal Endoscopy (ASGE) in collaboration with other medical and surgical societies such as the European Society for Gastrointestinal Endoscopy (ESGE), American College of Radiology, the College of American Pathologists, the American Academy of Ophthalmology, and the American DentalAssociation have defined a new supplement to the DICOM Standard. This supplement to the DICOM Standard specifies a DICOM image IOD for visible light (VL) images. This standard will enable specialists working with color images to exchange images between different imaging systems using direct network connections, telecommunications, and portable media such as CD-ROM and magneto-optical disk. The DICOM Standard for endoscopy is part of a larger standard for color images in medicine which has been provisionally approved by the DICOM committee. The current version will go through a process of public comment and testing. This period ensures that any interested party may review the document and suggest changes to a committee that is responsible for creating the final version. This process is time-consuming, but it ensures that the standard is comprehensive and meets the needs of a broad group of users. The endoscopy community through the ASGE and ESGE has also suggested that the DICOM standard be expanded to incorporate other information associated with the imaging study. These expanded standards would include image labels and overlays, sound and waveform. The goal of a true multimedia report will only be achieved when these standards have been thoroughly tested and implemented as part of the daily clinical activities of gastrointestinal endoscopists throughout the world. The cooperation of endoscopists, professional, societies, and industry is absolutely necessary for improved endoscopic information systems and will result in improved patient care. We thank the ASGE, Ollie Cass, MD, James Barthel, MD, and members of the Informatics Committee, L. J. Hunyadi, and the vendor community, Michel Delvaux, MD (ESGE), and Dean Bidgood, MD (ACR), for their work in developing the standards for gastrointestinal endoscopy.

Full Text