Abstract

The Dutch and the French schools of data analysis differ in their approaches to the question: How does one understand and summarize the information contained in a data set? The commonalities and discrepancies between the schools are explored here with a focus on methods dedicated to the analysis of categorical data, which are known either as homogeneity analysis (HOMALS) or multiple correspondence analysis (MCA).

Highlights

  • In the 1960s two currents of research emerged in the spirit of Tukey’s exploratory data analysis (Tukey 1962): the French school and the Dutch school

  • Methods based on principal component analysis have roughly similar aims, such as studying similarities between rows, similarities between columns, and associations between rows and columns, but they differ with respect to the nature of the data: principal components analysis for continuous data, correspondence analysis for contingency tables, and multiple correspondence analysis for categorical data

  • In the words of Benzécri, “all in all, doing a data analysis, in good mathematics, is searching eigenvectors; all the science of it is in finding the right matrix to diagonalize.”3 many contributions in the French school were never translated to English, many references are available and include Benzécri (1982); Le Roux and Rouanet (2004); Murtagh (2005); Holmes (2008); Lebart (2008); Lebart and Saporta (2014); Lebaron and Le Roux (2015)

Read more

Summary

Introduction

In the 1960s two currents of research emerged in the spirit of Tukey’s exploratory data analysis (Tukey 1962): the French school and the Dutch school. The French school of “analyse des données” (data analysis) was led by Jean-Paul Benzécri, a mathematician and linguist, who encouraged the idea of “letting the data speak for themselves” One of his famous quotes (Benzécri 1973, p 6, Tome 2) starts with, “The model must follow the data, not the other way around,” and ends with, “What we need is a rigorous method which extracts structures from the data.” He described “statistical analysis as a tool. This popularity is mainly due to the prevalence of categorical data in this area It is more difficult for us to talk about the Dutch school and to reflect on Jan’s views of statistics, models, and inference without taking the risk of misrepresenting his thoughts. We show how Jan’s developments influenced the French school

HOMALS and multiple correspondence analysis
Classical MCA presentation
Classical HOMALS presentation
Connection between HOMALS and MCA
Handling missing values in HOMALS and MCA
Example
H Retired
Influence on the early works
Contribution of optimization methods in recent work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.