Supplementary Figure 1. The anatomy of the vocal tract shows continuous and overlapping but identifiable variation between broad ethno-linguistic groups. We show the results of the Canonical Variate Analysis (CVA) of 57 classical anthropological measurements of the oral vocal tract derived from the 3D intra-oral optical scans of n=94 participants from the ArtiVarK sample [ref16], distributed in four broad self-declared ethno-linguistic groups. These groups are: "Ca" = European or North American of European Descent, speaking Indo-European (mostly Germanic and Romance) languages; "NI" = North Indian, speaking Indo-Aryan languages; "SI" = South Indian, speaking Dravidian languages; and "C" = Chinese, speaking Sino-Tibetan languages. Panels (a) and (b) show the distribution of the participants (represented by their group) in the space of the first three Canonical Axes (CVs; explaining, sequentially, 49.3%, 37.6% and 13.1% of variance); the solid polygons are the convex hulls and the colored ellipses are the 95% confidence ellipses. Panel (c) plots the posterior probabilities of each participant belonging to the four groups (vertical bars), while the top symbols show the actual group (the squares) and the assigned group (the circles; gray circles represent "outlier" participants which cannot be assigned to any group because they are below the horizontal solid line of the 5% threshold); the dotted horizontal line shows the probability of 1.0. In this case, CVA is very successful at recovering the groups despite a few misclassifications and "outliers" (84% overall classification accuracy) and it can be seen that, while overlapping, the four groups are separated by the first three CVs.