Referências Bibliográficas Genética - Seleção 2025
Arigos Genética.pdf
Documento PDF (6.4MB)
Documento PDF (6.4MB)
www.nature.com/ejhg
ARTICLE
OPEN
Identification of people with Lynch syndrome from those
presenting with colorectal cancer in England: baseline analysis
of the diagnostic pathway
✉
Fiona E. McRonald 1 , Joanna Pethick1, Francesco Santaniello 1,2, Brian Shand1,2, Adele Tyson 1,3, Oliver Tulloch1,2, Shilpi Goel1,2,
Margreet Lüchtenborg1,4, Gillian M. Borthwick5, Clare Turnbull6, Adam C. Shaw3, Kevin J. Monahan7, Ian M. Frayling 7,8,
✉
Steven Hardy1 and John Burn 5
1234567890();,:
© The Author(s) 2024
It is believed that >95% of people with Lynch syndrome (LS) remain undiagnosed. Within the National Health Service (NHS) in
England, formal guidelines issued in 2017 state that all colorectal cancers (CRC) should be tested for DNA Mismatch Repair
deficiency (dMMR). We used a comprehensive population-level national dataset to analyse implementation of the agreed
diagnostic pathway at a baseline point 2 years post-publication of official guidelines. Using real-world data collected and curated by
the National Cancer Registration and Analysis Service (NCRAS), we retrospectively followed up all people diagnosed with CRC in
England in 2019. Nationwide laboratory diagnostic data incorporated somatic (tumour) testing for dMMR (via
immunohistochemistry or microsatellite instability), somatic testing for MLH1 promoter methylation and BRAF status, and
constitutional (germline) testing of MMR genes. Only 44% of CRCs were screened for dMMR; these figures varied over four-fold with
respect to geography. Of those CRCs identified as dMMR, only 51% underwent subsequent diagnostic testing. Overall, only 1.3% of
patients with colorectal cancer had a germline MMR genetic test performed; up to 37% of these tests occurred outside of NICE
guidelines. The low rates of molecular diagnostic testing in CRC support the premise that Lynch syndrome is underdiagnosed, with
significant attrition at all stages of the testing pathway. Applying our methodology to subsequent years’ data will allow ongoing
monitoring and analysis of the impact of recent investment. If the diagnostic guidelines were fully implemented, we estimate that
up to 700 additional people with LS could be identified each year.
European Journal of Human Genetics (2024) 32:529–538; https://doi.org/10.1038/s41431-024-01550-w
INTRODUCTION
At least 3% of cancers are attributable to constitutional (germline)
pathogenic variants in a cancer susceptibility gene (CSG) [1].
Families harbouring these constitutional pathogenic variants were
classically ascertained by clinical geneticists, based on familial
clustering of related tumour types in several relatives, multiple
primary tumours in some individuals, and tumour development at
a younger age than typical for that cancer type. However, more
widespread availability of molecular diagnostics has revealed
other individuals who carry a similar genetic predisposition, but
with a more subtle familial phenotype, or absence of a family
history of similar cancers [2, 3]. Ascertainment has therefore been
biased towards the classical familial pattern rather than the
individual’s own phenotype.
The Mismatch Repair (MMR) family of proteins is responsible for
rectifying DNA replication errors that arise during the S-phase of
the cell cycle. Germline pathogenic variants affecting any of the
four MMR genes MLH1, MSH2, MSH6 or PMS2 underlie Lynch
syndrome (LS), conferring a strong predisposition towards various
cancers—predominantly colorectal and endometrial carcinoma,
but also others including urothelial, ovarian, and upper gastrointestinal cancers, and sebaceous dermatological tumours [4].
Estimates of the true population prevalence of LS [5–7] indicate
substantial underdiagnosis, hence NHS England’s imperative to
identify more cases. Outcomes for people diagnosed with LS
could be improved by offering regular colonoscopy, aspirin and
prophylactic gynaecological surgery, leading to reduced cancer
incidence and earlier diagnosis. This could result in significant
financial savings across the NHS [8], in addition to the primary
objective of saving lives.
National Institute for Health and Care Excellence (NICE) guidelines (DG27) [9] issued in February 2017 state that all colorectal
cancers (CRC) should be tested for MMR deficiency (dMMR) at the
point of diagnosis, using either immunohistochemistry (IHC) or
microsatellite instability (MSI) testing. Any tumours with evidence of
dMMR should undergo further molecular tests, culminating in
germline MMR gene testing for individuals at highest likelihood of
having LS. In 2018, the charity Bowel Cancer UK initiated a Freedom
1
National Disease Registration Service, NHS England, London, UK. 2Health Data Insight, Cambridge, UK. 3Guy’s and St. Thomas’ NHS Foundation Trust, London, UK. 4Cancer
Epidemiology and Cancer Services Research, King’s College London, London, UK. 5Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, UK.
6
The Institute of Cancer Research, Sutton, UK. 7St Mark’s Hospital Centre for Familial Intestinal Cancer, Imperial College, London, UK. 8St Vincent’s University Hospital, Dublin,
Ireland. ✉email: fiona.mcronald@nhs.net; john.burn@newcastle.ac.uk
Received: 8 August 2023 Revised: 8 January 2024 Accepted: 23 January 2024
Published online: 15 February 2024
F.E. McRonald et al.
530
of Information request [10] and campaign [11]—‘Time To Test’—
finding that MMR testing guidelines were being implemented by
only 17% of hospitals in England, with cited barriers to testing
including funding, staff capacity, awareness and local policy.
Whilst the diagnostic guidelines are clear, it is important to
evaluate whether these are being consistently applied across the
different NHS Cancer Alliances (regional healthcare partnerships
that drive integration of local cancer services), and to highlight
any inequities. This requires large scale, population-level collection
and curation of molecular testing data, and robust linkage to
cancer diagnoses. The National Disease Registration Service
(NDRS) has developed a programme of work collating germline
and somatic genetic testing data from NHS laboratories. By linking
these data at patient- and tumour-level to national cancer
registration records [12], we are, for the first time, able to describe
the English national landscape of LS molecular diagnostic testing.
The baseline data presented here refer to all colorectal cancers
diagnosed in England in the year 2019, the first year for which
national molecular data collections made this possible.
METHODS
Cancer registration
The National Cancer Registration and Analysis Service (NCRAS), part of NHS
England, constructs the population-based cancer registry for England [12].
Somatic genomic testing data was derived from two sources: bespoke data
extracts supplied by individual genomic laboratories, and pathology
reports acquired through the nationally mandated Cancer Outcomes and
Services Dataset (COSD). Laboratory germline data on MMR genes was
submitted and processed via pseudonymisation and bioinformatics
pipelines previously described [13], and linked at patient-level. Somatic
data was linked at tumour-level. Where MMR testing was referenced in the
initial pathology report, but there was no supplementary report containing
the MMR test results, this was fed back to the relevant NHS Trust by the
NCRAS Data Improvement Team, to maximise national data completeness.
Data analysis
From the 2019 end of year cancer registration table, 37,662 colorectal
tumours (10th revision of the International Classification of Diseases (ICD10) C18, C19 or C20) diagnosed in 2019 were identified. All tumours were
linked to the genomic testing data up to the end of 2020 (latest available
data at the time of writing).
From the cancer registry data, information on patients’ demographics
and tumour information was retrieved. Patients were assigned a Cancer
Alliance based upon their postcode of residence at diagnosis, using the
2019 geographical boundaries. Age groups were banded from 10–29
years, then by 10-year intervals between 30–49 years, 5-year intervals
between 50–89 years, then 90 years+.
Self-reported gender and ethnicity information is recorded in the cancer
registration data from clinical records; ethnicity was categorised according to
the 16-category classification as used in the 2021 Census of England and
Wales. This was then collapsed to seven ethnic groups: White, Asian, Chinese,
Black, Mixed, Other, and Unknown. Each patient’s socioeconomic deprivation
quintile was assigned using the patient’s residential postcode at the time of
diagnosis and based upon the quintile distribution of the lower-layer super
output area (LSOA) ranking of the Indices of Multiple Deprivation (IMD) 2019,
with 1 being the most deprived and 5 being the least deprived. Tumour
stage is recorded according to the Union for International Cancer Control
(UICC) Classification of Malignant Tumours (TNM). Colorectal cancer grading
is recorded as 1 to 4, with 1 representing well differentiated cancer cells
through to 4 when cancer cells are poorly differentiated or undifferentiated.
Descriptive statistics, chi-squared and t-tests, and logistic regression
analyses were carried out using R software [14].
Ethical and legal considerations
The data included in this study were collected and analysed under the
National Disease Registries Directions 2021 [15], made in accordance with
sections 254(1) and 254(6) of the 2012 Health and Social Care Act.
Before embarking upon the collection of genetic data, we sought
courtesy permission from the Caldicott Guardian at each NHS Trust
housing the relevant laboratories.
Patient and public involvement
Author JB has been involved with the patient group Lynch Syndrome UK
(LSUK) from when it was established as a charity in 2014, initially as the
Clinical Director. JB, GMB, FEM, IMF and KJM have all presented work in
progress to LSUK at their annual conference, and are in regular contact,
receiving patient feedback.
RESULTS
Somatic testing
In 2019, 37,662 CRCs (from 37,090 people) were diagnosed in
England. Under half of these (44%; 16,463) were tested for
dMMR. IHC was the preferred test method in 89% of cases; the
remainder were tested by MSI (8%) or by both methods (3%). The
dMMR detection rates were slightly higher for MSI (19%
detection rate) than for IHC (16% detection rate) (χ2 = 12.0;
df = 1; p < 0.01).
To triage individuals for germline testing as per NICE guidelines,
dMMR tumours can be further subdivided according to
MLH1 status. Individuals whose tumours are proficient for MLH1,
but abnormal for one or more of the other MMR proteins (MSH2,
MSH6 or PMS2), should be offered direct referral for germline
testing; tumours with MLH1 abnormality require further
somatic tests.
Overall, 16% (n = 2576) of CRCs were dMMR. Of these, 15%
(n = 386 tumours from 372 patients) were deficient in MSH2,
MSH6 or PMS2 (but MLH1-proficient), so were eligible for germline
testing; 121 of these patients (33%) received a germline test. The
remaining 85% (n = 2190) CRCs were MLH1-deficient or MSI-High,
indicating requirement for further somatic tests. Downstream
testing was, however, performed on only 54% (n = 1178) of these,
comprising 1041 tumours tested for BRAF mutational status and a
further 137 tested for MLH1 promoter hypermethylation in the
absence of BRAF testing.
Of those MLH1-deficient CRCs tested for BRAF, 34% (n = 356)
had a normal (i.e. wild-type) result, of which 63% (n = 224) were
reflex tested for MLH1 promoter hypermethylation, as per NICE
guidance. An additional 52 MLH1 promoter tests were performed
following abnormal or failed BRAF results. Thus a total of 413
tumours were tested for MLH1 promoter hypermethylation as part
of the Lynch screening pathway, of which 138 tumours (33%),
from 137 patients, were unmethylated, and therefore eligible for
germline testing. Full testing pathways and results are shown in
Fig. 1.
Variation in MMR testing
Table 1 shows numbers and percentages of people having MMR
testing according to patient and tumour characteristics. Females
had a slightly lower testing rate than males (42.8% vs. 44.5%). The
lowest testing rates were found among persons of White (43.4%)
or unknown (39.8%) ethnicity, whereas the highest testing rate
was observed among Black persons (56.5%). Testing rates were
highest among persons from the least deprived areas (45.8%) and
lowest among those from the most deprived areas (40.8%). Higher
testing rates were observed for tumours with stage II and III (52.1%
and 52.6%, respectively) than for stage I (40%) and IV (41%) and
tumours with unknown stage (27.9%). Similarly, higher testing
rates were found among grade 2 (52.8%) and 3 (53.7%) tumours
than grade 1, 4 and unknown grade tumours (32.8%, 27.5% and
15.8%, respectively). The most striking difference in MMR testing
rates was according to Cancer Alliance, where tumour MMR
testing rates varied from 17 to 71% (Fig. 2). When compared to the
Cancer Alliance with the highest testing rate (West Yorkshire and
Harrogate), and apart from the surrounding Cancer Alliances
(Humber, Coast and Vale, and South Yorkshire and Bassetlaw),
tumours diagnosed in all other Cancer Alliances were significantly
less likely to be tested; more markedly so when adjusting for
European Journal of Human Genetics (2024) 32:529 – 538
F.E. McRonald et al.
531
Fig. 1 Consort diagram showing Lynch syndrome testing pathway from cancer diagnosis to germline testing in 37,662 colorectal cancers
(from 37,090 patients) diagnosed in England in 2019. For all levels of the Consort diagram, borderline results have been categorised as
eligible to proceed to the next stage of the testing pathway, e.g. ‘deficient’ box in ‘tested tumours with MMR deficiency or MSI’ row includes
both abnormal and borderline results; ‘proficient’ box includes normal results only; ‘failed’ box includes everything else (failed/not tested/
unknown). Dark pink boxes represent the NICE DG27 ‘official’ pathway to germline testing, defined as MMR deficiency with (in the case of
MLH1 deficiency or MSI-High status), an unmethylated MLH1 promoter. An unbroken line of pink boxes from top to bottom indicates the
‘textbook’ NICE-recommended pathway. Other pink boxes show paths to germline testing performed on samples that were incompletely
tested, but were MLH1 deficient and unmethylated. Orange boxes indicate germline tests done under broader inclusion criteria, i.e. MLH1
deficiency with BRAF wild type but MLH1 promoter methylated, failed testing, or untested. Dark grey boxes indicate either a lack of testing, or a
test result that would signify a legitimate end to the testing pathway. Light brown boxes indicate failed tests.
demographic differences between Cancer Alliances. Full outcomes
from the uni- and multivariable logistic regression analyses are
shown in Supplementary Table 1.
Access to somatic follow up testing
Significant variation between Cancer Alliances was also observed
when considering follow up of dMMR tumours (either germline
testing for MSH2/MSH6/PMS2 deficient tumours, or further
somatic testing for MLH1 deficient tumours). Performance of
Cancer Alliances on follow up metrics did not necessarily
correspond to their performance in arranging initial MMR testing
(Supplementary Fig. 1).
Constitutional (germline) testing
Overall, 507 individuals with CRC were eligible for germline
testing based on NICE guidelines—i.e. their tumours were either
abnormal for MSH2/MSH6/PMS2 (n = 372) or abnormal for
MLH1/MSI-High with no evidence of MLH1 promoter methylation
(n = 135). Of these 507 people, just 36% (n = 180) received a
germline full screen test following their diagnosis. If eligibility for
germline testing is instead based upon the NHS National
Genomic Test Directory (indication R210) [16], this includes all
patients whose MLH1-deficient/MSI-High tumours are BRAF wildtype (i.e. skipping MLH1 promoter methylation testing). Adopting
these broader eligibility criteria—i.e. at least one of BRAF wild
type or failed, or MLH1 promoter unmethylated or failed—786
people with CRC could have been offered a germline test. Of
these 786 patients, 36 (5%) had either already received a
germline test before diagnosis, or received a targeted germline
test after diagnosis—i.e. they were members of families already
known to genetics services. Thus 750 patients were, as a result of
tumour molecular testing, newly identified as being eligible for
germline testing, of whom only 210 (28%) actually received a
germline test.
European Journal of Human Genetics (2024) 32:529 – 538
Of all 37,090 patients diagnosed with CRC in 2019, 487 (1.3%)
received germline MMR testing (Table 2). Those tested could be
split into four groups, depending on (1) the timing of the germline
test with respect to the 2019 CRC diagnosis (pre- or postdiagnosis), and (2) the scope of the germline test (full screening of
all MMR genes, versus targeted testing for a specific pathogenic
variant in a member of a known LS family) (Table 2). This
distinction is important, as it reflects how patients were
ascertained, and thus what proportion were identified through
the NICE-recommended tumour testing pathway, as opposed to
being already known to clinical genetics services.
A minority of germline tests (56/487; 11%) were targeted
tests; these are indicated when a specific pathogenic variant has
previously been identified in a relative. Of these, germline
testing preceded the 2019 CRC diagnosis (i.e. predictive/presymptomatic testing) in 30 (54%); the remaining 26 (46%)
underwent confirmatory germline testing following their CRC
diagnosis.
Forty-one people (8% of all tested) had full screen testing prior
to their 2019 CRC diagnosis; this could either follow an earlier
cancer diagnosis, or be a clinical genetics referral for ‘indirect
testing’ where family history or personal polyp status was
sufficiently strong to warrant variant-agnostic germline testing.
Three hundred and ninety out of 487 germline tests (80%) were
full screen, post-diagnosis tests; this group represents newlyidentified LS families, as opposed to those already known to
genetics services. However, not all 390 tests were performed as per
NICE or National Genomic Test Directory guidelines (Table 3). Even
taking the more liberal eligibility criteria for germline testing, as
outlined above [16], only 210 out of 390 (54%) followed
recommended diagnostic pathways. The remainder comprised 45
people whose tumour records showed no evidence of dMMR
testing, 74 with MMR proficient tumours, 53 with MLH1 deficiency/
MSI-High status but no evidence of downstream somatic testing,
F.E. McRonald et al.
532
Table 1.
MMR testing according to patient and tumour characteristics.
MMR tested?
No (n | %)
Total
χ2 p value
Yes (n | %)
21,199
56.3%
16,463
43.7%
10–29
40
38.1%
65
61.9%
68.9%
Age
<0.001
30–39
228
31.1%
504
40–49
450
30.9%
1006
69.1%
50–54
675
40.4%
995
59.6%
55–59
1289
47.2%
1444
52.8%
60–64
1965
49.7%
1986
50.3%
49.6%
65–69
2239
50.4%
2203
70–74
3472
54.2%
2935
45.8%
75–79
3196
57.4%
2374
42.6%
80–84
3501
64.8%
1904
35.2%
85–89
2670
76.0%
842
24.0%
90+
1474
87.8%
205
12.2%
Female
9581
57.2%
7161
42.8%
Male
11,618
55.5%
9302
44.5%
Asian
363
43.8%
466
56.2%
Black
261
43.5%
339
56.5%
Chinese
46
48.9%
48
51.1%
Mixed
79
52.3%
72
47.7%
Other
250
52.1%
230
47.9%
Unknown
1689
60.2%
1117
39.8%
White
18,511
56.6%
14,191
43.4%
1—Most deprived
3571
59.2%
2463
40.8%
2
3753
55.6%
3000
44.4%
3
4584
57.3%
3410
42.7%
4
4720
55.9%
3720
44.1%
5—Least deprived
4571
54.2%
3870
45.8%
1497
82.6%
315
17.4%
Gender
0.001
Ethnicity
<0.001
Socioeconomic deprivation quintile
<0.001
Cancer alliance
Cheshire and Merseyside
<0.001
East Midlands
1691
49.5%
1724
50.5%
East of England—North
1399
63.8%
793
36.2%
45.9%
East of England—South
1255
54.1%
1065
Greater Manchester
1424
80.8%
338
19.2%
Humber, Coast and Vale
332
31.2%
731
68.8%
Kent and Medway
874
70.2%
371
29.8%
Lancashire and South Cumbria
883
67.0%
435
33.0%
North Central and East London
580
41.2%
829
58.8%
North East and Cumbria
1561
66.7%
780
33.3%
North West and South West London
1015
59.0%
706
41.0%
Peninsula
877
60.4%
576
39.6%
Somerset, Wiltshire, Avon and Gloucestershire
1353
63.1%
792
36.9%
South East London
304
34.9%
566
65.1%
South Yorkshire and Bassetlaw
324
31.7%
698
68.3%
Surrey and Sussex
1423
61.2%
904
38.8%
Thames Valley
688
44.2%
870
55.8%
Wessex
877
46.3%
1016
53.7%
West Midlands
2366
56.8%
1798
43.2%
West Yorkshire and Harrogate
476
29.2%
1156
70.8%
European Journal of Human Genetics (2024) 32:529 – 538
F.E. McRonald et al.
533
Table 1.
continued
MMR tested?
No (n | %)
χ2 p value
Yes (n | %)
Tumour stage
<0.001
I
3614
60.0%
2413
40.0%
II
3740
47.9%
4076
52.1%
III
4555
47.4%
5045
52.6%
IV
4314
59.0%
3000
41.0%
Unknown
4976
72.1%
1929
27.9%
1
868
67.2%
424
32.8%
2
10,842
47.2%
12,117
52.8%
3
2200
46.3%
2550
53.7%
4
29
72.5%
11
27.5%
Unknown
7260
84.2%
1361
15.8%
Tumour grade
<0.001
Data shown as absolute numbers and proportions.
Fig. 2 Geographical variation in compliance with guidelines to test all CRCs for dMMR. Proportion of 2019-diagnosed colorectal cancers
tested for dMMR, stratified by NHS England Cancer Alliance (using 2019 geographical boundaries and based upon patient postcode of
residence at diagnosis).
Table 2.
Number of germline MMR tests performed in 2019, split by test timing and scope.
Timing of germline test, with respect to CRC
diagnosis in 2019
Pre-diagnosis
germline test
Scope of
germline test
Post-diagnosis
germline test
Total
Full screen test (Interrogates all MMR genes for an unknown variant)
41
390
431
Targeted test (Looks for a specific MMR gene variant already known to
segregate in family members)
30
26
56
Total
71
416
487
European Journal of Human Genetics (2024) 32:529 – 538
F.E. McRonald et al.
534
Table 3.
Full screen, post-diagnosis germline tests, split by route to testing (somatic test status), and outcome of the germline test.
Key: (a) Insufficient somac tesng performed; (b) germline tesng not based on NICE guidelines; (c) NICE pathway followed correctly.
Somac test status
No MMR test donea
MMR tested — all genes proficient / MSSb
MMR tested — MSH2/MSH6/PMS2 deficientc
MLH1 deficient/MSI — no further somac donea
MLH1 deficient/MSI & BRAF wt +/or MLH1 unmethc
MLH1 deficient/MSI & BRAF mut + MLH1 meth (or 1
abnormal, the other untested)b
TOTAL
Total tested
45
74
121
53
89
Normal
34
68
41
31
63
Germline test result
VUS
Pathogenic (Class
(Class 3)
4/5)
2
9
3
3
2
78
4
18
1
25
8
390
8
245
0
12
Tumour MMR testing
0
133
% pathogenic
20.0
4.1
64.5
34.0
28.1
0.0
34.1
No tumour MMR testing
100%
45 (12%)
Percentage of tumours having MMR testing
5 (19%)
75%
20 (49%)
15 (50%)
50%
345 (88%)
21 (81%)
25%
21 (51%)
15 (50%)
0%
Pre−diagnosis of CRC
Full screen
Post−diagnosis of CRC
Targeted
Full screen
Targeted
Nature of germline test
Fig. 3 Number and percent of tumours having dMMR testing, grouped by timing of patient’s germline genetic test (pre- or post-2019
diagnosis of CRC) and scope of their germline test (full screen or targeted). Bars from L to R: full screen germline test performed pre-2019
cancer diagnosis; targeted germline test performed pre-2019 cancer diagnosis; full screen germline test performed post-2019 cancer
diagnosis; targeted germline test performed post-2019 cancer diagnosis. Red bars signify that tumour dMMR testing has taken place; orange
bars indicate no tumour dMMR test was performed.
and eight with MLH1 deficiency but mutant BRAF/MLH1 promoter
hypermethylation.
Thus of the total 390 full screen, post-diagnosis germline tests
carried out, 210 patients (54%) were tested appropriately, 98
(25%) with no or insufficient somatic testing, and 82 (21%)
following somatic results that did not indicate germline testing.
Overlap between somatic and germline testing
Individuals having a germline test post-diagnosis were significantly more likely (χ2 = 58; p < 0.0001) to have had MMR testing
on their 2019-diagnosed tumour(s) (366/416; 88%) than those
whose germline test had preceded their 2019 CRC diagnosis
(36/71; 51%). The group most likely to have had MMR tumour
testing were the full screen, post-diagnosis germline test group, at
88%) (Fig. 3).
Outcome of germline testing
A germline MMR pathogenic or likely pathogenic (P/LP) variant
was reported in 206/487 (42%) people tested, comprising variants
detected in 156/431 (36%) people undergoing full screen testing,
European Journal of Human Genetics (2024) 32:529 – 538
F.E. McRonald et al.
and 50/56 (89%) people undergoing targeted (familial) testing.
Abnormal germline results were distributed between the four
MMR genes and EPCAM as expected [4], with variants in MLH1 and
MSH2 comprising 65% of cases, and PMS2 just 15% (Table 4).
When full screen, post-diagnosis germline tests were stratified
according to prior somatic testing status, variant detection rates
ranged from 0–64%, (Table 3). A P/LP variant was detected in
103/210 (49%) of people whose tumour testing pathway
followed NICE guidelines, in 27/98 (28%) of those where somatic
testing was absent or incomplete, and in 3/82 (4%) of those
where somatic testing results did not indicate germline testing
(Table 3).
Of patients undergoing full screen, post-diagnosis germline
testing, those with MMR-proficient (pMMR) tumours were
significantly younger than those with dMMR tumours (mean
Table 4.
Mutated gene spectrum for all 2019-diagnosed colorectal
cancer patients who had an abnormal germline Lynch test (n = 206;
includes all germline test scopes and timings).
Gene
Number of patients
with pathogenic/likely
pathogenic variant,
split by gene (N = 206)
Proportional distribution
by gene of all patients
with pathogenic/likely
pathogenic variant (%)
MLH1
65
31.6
MSH2
68
33.0
MSH6
41
19.9
PMS2
31
15.0
EPCAM
1
0.5
48.1 years vs. 56.8 years; Welch two sample t-test statistic = 4.29
(95% CI = 4.67–12.68, df = 111.78, p = 0.001).
Timeline of complete molecular diagnostic pathway for LS
The median time between CRC diagnosis and functional MMR
testing (IHC and/or MSI) was 24 days (mean 58 days), with a further
34 days elapsing before follow up somatic testing, i.e. the total
median time to complete somatic testing was 58 days (mean
129 days). The main diagnostic pathway delay occurred between
somatic and germline testing, the latter being performed at median
315 days (mean 368 days) following initial CRC diagnosis. For all
tests, there was a long right-hand tail in the distribution, indicating
delays exceeding 1000 days for some individuals (Fig. 4).
DISCUSSION
This is the first comprehensive analysis of a policy to identify
people with Lynch syndrome (LS) across a national healthcare
system serving 55 million people. Despite being a snapshot in
time, prior to coordinated expansion of testing [17], it provides a
baseline for assessment of future developments, and is a likely
reflection of underdiagnosis of this treatable disorder in other
developed countries [18, 19].
In depth analysis of comparable populations suggests a LS birth
prevalence of 1 in 280–1 in 500 [5–7], implying a population
prevalence of one to two hundred thousand in England. Pooled data
across clinical and laboratory genetics services indicates under 10%
are known. A health economic analysis [8] indicated the clinical utility
of testing all CRCs for dMMR; on this evidence, NICE introduced the
current pathway in 2017 [9]. The rationale for identifying LS carriers is
MMR
24
Type of test
MMR
Follow−up
Follow−up
Germline
58
Germline
315
0
400
800
1200
Length of time between diagnosis and test (days)
Fig. 4 Distribution and average time from initial diagnosis (at day 0) to functional testing (MMR IHC/MSI), subsequent follow-up (somatic
BRAF/MLH1 promoter methylation testing following an MMR test) and germline testing. Within each box, vertical black lines denote
median values (enumerated below the box), and red triangles denote mean values; boxes extend from the 25th to the 75th percentile of each
group’s distribution of values and denote the interquartile range (IQR). Horizontal extending black lines denote adjacent values (i.e. the most
extreme values within 1.5 x IQR of the 25th and 75th percentile of each group); black dots denote the observations outside the range of
adjacent values (i.e. the outliers). Only full screen, post-diagnosis germline tests are included here (pre-diagnosis tests went back ~18 years).
European Journal of Human Genetics (2024) 32:529 – 538
535
F.E. McRonald et al.
536
further enhanced by the demonstration of a 50% reduction in their
CRC incidence following daily aspirin [20] (now also a NICE guideline)
[21], and the highly significant reduction in their non-CRC LSassociated cancer risk when prescribed dietary supplementation
with resistant starch [22]. Identification of dMMR cancers as a target
for immunotherapy [23–26] provides further justification for functional testing of all tumours, regardless of patient LS status.
The health economic benefit can be maximised by offering
cascade testing to relatives to identify other at-risk carriers.
Management guidelines for LS are gene-specific: colonoscopic
surveillance should be offered at least every 2 years, starting from
age 25 for carriers of pathogenic or likely pathogenic variants (PVs)
in MLH1 or MSH2, and from age 35 for those with PVs in MSH6 or
PMS2 [27]. From 2023, colonoscopic surveillance of LS carriers will
be incorporated into the NHS national bowel cancer screening
programme.
Any guidelines, however good, are only beneficial if properly
implemented. The Bowel Cancer UK investigation in 2018
indicated that only 17% of hospitals in England were following
NICE recommendations for tumour MMR testing [10]; however this
questionnaire-based investigation was limited in its design,
potentially had a response-bias, and was set up to ask the
question at hospital-level rather than patient-level. The current
study is therefore the first national evaluation of MMR testing in
England, covering the entire LS diagnostic pathway from initial
tumour testing (IHC/MSI) through to germline testing, and is only
possible due to the systematic collection, curation, and linkage of
comprehensive NHS laboratory data within the National Disease
Registration Service (NDRS).
Our data show that only 44% of 2019-diagnosed CRCs were
tested for MMR status (IHC and/or MSI), and highlight large
disparities in provision across England. There was more than a
four-fold difference in MMR testing rates between the best- and
worst-performing Cancer Alliance. Notably, the three best
performing Cancer Alliances (West Yorkshire and Harrogate, South
Yorkshire and Bassetlaw, and Humber, Coast and Vale) belong to
the Yorkshire and Humber (YH) region, where, between April 2017
and March 2019, the Yorkshire Cancer Research Bowel Cancer
Improvement Programme (YCR BCIP) funded pilot MMR screening
for all CRC patients in the region who were not already covered by
the previous inclusion criteria (<50 years of age) [28]. Although the
YH pilot overlapped this NDRS evaluation only for the first
3 months of 2019, the region performed consistently well
throughout the year, indicating the ongoing positive legacy of
the YCR BCIP programme, and its implementation of suitable
infrastructure, education, and co-ordination.
Of the 16% of tested tumours found to be dMMR, only 51%
were followed up as per diagnostic guidance: 121/372 (33%)
patients with MSH2/MSH6/PMS2 deficient tumours had germline
testing, and 1178/2190 (54%) tumours with MLH1 deficiency or
MSI-High status had further somatic testing. The latter facilitates
distinction between sporadic (tumour-confined) dMMR versus
potential constitutional dMMR underpinned by a germline
pathogenic variant. As with initial MMR testing, the follow up of
dMMR tumours was observed to vary significantly across Cancer
Alliances.
There are some caveats here around data completeness, with
potential gaps in BRAF data particularly affecting London and the
Thames Valley region. Additionally, due to database challenges at
genomics laboratories, we are missing a small number of germline
MMR testing records from Great Ormond Street from December
2019 onwards, and from Bristol since the inception of their MMR
testing service in summer 2019. Nevertheless, these gaps
constitute a very small proportion of the overall national
LS-related testing activity, and do not alter our overall conclusions.
In 2019, 2 years after publication of the NICE guidance [9], MMR
testing and appropriate follow up were generally poorly
implemented, with major geographical inequities, substantial
attrition from all levels of the testing pipeline, and very long time
lags between initial functional MMR tumour testing and germline
follow up. This long delay in germline testing limits the analysis
that can be performed on more recently diagnosed tumours, as
the data need time to mature with respect to the time period
between diagnosis of cancer and genetic diagnosis of Lynch
syndrome. It also evidences the need to develop and implement
more efficient LS testing pathways, e.g. those co-ordinated via
mainstream oncology services.
Where germline testing was performed, we observed a
relatively high detection rate of pathogenic/likely pathogenic (P/
LP) variants. Amongst full screen, post-diagnosis tests, the
detection rate was 34.1%; this is somewhat higher than the 28%
reported for all full screen MMR testing carried out in English labs
since 2008 [13]. The difference probably reflects the biased nature
of the 2019 CRC-diagnosed cohort, most of whose tumours had
been pre-screened for dMMR. In contrast, most historical full
screen germline testing would have been performed based on
family history and/or young age of cancer development. Accordingly, by restricting the 2019 analysis to patients whose tumourscreening adhered properly to the NICE guidelines, the germline
detection rate increased to 49%. Strikingly, a germline P/LP MMR
variant was detected in 65% of patients whose tumours were
abnormal for MSH2/MSH6/PMS2, indicating the clinical utility of
this as a biomarker of LS.
Overall, 133 (65%) of the total 206 people with MMR germline
P/LP variants were identified following a full screen, post-diagnosis
germline test, i.e. represented new LS families not previously
known to genetics services. This demonstrates the importance of
the NICE-recommended tumour testing pathway in identifying
new cases. Were the pathways to be implemented fully, both lives
and health service resources could be saved [8, 29]. Based on
extrapolations from all tumour and germline data, we estimate
that, were NICE guidelines to be fully executed in all cases of CRC,
up to 700 additional LS index cases (above this 2019 baseline)
could be diagnosed per year; others could then be identified
through familial testing.
Since the current reporting period of 2019 diagnoses, there has
been more recognition of the importance of detecting LS, and a
national transformation project is now underway [17]. This report
provides a baseline for the anticipated improvement in LS
detection. To facilitate comparison, and provide figures for
subsequent reporting years beyond this baseline, we have made
regional and national data available online at https://
cancerstats.ndrs.nhs.uk/molecular/lynchsyndrome (requires an
NHS network connection and login).
The national-scale collection, collation, curation and standardisation of these data by NDRS is the world’s first example of linking
cancer records with both germline and somatic molecular testing
data in a real-world setting at population-level. Linkage of
genomic data to the rich clinical phenotype, treatment and
outcome data held within NDRS will enable the NHS to build up a
comprehensive picture of genotype-phenotype correlations, facilitate genetic counselling of families with cancer, and monitor
equity of access to molecular testing and targeted therapies.
Through our collaboration with the UK Cancer Variant Interpretation Group (CanVIG-UK) [30], the datasets are also supporting
national efforts to interpret germline variants of uncertain clinical
significance (VUS).
CONCLUSION
The data presented here for 2019 diagnoses of colorectal cancer
are the first of their kind to give a national picture of Lynch
syndrome diagnostics across the entire cancer pathway, encompassing both germline and somatic testing. Only 44% of CRCs
were screened for MMR deficiency; these figures varied over fourfold with respect to geography. These 2019 figures provide a
European Journal of Human Genetics (2024) 32:529 – 538
F.E. McRonald et al.
537
baseline level of tumour testing and indicate the level of
underdiagnosis of LS at a point 2 years from when NICE
recommended MMR testing in all colorectal cancers, but prior to
the widespread disruption to NHS services caused by the SARSCoV-2 pandemic. Now that the national data collection, processing, and analytical methodology is embedded within NDRS, it is
possible to monitor improvements over time, and to benchmark
the relative performance of individual NHS Trusts and Cancer
Alliances.
DATA AVAILABILITY
Data are held within the National Disease Registration Service (NDRS), which is part of
NHS England. Formal data requests may be made through the Data Access Request
Service (DARS): https://digital.nhs.uk/services/data-access-request-service-dars.
CODE AVAILABILITY
Analytical code is available from the National Disease Registration Service (NDRS)
upon reasonable request.
REFERENCES
1. Rahman N. Realizing the promise of cancer predisposition genes. Nature.
2014;505:302–8.
2. Hampel H, Frankel WL, Martin E, Arnold M, Khanduja K, Kuebler P, et al. Screening
for the Lynch syndrome (hereditary nonpolyposis colorectal cancer). N Engl J
Med. 2005;352:1851–60.
3. Barnetson RA, Tenesa A, Farrington SM, Nicholl ID, Cetnarskyj R, Porteous ME,
et al. Identification and survival of carriers of mutations in DNA mismatch-repair
genes in colon cancer. N Engl J Med. 2006;354:2751–63.
4. Idos G, Valle L. Lynch syndrome. In: Adam MP, Everman DB, Mirzaa GM, et al.,
editors. GeneReviews® [Internet]. Seattle (WA): University of Washington;
1993–2022. https://www.ncbi.nlm.nih.gov/books/NBK1211/.
5. Win AK, Jenkins MA, Dowty JG, Antoniou AC, Lee A, Giles GG, et al. Prevalence
and penetrance of major genes and polygenes for colorectal cancer. Cancer
Epidemiol Biomark Prev. 2017;26:404–12.
6. Patel AP, Wang M, Fahed AC, Mason-Suares H, Brockman D, Pelletier R, et al.
Association of rare pathogenic DNA variants for familial hypercholesterolemia,
hereditary breast and ovarian cancer syndrome, and lynch syndrome with disease risk in adults according to family history. JAMA Netw Open. 2020;3:e203959.
7. Grzymski JJ, Elhanan G, Morales Rosado JA, Smith E, Schlauch KA, Read R, et al.
Population genetic screening efficiently identifies carriers of autosomal dominant
diseases. Nat Med. 2020;26:1235–9.
8. Snowsill T, Huxley N, Hoyle M, Jones-Hughes T, Coelho H, Cooper C, et al. A
systematic review and economic evaluation of diagnostic strategies for Lynch
syndrome. Health Technol Assess. 2014;18:1–406.
9. NICE Diagnostics guidance [DG27]. Molecular testing strategies for Lynch syndrome in people with colorectal cancer. 2017. https://www.nice.org.uk/guidance/
dg27.
10. Bowel Cancer UK. People at high risk of cancer denied a £200 life saving genetic
test. 2018. https://www.bowelcanceruk.org.uk/news-and-blogs/news/people-athigh-risk-of-cancer-denied-a-£200-life-saving-genetic-test/.
11. Bowel Cancer UK. Testing for Lynch syndrome – what you need to know. 2018.
https://www.bowelcanceruk.org.uk/news-and-blogs/research-blog/testing-forlynch-syndrome-%E2%80%93-what-you-need-to-know/.
12. Henson KE, Elliss-Brookes L, Coupland VH, Payne E, Vernon S, Rous B, et al. Data
resource profile: national cancer registration dataset in England. Int J Epidemiol.
2020;49:16–h.
13. Loong L, Huntley C, McRonald F, Santaniello F, Pethick J, Torr B, et al. Germline
mismatch repair (MMR) gene analyses from English NHS regional molecular
genomics laboratories 1996-2020: development of a national resource of patientlevel genomics laboratory records. J Med Genet. 2023;60:669–78.
14. R Core Team. R: a language and environment for statistical computing. Vienna,
Austria: R Foundation for Statistical Computing; 2022. https://www.R-project.org/.
15. NHS England. National Disease Registries Directions 2021. 2021. https://
digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/
directions-and-data-provision-notices/secretary-of-state-directions/nationaldisease-register-service-directions.
16. NHS England. Rare and inherited disease eligibility criteria. In: National Genomic
Test Directory. 2018. https://www.england.nhs.uk/publication/national-genomictest-directories/. Version 5.2 accessed 5th July 2023.
European Journal of Human Genetics (2024) 32:529 – 538
17. Monahan KJ, Ryan N, Monje-Garcia L, Armstrong R, Church DN, Cook J, et al. The
English National Lynch Syndrome transformation project: an NHS Genomic
Medicine Service Alliance (GMSA) programme. BMJ Oncol. 2023;2:e000124.
https://doi.org/10.1136/bmjonc-2023-000124.
18. Tranø G, Wasmuth HH, Sjursen W, Hofsli E, Vatten LJ. Awareness of heredity in
colorectal cancer patients is insufficient among clinicians: a Norwegian
population-based study. Colorectal Dis. 2009;11:456–61.
19. Vasen HF, Möslein G, Alonso A, Aretz S, Bernstein I, Bertario L, et al. Recommendations to improve identification of hereditary and familial colorectal cancer
in Europe. Fam Cancer. 2010;9:109–15.
20. Burn J, Sheth H, Elliott F, Reed L, Macrae F, Mecklin JP, et al. Cancer prevention
with aspirin in hereditary colorectal cancer (Lynch syndrome), 10-year follow-up
and registry-based 20-year data in the CAPP2 study: a double-blind, randomised,
placebo-controlled trial. Lancet. 2020;395:1855–63.
21. NICE guideline [NG151]. Reduction in risk of colorectal cancer in people with
Lynch syndrome. 2020. https://www.nice.org.uk/guidance/ng151.
22. Mathers JC, Elliott F, Macrae F, Mecklin JP, Möslein G, McRonald FE, et al. Cancer
prevention with resistant starch in Lynch syndrome patients in the CAPP2randomized placebo controlled trial: planned 10-year follow-up. Cancer Prev Res.
2022;15:623–34.
23. Le DT, Uram JN, Wang H, Bartlett BR, Kemberling H, Eyring AD, et al. PD-1
blockade in tumors with mismatch-repair deficiency. N Engl J Med.
2015;372:2509–20.
24. Le DT, Durham JN, Smith KN, Wang H, Bartlett BR, Aulakh LK, et al. Mismatch
repair deficiency predicts response of solid tumors to PD-1 blockade. Science.
2017;357:409–13.
25. André T, Shiu KK, Kim TW, Jensen BV, Jensen LH, Punt C, et al. Pembrolizumab in
microsatellite-instability-high advanced colorectal cancer. N Engl J Med.
2020;383:2207–18.
26. NICE Technology Appraisal Guidance [TA716]. Nivolumab with ipilimumab for
previously treated metastatic colorectal cancer with high microsatellite instability
or mismatch repair deficiency. 2021. https://www.nice.org.uk/guidance/ta716.
27. Monahan KJ, Bradshaw N, Dolwani S, Desouza B, Dunlop MG, East JE, et al.
Guidelines for the management of hereditary colorectal cancer from the British
Society of Gastroenterology (BSG)/Association of Coloproctology of Great Britain
and Ireland (ACPGBI)/United Kingdom Cancer Genetics Group (UKCGG). Gut.
2020;69:411–44.
28. West NP, Gallop N, Kaye D, Glover A, Young C, Hutchins GGA, et al. Lynch
syndrome screening in colorectal cancer: results of a prospective 2-year regional
programme validating the NICE diagnostics guidance pathway throughout a 5.2million population. Histopathology. 2021;79:690–9.
29. Snowsill T, Coelho H, Huxley N, Jones-Hughes T, Briscoe S, Frayling IM, et al.
Molecular testing for Lynch syndrome in people with colorectal cancer: systematic reviews and economic evaluation. Health Technol Assess. 2017;21:1–238.
30. Garrett A, Callaway A, Durkie M, Cubuk C, Alikian M, Burghel GJ, et al. Cancer
Variant Interpretation Group UK (CanVIG-UK): an exemplar national subspecialty
multidisciplinary network. J Med Genet. 2020;57:829–34.
ACKNOWLEDGEMENTS
We thank the scientific and bioinformatic staff at the NHS Regional Genomics
Laboratories for working with us to realise this national data collection. Data that has
been provided by patients, the NHS and other health care organisations as part of
routine patient care and support is collated, maintained, and quality assured by the
National Disease Registration Service, which is part of NHS England. We thank Donna
Job for her ongoing support of the Lynch syndrome work programme at Newcastle
University. We extend our heartfelt thanks to our colleagues working in NDRS,
particularly the skilled cancer registration officers (CROs) who abstracted and
registered the tumour and molecular testing data. We would like to dedicate this
paper to the memory of Kathryn Dickinson, a much-loved CRO who tragically lost her
life on 17th October 2021.
AUTHOR CONTRIBUTIONS
Conceptualisation: FEM, JP, ML, GMB, CT, ACS, KJM, IMF, SH, and JB. Data curation:
FEM, JP, FS, BS, OT, and SG. Formal analysis: FEM, JP, FS, BS, AT, OT, SG, ML, IMF, and
SH. Funding acquisition: CT and JB. Investigation: FEM, JP, and SH. Methodology: FEM,
JP, FS, BS, AT, OT, SG, and ML. Project administration: FEM and GMB. Resources: FS,
BS, OT, and SG. Software: FS, BS, OT, and SG. Supervision: GMB, SH, and JB. Validation:
FEM, JP, FS, BS, AT, SG, and ML. Visualisation: FEM, JP, AT, and ML. Writing—original
draft: FEM, JP, ML, SH, and JB. Writing—review and editing: FEM, JP, FS, ML, GMB, CT,
ACS, KJM, IMF, SH, and JB.
F.E. McRonald et al.
538
FUNDING
ADDITIONAL INFORMATION
We thank Bowel Cancer UK (18PG0019) and Cancer Research UK (CanGene-CanVar
Programme Grant, C61296/A27223) for their generous support in funding the
bioinformatics aspects of this work.
Supplementary information The online version contains supplementary material
available at https://doi.org/10.1038/s41431-024-01550-w.
Correspondence and requests for materials should be addressed to Fiona E.
McRonald or John Burn.
COMPETING INTERESTS
JB was awarded funding from Bowel Cancer UK (grant 18PG0019), which provided
primary funding for the bioinformatics aspects of this work. CT is Chief Investigator
for the CRUK Programme CanGene-CanVar (grant C61296/A27223), which provided
additional funding for the bioinformatics aspects of this work. ACS is partly funded by
the NHS South East Genomic Medicine Service as Co-Chair of the National Lynch
syndrome diagnosis transformation project. IMF has received travel support from St
Vincent’s University Hospital, Dublin. He is an Assessor for the UK National External
Quality Assessment Service for immunocytochemistry and in situ hybridisation (UK
NEQAS ICC&ISH), receiving honoraria and travel expenses. He also undertakes unpaid
roles as Honorary Treasurer & Trustee, International Society for Gastrointestinal
Hereditary Tumours (InSiGHT); President & Trustee, Association of Clinical Pathologists, UK; and Member of Council, UK Cancer Genetics Group. KJM has previously
received funding from 40tude cancer charity, and sits on the Medical Advisory Board
for Bowel Cancer UK and Lynch Syndrome UK. All other authors have declared no
competing interests.
ETHICAL APPROVAL
Data used in this study were routinely collected by NDRS as part of NHS patient care
and support. No human or animal subjects were used. Further ethical approval for
this study was not required per the definition of research according to the UK Policy
Framework for Health and Social Care Research.
Reprints and permission information is available at http://www.nature.com/
reprints
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims
in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons licence, and indicate if changes were made. The images or other third party
material in this article are included in the article’s Creative Commons licence, unless
indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this licence, visit http://
creativecommons.org/licenses/by/4.0/.
© The Author(s) 2024
European Journal of Human Genetics (2024) 32:529 – 538
Genes, Chromosomes and Cancer
RESEARCH ARTICLE
OPEN ACCESS
Hereditary Colorectal Cancer and Polyposis Syndromes
Caused by Variants in Uncommon Genes
Ahmed Bouras1,2
| Aurélie Fabre3 | Hélène Zattara3 | Sandrine Handallou4 | Françoise Desseigne5 | Caroline Kientz6 |
6
Fabienne Prieur | Magalie Peysselon7 | Clémentine Legrand7 | Laura Calavas8 | Jean-Christophe Saurin8 | Qing Wang1,2
1Laboratory of Constitutional Genetics for Frequent Cancer HCL-CLB, Centre Léon Bérard, Lyon, France
3Department of Genetics, Hôpital d'Enfants de La Timone, AP-H M, Marseille, France
| 2Inserm U1052, Lyon Cancer Research Center,
4 Cancer Genetics Unit, Department of Public
|
Lyon, France |
Health, Centre Léon Bérard, Lyon, France | 5Department of Medicine, Centre Léon Bérard, Lyon, France | 6Department of Clinical, Chromosomal and
Molecular Genetics, Hôpital Nord, CHU Saint Etienne, Saint Etienne, France | 7Genetic Service, Department of Genetics and Procreation, CHU Grenoble
Alpes, Grenoble, France | 8Department of Gastroenterology and Endoscopy, Edouard Herriot Hospital, Lyon, France
Correspondence: Ahmed Bouras (ahmed.bouras@lyon.unicancer.fr)
Received: 19 May 2024 | Revised: 18 July 2024 | Accepted: 25 July 2024
Keywords: cancer predisposition genes | colorectal cancer | polyposis
ABSTRACT
A substantial number of hereditary colorectal cancer (CRC) and colonic polyposis cannot be explained by alteration in confirmed
predisposition genes, such as mismatch repair (MMR) genes, APC and MUTYH. Recently, a certain number of potential predisposition genes have been suggested, involving each a small number of cases reported so far. Here, we describe the detection of rare variants in the NTLH1, AXIN2, RNF43, BUB1, and TP53 genes in nine unrelated patients who were suspected for inherited CRC and/
or colonic polyposis. Seven of them were classified as pathogenic or likely pathogenic variants (PV/LPV). Clinical manifestations of
carriers were largely consistent with reported cases with, nevertheless, distinct characteristics. PV/LPV in these uncommon gene
can be responsible for up to 2.7% of inherited CRC or colonic polyposis syndromes. Our findings provide supporting evidence for
the role of these genes in cancer predisposition, and contribute to the determination of related cancer spectrum and cancer risk for
carriers, allowing for the establishment of appropriate screening strategy and genetic counseling in affected families.
1 | Introduction
Hereditary colorectal cancer (CRC) and colonic polyposis are
caused by different etiologies and associated with variable clinical
phenotypes. Heterozygous pathogenic variants (PV) in mismatch
repair (MMR) genes are the most frequent causes which are responsible for Lynch syndrome (LS) with deficient MMR (dMMR)
tumor phenotype. Monoallelic PVs of APC and biallelic MUTYH
inactivation are common causes for hereditary adenomatous
polyposis. Less frequently, several other genes are involved, including POLE/POLD1 for polymerase-proofreading-associated
polyposis (PPAP) syndrome, SMAD4 and BMPR1A for Juvenile
polyposis syndrome, STK11 for Peutz–
Jeghers syndrome and
PTEN for Cowden syndrome [1]. These confirmed cancer predisposition genes are routinely screened in patients with suspicion of
gastrointestinal cancer syndromes following French recommendation [2]. However, genetic causes for a substantial proportion of
cases still remain to be unveiled. Recent studies identified several
potential susceptibility genes with growing evidence that strongly
supports their role in hereditary CRC or colonic polyposis [1, 3].
Their related cancer risk needs to be evaluated with the accumulation of affected cases. Searching for germline inactivation of
such genes presents thus important interests for the understanding of their roles and for genetic counseling of affected families.
In this report, we described the identification of germline variants in the AXIN2, BUB1, NTHL1, RNF43, and TP53 genes in
patient suspected for hereditary CRC or colonic polyposis. The
AXIN2 gene (Axis inhibition protein 2) is a component of Wnt-
pathway in which it regulates the stability of β-catenin. The
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original
work is properly cited, the use is non-commercial and no modifications or adaptations are made.
© 2024 The Author(s). Genes, Chromosomes and Cancer published by Wiley Periodicals LLC.
Genes, Chromosomes and Cancer, 2024; 63:e23263
https://doi.org/10.1002/gcc.23263
1 of 8
2 | Materials and Methods
2.1 | Patients
Patients were identified through genetic consultation sessions
and genetic testing was performed after the informed written
consent was obtained. Patients suspected with predisposition to
CRC or polyposis underwent germline variant screening using
a 22-gene panel for digestive cancer predisposition (Table S1).
Result assessment followed geneticists' prescription based on
clinical indication, that is, either an oriented LS panel for suspected LS, or an expanded panel of 14 confirmed CRC predisposing genes for patient with proficient or unknown MMR tumor
status, as well as for patients suspected of hereditary colonic
polyposis. Eight potential CRC predisposition genes (called “research genes”) were further explored, including AXIN2, BUB1,
GREM1, MSH3, NTHL1, RNF43, RPS20, and TP53 (Table S1). In
total, 325 patients underwent an extensive genetic testing panel
including all eight “research genes.”
2.2 | Germline Variant Screening
and Interpretation
Total genomic DNA was extracted from blood samples using
automated STARlet platform (Hamilton Company, Reno, NV,
USA). Next-generation sequencing (NGS) was performed using
customized Agilent XTHS panel with capture-based target enrichment (Agilent, Santa Clara, CA, USA). Sequence alignment
and variant calling were carried out using an in-house bioinformatics pipeline. Sanger sequencing was subsequently used for
confirming variants of interest. The Human Genome Variation
Society (HGVS) guidelines were used for variant nomenclature,
with c.1 corresponding to the first nucleotide of the coding sequence (www.hgvs.org/varnomen). The sequence references
are indicated in Table S2. Variant pathogenicity was determined
using ACMG criteria [5]. General population data were referred
to gnomAD database v.2.1.1 (European non-Finnish), and in
silico prediction algorithms included Align GVGD [6], SIFT [7],
Polyphen 2 [8], CADD V1.6 [9], SPIP [10], and SpliceAI [11].
2.3 | Somatic Variant Screening
Somatic analysis of the RNF43 (NM_017763.5) gene was carried
out using the method described previously [12].
3 | Results
Patients suspected with genetic predisposition to CRC or polyposis were screened for germline PVs using digestive cancer
2 of 8
predisposition gene panel including 14 confirmed predisposition
genes [2], as well as eight “research” genes which were previously reported to be involved in CRC susceptibility for further
analysis in negative families. Among 880 patients tested between 2019 and 2023, 550 patients were tested by a LS-oriented
panel with a diagnostic yield of pathogenic or likely pathogenic
variant (PV/LPV) in MMR genes of 26.8% (n = 149) (Figure 1).
For the remaining 325 patients with proficient or unknown
tumor mismatch repair (p/uMMR) status, PV/LPV were detected in 28 patients (8.6%) in one of the confirmed CRC predisposing genes including MMR genes, APC, MUTYH, BMPR1A,
STK11, and POLD1 (Figure 1 and Table S2). In an attempt to
find the genetic cause in a large proportion of negative patients
with p/uMMR status, we further assessed “research” genes. In
nine unrelated patients, rare variants were found in the NTHL1
gene (two cases), the AXIN2 gene (two cases), the RNF43 gene
(two cases), the BUB1 gene (two cases), and the TP53 gene (one
case). These cases are described as follows and summarized in
Table 1 and Figure 2.
3.1 | Cases With NTHL1 Variant
Homozygous variant NTHL1 c.268C>T, p.(Gln90*) was detected
in probands of two families without familial consanguinity. This
variant was previously reported as recurrent PV [13]. The proband
of the Family 1 (Figure 2a) developed breast and endometrial cancers at 58 and 63 years old, in addition with more than 10 colonic
adenomatous polyps. Two brothers were diagnosed, respectively,
with a CRC at the age of 37 years and a pheochromocytoma at the
age of 46 years. Her sister had a breast cancer at the age of 42 years.
Her mother had colonic polyps with unknown number and histology. The variant was not carried by her cancer-free sister.
For the Family 2 (Figure 2b), the proband developed a meningioma at the age of 56 years and synchronous breast and endometrial cancers at the age of 68 years. Her father deceased from
a CRC and her mother had a breast cancer. Three malignancies
were diagnosed in her sibling: two sisters had respectively brain
cancer at 18 and breast cancer at 62 and one brother had lung
cancer at 59. Proband's mother and her cancer-free sister were
both heterozygous carrier of the variant.
3.2 | Cases With AXIN2 Variants
The proband of the Family 3 (Figure 2c) was diagnosed with
a CRC at the age of 59 years associated with nine colonic adenomatous polyps. Genetic testing detected a heterozygous
truncating variant in the AXIN2 gene: c.2303_2306del, p.(Tyr768Phefs*13). In her family, three members (mother, one brother,
and one niece) were diagnosed with CRCs at the age of 71, 50,
and 56 years, respectively. Based on truncating nature of the
variant and coherent family history, together with its absence
in general population, we classified this variant as pathogenic.
For the Family 4 (Figure 2d), later-onset CRCs were diagnosed
in the proband and two first-degree relatives in addition to adenomatous polyps found in the proband, his father, and six of his
siblings with an age of 42 years for the earliest onset. The proband's tumor displayed MSS with normal expression of four MMR
Genes, Chromosomes and Cancer, 2024
10982264, 2024, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/gcc.23263 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
BUB1 gene is a mitotic checkpoint protein kinase, playing a key
role in mitotic spindle checkpoint. The NTHL1 gene encodes
Endonuclease III-like protein 1 and the RNF43 gene encodes for
Ring-f inger protein 43 and is involved in the regulation of Wnt-
pathway. Together with the TP53 gene which plays a key role in
cancer development, all these genes have been reported to be
involved in CRC/polyposis susceptibility [1, 4].
proteins. We found in this patient a heterozygous missense variant
in the AXIN2 gene: c.952A>G, p.(Ser318Gly). This variant is absent in the general population and is predicted to be deleterious by
several in silico tools including Align GVGD (C55), SIFT, Polyphen
2, and CADD V1.6 (phred score = 31). Furthermore, this variant is
predicted to alter splicing by the creation of a de novo splice donor
site by both SPiP and SpliceAI software. Nevertheless, its pathogenicity could not be clearly determined at present until further
analysis will be conducted, in particular on the transcription level.
Co-segregation analysis in the family was also expected.
3.3 | Cases With RNF43 Variants
The proband of the Family 5 (Figure 2e) was diagnosed with a
CRC at the age of 48 years with a serrated polyp removed during
his colonoscopy. Her mother developed synchronous colorectal and ovarian cancers at the age of 68 years. A heterozygous
truncating variant in the RNF43 gene was detected: c.394C>T,
p.(Arg132*) and was classified as pathogenic, based on the interruption of protein synthesis, the absence in the general population, consistent clinical phenotype, and in addition, the loss
of heterozygosity (LOH) of the wildtype allele revealed in the
tumor (Figure S1).
For the Family 6 (Figure 2f), the proband was diagnosed with
CRC associated with one serrated and three adenomatous polyps. The patient's tumor displayed MSS with normal expression of MMR proteins. A heterozygous missense variant in the
RNF43 gene was detected: c.655C>T p.(Arg219Cys). This variant is present with a low frequency (0.018%) in the European
non-Finish population. It is predicted as deleterious by SIFT,
Polyphen 2 and is highly scored by Align GVGD (C65) and
CADD (phred score = 32) compatible with impaired function.
Nevertheless, clinical and biological elements were still insufficient for pathogenicity determination, it remains as a variant
of unknown significance (VUS). No co-segregation study was
possible for this family as all affected members are deceased.
3.4 | Case With TP53 Variant
The proband of the Family 7 (Figure 2g) developed a CRC associated with 20 adenomatous polyps at the age of 67 years. His
3 of 8
10982264, 2024, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/gcc.23263 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
FIGURE 1 | Screening strategy for germline cancer susceptibility gene variants. Number sign (#) indicates homozygous carrier. Asterisk (*)
indicates rare missense variant predicted to be deleterious with CADD Phred score ≥30. CRC, colorectal cancer; LS, Lynch syndrome; MMR,
mismatch repair; PJS, Peutz–Jeghers Syndrome; PPAP, polymerase proofreading-associated polyposis; PS, polyposis syndromes; PV/LPV, pathogenic
or likely pathogenic variant.
4 of 8
c.268C>T, p.(Gln90*)
c.268C>T, p.(Gln90*)
NTHL1a
NTHL1a
AXIN2
AXIN2
RNF43
RNF43
TP53
BUB1
BUB1
1
2
3
4
5
6
7
8
9
Chondrosarcoma (25), 15 AP (40–62)
15 AP (76)
CRC (67), 20 AP
CRC (48), 1 SP, 3 AP
CRC (48), 1 SP (48)
CRC (69), 3 AP
CRC (59), 9 AP
Meningioma (56), Br (68), En (68)
Br (58), En (63), >10 AP
CRC (?), Br (54),
pancreas cancer (88)
CRC (67)
OV (43), gynecological
cancer (?)
GC (48), CRC (55)
CRC + OV (68)
CRC (80) + 17 AP, CRC (69),
several polyps in the sibling
CRC (71), CRC (50), AP
(41), CRC (56), 1 HP (36)
Br (?), CRC (?), Br (62),
brain (18), lung cancer (59)
Polyps, CRC (37), Br (42),
pheochromocytoma (46)
Phenotype in first-
and second-degree
relatives (age)
Abbreviations: AP, adenomatous polyp(s); Br, tumors of breast; CRC, colorectum; En, endometrium; GC, stomach; HP, hyperplasic polyp(s); OV, ovary; SP, serrated polyp(s); (?), unknown age.
a Homozygous variant.
c.625C>T, p.(Arg209*)
c.2166G>A, (p.Trp722*)
c.845G>A, p.(Arg282Gln)
c.655C>T, p.(Arg219Cys)
c.394C>T, p.(Arg132*)
c.952A>G, p.(Ser318Gly)
c.2303_2306del, p.(Tyr768Phefs*13)
Variant
Gene
Nb. family
Proband's phenotype
(age at diagnosis)
TABLE 1 | Summary of the clinicopathological characteristics of the patients carrying variants in recently proposed colorectal cancer susceptibility gene.
5
5
4
3
5
3
5
5
5
Variant
classification
10982264, 2024, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/gcc.23263 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Genes, Chromosomes and Cancer, 2024
5 of 8
10982264, 2024, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/gcc.23263 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
FIGURE 2 | Family 1–9 pedigrees. No consanguinity was reported. Black arrows indicate index cases. “MUT/MUT” and “MUT/WT” denote
homozygous and heterozygous carriers, respectively. Age at diagnosis is indicated in brackets. AP, adenomatous polyp(s); Br, tumors of breast; CRC,
colorectal; En, endometrium; GC, stomach; HP, hyperplasic polyp(s); MSS, microsatellite stable; OV, ovary; pMMR, proficient mismatch repair; SP,
serrated polyp(s).
3.5 | Cases With BUB1 Variants
The variant in the BUB1 gene c.2166G>A p.(Trp722*) was identified in a patient from Family 8 (Figure 2h) who had 15 adenomatous polyps detected at the age of 76 years. His sister was
diagnosed with CRC at the age of 67 years. Based on the truncating nature, the absence in the general population and consistent
clinical manifestation, we classified it as pathogenic.
The proband of the Family 9 (Figure 2i) was diagnosed with a
chondrosarcoma at the age of 25 years. He also had 15 colonic adenomatous polyps detected between the ages of 40 and 62 years.
The father developed a pancreas cancer at the age of 88 years,
and the mother was diagnosed with a breast cancer at the age
of 55 years. The paternal grandfather had a colon cancer at an
unknown age. A truncating variant of the BUB1 gene: c.625C>T
p.(Arg209*) was identified in the patient. The clinical feature,
biological consequence, and a very low prevalence in the general
population (0.004% in European non-Finish population) lead us
to classify this variant as pathogenic.
4 | Discussion
We reported here nine variants detected in five uncommon
genes: NTHL1, AXIN2, RNF43, BUB1, and TP53 in patients
suspected for hereditary CRCs and polyposis, among which
seven were considered as PV/LPV. The pathogenicity of two
other rare variants were not able to be determined although
they were predicted to be deleterious or spliceogenic by in silico
algorithms. Further investigations are needed for their classification. Co-segregation study presents an important interest in
determining the role of such uncommon cancer predisposition
genes in familial cancer syndromes and in establishing appropriate clinical surveillance for variant carriers. Unfortunately,
it was not able to be carried out in these families partly because
of their small size with few members affected with relevant
cancers for testing. It will certainly be complemented when it
is possible.
All five genes have been previously reported to be associated to
hereditary CRC and/or polyposis susceptibility with apparently
variable clinical or biological characteristics. NTHL1 PVs cause
recessively inherited multitumor syndrome [13, 18, 19]. So far,
50 biallelic PV carriers have been reported [20, 21] and many
of them carried homozygous c.268C>T variant. To note, 0.38%
of general European non-Finnish population are monoallelic
6 of 8
carriers of this variant. Biallelic carriers developed mainly CRC,
colonic polyps, breast cancer and less frequently, and endometrial cancer. A number of other cancers with lower frequency
were reported. Regarding two female biallelic c.268C>T carriers in our series, both developed multiple-primary cancers.
Consistent with reported cases, breast and endometrial cancers
were diagnosed in both but at a later age (58 and 68 for breast
and 63 and 68 for endometrial cancers) and one had more than
10 colonic polyps. One carrier had a meningioma which was
one of the rare manifestations already found in NTHL1 PV
biallelic carriers [20]. Our findings, together with previous reports, suggested that female carriers have higher risk to develop
breast and endometrial cancers, even in an advanced age. Other
NTHL1-associated cancers were found in family numbers including early onset of CRC in male patient and early onset breast
cancer in female patients as well as brain cancer but unfortunately their carrier status was not able to be confirmed. To note,
a pheochromocytoma at the age of 46 years and a lung cancer
at the age of 59 years were diagnosed in family numbers which
were not described previously in NTHL1 PV carriers.
Monoallelic germline PVs in the AXIN2 gene were reported in
less than 20 patients worldwide [22]. All were truncating variants leading to the synthesis of a protein lacking C-terminal
functional disheveled and axin (DIX) domain and subjected to
the degradation by nonsense mediated mRNA decay (NMD).
The PV identified in the Family 3 was located within the DIX domain (Exon 10) thus impact doubtlessly protein function. On the
contrary, the consequence on protein synthesis/function of the
missense variant found in the Family 4 (Exon 3) required further investigation especially through splicing defect predicted
by in silico algorithms. Reported AXIN2 PV carriers manifested predominantly CRCs and colonic adenomatous polyposis.
Extracolonic cancers seemed to be rare. Such observations were
consistent with our finding in two families in which only colon
cancer and polyps were diagnosed in affected members. It is
reported that patients carrying AXIN2 PV often present dental
anomalies such as anodontia or oligodontia [22]. Unfortunately,
no information about dental examination was available for carriers of these two families.
Monoallelic germline RNF43 PVs are associated with inherited
serrated polyposis syndrome [23–25]. To date, only seven PVs
were reported in the literature involving 10 families including
the Family 5 from this study. The variant c.394C>T, p.(Arg132*)
identified in this family was likely recurrent since it was identified in three unrelated families. RNF43 PV carriers commonly
had serrated colorectal polyps susceptible to malignant transformation. Our carrier was diagnosed with a CRC at 48 with the
detection of one serrated polyp, but not fulfilling clinical diagnosis of serrated polyposis syndrome. Serrated polyp was also
detected in the patient carrying RNF43 variant (VUS) c.655C>T,
constituting a supporting element in its interpretation, although
a definitive classification was not able to establish at present according to ACMG criteria.
It is well known that germline TP53 PVs cause LFS, an aggressive condition predisposing carrier to different malignancies
especially sarcomas, brain tumor, and breast cancer at young
ages. CRC was not conventionally considered as a component
of LFS tumor spectrum. However, recent study from Terradas
Genes, Chromosomes and Cancer, 2024
10982264, 2024, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/gcc.23263 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
mother was diagnosed with ovarian cancer at the age of 43 years
and his sister developed a gynecological cancer at an unknown
age. A TP53 gene variant was detected: c.845G>A p.(Arg282Gln). This variant is absent in general population. It affects a
highly conserved amino acid and is predicted as deleterious by
SIFT and Polyphen 2. It is a hotspot somatic mutation in variable
types of cancers [14]. Functional testing showed that it reduced
transactivation activity [15]. On germline level, it was reported
in families with childhood cancers or breast cancers compatible
with Li–Fraumeni syndrome (LFS) with, interestingly, a CRC
diagnosed in one of the carriers at the age of 47 years [16] [17].
Taken together, we classified it as likely pathogenic.
Germline monoallelic BUB1 PVs were reportedly associated
with hereditary early onset of CRC [4]. To date, a total of eight
BUB1 variants were reported in CRC patients, with only four
being considered as functionally deleterious [27, 28]. Carriers
manifested mainly early onset of CRC and colonic adenomatous
polyps. We report here two additional carriers of BUB1 truncating variants with colonic polyps (>15 for both) as predominant
phenotype. However, the fact that carriers in our series had only
colonic polyps detected at a later age seem to suggest that BUB1
PVs predispose carriers rather to the development of polyps
than to early-onset CRC. To note, one of the carriers developed
a chondrosarcoma at the age of 25 years, but whether this tumor
was in relation to BUB1 inactivation could not be determined.
Apparently, these genes are responsible, each, for a small subgroup of patients with inherited pMMR colon cancer and polyposis. The prevalence of PV/LPV was shown to be low according
to large series studies: 0.2% (8/3936) for NTHL1 [13], 0.24%
(8/3322) for AXIN2 [22], 0.21% (1/473) for TP53 [26], 0.14% (combined studies) for BUB1 [27], as well as 1.3% (1/73) for RNF43
in selected patients with serrated polyposis syndrome [29]. In
our study, in 325 patients with pMMR or uMMR tumor status,
the prevalence of PV/LPV in these genes were: 0.6% (2/325) for
NTLH1, 0.3% (1/325) for AXIN2, 0.6% (2/325) for BUB1, 0.3%
(1/325) for TP53, as well as 0.3% (1/325) in RNF43. Taken together, including two potential deleterious rare missense
variants, these genes which were actually not considered as
confirmed diagnostic predisposition genes and were not systematically screened were potentially implicated in 2.7% (9/325) of
cases associated with (p/uMMR) tumor status. Thus, it seems
important to include these genes in routine examination, possibly as a second intention when the investigation in diagnostic genes was negative, in particular for patients with pMMR
tumor phenotype. Doubtlessly, accumulation of such rare cases
is essential for evaluating precise related cancer risks for each of
these genes in order to propose adapted clinical surveillance for
affected families.
Limitations in this study included incomplete clinical data in
some affected families. The inaccessibility to samples of affected
family members hampered co-segregation analysis. Also, functional studies, especially spliceogenicity evaluation of two missense variants remain to be carried out in order to determine the
pathogenicity.
In summary, our findings provided novel evidence showing that
besides major confirmed digestive tract cancer predisposition
genes, a number of other genes were shown to be involved in
inherited CRC and polyposis susceptibility, although impacting each a small number of patients. Our observations further
confirmed an etiological diversity for inherited CRC and polyposis. Indeed, additional genes are emerging with potential
susceptibility to CRC/colonic polyps, such as MBD4 [30]. It is
still difficult to make a clear genetic/phenotype correlation, but
data from described cases in the literature and this study appeared to be consistent, providing clues for the understanding
of gene-related distinct cancer syndromes. We believe that a
systematic screening in these uncommon genes should be recommended, allowing for the collection of related clinical and
biologic characteristics, necessary for establishing propriate
surveillance programs for carriers and their family.
Author Contributions
A.B. and Q.W. were responsible for designing the study, supervising
the research, variant interpretation as well as the edition of the manuscript. A.F., H.Z., S.H., F.D., C.K., F.P., M.P., C.L., L.C., and J.-C.S. are
the geneticists or genetic counselors who identified and consulted the
probands. All authors approved the submitted version.
Acknowledgments
We thank the patients for their participation in this study.
Ethics Statement
Written informed consent was obtained for all patients who were tested
and diagnosed within the frame of genetic counseling, in accordance
with French law for diagnostic genetic testing. Samples were collected
in the frame of care, from patients who consented to a research use of
their samples. Testing was done in a hospital laboratory approved for
genetic molecular diagnosis. The analyses were performed in accordance with French regulations and the principles of the Declaration of
Helsinki.
Conflicts of Interest
The authors declare no conflicts of interest.
Data Availability Statement
The datasets generated and/or analyzed during the current study are
available on request from the corresponding author. The data are not
publicly available due to privacy or ethical restrictions.
References
1. L. Valle and K. J. Monahan, “Genetic Predisposition to Gastrointestinal Polyposis: Syndromes, Tumour Features, Genetic Testing, and
Clinical Management,” Lancet Gastroenterology & Hepatology 9 (2024):
68–82.
2. M. Dhooge, S. Baert-Desurmont, C. Corsini, et al., “National Recommendations of the French Genetics and Cancer Group—Unicancer on
the Modalities of Multi-Genes Panel Analyses in Hereditary Predispositions to Tumors of the Digestive Tract,” European Journal of Medical
Genetics 63 (2020): 104080.
3. R. Mao, P. Krautscheid, R. P. Graham, et al., “Genetic Testing for Inherited Colorectal Cancer and Polyposis, 2021 Revision: A Technical
Standard of the American College of Medical Genetics and Genomics
(ACMG),” Genetics in Medicine 23 (2021): 1807–1817.
4. R. M. de Voer, A. Geurts van Kessel, R. D. A. Weren, et al., “Germline
Mutations in the Spindle Assembly Checkpoint Genes BUB1 and BUB3
Are Risk Factors for Colorectal Cancer,” Gastroenterology 145 (2013):
544–547.
5. S. Richards, N. Aziz, S. Bale, et al., “Standards and Guidelines for the
Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the
7 of 8
10982264, 2024, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/gcc.23263 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
et al. reported the detection of TP53 loss-of-f unction alterations
in CRC patients with proficient tumor MMR (pMMR). Together
with other similar findings, the role of TP53 as a CRC predisposition gene was suggested which may be independent of LFS
[26]. The novel TP53 LPV found in this study provided an additional case supporting this hypothesis. However, the presence of
>20 adenomatous polyps in our patient seemed to be uncommon
which was described in only one family member of a carrier [26].
6. S. V. Tavtigian, G. B. Byrnes, D. E. Goldgar, and A. Thomas, “Classification of Rare Missense Substitutions, Using Risk Surfaces, With Genetic-and Molecular-Epidemiology Applications,” Human Mutation 29
(2008): 1342–1354.
7. P. C. Ng and S. Henikoff, “SIFT: Predicting Amino Acid Changes That
Affect Protein Function,” Nucleic Acids Research 31 (2003): 3812–3814.
8. I. A. Adzhubei, S. Schmidt, L. Peshkin, et al., “A Method and Server
for Predicting Damaging Missense Mutations,” Nature Methods 7
(2010): 248–249.
9. M. Kircher, D. M. Witten, P. Jain, B. J. O'Roak, G. M. Cooper, and J.
Shendure, “A General Framework for Estimating the Relative Pathogenicity of Human Genetic Variants,” Nature Genetics 46 (2014): 310–315.
10. R. Leman, B. Parfait, D. Vidaud, et al., “SPiP: Splicing Prediction
Pipeline, a Machine Learning Tool for Massive Detection of Exonic
and Intronic Variant Effects on mRNA Splicing,” Human Mutation 43
(2022): 2308–2323.
11. K. Jaganathan, S. Kyriazopoulou Panagiotopoulou, J. F. McRae,
et al., “Predicting Splicing From Primary Sequence With Deep Learning,” Cell 176 (2019): 535–548.e24.
12. C. Lefol, E. Sohier, C. Baudet, et al., “Acquired Somatic MMR Deficiency Is a Major Cause of MSI Tumor in Patients Suspected for ‘Lynch-
Like Syndrome’ Including Young Patients,” European Journal of Human
Genetics 29 (2021): 482–488.
13. F. Boulouard, E. Kasper, M.-P. Buisine, et al., “Further Delineation
of the NTHL1 Associated Syndrome: A Report From the French Oncogenetic Consortium,” Clinical Genetics 99 (2021): 662–672.
24. I. Quintana, R. Mejías-Luque, M. Terradas, et al., “Evidence Suggests That Germline RNF43 Mutations Are a Rare Cause of Serrated
Polyposis,” Gut 67 (2018): 2230–2232.
25. M. K. Gala, Y. Mizukami, L. P. Le, et al., “Germline Mutations in
Oncogene-Induced Senescence Pathways Are Associated With Multiple
Sessile Serrated Adenomas,” Gastroenterology 146 (2014): 520–529.
26. M. Terradas, P. Mur, S. Belhadj, et al., “TP53, a Gene for Colorectal
Cancer Predisposition in the Absence of Li–Fraumeni-A ssociated Phenotypes,” Gut 70 (2021): 1139–1146.
27. P. Mur, R. M. De Voer, R. Olivera-Salguero, et al., “Germline Mutations in the Spindle Assembly Checkpoint Genes BUB1 and BUB3 Are
Infrequent in Familial Colorectal Cancer and Polyposis,” Molecular
Cancer 17 (2018): 23.
28. M. Djursby, M. B. Madsen, J. H. Frederiksen, et al., “New Pathogenic
Germline Variants in Very Early Onset and Familial Colorectal Cancer
Patients,” Frontiers in Genetics 11 (2020): 566266.
29. A. Murphy, J. Solomons, P. Risby, et al., “Germline Variant Testing in Serrated Polyposis Syndrome,” Journal of Gastroenterology and
Hepatology 37 (2022): 861–869.
30. C. Palles, H. D. West, E. Chew, et al., “Germline MBD4 Deficiency
Causes a Multi-T umor Predisposition Syndrome,” American Journal of
Human Genetics 109 (2022): 953–960.
Supporting Information
Additional supporting information can be found online in the
Supporting Information section.
14. H. Wang, M. Guo, H. Wei, and Y. Chen, “Targeting p53 Pathways:
Mechanisms, Structures, and Advances in Therapy,” Signal Transduction and Targeted Therapy 8 (2023): 1–35.
15. P. Campomenosi, P. Monti, A. Aprile, et al., “p53 Mutants Can Often
Transactivate Promoters Containing a p21 But Not Bax or PIG3 Responsive Elements,” Oncogene 20 (2001): 3573–3579.
16. U. Stoltze, A.-B. Skytte, H. Roed, et al., “Clinical Characteristics
and Registry-Validated Extended Pedigrees of Germline TP53 Mutation
Carriers in Denmark,” PLoS One 13 (2018): e0190050.
17. A. Chompret, L. Brugières, M. Ronsin, et al., “P53 Germline Mutations in Childhood Cancers and Cancer Risk for Carrier Individuals,”
British Journal of Cancer 82 (2000): 1932–1937.
18. C. B. Weatherill, S. A. Burke, C. G. Haskins, et al., “Six Case Reports
of NTHL1-A ssociated Tumor Syndrome Further Support It as a Multi-
Tumor Predisposition Syndrome,” Clinical Genetics 103 (2023): 231–235.
19. J. E. Grolleman, R. M. de Voer, F. A. Elsayed, et al., “Mutational Signature Analysis Reveals NTHL1 Deficiency to Cause a Multi-T umor
Phenotype,” Cancer Cell 35 (2019): 256–266.e5.
20. S. H. Beck, A. M. Jelsig, H. M. Yassin, L. J. Lindberg, K. A. W. Wadt,
and J. G. Karstensen, “Intestinal and Extraintestinal Neoplasms in Patients With NTHL1 Tumor Syndrome: A Systematic Review,” Familial
Cancer 21 (2022): 453–462.
21. N. Grot, M. Kaczmarek-Ryś, E. Lis-Tanaś, et al., “NTHL1 Gene
Mutations in Polish Polyposis Patients-Weighty Player or Vague Background?” International Journal of Molecular Sciences 24 (2023): 14548.
22. J. Leclerc, M. Beaumont, R. Vibert, et al., “AXIN2 Germline Testing in a French Cohort Validates Pathogenic Variants as a Rare Cause
of Predisposition to Colorectal Polyposis and Cancer,” Genes, Chromosomes & Cancer 62 (2023): 210–222.
23. H. H. N. Yan, J. C. W. Lai, S. L. Ho, et al., “RNF43 Germline and Somatic Mutation in Serrated Neoplasia Pathway and Its Association With
BRAF Mutation,” Gut 66 (2017): 1645–1656.
8 of 8
Genes, Chromosomes and Cancer, 2024
10982264, 2024, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/gcc.23263 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Association for Molecular Pathology,” Genetics in Medicine 17 (2015):
405–424.
de Oliveira et al.
Orphanet Journal of Rare Diseases
(2024) 19:405
https://doi.org/10.1186/s13023-024-03392-7
Orphanet Journal of
Rare Diseases
Open Access
RESEARCH
Epidemiological characterization of rare
diseases in Brazil: A retrospective study
of the Brazilian Rare Diseases Network
Bibiana Mello de Oliveira1,2, Filipe Andrade Bernardi3, João Francisco Baiochi4, Mariane Barros Neiva5,
Milena Artifon6, Alberto Andrade Vergara7, Ana Maria Martins8, Anete Sevciovic Grumach9,
Angelina Xavier Acosta10, Antonette Souto El Husny11, Bethania de Freitas Rodrigues Ribeiro12,
Camila Ferreira Ramos13, Carlos Eduardo Steiner14, Chong Ae Kim15, Denise Maria Christofolini9,
Diego Bettiol Yamada16, Ellaine Doris Fernandes Carvalho17, Erlane Marques Ribeiro18,
Fabíola de Arruda Bastos19, Faradiba Sarquis Serpa20, Flávia Reseda Brandão21,
Giselle Maria Araujo Felix Adjuto22, Isabelle Carvalho5, Jonas Alex Morales Saute23,
Juan Clinton Llerena Junior24, Larissa Souza Mario Bueno25, Luiz Carlos Santana da Silva11,
Mara Lucia Schmitz Ferreira Santos26, Marcela Câmara Machado Costa27, Marcia Maria Costa Giacon Giusti28,
Marcial Francis Galera29, Márcio Eloi Colombo Filho3, Maria Denise Fernandes Carvalho de Andrade30,
Maria Teresinha De Oliveira Cardoso22,31, Marilaine Matos de Menezes Ferreira27, Michelle Zeny26,
Milena Coelho Fernandes Caldato32, Ney Boa Sorte13, Nina Rosa de Castro Musolino33,
Paula Frassinetti Vasconcelos de Medeiros34, Paulo Ricardo Gazzola Zen35, Raquel Tavares Boy Da Silva36,
Rayana Elias Maia37, Rodrigo Fock8, Rosemarie Elizabeth Schimidt Almeida38, Solange Oliveira Rodrigues Valle39,
Tatiana Amorim40, Thaís Bomfim Teixeira41, Vania Mesquita Gadelha Prazeres42,
Victor Evangelista de Faria Ferraz43, Vinicius Costa Lima44, Wagner José Martins Paiva38,
Ida Vanessa Doederlein Schwartz1,2, Domingos Alves4,45, Têmis Maria Félix46* and Raras Network Group
Abstract
Background The Brazilian Policy for Comprehensive Care for People with Rare Diseases was implemented in 2014;
however, national epidemiological data on rare diseases (RDs) are scarce and mainly focused on specific disorders.
To address this gap, University Hospitals, Reference Services for Neonatal Screening, and Reference Services for Rare
Diseases, all of which are public health institutions, established the Brazilian Rare Diseases Network (RARAS) in 2020.
The objective of this study was to perform a comprehensive nationwide epidemiological investigation of individuals
with RDs in Brazil. This retrospective survey collected data from patients receiving care in 34 healthcare facilities affiliated with RARAS in 2018 and 2019.
Results The survey included 12,530 participants with a median age of 15.0 years, with women representing 50.5%
of the cohort. Classification according to skin color demonstrated that 5044 (47.4%) participants were admixed. Most
*Correspondence:
Têmis Maria Félix
tfelix@hcpa.edu.br
Full list of author information is available at the end of the article
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativeco
mmons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Page 2 of 13
had a confirmed diagnosis (63.2%), with a predominance of phenylketonuria (PKU), cystic fibrosis (CF), and acromegaly. Common clinical manifestations included global developmental delay and seizures. The average duration
of the diagnostic odyssey was 5.4 years (± 7.9 years). Among the confirmed diagnoses, 52.2% were etiological (biochemical: 42.5%; molecular: 30.9%), while 47.8% were clinical. Prenatal diagnoses accounted for 1.2%. Familial recurrence and consanguinity rates were 21.6% and 6.4%, respectively. Mainstay treatments included drug therapy (55.0%)
and rehabilitation (15.6%). The Public Health System funded most diagnoses (84.2%) and treatments (86.7%). Hospitalizations were reported in 44.5% of cases, and the mortality rate was 1.5%, primarily due to motor neuron disease
and CF.
Conclusion This study marks a pioneering national-level data collection effort for rare diseases in Brazil, offering
novel insights to advance the understanding, management, and resource allocation for RDs. It unveils an average
diagnostic odyssey of 5.4 years and a higher prevalence of PKU and CF, possibly associated with the specialized services network, which included newborn screening services.
Keywords Rare diseases , Public Health System, Brazil, Brazilian Rare Diseases Network
Introduction
Rare diseases (RDs) are individually rare but collectively
affect a significant proportion of the population. Approximately 71.9% of RDs have a genetic cause, and there are
over 6000 known RDs [1]. They represent a serious public
health problem with major unmet needs since many are
life-limiting or chronically debilitating. Patients and families with RDs often face long diagnostic journeys, while
healthcare professionals struggle with identifying, managing, and obtaining accurate information about these
conditions. RDs are often associated with early mortality
and a considerable reduction in quality of life [1–5].
In Brazil, the Ministry of Health defines an RD as any
disorder that affects up to 65 per 100,000 individuals [3,
4]. Previous international studies have reported an estimated population prevalence of RDs of 3.5–8.0%, suggesting that they have a substantial impact on public
health [1, 5, 6]. Extrapolating these estimates to the Brazilian population [7] produces a corresponding figure of
7.0–16.2 million Brazilians affected by RDs, highlighting
their significant burden and public health implications.
Brazil, the fifth-largest country worldwide, covers
8,510,417 square kilometers and is divided into five
regions with 26 states, a Federal District, and 5570
municipalities [7]. The Brazilian Unified Health System
(Sistema Único de Saúde [SUS]) was established in 1988
and aims to provide universal and equitable access to
promotion, prevention, and health care services for all
Brazilian citizens. Brazil has undergone an epidemiological transition in recent decades, marked by significant advancements in health indicators attributable to
external factors. Notably, hereditary diseases and congenital anomalies contribute significantly to child mortality, ranking second among infant mortality causes
since 2005 [8, 9].
In January 2014, the Brazilian Policy for Comprehensive Care for Persons with Rare Diseases was established within the scope of the SUS. This policy aims to
reduce morbidity and mortality and improve the quality of life of individuals with RDs through promotion,
prevention, early detection, timely treatment, disability reduction, and palliative care. It classifies RDs as
genetic and non-genetic, with genetic RDs grouped into
three categories: congenital anomalies and late-onset
disorders, intellectual disability, and inborn errors of
metabolism [10].
To date, over 30 reference services for RDs have been
accredited. This is still insufficient to meet population
demands. Most cases are treated in university hospitals (UHs), but whether their human and technological resources are adequate for RD care is unknown
[10, 11]. Despite advances in diagnosis, mainly due to
the development of new technologies and the recent
organization of RD care in Brazil, the country lacks an
established system for registering RDs. Except for a few
infectious RDs that require mandatory reporting, epidemiological data on these conditions are scarce and,
when available, are often restricted to specific RDs [2,
3].
High-quality epidemiological data on RDs are essential
for understanding patient needs, enhancing healthcare
management, and identifying the potential beneficiaries
of clinical trials and novel therapies. However, epidemiological research encounters obstacles since many studies
rely on limited national registries that often focus on specific disease groups [5]. Therefore, a coordinated effort
to map the epidemiology of RDs in Brazil is needed. The
Brazilian Rare Diseases Network (RARAS) was established in 2020 to bridge this gap, including UHs, RD
reference services (RDRSs), and newborn screening reference services (NSRSs). This initiative encompasses a
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
national survey of the epidemiology, diagnosis, clinical
presentation, and treatment of individuals with genetic
and non-genetic RDs. It has two phases: retrospective and prospective. The retrospective phase involved
data collection on RD cases treated at centers in 2018 and
2019, while data collection for the prospective phase has
been going on since 2022 [2, 3]. This study presents the
findings of the retrospective phase, undertaking a comparative analysis of distinct diagnostic status groups.
Materials and methods
A retrospective survey was conducted to collect data
from patients under diagnostic investigation or with
a diagnosis or suspicion of an RD who were evaluated
between 2018 and 2019 at 34 centers participating in the
RARAS. These centers include 15 UHs, 4 RDRSs, and 3
NSRSs, with the remaining centers having mixed roles: 8
are both an RDRS and a UH, 3 are both an RDRS and an
NSRS, and 1 is both an NSRS and a UH. A map of the
participating centers can be seen in Additional File 1.
This project’s methodology has been previously published by Alves et al. [2] and Félix et al. [3]. All participating network services retrospectively searched for
cases with genetic and non-genetic RDs and those under
diagnostic investigation. Researchers collected data from
each service by accessing medical records, using a standardized form in the Research Electronic Data Capture
(REDCap) platform hosted at Ribeirao Preto Medical
School, University of São Paulo [12]. The original survey
is available at LattesData [13]. The form collected demographic, clinical, and therapeutic data. Given the different backgrounds of the data collectors, training was
conducted for the participating centers. Initially, a pilot
project was performed in five centers with different medical record management forms (paper or electronic). Two
hundred and fifty cases were collected during the pilot
phase from December 7, 2020, to January 15, 2021. The
data were validated and curated. Based on this validation,
retrospective data collection was initiated in the centers,
which ended in March 2022.
Skin color was described according to the Brazilian
Institute of Geography and Statistics (IBGE) as parda
(admixed), branca (white), preta (black), amarela (yellow), and indígena (indigenous). Phenotypic data were
described according to the Human Phenotype Ontology (HPO) [14] and limited to five terms per case. Diagnostic information was recorded based on international
ontologies (International Statistical Classification of
Diseases and Related Health Problems, Tenth Revision
[ICD-10] [15]; Orphanet [ORPHA] [16]; or Online Mendelian Inheritance in Man [OMIM] [17]), enabling comparison and aggregation with Orphadata. Reasons for
Page 3 of 13
hospitalization and causes of death were documented
using ICD-10 [3].
Data analyses were performed using the IBM® SPSS
Statistics software (version 26) and Python language
(version 3.9.17), leveraging the Pandas (version 1.5.3),
NumPy (version 1.24.3), and SciPy (version 1.10.1) libraries. In the descriptive analyses, each individual was evaluated independently. In the comparative analyses based
on diagnostic status, each diagnosis was considered independently, as an individual might have more than one
RD diagnosis. The chi-squared test was used to compare
nominal variables, while the Kruskal–Wallis test was
applied to compare continuous numerical variables. In
both cases, the Bonferroni correction was utilized for
multiple comparisons. The significance level was set at
0.05.
Results
Population
Data from 12,530 participants across 34 centers were collected. Most of the sample was female (n = 6331; 50.6%),
and 13 (0.1%) individuals had undetermined sex. The
median age was 15.0 years (interquartile range [IQR]:
7–31; mean: 24.9 ± 20.4; range: 1–98) at the time of inclusion (Fig. 1a). The sample’s characteristics are shown in
Table 1.
Classification according to skin color demonstrated
that 5044 (47.5%) individuals were admixed and 4881
(45.9%) were white. Most participants were born in the
Southeast (n = 3765; 33.6%) and Northeast (n = 3729;
33.2%) regions. Individuals born in 1750 Brazilian municipalities were included. Twelve participants (0.1%) were
born in other countries: two in Lebanon and one each
in Egypt, Ecuador, Guinea-Bissau, Japan, Paraguay, Peru,
Portugal, and Venezuela (Table 1). Most participants
lived in the Southeast region (n = 3996; 32.8%), followed
by the Northeast region (n = 3950; 32.5%).
The first evaluation at the participating centers
occurred at a median age of 6.2 years (IQR: 0.9–20.7).
The participants had a median follow-up duration of
2.8 years in the centers (IQR: 0.6–7.9) and 1.7 years in the
medical specialty (IQR: 0.1–1.7). Of the total sample, 92
participants were followed up in more than one participating center.
Diagnosis
Regarding diagnosis status, 7931 (63.2%) participants
had a confirmed diagnosis, while 2450 (19.5%) had a
suspected diagnosis, and 2177 (17.3%) were considered
undiagnosed. Sixty-seven participants had more than one
confirmed RD diagnosis: 65 had two, and two had three.
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Page 4 of 13
Fig. 1 a Histogram of participants’ age and sex distribution (n = 12,502) and b diagnostic status (n = 12,279)
Regarding the diagnostic terminology, 6644 (64.7%)
of the diagnoses were recorded using an ORPHA code,
2794 (27.2%) using an ICD-10 code, and 825 (8.0%) using
an OMIM code. A total of 1778 different diagnostic codes
were mentioned. The most frequent diseases were phenylketonuria (PKU; n = 623), cystic fibrosis (CF; n = 506),
and acromegaly (n = 382; Table 2). The diagnostic codes
aggregated for the ten most prevalent conditions are
detailed in Additional File 2. Upon excluding cases diagnosed through newborn screening, the most frequent
diagnoses were CF (n = 389), acromegaly (n = 381), and
osteogenesis imperfecta (n = 361). The distribution of the
most frequently reported diagnostic codes at each participating center is detailed in Additional file 3.
Most confirmed diagnoses were etiological (n = 5185;
52.2%), with clinical diagnoses accounting for the
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Family history
Table 1 Sample characteristics (n = 12,530)
N
Page 5 of 13
%
Color or race
Admixed
5044
47.5
White
4881
45.9
Black
609
5.7
Yellow
68
0.6
Indigenous
30
0.3
Sex
Family recurrence was reported in 2717 cases (21.6%)
and consanguinity in 803 cases (6.4%). Consanguinity
rates, expressed as percentages, were significantly higher
in the Northeast region (14.0%), followed by the South
(7.1%), North (6.5%), Southeast (6%), and Midwest (4.4%;
p < 0.0001). The mean maternal age at the patient’s birth
was 27.7 ± 7.0 years (range: 12–63), and the mean paternal age was 31.7 ± 8.4 years (range: 12–79).
Female
6331
50.6
Treatment
Male
6171
49.3
13
0.1
Regarding treatment, 6509 participants (54.3%) received
specific therapy to treat their RD or manage its signs and
symptoms. The most frequent therapies were drug therapy (n = 6108; 55.0%), rehabilitation therapy (n = 1739;
15.6%), and dietary therapy (n = 976; 8.8%). Drug treatment was initiated at an average age of 22 ± 21.8 years,
dietary treatment at 3.2 ± 8.3 years, and rehabilitation at
14.9 ± 19.4 years. The primary funding source for treatments was the SUS (86.7%), which supported 85.6% of
the drug treatments, 83.2% of the dietary treatments, and
88.2% of the rehabilitative treatments.
Multi-specialty medical follow-up was reported in
84.0% (n = 9864) of participants. Apart from medical
genetics, the specialty where most data was collected,
neurology was the most consulted specialty, representing
31% of consultations, followed by endocrinology (22.6%),
neuropediatrics (21%), and ophthalmology (18.2%).
Undetermined
Region of birth
Southeast
3765
33.6
Northeast
3729
33.2
South
1659
14.8
Midwest
1377
12.3
North
673
6.0
Born in other countries
12
0.1
Region of residence
Southeast
3996
32.8
Northeast
3950
32.5
South
2081
17.1
Midwest
1497
12.3
North
642
5.3
remaining cases (n = 4743; 47.8%). Among the cases with
an etiological diagnosis, most were confirmed through
biochemical (n = 2164; 42.5%), molecular (n = 1574;
30.9%), and cytogenetic (n = 691; 13.6%) diagnostic methods (Fig. 1b). The primary funder for the diagnostic tests
was the SUS (84.2%).
On average, 2.85 HPOs were reported per case.
The most frequent signs and symptoms were global
developmental delay (HP:0001263; n = 1246), seizure
(HP:0001250; n = 734), and short stature (HP:0004322;
n = 678; Table 2). The median age at symptom onset was
0.8 years (IQR: 0–9; mean: 9.2), with a median age of 1
year for confirmed cases and 0.8 years for suspected diagnoses (Table 3). Only 17.8% of participants experienced
symptom onset after the age of 18 years (n = 1638).
The diagnosis was made prenatally in only 121 cases
(1.2%) and via newborn screening in 979 (9.9%) cases.
The median age at confirmatory diagnosis was 10.4 years
(IQR: 2.1–33.1) upon excluding prenatal and newborn
screening diagnoses (Table 3). The average time from the
onset of the first symptom to the diagnostic confirmation
was 5.4 ± 7.9 years (n = 4583).
Hospitalization and death
A previous hospitalization was recorded for 4922 participants (44.5%). The mean number of hospitalizations was
4.12 ± 14.2 (range: 0–379), with 5% of participants undergoing at least 13 hospitalizations. The most frequent
reasons for hospitalization were ICD-10 codes E22.0
(acromegaly and pituitary gigantism; n = 189), Q78.0
(osteogenesis imperfecta; n = 161), and E84 (CF; n = 125;
Table 2).
A mortality rate of 1.5% (n = 177) was observed in the
studied population during the evaluated period. The
median age at death was 20.3 years (IQR: 1.6–55.7; mean:
30.3 ± 27.8; range: 0–87.7). The leading causes of death
were ICD-10 codes G12.2 (motor neuron disease; n = 30),
E84 (CF; n = 10), and I46 (cardiac arrest; n = 7; Table 2).
Autopsy was performed in 18 (10.3%) cases.
Table 3 presents comparative data on cases with confirmed diagnoses, suspected diagnoses, and undiagnosed
cases based on the investigated characteristics. Details
of the statistical results and pairwise comparisons with
Bonferroni correction are available in Additional file 4.
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Page 6 of 13
Table 2 The ten most frequent disorders, signs and symptoms, causes of hospitalization, and causes of death
Most frequent diagnoses (N =12,261)*
Description
N
%
Phenylketonuria
623
5.1
Cystic Fibrosis
506
4.1
Acromegaly
382
3.1
Osteogenesis Imperfecta
360
2.9
Dystrophinopathy
278
2.3
Congenital adrenal hyperplasia
275
2.2
Neurofibromatosis
271
2.2
Mucopolysaccharidosis
225
1.8
Amyotrophic lateral sclerosis
211
1.7
Turner Syndrome
197
1.6
Description
N
%
HP:0001263
Global developmental delay
1246
3.6
HP:0001250
Seizure
734
2.1
Most frequent signs and symptoms (N = 34,685)**
HPO
HP:0004322
Short stature
678
2.0
HP:0001249
Intellectual disability
514
1.5
HP:0001252
Hypotonia
451
1.3
HP:0005982
Reduced phenylalanine hydroxylase level
391
1.1
HP:0001324
Muscle weakness
390
1.1
HP:0002315
Headache
331
0.9
HP:0000252
Microcephaly
326
0.9
HP:0002015
Dysphagia
298
0.8
Most frequent causes of hospitalization (N = 4,922)***
ICD-10
Description
N
%
E22.0
Acromegaly and pituitary gigantism
189
3.8
Q78.0
Osteogenesis imperfecta
161
3.3
E84
Cystic fibrosis
125
2.5
J18–J18.9
Pneumonia, organism unspecified
119
2.4
G12.2
Motor neuron disease
87
1.8
E25
Adrenogenital disorders
50
1.0
E84.0
Cystic fibrosis with pulmonary manifestations
46
0.9
R56
Convulsions, not elsewhere classified
38
0.8
G71.0
Muscular dystrophy
33
0.7
G40
Epilepsy and recurrent seizures
32
0.6
Most frequent causes of death (N = 177)
ICD-10
Description
N
%
15.8
G12.2
Motor neuron disease
28
E84
Cystic fibrosis
10
5.6
I46
Cardiac arrest
7
3.9
R09.2
Respiratory arrest
3
1.7
J96.9
Respiratory failure, unspecified
3
1.7
J96.0
Acute respiratory failure
3
1.7
J96
Respiratory failure, not elsewhere classified
3
1.7
J38.4
Edema of larynx
2
1.1
E74.0
Glycogen storage disease
2
1.1
A41.9
Sepsis, unspecified organism
2
1.1
A41
Other sepsis
2
1.1
*Overall diagnoses. ** Total mentioned HPOs. ***Number of individuals with previous hospitalizations
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Discussion
This study represents Brazil’s first comprehensive evaluation of RD epidemiology, embodying an innovative
approach based on collaborative efforts and a networkbased framework. The specialized services network,
including NSRSs, contributed to the higher prevalence
of PKU and CF diagnoses in this epidemiological survey.
Additionally, this study revealed the average duration of
the diagnostic odyssey for individuals with RDs in Brazil (5.4 years). Moreover, a substantial portion of patients
with RDs were found to remain undiagnosed.
The study population mainly comprised individuals
born and residing in Brazil’s Southeast, Northeast, and
Southern regions, respectively, which are ranked as the
most populous regions in the country [7]. Individuals
born in 1750 Brazilian cities were included, representing 31.4% of all national municipalities [7]. Notably, São
Paulo city, with 12.4 million inhabitants, has the highest population and contributed the most participants
to this study. Higher rates of confirmed diagnoses were
found among participants born and residing in the South
and Southeast regions of the country compared to other
regions, likely due to the greater availability of genetic
testing and specialized resources for RDs in these areas,
as reported in previous studies [8, 9, 11, 18].
The newborn screening program in Brazil encompasses
PKU and CF, contributing to the high frequency of these
conditions in this study. The screening also covers congenital hypothyroidism, hemoglobinopathies, congenital adrenal hyperplasia, and biotinidase deficiency [3].
Sickle cell disease was excluded due to its non-rare status in certain states of Brazil, especially among individuals with African ancestry [19]. Medical genetics services’
prevalence may have influenced the lower frequency of
congenital hypothyroidism. Upon excluding newborn
screening cases, PKU was not the most common diagnosis. A considerable number of cases of CF were not identified through neonatal screening. This may be due to the
inclusion of CF in the Brazilian neonatal screening program around 2001 [20] and its complete incorporation
may not have occurred immediately. It is also important
to consider the possibility of false negatives in the screening process.
Acromegaly emerged as a notable focal point in our
study, standing out as one of the three most prevalent
conditions in seven participating centers and the most
frequent cause of hospitalization in the studied population. This prominence could be attributed to the specialized nature of at least four of these centers, which
function as dedicated reference services for acromegaly
treatment. This specialization can potentially cause
selection bias, as individuals seeking care specifically
Page 7 of 13
for acromegaly may contribute disproportionately to the
study population from these centers.
In our study, 67 participants had multiple confirmed
RD diagnoses, which poses unique challenges and
impacts patients physically, emotionally, and financially.
With the advancing scope of genomic techniques, having
multiple confirmed RD diagnoses is becoming increasingly common [21].
Compared to the 6.4% consanguinity rate observed in
our study, previous research indicates variable consanguinity rates in different populations. Leutenegger et al.
[22] found inbreeding in various populations around the
world, with the highest levels in the Middle East, Central
South Asia, and the Americas. A mean consanguinity
rate of 0.96% was reported in South America, with higher
rates in Venezuela (1.84%) and Brazil (1.60%) [23]. Previous studies have also indicated higher consanguinity
rates in the Northeastern region [24]. Factors such as low
paternal education and occupation levels were positively
associated with consanguinity [23]. The higher consanguinity rates in our study compared to previous studies can be attributed to the population of participants
with diagnosed or suspected RDs, including autosomal
recessive disorders.
Many participants experienced numerous hospitalizations, especially those with confirmed RD diagnoses,
suggesting that these hospitalizations may be related to
therapeutic requirements. This observation underscores
the complex, multidisciplinary specialized care that
individuals with RDs uniquely need and emphasizes the
importance of accordingly tailored accessible healthcare.
Previous studies have reported the elevated economic
burden of hospitalizations for RDs [6] and higher hospitalization rates among patients with metabolic and
genitourinary system-related RDs [25]. Additionally, RDs
have been previously associated with unfavorable inpatient outcomes, including in-hospital deaths, extended
stays, intensive care unit admissions, and 30-day readmissions when compared to an inpatient population
without RDs [26].
Some form of instituted therapy was identified more
frequently among individuals with confirmed RD diagnoses. Participants with confirmed RD diagnoses may
have received more frequent therapy due to selection
bias, reflecting possibly more severe symptoms and referrals to specialized centers. Disease severity may have
also driven immediate therapy initiation for improved
management and outcomes. Ninety-two participants
received care from multiple centers, illustrating co-management challenges in complex, multisystem RDs [10,
25]. Our study also emphasized the importance of multidisciplinary care for individuals with RDs. However, it
is essential to acknowledge that medical genetics data
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Page 8 of 13
Table 3 Comparative analysis based on diagnostic status
Confirmed diagnosis (N = 7931) Suspected diagnosis (N = 2450) Undiagnosed (N = 2177)
Significance
Median (IQR)
P value
Median (IQR)
Median (IQR)
Age (years) (N = 12,159)
18 (9–37)
13 (6–26)
11 (6–18)
< 0.0001*
Age of symptom onset (years)
(N = 9328)
1 (0–14)
0.8 (0–8)
0.2 (0–2)
< 0.0001*
Age at first evaluation
at the center (years) (N = 11,546)
7.3 (0.7–26.8)
6.5 (1.3–17.4)
3.8 (0.9–10.8)
< 0.0001*
Age at first evaluation in the specialty (years) (N = 11,277)
8.1 (1.1–27.3)
7.6 (1.9–18.5)
5.6 (1.7–12.6)
< 0.0001*
Length of follow-up at the center
(years) (N = 11,592)
3.7 (1–9.4)
1.3 (0.2–4.6)
1.8 (0.3–5.5)
< 0.0001*
Length of follow-up in the specialty (years) (N = 11,317)
2.7 (0.6–7.2)
0.6 (0–2.6)
0.5 (0–2.6)
< 0.0001*
Age at confirmatory diagnosis
(years) (N=4944)
10.4 (2.1–33.1)
NA
NA
Number of previous hospitalizations (N = 4294)
2 (1–4)
1 (1–3)
1 (1–2)
< 0.0001*
Maternal age at birth (years)
(N = 4837)
27 (22–33)
27 (22–32)
27 (22–33)
0.332
Paternal age at birth (years)
(N = 3996)
31 (25–37)
30 (25.7–37)
31 (25–37)
0.995
N (%)
N (%)
N (%)
P value
White
3330 (66.6)
763 (15.3)
907 (18.1)
< 0.0001*
Admixed
3054 (61.2)
1072 (21.5)
863 (17.3)
Black
425 (69.2)
103 (16.8)
86 (14.0)
Yellow
45 (64.3)
11 (15.7)
14 (20.0)
Indigenous
21 (70.0)
5 (16.7)
4 (13.3)
Female
4254 (67.4)
1085 (17.2)
971 (15.4)
Male
3687 (60.7)
1200 (19.8)
1184 (19.5)
Undetermined
7 (53.8)
5 (38.5)
1 (7.7)
Southeast
2516 (66.3)
582 (15.3)
700 (18.4)
Northeast
2200 (59.6)
739 (20.0)
753 (20.4)
South
1258 (72.1)
213 (12.2)
275 (15.7)
Midwest
828 (61.2)
325 (24.0)
201 (14.8)
North
321 (47.8)
254 (37.9)
96 (14.3)
Born in other countries
7 (63.6)
2 (18.2)
2 (18.2)
Southeast
2688 (66.5)
610 (15.1)
744 (18.4)
Northeast
2339 (60.0)
800 (20.5)
758 (19.5)
South
1658 (76.2)
234 (10.7)
286 (13.1)
Midwest
906 (61.5)
355 (24.1)
213 (14.4)
North
299 (46.8)
246 (38.6)
93 (14.6)
No
4953 (62.9)
1418 (18.0)
1503 (19.1)
Yes
1713 (63.5)
531 (19.7)
452 (16.8)
No
5487 (61.5)
1697 (19.1)
1734 (19.4)
Yes
440 (55.1)
158 (19.8)
200 (25.1)
–
Color or race
Sex
< 0.0001*
Region of birth
< 0.0001*
Region of residence
< 0.0001*
Family recurrence
0.030
Consanguinity
< 0.0001*
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Page 9 of 13
Table 3 (continued)
N (%)
N (%)
N (%)
P value
No
3789 (62.8)
1143 (19.0)
1099 (18.2)
< 0.0001*
Yes
3583 (70.4)
809 (15.9)
697 (13.7)
No
7688 (65.0)
2113 (17.9)
2021 (17.1)
Yes
127 (71.8)
30 (16.9)
20 (11.3)
Yes
5317 (83.9)
620 (9.8)
397 (6.3)
No
134 (40.9)
73 (22.3)
121 (36.8)
Previous hospitalization
Death
0.094
Treatment related to rare disease
< 0.0001*
Each row corresponds to the total number of valid data, i.e., without considering missing values. In this analysis, each diagnosis was evaluated independently,
considering that a participant may have more than one RD diagnosis
P-values marked with * represent statistical significance (P < 0.05)
were not separately collected as a distinct medical specialty. Instead, this specialty was encompassed within the
primary care for most cases, where data collection and
treatment were conducted.
The SUS plays a vital role in RD diagnosis and treatment. It serves as the primary funder for therapies and
diagnostic methods related to RDs. The SUS enables
the availability of genetic testing [11], specialized consultations, and treatment options that incorporate the
National Committee for Health Technology Incorporation recommendations and enable the subsequent
development of clinical guidelines [10, 27]. Working as a network becomes essential to optimize the
use of resources and enhance collaboration between
institutions.
Five of the 34 participating centers exclusively care
for pediatric patients, while the remaining centers offer
care to both pediatric and adult patients. This distribution reflects the prevalence of RDs affecting individuals
across the age spectrum. Interestingly, our data revealed
a median age at symptom onset of 0.8 years, indicating
that symptoms typically manifest early in life. Additionally, our findings show that over 80% of individuals experienced symptoms before the age of 18 years, surpassing
the figure of 70% reported in a previous study [1]. This
difference could be attributed to the participation of dedicated pediatric care centers in our study. Our findings
suggest that RD symptoms often present at a younger
age, highlighting the need for early diagnosis and intervention, especially in pediatric patients, but continue to
pose challenges into adulthood.
The diagnostic odyssey, defined as the time from symptom recognition to a definitive diagnosis [28], averaged
5.4 years, consistent with the figure of 4.8–7.6 years
reported in other studies worldwide [29, 30]. Notably,
a previous study in Brazil reported that the diagnostic
odyssey for mucopolysaccharidosis lasted 4.8 years [31].
Prolonged diagnostic odysseys for RDs often involve disease progression, incorrect diagnoses, invasive procedures, delayed treatment initiation, financial burden, and
inappropriate interventions [32].
Despite thousands of described RDs, many remain
undiagnosed, subjecting individuals to prolonged, costly
diagnostic odysseys across multiple healthcare centers
[32]. However, even after such efforts, around 6% and
7% of patients with RDs in the United States and Australia, respectively, remained undiagnosed even in expert
clinical settings [32, 33]. Factors that may explain the
higher rates of undiagnosed cases (exceeding 17%) in our
study include poor access to molecular diagnostic techniques. A recent study by RARAS reported that molecular diagnostic tests were available in just over half of the
participating centers [11]. Most cases with an etiological diagnosis were confirmed through biochemical and
molecular methods. Interestingly, while not the primary
confirmatory method, cytogenetic testing was the most
accessible diagnostic method in the participating centers,
according to the same study.
In the comparative analysis, individuals with a confirmed RD diagnosis showed a higher age, longer followup duration in specialized centers, and higher number of
previous hospitalizations. Specifically, the undiagnosed
group may include individuals who are in the diagnostic journey or odyssey and have not yet obtained a confirmed diagnosis. Subsequent investigations within the
RARAS initiative will aim to prospectively assess such
cases, establishing a national registry of RDs.
The average age at death was 30.3 years, representing a
47-year reduction compared to the Brazilian population’s
2021 life expectancy [34]. In our study, 25% of deaths
occurred within the first 1.6 years of life, indicating that
RDs significantly impact life expectancy. Previous data
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
suggested that 22% of infant deaths were due to confirmed genetic disorders [35]. Causes of death related to
RDs vary and are often documented as complications
rather than the underlying disease. Cardiac and respiratory arrests were frequently recorded causes that did not
fully represent the primary cause. The accurate documentation of complications and comorbidities is crucial
in RDs, offering insights into disease progression and
leading to the development of targeted interventions to
improve patient care and reduce RD-related mortality
[36]. It is important to recognize that undiagnosed cases
might also contribute to mortality figures since some
individuals may miss the opportunity to receive care in
specialized healthcare facilities, leading to an unrealized
suspicion of an RD.
While our study provides valuable information, it has
limitations, including sample size and potential bias. The
estimated population prevalence for RDs ranged from 3.5
to 8.0% [1, 5, 6], suggesting a significantly larger affected
population. Considering the Brazilian population, the
country’s total number of individuals with RDs would
be 550–1200 times larger than the population studied in
this project phase [7]. It is essential to note that this study
did not include all national healthcare centers, potentially missing patients not evaluated during the study or
not receiving care at participating centers. Moreover, the
predominance of genetic RDs may have resulted from the
specialized expertise and diagnostic resources in genetic
centers, leading to selection bias.
This study faced operational limitations related to data
sources, including finding, accessing, sharing, and reusing information. A “data quality culture” was promoted
to address these issues, emphasizing the need for reliable and comprehensive data. Collectors had diverse
backgrounds and digital literacy levels, which could have
introduced errors and affected data reliability. Tools,
training, support materials, and dedicated channels
were provided to mitigate their effects. The complex
RD domain made case identification and classification
challenging, potentially leading to underreporting and
underdiagnosis. Awareness efforts, feedback sessions,
outlier identification, case discussions, and standardized
data collection protocols were implemented to address
this issue [2, 37].
This study revealed appreciable missing data in medical
records, which can introduce record-keeping, memory,
and registration biases. Missing data in medical records
can limit retrospective research, potentially due to registration bias. However, data collection directly from
participants in the prospective project phase aims to fill
these gaps. A potential contribution of our study is the
enhancement of registration methods. By identifying and
addressing limitations in data collection and diagnostic
Page 10 of 13
terminology classification, we lay the groundwork for
more accurate and comprehensive RD registration. This
enhancement improves our understanding of RD epidemiology and supports the development of effective public
health policies and resource allocation strategies. Standardized data collection protocols and advanced information systems will ensure that future studies and registries
capture vital data points, facilitating ongoing RD monitoring and research [2].
Diagnosis data in our study came from three different
ontologies, each with limitations regarding disease terminology. While this study’s protocol allowed centers to
select RD terminology, including ICD-10, it had limitations in RD classification [38, 39]. Accurate RD classification is crucial for efficient healthcare resource allocation
and improved analysis for differential diagnosis and clinical decision support. While data were aggregated from
the Orphadata database designed for RDs, this database
does not encompass all described RDs. In Brazil, ICD-10
remains the classification used by the SUS for diagnosis,
hospitalization, and death registration [10, 39]. In the
context of HPO terminology, it is noteworthy that the
number of HPO terms may have been underestimated
due to the limitation of five terms per case.
Future research within the RARAS will encompass the
diagnostic and treatment journey of participants with
multiple confirmed diagnoses, explore specific therapies and the duration of hospitalizations, investigate the
correlation between diagnostic ontologies, and examine population genetics. Other research avenues include
exploring the relationship between parental age and RDs
and examining the correlations of diagnoses with available diagnostic methods at each center.
We also identified challenges in finding a minimal data
set (MDS) that applied to Brazilian patients with RDs. To
address this issue, we conducted a systematic review to
create a comprehensive MDS for future project phases
[40, 41]. Standardizing data collection through an MDS
is critical for accurately identifying RDs and optimizing diagnostic and treatment processes, particularly in
resource-limited settings. Validating it as a national tool
for epidemiological tracking and analysis is essential for
structuring health information systems and guiding more
effective public health policies. Further research phases
are required to refine prevalence estimates and comprehensively understand specific RDs and their impact on
the Brazilian population by including a broader range of
healthcare facilities. This retrospective analysis did not
address factors such as participants’ socioeconomic status, referral sources, or willingness to participate in other
studies. However, these variables became part of the data
collection protocol and will be examined in forthcoming
studies.
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
The perspectives presented here shed light on the
future research directions derived from our study, fostering further advancements in the field. These data can
support future studies and ultimately lead to improvements in RD diagnosis, treatment, and management.
Understanding the magnitude of RDs is crucial for effective resource allocation, policy development, and the
provision of appropriate healthcare services for affected
individuals [3, 5].
This multicenter study presents the initial nationwide
data on the care provided to individuals with RDs in Brazil, highlighting the importance of collaboration between
specialized services. Reliable epidemiological data will
support public health approaches, including population
impact assessment, cost evaluation, and improved RD
management, and facilitate clinical trial development [5].
This study also emphasizes the vital role of the collected
information in shaping public policies while identifying
limitations such as data gaps and constrained terminologies for disease classification. Until this study was performed, our understanding of RDs in Brazil, except for
specific disorders, was limited by a lack of comprehensive evidence. Establishing a national network, including
data collection infrastructure, marked a significant step
towards advancing the understanding of RDs in Brazil
and addressing this gap.
The longitudinal and prospective continuation of this
study is necessary and currently underway, with the
expectation that it will impact health policy for RDs
regarding resource allocation and improving the quality of life of affected individuals. The results of our study
also provide valuable guidance for the refinement of data
collection forms and instruments, thereby enhancing the
effectiveness and accuracy of information related to RDs
in Brazil.
Supplementary Information
The online version contains supplementary material available at https://doi.
org/10.1186/s13023-024-03392-7.
Additional file 1. Map of participating centers
Additional file 2. The ten most frequent RD diagnoses in RARAS and the
applied coding
Additional file 3. Top three diagnostic codes and their corresponding
counts and percentages at each participating center
Additional file 4. Post-test analysis of demographic factors and medical
outcomes across distinct diagnostic statuses
Page 11 of 13
MBN, DBY; JFB, DA; Supervision: IVDS, DA, TMF; Validation: BMO, MBN, MA, TMF;
Visualization: IVDS, DA, TMF; Writing-original draft: BMO, FAB, JFB, TMF.; Writing—review and editing: All authors. Raras Network Group: Adlya de Sousa
Melo; Adrya Rafaela da Silva Rocha; Amanda Aragão; Amanda Delfino Braccini;
Amanda Maria Schmidt; Ana Mondadori dos Santos; Ana Carolina de Souza e
Silva; Ana Catarina Góes Leite Lima; Anna Luiza Scasso; Anne Caroline Magalhães Oliveira; Arthur Perico; Bárbara da Silva Aniceto; Barbara Pinheiro; Beatriz
Ono Badaró; Beatriz Brasil Braga; Beatriz de Oliveira Chapiesk; Beatriz Felix Pinheiro; Beatriz Pereira; Betânia de Souza Ponce; Bianca Martins; Blenda Antunes
Cacique Curçino de Eça; Bruna de Souza; Brunno Busnardo Paschoalino; Bruno
Valadares; Caio Lôbo de Oliveira; Camila Sales; Carine Pacheco Alexandre; Carla
Desengrini Girelli; Carolina Balluz; Carolina de Paiva Farias; Carolina Oliveira
Vilemar; Caroline Duarte Arrigoni; Catharina de Almeida Passos; Catharine
Harumi; Cleber Barbieri; Daniel Prado; Daniela Monteiro; Dhallya Andressa da
Silva Cruz; Eduardo Batista; Eduardo José Pereira Naves; Elaine Samara Pinheiro
Mendes da Silva; Estela Teixeira; Fabio Amaral Jr; Fernanda Caroline Moreira;
Flavia Liberato de Souza; Flavia Boggian; Francisco André Gomes Bastos Filho;
Gabriel Lima Lôla; Gabriel Pereira; Gabrielle Diehl; Giovanna Pessanha Cordeiro;
Giulia Duran; Gustavo Foz Fonseca; Helena Mello; Henrique Serpa; Henrique
Veiga; Ingrid Gabriel; Isabella Formenti; Isabella de Brito Ramos; Isabella
Ramos Paiva; Janaina Ferreira; Jannine Barboza Rangel; Jôbert Pôrto Florêncio;
Josevaldo Monteiro Maia Filho; Júlia Emily Silva Dantas; Julia Cordeiro Milke;
Juliana Rios; Julya Pavao; Kahue Aluaxe Angelo; Karina Montemor Klegen de
Oliveira; Katheryne Barbosa de Carvalho; Kauanne Zulszeski; Leticia Raabe
Mota de Lima; Livia Polisseni Cotta Nascimento; Lorena Alves dos Santos
Pereira; Lorenzo Makariewicz; Luan Junio Pereira Bittencourt; Luana Medeiros;
Luana Souza Vasconcelos; Lucca Nogueira Paes Jannuzzi; Luciana Costa Pinto
da Silva; Luisa Aguilar; Luiza Valeria Chibicheski; Luiza de Oliveira Simões; Maria
Teresa Aires Cabral Dias; Mariana Lopes dos Santos; Mariana Pacheco Oliveira
Neves; Marina Teixeira Henriques; Matheus Viganô Leal; Milena Atique Tacla;
Milena Soares Souza; Moises Ribeiro da Paz; Morya Silva; Natan Soares; Nicole
da Silva Gilbert; Otavio Mauricio Silva; Paula Dourado Sousa; Paulo Rocha;
Raissa Emanuelle Jacob; Raissa Vieira Leite da Silva; Raniery Barros Carvalho;
Raphaella Nagib Carvalho Santos; Raquel Silva; Rebeca Pedrosa Holanda;
Rebeca Falcão Lopes Mourão; Ricardo Cunha de Oliveira; Rodrigo Mesquita
Costa Braga; Sabrina Macely; Sergio Morais; Sheila Constância Adolfo Mabote
Mucumbi; Simei Nhime; Stefanny Karla Ferreira de Sousa; Tauane Franca Rego;
Thayane Holanda Gurjão; Thuanne Cidreira dos Santos Gomes; Tiago Ramos
Gazineu; Victória Scheibe Machado; Victória Feitosa Muniz; Victória Rocha; Vitor
Leão; Wendyson Oliveira; Willian Miguel; Yasmin de Araújo Ribeiro; Yasmin
Amorim dos Santos.
Funding
This study was funded by the National Council for Scientific and Technological
Development (CNPq) and the Department of Science and Technology of the
Ministry of Health of Brazil (Decit/SCTIE/MS) (Grant No. 443030/2019/7).
Availability of data and materials
Data analyzed in this study are available interactively through the Brazilian
Online Atlas of RD (RARASBR; https://doi.org/10.25504/FAIRsharing.d7b6c8)
[42] and LattesData [13]. For any further inquiries, please contact the corresponding author.
Declarations
Ethics approval and consent to participate
This study protocol was reviewed and approved by the Research Ethics
Committee (REC) of Hospital de Clínicas de Porto Alegre (approval number:
33970820.0.1001.5327), the coordinator center for the study, and in all participating centers. Written informed consent was dispensed by the respective
RECs for this project phase.
Acknowledgements
We thank all the patients and families who participated in this study.
Consent for publication
All authors have given final permission to submit for publication.
Author contributions
Conceptualization: TMF, IVDS, AXA, DA, VEFF, JAMS, NBS, BMO, FAB; Data
curation: BMO, JFB; Formal analysis: BMO, FAB, JFB; Funding acquisition: TMF;
Investigation: All authors; Methodology: TMF, BMO, FAB, JFB, MA; Project
administration: TMF, DA; Resources: All authors; Software: FAB; IS; VCL; MECF;
Competing interests
The authors declare that they have no competing interests.
de Oliveira et al. Orphanet Journal of Rare Diseases
(2024) 19:405
Author details
1
Medical Genetics Service, Hospital de Clínicas de Porto Alegre, Porto Alegre,
Brazil. 2 Postgraduation Program in Genetics and Molecular Biology, Federal
University of Rio Grande Do Sul, Porto Alegre, RS, Brazil. 3 Engineering School
of São Carlos, Bioengineering Department, University of São Paulo, São Carlos,
SP, Brazil. 4 Ribeirão Preto Medical School, University of São Paulo, Ribeirão
Prêto, SP, Brazil. 5 Institute of Mathematics and Computer Sciences, São Carlos
Campus, University of São Paulo, São Carlos, SP, Brazil. 6 Medical Genetics
Service, Hospital de Clínicas de Porto Alegre, Porto Alegre, RS, Brazil. 7 Hospital
Infantil João Paulo II, Belo Horizonte, MG, Brazil. 8 Hospital São Paulo, São Paulo,
SP, Brazil. 9 Faculdade de Medicina do Centro Universitario FMABC, Santo
André, SP, Brazil. 10 Hospital Universitário Prof. Edgar Santos and Faculdade
de Medicina da Bahia da Universidade Federal da Bahia, Salvador, BA, Brazil.
11
Hospital Universitário Bettina Ferro de Souza, Universidade Federal Do Pará,
Belém, PA, Brazil. 12 Fundação Hospital Estadual do Acre, Rio Branco, AC, Brazil.
13
Hospital Universitário Prof. Edgar Santos, Salvador, BA, Brazil. 14 Universidade
Estadual de Campinas, Campinas, SP, Brazil. 15 Instituto da Criança, Faculdade
de Medicina da Universidade de São Paulo, São Paulo, SP, Brazil. 16 Ribeirao
Preto Medical School, University of Sao Paulo, Ribeirão Prêto, SP, Brazil. 17 Hospital Geral Dr. César Cals, Fortaleza, CE, Brazil. 18 Hospital Infantil Albert Sabin,
Fortaleza, CE, Brazil. 19 Centro Universitário do Estado do Pará, Belém, PA, Brazil.
20
Hospital Santa Casa de Misericórdia de Vitória, Vitória, ES, Brazil. 21 Centro de
Diabetes e Endocrinologia da Bahia, Salvador, BA, Brazil. 22 Hospital de Apoio
de Brasília, Brasília, DF, Brazil. 23 Universidade Federal do Rio Grande do Sul,
Porto Alegre, RS, Brazil. 24 Instituto Nacional de Saúde da Mulher, da Criança e
do Adolescente Fernandes Figueira/Fiocruz, Rio de Janeiro, RJ, Brazil. 25 Maternidade Climério de Oliveira, Salvador, BA, Brazil. 26 Hospital Pequeno Príncipe,
Curitiba, PR, Brazil. 27 Escola Bahiana de Medicina e Saúde Pública, Salvador,
BA, Brazil. 28 Instituto Jô Clemente, São Paulo, SP, Brazil. 29 Hospital Universitário Júlio Müller, Cuiabá, MT, Brazil. 30 Hospital Universitário Walter Cantídio,
Universidade Estadual do Ceará, Fortaleza, CE, Brazil. 31 Hospital Materno
Infantil de Brasília, Brasília, DF, Brazil. 32 Centro Universitário do Pará, Belém, PA,
Brazil. 33 Instituto de Psiquiatria Hospital das Clínicas da Faculdade de Medicina
da Universidade de São Paulo, São Paulo, SP, Brazil. 34 Unidade Acadêmica de
Medicina, Centro de Ciências Biológicas e de Saúde, Hospital Universitário
Alcides Carneiro, Universidade Federal de Campina Grande, Campina Grande,
PB, Brazil. 35 Hospital da Criança Santo Antônio, Universidade Federal de Ciências da Saúde de Porto Alegre, Porto Alegre, RS, Brazil. 36 Hospital Universitário
Pedro Ernesto, Rio de Janeiro, RJ, Brazil. 37 Hospital Universitário Lauro Wanderley, João Pessoa, PB, Brazil. 38 Universidade Estadual de Londrina, Londrina,
PR, Brazil. 39 Hospital Universitário Clementino Fraga Filho, Rio de Janeiro, RJ,
Brazil. 40 Associação de Pais e Amigos dos Excepcionais de Salvador, Salvador,
BA, Brazil. 41 Associação de Pais e Amigos dos Excepcionais de Anápolis,
Anápolis, GO, Brazil. 42 Policlínica Codajás, Manaus, AM, Brazil. 43 Hospital das
Clínicas da Faculdade de Medicina de Ribeirão Preto da Universidade de São
Paulo, Ribeirão Prêto, SP, Brazil. 44 Health Intelligence Laboratory, Ribeirão Preto
Medical School, University of São Paulo, Ribeirão Prêto, SP, Brazil. 45 Department
of Social Medicine, Ribeirão Preto Medical School, University of São Paulo,
Ribeirão Prêto, SP, Brazil. 46 Medical Genetics Service, Hospital de Clínicas de
Porto Alegre, Rua Ramiro Barcelos, 2350, Porto Alegre, RS 90035‑903, Brazil.
Page 12 of 13
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Received: 25 November 2023 Accepted: 3 October 2024
22.
References
1. Nguengang Wakap S, Lambert DM, Olry A, Rodwell C, Gueydan C,
Lanneau V, Murphy D, Le Cam Y, Rath A. Estimating cumulative point
prevalence of rare diseases: analysis of the Orphanet database. Eur J Hum
Genet. 2020;28(2):165–73. https://doi.org/10.1038/s41431-019-0508-0.
2. Alves D, Yamada DB, Bernardi FA, Carvalho I, Filho MEC, Neiva MB, Lima
VC, Félix TM. Mapping, infrastructure, and data analysis for the Brazilian
network of rare diseases: protocol for the RARASnet observational cohort
study. JMIR Res Protoc. 2021;10(1):e24826. https://doi.org/10.2196/24826.
3. Félix TM, de Oliveira BM, Artifon M, et al. Epidemiology of rare diseases
in Brazil: protocol of the Brazilian Rare Diseases Network (RARASBRDN). Orphanet J Rare Dis. 2022;17:84. https://doi.org/10.1186/
s13023-022-02254-4.
4. Giugliani R, Vairo FP, Riegel M, de Souza CF, Schwartz IV, Pena SD. Rare
disease landscape in Brazil: report of a successful experience in inborn
23.
24.
25.
26.
errors of metabolism. Orphanet J Rare Dis. 2016;11(1):76. https://doi.org/
10.1186/s13023-016-0458-3.
Bruckner-Tuderman L. Epidemiology of rare diseases is important. J Eur
Acad Dermatol Venereol. 2021;35(4):783–4. https://doi.org/10.1111/jdv.
17165.
Ferreira CR. The burden of rare diseases. Am J Med Genet A.
2019;18;179(6):885–92. https://doi.org/10.1002/ajmg.a.61124.
IBGE—Instituto Brasileiro de Geografia e Estatística. Censo Demográfico
2022: População e domicílios. Brasília: IBGE; 2023.
Melo DG, Sequeiros J. The challenges of incorporating genetic testing
in the unified national health system in Brazil. Genet Test Mol Biomark.
2012;16(7):651–5. https://doi.org/10.1089/gtmb.2011.0286.
Horovitz DDG, de Faria Ferraz VE, Dain S, Marques-de-Faria AP. Genetic
services and testing in Brazil. J Community Genet. 2013;4(3):355–75.
https://doi.org/10.1007/s12687-012-0096-y.
Brasil. Ministério da Saúde. Portaria No 199, de 30 de janeiro de 2014.
2014. Available from: https://bvsms.saude.gov.br/bvs/saudelegis/gm/
2014/prt0199_30_01_2014.html [Internet]. Accessed 24 May 2023.
de Oliveira BM, Neiva MB, Carvalho I, Schwartz IVD, Alves D, Felix TM, Raras
Network Group. Availability of genetic tests in public health services
in Brazil: data from the Brazilian Rare Diseases Network. Public Health
Genomics. 2023;26(1):1. https://doi.org/10.1159/000531547.
Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, et al. The
REDCap consortium: building an international community of software
platform partners. J Biomed Inform. 2019;95:103208. https://doi.org/10.
1016/j.jbi.2019.103208.
LattesData. 2023. https://doi.org/10.57810/lattesdata/XEL53O.
Köhler S, Gargano M, Matentzoglu N, Carmody LC, Lewis-Smith D, Vasilevsky NA, et al. The human phenotype ontology in 2021. Nucleic Acids
Res. 2021;8;49(D1):D1207–17. https://doi.org/10.1093/nar/gkaa1043.
World Health Organization. ICD-10: international statistical classification
of diseases and related health problems: tenth revision / Vol. I [Tabular
list]. Geneva: Who; 2004.
Orphanet: an online database of rare diseases and orphan drugs. Copyright, INSERM 1997. http://www.orpha.net. Accessed 26 Dec 2023.
Amberger JS, Bocchini CA, Scott AF, Hamosh A. OMIM.org: leveraging
knowledge across phenotype-gene relationships. Nucleic Acids Res.
2019;8;47(D1):D1038–43. https://doi.org/10.1093/nar/gky1151.
Bonilla C, Albuquerque Sortica V, Schuler-Faccini L, Matijasevich A, Scheffer MC. Medical geneticists, genetic diseases and services in Brazil in the
age of personalized medicine. Pers Med. 2022;19(6):549–63. https://doi.
org/10.2217/pme-2021-0153.
Arduini GAO, Rodrigues LP, Trovó de Marqui AB. Mortality by sickle
cell disease in Brazil. Revista Brasileira de Hematologia e Hemoterapia.
2017;39(1):52–6.https://doi.org/10.1016/j.bjhh.2016.09.008.
Brasil. Ministério da Saúde. Portaria no 822/GM/MS, de 06 de junho de
2001. Available from: https://bvsms.saude.gov.br/bvs/saudelegis/gm/
2001/prt0822_06_06_2001.html [Internet]. Accessed 20 July 2023.
Ferrer A, Schultz-Rogers L, Kaiwar C, Kemppainen JL, Klee EW, Gavrilova
RH. Three rare disease diagnoses in one patient through exome sequencing. Cold Spring Harb Mol Case Stud. 2019;5(6):a004390. https://doi.org/
10.1101/mcs.a004390.
Leutenegger AL, Sahbatou M, Gazal S, Cann H, Génin E. Consanguinity
around the world: What do the genomic data of the HGDP-CEPH diversity panel tell us? Eur J Hum Genet. 2011;19(5):583–7. https://doi.org/10.
1038/ejhg.2010.205.
Liascovich R, Rittler M, Castilla E. Consanguinity in South America:
demographic aspects. Hum Hered. 2000;51(1–2):27–34. https://doi.org/
10.1159/000022956.
Santos S, Kok F, Weller M, de Paiva FR, Otto PA. Inbreeding levels in Northeast Brazil: strategies for the prospecting of new genetic disorders. Genet
Mol Biol. 2010;33(2):220–3. https://doi.org/10.1590/S1415-4757201000
5000020.
Baldacci S, Santoro M, Pierini A, Mezzasalma L, Gorini F, Coi A. Healthcare
burden of rare diseases: a population-based study in Tuscany (Italy). Int J
Environ Res Public Health. 2022;19(13):7553. https://doi.org/10.3390/ijerp
h19137553.
Blazsik RM, Beeler PE, Tarcak K, Cheetham M, von Wyl V, Dressel H.
Impact of single and combined rare diseases on adult inpatient
outcomes: a retrospective, cross-sectional study of a large inpatient
de Oliveira et al. Orphanet Journal of Rare Diseases
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
(2024) 19:405
population. Orphanet J Rare Dis. 2021;16(1):105. https://doi.org/10.1186/
s13023-021-01737-0.
Cunico C, Vicente G, Leite SN. Initiatives to promote access to medicines
after publication of the Brazilian Policy on the Comprehensive Care of
People with Rare Diseases. Orphanet J Rare Dis. 2023;18(1):259. https://
doi.org/10.1186/s13023-023-02881-5.
Wainstock D, Katz A. Advancing rare disease policy in Latin America: a call
to action. Lancet Reg Health Am. 2023;27(18):100434. https://doi.org/10.
1016/j.lana.2023.100434.
Engel P, Bagal S, Broback M, Boice N. Physician and patient perceptions
regarding physician training in rare diseases: the need for stronger educational initiatives for physicians. J Rare Disord. 2013;1(2):1–15.
The Lancet Diabetes Endocrinology. Spotlight on rare diseases. Editorial.
Lancet Diabetes Endocrinol. 2019;7(2):75. https://doi.org/10.1016/S2213-
8587(19)30006-3.
Vieira T, Schwartz I, Muñoz V, Pinto L, Steiner C, Ribeiro M, Boy R, Ferraz V,
de Paula A, Kim C, Acosta A, Giugliani R. Mucopolysaccharidoses in Brazil:
What happens from birth to biochemical diagnosis? Am J Med Genet A.
2008;146A(13):1741–7. https://doi.org/10.1002/ajmg.a.32320.
Zurynski Y, Deverell M, Dalkeith T, Johnson S, Christodoulou J, Leonard H,
Elliott EJ, APSU Rare Diseases Impacts on Families Study Group. Australian
children living with rare diseases: experiences of diagnosis and perceived
consequences of diagnostic delays. Orphanet J Rare Dis. 2017;12(1):68.
https://doi.org/10.1186/s13023-017-0622-4.
Gahl WA, Tifft CJ. The NIH Undiagnosed Diseases Program: lessons
learned. JAMA. 2011;305(18):1904–5. https://doi.org/10.1001/jama.2011.
613.
Brasil. Diário Oficial da União. Portaria Pr-3.746, De 24 De Novembro De
2022. Available from: https://in.gov.br/en/web/dou/-/portaria-pr-3.746-
de-24-de-novembro-de-2022-446105011 [Internet]. Accessed 20 July
2023.
Wojcik MH, Schwartz TS, Thiele KE, Paterson H, Stadelmaier R, Mullen TE,
VanNoy GE, Genetti CA, Madden JA, Gubbels CS, Yu TW, Tan WH, Agrawal
PB. Infant mortality: the contribution of genetic disorders. J Perinatol.
2019;39(12):1611–9. https://doi.org/10.1038/s41372-019-0451-5.
Gunne E, McGarvey C, Hamilton K, Treacy E, Lambert DM, Lynch SA. A
retrospective review of the contribution of rare diseases to paediatric
mortality in Ireland. Orphanet J Rare Dis. 2020;15(1):311. https://doi.org/
10.1186/s13023-020-01574-7.
Yamada DB, Bernardi FA, Colombo ME, Neiva MB, Lima VC, Vinci ALT,
de Oliveira BM, Felix TM, Alves D. National Network for Rare Diseases in
Brazil: the computational infrastructure and preliminary results. In: Groen
D, de Mulatier C, Paszynski M, Krzhizhanovskaya VV, Dongarra JJ, Sloot
PMA, editors. Computational science: ICCS 2022 ICCS 2022. Lecture notes
in computer science, vol. 1; 2022, p. 43–49. https://doi.org/10.1007/
978-3-031-08757-8_4.
Aymé S, Bellet B, Rath A. Rare diseases in ICD11: making rare diseases visible in health information systems through appropriate coding. Orphanet
J Rare Dis. 2015;10:35. https://doi.org/10.1186/s13023-015-0251-8.
Neiva MB, de Oliveira BM, Schmidt AM, Scheibe VM, Milke JC, dos Santos
ML, Yamada DB, Colombo Filho ME, Soares GT, de Araújo Ribeiro Y, Bruno
OM, Félix TM, Alves D, RARAS Network group. ICD-10 - ORPHA: an interactive complex network model for Brazilian rare diseases. Procedia Comput
Sci. 2024;239:634–42. https://doi.org/10.1016/j.procs.2024.06.218.
Bernardi FA, Yamada DB, de Oliveira BM, Lima VC, et al. The minimum
dataset for rare diseases in Brazil: a systematic review protocol. Procedia
Comput Sci. 2022;196:439–44. https://doi.org/10.1016/j.procs.2021.12.
034.
Bernardi FA, de Oliveira BM, Yamada DB, Artifon M, Schmidt AM, Scheibe
VM, et al. The minimum data set for rare diseases: systematic review. J
Med Internet Res. 2023;27;25:e44641–1. https://doi.org/10.2196/44641.
Brazilian Rare Disease Portal (RARASBR). 2020. https://doi.org/10.25504/
FAIRsharing.d7b6c8.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Page 13 of 13
www.nature.com/scientificreports
OPEN
Population‑specific facial traits
and diagnosis accuracy of genetic
and rare diseases in an admixed
Colombian population
Luis M. Echeverry‑Quiceno 1,6, Estephania Candelo 2,3,6, Eidith Gómez 2, Paula Solís 2,
Diana Ramírez 2, Diana Ortiz 2, Alejandro González 4, Xavier Sevillano 4, Juan Carlos Cuéllar 5,
Harry Pachajoa 2,3 & Neus Martínez‑Abadías 1*
Up to 40% of rare disorders (RD) present facial dysmorphologies, and visual assessment is commonly
used for clinical diagnosis. Quantitative approaches are more objective, but mostly rely on
European descent populations, disregarding diverse population ancestry. Here, we assessed the
facial phenotypes of Down (DS), Morquio (MS), Noonan (NS) and Neurofibromatosis type 1 (NF1)
syndromes in a Latino-American population, recording the coordinates of 18 landmarks in 2D images
from 79 controls and 51 patients. We quantified facial differences using Euclidean Distance Matrix
Analysis, and assessed the diagnostic accuracy of Face2Gene, an automatic deep-learning algorithm.
Individuals diagnosed with DS and MS presented severe phenotypes, with 58.2% and 65.4% of
significantly different facial traits. The phenotype was milder in NS (47.7%) and non-significant in NF1
(11.4%). Each syndrome presented a characteristic dysmorphology pattern, supporting the diagnostic
potential of facial biomarkers. However, population-specific traits were detected in the Colombian
population. Diagnostic accuracy was 100% in DS, moderate in NS (66.7%) but lower in comparison
to a European population (100%), and below 10% in MS and NF1. Moreover, admixed individuals
showed lower facial gestalt similarities. Our results underscore that incorporating populations with
Amerindian, African and European ancestry is crucial to improve diagnostic methods of rare disorders.
According to the Online Mendelian Inheritance in Man (OMIM) databank, there are more than 10,000 genetic
and rare diseases (RD) affecting 7% of the world’s p
opulation1,2. This corresponds to approximately 500 million
people. Although as a whole genetic and RD are a significant cause of morbidity and mortality in the pediatric
population3, by separate each disorder affects a very reduced number of people. Depending on the country, the
prevalence to consider a disease as rare ranges from 1 affected individual in 50,000 people to 1 in 200,000. This
low prevalence has limited the research on rare disorders.
Currently, there is limited knowledge on the etiology of these disorders. A reduced percentage of diseases
(20%) presents a known molecular basis associated to a detailed phenotype description, and treatment is only
available for 0.04% of RD3. As orphan diseases, many RD are chronic and incurable, representing severe and
debilitating conditions4. The diagnosis and management of genetic RD is currently a clinical c hallenge5. Precise and early diagnosis is crucial for individuals and their families to get effective care and to reduce disease
progression. However, due to the limited knowledge and complexity of these pathologies, diagnosis may take
several years6. People often suffer during a long diagnostic odyssey, with delays in their correct treatment and
management7. For most rare diseases, there are no reliable biomarkers for early d
iagnosis8.
Among the wide constellation of clinical symptoms associated to genetic and rare disorders, craniofacial
dysmorphologies emerge as potential b
iomarkers9,10. These phenotypes are highly p
revalent2,6 and are commonly used for diagnosis, management and treatment monitoring of genetic and R
D6. Up to 40% of these
1
Departament de Biologia Evolutiva, Ecologia i Ciències Ambientals (BEECA), Facultat de Biologia, Universitat de
Barcelona (UB), Av. Diagonal, 643. Planta 2, 08028 Barcelona, Spain. 2Centro de Investigaciones en Anomalías
Congénitas y Enfermedades Raras (CIACER), Universidad ICESI, Cali, Colombia. 3Servicio de Genética Clínica,
Fundación Valle del Lili, Cali, Colombia. 4HER ‑ Human‑Environment Research Group, La Salle - Universitat
Ramon Llull, Barcelona, Spain. 5Universidad ICESI, Cali, Colombia. 6These authors contributed equally: Luis
M. Echeverry-Quiceno and Estephania Candelo. *email: neusmartinez@ub.edu
Scientific Reports |
(2023) 13:6869
| https://doi.org/10.1038/s41598-023-33374-x
1
Vol.:(0123456789)
www.nature.com/scientificreports/
disorders present characteristic craniofacial phenotypes, including Down, Morquio, Noonan, Apert, Rett, Fragile
X, Williams-Beuren and Treacher-Collins and velocardiofacial syndromes, as well as other conditions such as
microcephaly, holoprosencephaly, palate/lip cleft, and other 2,000 rare genetic d
isorders10,11.
The genetic and environmental factors causing these disorders alter the complex process that orchestrates
facial morphogenesis during pre- and postnatal development, inducing facial dysmorphologies. Facial development is highly regulated by multiple signaling p
athways12–14, including Fibroblast Growth Factor (FGF), Hedgehog (HH), Wingless (WNT) and Transforming Growth Factor Beta (TGF-β) and Bone Morphogenetic Proteins
(BPMs). Disruptions in the regulation of any of these signaling pathways can lead to facial d
ysmorphogenesis15.
The facial patterns associated with each disorder are unique, but vary within and among diagnostics, ranging
from subtle facial anomalies to severe m
alformations16. In the clinical practice, craniofacial dysmorphology is
commonly assessed through qualitative visual assessment and basic anthropometric measurements. However,
this approach may not capture with optimal precision the anatomical complexity of the facial dysmorphologies
associated with these disorders. Qualitative descriptions of facial phenotypes are sometimes based on general
terms such as coarse face, large and bulging head; saddle-like, flat bridged nose with broad, fleshy tip; or malformed teeth17–19. Accurate identification of dysmorphic features for diagnosis thus depends on the clinician’s
expertise, and only highly trained dysmorphologists are able to recognize the facial “gestalt” characteristic of
the rarest d
isorders19.
Recent research seeks to incorporate into the clinical diagnosis of RD the use of objective and quantitative
tools to assess facial phenotypes20–25. Automated systems have been developed to improve and accelerate the
diagnostic process9,10,26. Within the clinical practice, Face2Gene is the most commonly used system (FDNA Inc.,
https://www.face2gene.com/), a community-driven phenotyping platform trained over 17,000 people representing more than 200 s yndromes9. Face recognition is performed on 2D images that can be collected with any type
of digital camera or phone, without previous training. Syndrome classification is achieved using DeepGestalt,
a cascade Deep Convolutional Neural Network (DCNN)-based method that achieved an average 91% top-10
accuracy in identifying the correct s yndrome9.
Other diagnostic approaches based on 3D photogrammetry have been developed more recently10,20,21. The
advantage of 3D facial models is that they are more efficient than 2D images in capturing the complexity of facial
phenotypes, but their widespread use is limited because the photographic equipment required for generating 3D
models is not commonly available in the clinical practice. Hallgrímsson et al. (2020)10 analyzed 3D facial models
from 7,057 subjects including subjects with 396 different syndromes, relatives and unrelated unaffected subjects
(https://www.facebase.org/). Deep phenotyping based on quantitative 3D facial imaging and machine learning
presented a balanced accuracy of 73% for syndrome diagnosis20.
Automated methods have thus demonstrated high potential to facilitate the diagnosis of facial dysmorphic
syndromes6,9,10,26. These tools present high accuracy diagnosis in European and North American populations, that
are the populations in which the machine learning algorithms have been trained and validated. However, these
tools have not been thoroughly tested in populations with different ancestries, and it is not well understood the
how facial phenotypes associated with genetic and RD might be influenced by the complex patterns of population ancestry characterizing human populations.
Population ancestry in facial dysmorphologies: a long‑disregarded factor. Facial shape shows
wide variation across world-wide human p
opulations27. Facial differences between populations are detected in
the shape of the forehead, brow ridges, eyes, nose, cheeks, mouth and jaw28. These facial phenotypes result from
divergent evolutionary and adaptive histories of human populations occurred during the evolution of Homo
sapiens over the last 200,000 years. Nowadays, continuous migration and admixture keep shaping the facial
phenotypes of human populations. Depending on dominance and epistatic interactions between alleles fixed or
predominant in each parental group30, admixed populations can display a variety of craniofacial morphologies,
ranging from resemblance to one of the parental groups to a combination of both parental phenotypes and the
evolution of novel phenotypes29. Therefore, the evolutionary and population dynamics of human populations
result in genetic and phenotypic patterns that surrogate population ancestry30–32, and can modulate the facial
phenotypes associated to disease.
Few studies to date have analyzed the craniofacial phenotypes associated with genetic and RD in populations
of non-European d
escent33–36, leaving African, Asian and Latin-American populations often disregarded and
underrepresented. Unfortunately, there are no reliable representations of facial phenotypes in genetic and rare
diseases in populations of non-European descent. However, it is crucial to account for the influence of population ancestry on facial variation to develop quantitative approaches that efficiently diagnose these disorders in
populations from all over the world.
To cover this gap, here we assessed the facial dysmorphologies associated to prevalent genetic and RD in a
Latin-American population from the Southwest of Colombia. Latin-Americans are fascinating cases of hybrid/
admixed populations that evolved over relatively short periods of t ime30,37. Peopling of the Americas likely
started 12–18,000 years ago38,39 by migration waves coming from North and South East Asia30, following coastal
and continental routes41. Amerindian populations established all over the continent and adapted to a variety of
environments over thousands of years. During the last 600 years, admixture with European and African populations further shaped the genetic ancestry of Latin-American p
opulations42,43. In particular, the population from
the region of Cali is the result of diverse migratory p
rocesses44. Admixture with the indigenous Amerindian
population began in the sixteenth century with the arrival of Spanish colonizers. In the eighteenth century,
large colonial settlements of slaves brought from Africa were established in Cali for the exploitation of sugar
cane that significantly changed the population structure of Valle del Cauca. Nowadays, the population of Cali
Scientific Reports |
Vol:.(1234567890)
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
2
www.nature.com/scientificreports/
is characterized by indigenous and mestizo communities, with Amerindian and African ancestry components
predominating over the European ancestry c ontribution44.
In this study, we compared the facial phenotypes associated to four genetic and RD, including Down syndrome (DS), Mucopolysaccharidosis type IVA metabolic disorder known as Morquio syndrome (MS), and two
types of RASopathies, Noonan syndrome (NS) and Neurofibromatosis type 1 (NF1). The facial phenotype of
these syndromes has not been previously characterized in Latin-American populations, and differences between
populations with different ancestry backgrounds have not been assessed34–36. Here, we quantitatively assessed the
facial phenotypes associated to these syndromes, and compared our results in a Colombian admixed population
with those reported in European descent populations. We also assessed the diagnostic accuracy of automatic
methods currently used in the clinical practice, and detected evidence suggesting that further research is needed
to optimize these methods in admixed populations of non-European descent.
Materials and methods
Participant recruitment for photographic sessions. The Colombian sample comprised 130 individu-
als from Valle del Cauca, a Southwest region in Colombia (Table 1). The cohort included 79 age matched controls
and 51 individuals diagnosed with Down, Morquio, Noonan and Neurofibromatosis type 1 syndromes that
were recruited from the clinical genetics consultation at Hospital-Fundación Valle del Lili in Cali (Colombia), a
tertiary health reference center for these genetic and rare disorders. In most cases, clinical diagnoses were confirmed by molecular genetic testing.
Down syndrome (DS, OMIM 190685), caused by trisomy of chromosome 21, was selected because it is
one of the most common genetic disorders, and previous studies have shown that the clinical manifestations
associated with DS vary across ethnicities35. Within RD, we included Morquio syndrome type A (MS, OMIM
253000) because Colombia presents one of the highest prevalence of MS in the world, probably as a result of
founder effects45. Morquio syndrome is a subtype of Mucopolysaccharidosis disorders caused by more than
180 autosomal recessive mutations in the GALNS gene46 that alter the metabolism of the extracellular matrix
glycosaminoglycans47. Individuals with MS show coarse facies with an excessively rapid growth of the h
ead48.
Finally, we also included in the analyses two RASopathies, Noonan syndrome (NS, OMIM 163950) and
Neurofibromatosis type 1 (NF1, OMIM 162200), which are prevalent in Valle del Cauca and present altered
craniofacial development by genetic mutations that cause Ras/MAPK pathway d
ysregulation49.
To assess the facial phenotypes associated with these disorders, individuals diagnosed with DS, MS, NS and
NF1 and age matched controls were recruited for photographic sessions at educational and research centers in
Cali (Colombia) in 2021. The photographic material was taken under the protocol approved by ethics committee “Human Research Ethics Committee of the Icesi University” with Approval Act No. 309. To photograph the
participants and to record relevant clinical information, we obtained informed consent from the participants
or from their parents or legal guardians in the case of minor children, in accordance with national guidelines
and regulations.
Facial image acquisition and anatomical landmark collection. Facial shape was captured from 2D
images taken using a professional digital camera (SONY Alpha 58 + 18–55) that was attached to a tripod and
placed at one-meter distance in front of the participants. To capture a natural facial gesture, the images were
acquired in an upright position with facial neutral expression. Participants were asked to sit still, looking towards
the front, with open eyes and closed mouth. Although this was challenging in children with Down syndrome,
who usually show hyperactivity and tongue protrusion due to hypotonia, several photographs were taken until
a neutral facial expression was achieved.
To measure facial shape of each individual and to detect the traits associated with each disorder, we recorded
the 2D coordinates of a set of 18 anatomical facial landmarks (Fig. 1 and Supplementary Table 1). Landmarks
were acquired using an automatic facial landmark detection procedure adapted from the open-source software
library Dlib50. The automatic landmarking process is explained in detail in Supplementary Information. In brief,
from the set of 68 landmarks registered by Dlib, 15 landmarks directly matched our configuration of 18 facial
landmarks (Fig. 1, Fig. S1, Table S1). Three additional landmarks were approximated through direct computations
between the landmarks coordinates automatically returned by Dlib: the glabella was computed as the midpoint
point between the innermost points located in the eyebrows, and the palpebrale inferius landmarks of the right
and left eyes were computed as the midpoint between the two central lower eyelid landmarks.
Diagnosis
M
F
Total
Age (years old)
Control
32
47
79
4–59 ( x = 23.5)
Down syndrome
8
11
19
3–28 ( x = 12.7)
Morquio syndrome
6
5
11
9–28 ( x = 17.9)
Noonan syndrome
4
5
9
5–39 ( x = 16.4)
Neurofibromatosis type 1
6
6
12
6–52 ( x = 17.5)
Total
56
74
130
Table 1. Sample composition by diagnosis. The table provides the number of male (M) and female (F)
participants, as well as the total sample size for each syndrome. The age range within each diagnostic group is
also provided, where x represents the average age.
Scientific Reports |
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
3
Vol.:(0123456789)
www.nature.com/scientificreports/
Figure 1. Anatomical position of facial landmarks used in morphometric and statistical analyses to
quantify dysmorphologies associated to genetic and rare disorders Down, Morquio, Noonan syndromes and
Neurofibromatosis type 1 in a Colombian population.
The validity of the data was assessed by comparing the coordinates of landmarks automatically detected
by Dlib with the coordinates of landmarks manually collected by an expert facial morphologist. Manual and
automatic measurement differences were assessed for each individual landmark using the root mean square
error (RMSE) (Fig. S2). This method was first validated with the 2D facial images of 20 control subjects, and the
average RMSE was 1.75 mm. To validate the automatic landmarking method with images of syndromic patients,
we manually landmarked 20 patients, including 5 individuals diagnosed with each syndrome represented in our
sample. The RMSE for syndromic patients was slightly higher (RMSE = 1.96 mm), but below 2 mm (Fig. S2).
Considering that this error threshold is widely accepted in studies of biological anthropology for craniometric
measurements51, the precision of the automatic detection method of anatomical points was validated on both
control and syndromic samples.
Quantification of facial phenotypes. We used Euclidean distance matrix analysis (EDMA) to describe
the facial phenotype associated to each syndrome. EDMA is a robust morphometric method for assessing local
differences between samples52 by detecting linear distances that significantly differ between pairwise sample
contrasts and comparing patterns of significant differences across samples.
To account for size differences between subjects, the 2D coordinates of the facial landmarks of each subject
were scaled by their centroid size, estimated as the square root of the sum of squared distances of all the landmarks from their c entroid53. After scaling, as EDMA represents shape as a matrix of linear distances between
all possible pairs of landmarks, a total of 153 unique facial measurements were calculated for each individual.
Linear distances were compared for each group of DS, MS, NS and NF1 syndromes with control individuals by
performing a two-tailed two-sample shape contrasts on all unique inter-landmark linear distances from each
sample. Relative differences between patients and controls were computed as (mean distance in controls—mean
distance in patients) / mean distance in controls.
Statistical significance was assessed using a non-parametric bootstrap test with 10,000 resamples. EDMA
statistically evaluated the number of significant local linear distances in each two-sample comparison based
on confidence interval testing. We used the default α level in EDMA (α = 0.10), and a 90% confidence interval
was calculated for each linear distance. The shape differences were sorted in increasing order, and the first 5%
and the last 5% differences were discarded. The resulting minimum and maximum differences were used to set
up the lower and upper confidence limits for each linear distance. Interlandmark distances were considered
non-significantly different between controls and patients when the resulting interval contained the value zero.
Otherwise, the equality null hypothesis was not accepted, and we assumed that a significant shape difference
existed at the α level54. To pinpoint specific local shape differences and to reveal the unique morphological pattern of variation associated with each disorder, the ten longest and shortest significant relative differences were
plotted on facial figures.
Facial dysmorphology score. To confirm that results were not random due to the small sample sizes
available in rare diseases, we combined the results from EDMA with an iterative bootstrapping method that
further assessed whether the facial dysmorphologies associated to each syndrome were statistically s ignificant55.
First, we estimated from the EDMA results a facial dysmorphology score (FDS) as the percentage of significantly different distances between patient and control groups. Then, we ran simulations with random samples
of controls and patients generated by iterative bootstrapping to assess the statistical significance of the patterns
revealed by EDMA. For each disorder, we first created subsamples of N randomly chosen controls (where N is
Scientific Reports |
Vol:.(1234567890)
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
4
www.nature.com/scientificreports/
the total number of patients available in the sample). Then, using a subsampling approach, we automatically
generated random pseudo-subsamples containing a known number of patients (namely M). This procedure
was repeated with increasing numbers of patients and resulted in a series of staggered pseudo sub-samples that
contained from M = 0 to M = N patients. A total of 150 simulations were run in each round, and in each of these
simulations, we computed an EDMA analysis and an FDS score.
The results from each round of random groups were separately represented in histograms. The first round
of simulations contained no patients (M = 0) and only included control individuals, representing facial differences that can be found randomly in the general population. To assess whether the FDS value obtained using
the complete patient dataset was significantly different or similar to the FDS resulting from a random sample,
we compared the distribution of FDS random values with the FDS observed in the whole sample. The P-value
assessing the statistical significance of the comparison was computed as the ratio between the number of simulations containing no patients that provided a higher FDS than the observed FDS divided by the total number of
simulations. P-values below 0.05 indicated that the FDS obtained using the real dataset was higher that the FDS
obtained randomly in a sample of control subjects.
Face2Gene diagnostic assessment.
To assess the accuracy of automated diagnostic methods in the
Colombian sample, we compared the clinical diagnosis based on clinical and genetic testing with the diagnosis
estimated from the frontal facial 2D images of the patients using the Face2Gene technology (FDNA Inc., Boston,
MA, USA; https://www.face2gene.com). Following G
urovich9, we assessed the top-one and top-five accuracies
for each disorder, estimated as the percentage of cases where the Face2Gene model predicted the correct syndrome as the first result or within the five first results from the sorted list of probable diagnoses. We also calculated these accuracies expanding the diagnostic range to the disorder family.
Moreover, we evaluated the similarity between the Colombian patients and the facial gestalt models used
by Face2Gene for syndrome classification. For each individual, we selected the first diagnostic prediction that
matched their clinical and genetic diagnosis and recorded the gestalt similarity. We classified the level of similarity between the individual and the corresponding gestalt model into seven categories, including “very low”,
“low”, “low-medium”, “medium”, “medium–high”, “high”, and “very high” gestalt similarity, using the “gestalt
level” barplot provided by Face2Gene.
Finally, to further test the influence of population ancestry on the diagnostic accuracy of Face2Gene, and to
directly compare the results with individuals from European descent populations, we performed an extensive
search of public image databases to obtain 2D photos of European subjects diagnosed with DS, NS, MS and
NF1 syndromes. We collected the images of 45 subjects with D
S56; and 24 diagnosed with N
S57. Unfortunately,
no 2D images of European individuals diagnosed with MS and NF1 were found publicly available. Using these
images, we tested the accuracy of Face2Gene in DS and NS employing the same method previously described
for the Colombian population. However, we could not use these publicly available images to perform EDMA
and FDS analyses on the European samples, because the pictures were not taken under controlled conditions56,
and diverse facial expression and head position would lead to bias in results of quantitative shape comparisons.
Results
EDMA analyses showed that each syndrome presented a characteristic facial phenotype.
In individuals with Down syndrome, all facial structures including the eyes, nose and mouth presented significant differences as compared to controls. Overall, DS was associated with wider but shorter facial traits (Fig. 2A).
Results showed a 6.5% increase of relative distance between the midpoint between the eyebrows (glabella)
and the most inferior medial point of the lower right eyelid (palpelabre inferius), and a 7.5% increase between
the right palpelabre inferius and the outer commissure of the right eyes (exocanthion), indicating hypertelorism.
Additionally, in this Colombian sample, people with DS exhibited longer measurements in the buccal portion,
with a 6–8% increase of mouth width as measured from the crista philtri to the chelions (Fig. 2A). However, the
midfacial and nasal regions were reduced (Fig. 2A). People with DS presented a 6–8% reduction in measurements
of midfacial height, with the largest difference detected as a 9.7% reduction of the distance between the tip and
the root of the nose (Fig. 2A). The facial dysmorphology score (FDS) indicated that up to 58.2% of facial traits
were significantly different in people with DS (Fig. 2B).
The facial pattern associated with Morquio syndrome was also characterized by wider and shorter midfacial
traits, as observed in Down syndrome. However, facial dysmorphologies were more abundant and severe in MS
than in DS, with 65.4% of facial traits significantly different in diagnosed individuals and higher percentages of
relative change (Fig. 3 A, B). The most affected regions were the midface and the nose, whereas the mouth was
the least affected. Individuals with MS presented hypertelorism, with 14% increase in the distance between the
midpoint between the eyebrows (glabella) and the inner commissures of the left and right eyes (endocanthions).
Individuals with MS also showed larger distances in the base of the nose, with a 14–19% increase in the distance
from the tip of the nose to the insertion of the right and left alar bases (subalare) as compared to controls. Mouth
width was also increased in MS; whereas midfacial heights measuring the distance between the eyes and the nose
were significantly reduced from 10 to 16% in individuals with MS (Fig. 3A).
In Noonan syndrome, facial dysmorphologies were abundant and concentrated in the orbital and nasal
regions. EDMA detected significantly increased distances in the upper face, but decreased distances in the
midface (Fig. 4A).
Patients presented a lower position of the eyes, with 9 to 13% increased distances between the glabella or
sellion and the landmarks located in the eyes. The mouth also showed a more inferior position, with 8–10%
increased relative distance between the tip of the nose and the superior lip, but the shape of the mouth did not
show large differences between patients and controls. The reduction of midfacial heights in individuals with NS
Scientific Reports |
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
5
Vol.:(0123456789)
www.nature.com/scientificreports/
Figure 2. Localized Euclidean Distance Matrix Analysis facial shape pairwise contrasts and iterative
bootstrapping tests of facial dysmorphology between controls and individuals diagnosed with Down syndrome.
(A) EDMA results. Dotted lines represent facial measurements significantly different in control and patient
groups. Lines in light tones indicate measurements that are shorter in patients as compared to controls, whereas
lines in dark tones represent measurements that are longer in patients. (B) Iterative bootstrapping tests based
on facial dysmorphology scores (FDS). Histograms represent the simulation results for each random group
separately, which contain an increasing number of patients, from no patients (M = 0) to all patients (M = N).
From top to bottom histograms, the simulations included 0, 3, 6, 9, 12, 15 and 18 patients. The dotted red
line shows the FDS score obtained with the complete sample of control and patients (Table 1). * Statistically
significant P-value.
Figure 3. Localized Euclidean Distance Matrix Analysis facial shape pairwise contrasts and iterative
bootstrapping tests of facial dysmorphology between controls and individuals diagnosed with Morquio
syndrome. From top to bottom histograms, the simulations included 0, 2, 4, 6, 8 and 10 patients. For more
details see legend in Fig. 2.
ranged from 5 to 11%, with a similar magnitude as in DS (Fig. 4A). FDS indicated that 47.7% of facial traits were
significantly different in NS (Fig. 4B).
Neurofibromatosis type 1 was associated with minor facial dysmorphologies, which were less abundant and
less severe than in the previous syndromes (Fig. 5A). Individuals with NS only presented 11.4% of significantly
different facial traits as compared to controls, and the percentages of relative change were low, mostly ranging
from 1 to 5% (Fig. 5A,B). The largest difference was a 10% increase in facial distance between the glabella and
the labiale superius (Fig. 5A). Along with larger distances in the midline of the face, EDMA detected reduced
distances on the right and left sides of the face, with shorter distances from the right and left chelion to the eye
landmarks, the endocanthion and the palpebrale inferius. Hypertelorism was not present in individuals with
NF1 (Fig. 5A). In NF1, the FDS score was not significant (Fig. 5B), indicating that the facial dysmorphology
Scientific Reports |
Vol:.(1234567890)
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
6
www.nature.com/scientificreports/
Figure 4. Localized Euclidean Distance Matrix Analysis facial shape pairwise contrasts and iterative
bootstrapping tests of facial dysmorphology between controls and individuals diagnosed with Noonan
syndrome. From top to bottom histograms, the simulations included 0, 2, 4 and 6 patients. For more details see
legend in Fig. 2.
Figure 5. Localized Euclidean Distance Matrix Analysis facial shape pairwise contrasts and iterative
bootstrapping tests of facial dysmorphology between controls and individuals diagnosed with
Neurofibromatosis type 1. From top to bottom histograms, the simulations included 0, 2, 4, 6, 8 and 10 patients.
For more details see legend Fig. 2.
pattern associated with NF1 is so subtle that overall is not larger than facial differences that could be randomly
detected using a sample of control subjects.
For the other syndromes, the simulation tests confirmed that the facial dysmorphologies associated with
Down, Morquio and Noonan syndromes were significant and different from random comparisons in control
subjects. Few simulations resulted in a higher FDS than the FDS obtained with the complete real sample (Figs. 2B,
3B, 4B, first row and blue line). Moreover, in DS, MS and NS, facial dysmorphology scores increased as larger
numbers of diagnosed individuals were included in the simulations (Figs. 2B, 3B, 4B, middle rows), confirming
the severity of the facial dysmorphologies associated to these syndromes. Finally, the simulations comparing all
recruited diagnosed individuals (last row) with random subsamples of control subjects (first row) indicated that
FDS scores can range widely from 10 to 80%, underscoring the biasing effects of small sample sizes.
Face2Gene accuracy in Colombian and European populations. After quantifying the facial dysmorphologies associated to DS, MS, NS and NF1 in the Colombian sample, we tested the accuracy of the diagScientific Reports |
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
7
Vol.:(0123456789)
www.nature.com/scientificreports/
nosis provided by the automatic diagnostic algorithms of Face2Gene. We assessed the correspondence between
the estimated Face2Gene diagnosis based on facial frontal 2D images with the diagnosis based on clinical and
genetic testing.
Face2Gene estimated Down syndrome diagnosis with top-1 accuracy of 100%, as DS diagnosis was listed as
the first diagnosis in all individuals, with an average gestalt similarity of 6.2 (Table 2, Fig. 6). When comparing
the gestalt similarities in Colombian and European populations, a Wilcoxon test did not find a significant difference between the average gestalt similarity (P = 0.4). However, a Levene test detected a significant difference
in the variance of gestalt similarity scores (P = 0.01). Whereas in the Colombian population the gestalt similarity
in DS ranged from very high to very low; in the European population the range of variation was limited from
very high to medium (Fig. 7).
In Morquio syndrome, the top-1 accuracy of Face2Gene was 0%, as the specific diagnostic of mucopolysaccharidosis type IVA (MPSIVA) was never listed as a first prediction (Table 2). Although Face2Gene could not
identify the specific type of MS, the automatic diagnostic algorithms associated the facial dysmorphologies with
a diagnosis related with mucopolysaccharidosis disorders in 36.4% of cases, with a medium–high average gestalt
similarity of 5.6 (Table 2). When the first 5 diagnostic predictions were considered, the top-5 accuracy raised to
45.4% for exact MPSIVA diagnosis and to 100% for mucopolysaccharidosis disorders, but with a low-medium
gestalt similarity (Table 2, Fig. 6). In our sample, we detected four genetic variants (p.Gly301Cys, p.Arg386Cys,
p.Arg94Cys, p.Gly333Asp, and p.Ser80Leu) that are missense mutations commonly found in the Colombian
population45 (Table S2). Due to the small sample size and genetic heterogeneity of the patients, it was not possible
to test whether different genetic variants were associated to different facial phenotypes. Comparative European
samples were not available.
The top-1 accuracy of Face2Gene for Noonan syndrome was 66.7%, with a medium–high average gestalt similarity of 5.2 when considering subjects in which the diagnosis was successful (Table 2). Top-5 accuracy increased
to 77.8% for exact NS diagnosis, and to 88.9% when considering Noonan Syndrome-Like Disorder diagnoses,
Top-1 accuracy
DS
Top-5 accuracy
Exact diagnosis
Within disorder family
Exact diagnosis
Within disorder family
% cases
Gestalt similarity
% cases
Gestalt similarity
% cases
Gestalt similarity
% cases
Gestalt similarity
100
6.2
100
6.2
100
6.2
100
6.2
MS
0
0
36.4
5.6
45.4
3
100
3.4
NS
66.7
5.2
66.7
5.2
77.8
4.7
88.9
4.4
NF1
8.3
1
50
1.7
66.6
1.2
66.6
1.6
Table 2. Accuracy of Face2Gene diagnosis based on 2D facial images in Down, Morquio, Noonan and
Neurofibromatosis type 1 syndromes in a Colombian population. Percentage of cases matching the genetic
diagnosis are provided for each syndrome, as well as gestalt similarity values.
Figure 6. Gestalt similarity scores between Colombian individuals and Face2Gene models of Down, Morquio,
Noonan and Neurofibromatosis type 1 syndromes. Violin plots are based on top-5 accuracy Face2Gene
predictions within family disorder. Each plot shows the number of individuals scored at each gestalt similarity
level, from very high to very low.
Scientific Reports |
Vol:.(1234567890)
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
8
www.nature.com/scientificreports/
Figure 7. Comparison of gestalt similarity scores between Colombian and European populations in Down and
Noonan syndromes. Raincloud plots are based on top-5 accuracy Face2Gene predictions within family disorder,
and show the corresponding average gestalt similarity score, the range of variation, and the distribution within
each disorder and population.
with a medium gestalt similarity of 4.4 and wide variation among individuals (Table 2, Fig. 6). Although differences did not reach statistical significance probably due to small sample sizes (P = 0.09), the comparison between
populations showed that in Europe, both the diagnostic accuracy and the gestalt similarity were higher than in
Colombia. Using 2D images of patients from European origin, the Face2Gene top-1 accuracy for NS was 100%
and the average gestalt similarity was 5.5 (Fig. 7).
Finally, in Neurofibromatosis type 1, Face2Gene presented a top-1 accuracy of 8.3% associated with a very
low gestalt similarity of 1 (Table 2). When diagnoses within the RASopathies disorder family were considered, 5
out of 12 individuals were diagnosed as Noonan syndrome and the top-1 accuracy raised to 50% (Table 2). The
top-5 diagnostic accuracy was 66.6% and was associated with low gestalt similarity values of between 1 and 2 in
87.5% of individuals (Table 2, Fig. 6). Comparative European samples were not available for NF1.
Discussion
Our analyses provided an accurate quantitative comparison of facial dysmorphologies in Down, Morquio Noonan
and Neurofibromatosis type 1 syndromes in a Latin-American population from Colombia. An objective and
highly detailed description of the facial phenotype is a major improvement over qualitative descriptions of the
complex facial dysmorphologies associated with these genetic disorders. We quantified local facial trait differences presented in people diagnosed with these disorders as compared with age matched controls of the same
population, localizing the largest statistically significant facial dysmorphologies.
Our results indicated differential facial patterns associated with each disorder, with major significant dysmorphologies in DS, MS and NS, and minor facial dysmorphologies associated with NF1. Different types of genetic
alterations, which ranged from aneuploidy and overall genetic imbalance in DS; to point genetic mutations
affecting different processes or signaling pathways, such as the metabolism of mucopolysaccharides in MS, and
the RAS/MAPK pathway in NS and NF1, significantly affected the facial phenotypes. These genetic alterations
deviate the signaling pathways regulating normal facial d
evelopment16,58, and alter normal morphogenesis and
growth during pre- and postnatal d
evelopment15 of individuals with genetic and rare disorders.
Population‑specific facial traits in Colombian individuals with genetic and rare disor‑
ders. Overall, the facial patterns observed in the Colombian Latin-American population coincide with the
descriptions reported in the literature for each syndrome 48,59–61. However, there are specific local traits that
differ, suggesting that facial traits associated to genetic and rare diseases might be modulated by population
ancestry, as a result of different evolutionary and adaptive histories of human populations33–35.
Down syndrome. Down syndrome presents a worldwide prevalence of 14 per 10,000 live births, with life expectancy increasing from 25 to 60 years in developing countries62–65. In most Latino-American regions, the real
incidence of patients with DS remains unknown, and is usually underreported. A cross-sectional study in Brazil
reported a DS birth rate of 4 cases per 10,000 live births66; whereas in Colombia several studies have reported
a prevalence rate between 1 per 1,000 to 5 per 10,000 live b
irths67,68. DS is an aneuploidy caused by trisomy of
Scientific Reports |
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
9
Vol.:(0123456789)
www.nature.com/scientificreports/
chromosome 21, and is the leading genetic cause of intellectual disability63. Moreover, DS is associated with
craniofacial dysmorphologies that impair vital functions such as breathing, eating, and speaking. In the literature, the DS craniofacial phenotype is mostly based on the analysis of European descent populations, and the
characteristic traits include brachycephalic heads with maxillary hypoplasia leading to facial flatness; depressed
nasal bridge and reduced airway p
assages59; dysplastic ears with lobe absence; eyes with oblique palpebral fissures, epicanthal folds, strabismus and nystagmus16,69; and oral alterations including open mouth, cleft lip, lingual furrows and protrusion, macroglossia, micrognathia, and narrow palate70,71.
In the Colombian population, we found facial dysmorphologies that are consistent with the craniofacial
patterns reported in the literature. For instance, our analyses detected differences in linear facial measurements
that correspond to typical DS traits such as hypertelorism, maxillary hypoplasia, and shorter and wider faces
associated to a brachycephalic h
ead16,72. Results also suggested other characteristic traits of DS, such as midfacial retrusion, and depressed nasal b
ridge59. Open mouth and m
acroglossia70,71 were also observed during the
photographic sessions in the participants of our study.
However, in contrast to European and North American p
opulations55, in the Colombian population we
detected that the mouth was wider in individuals diagnosed with DS as compared to euploid controls. This difference could be caused by unnatural facial gestures of the participants when asked to close the mouth during
the photo shoot, or by facial differences associated to ancestry. In fact, Kruszka et al.33–35 analyzed individuals
diagnosed with DS in diverse populations, and showed craniofacial differences between individuals from different populations (Africans, Asians, and Latin Americans), demonstrating that ancestry is a relevant factor when
assessing craniofacial variation associated to rare disorders.
Morquio syndrome. In Morquio syndrome, the worldwide prevalence ranges from 1 case per 75,000 to 1
per 200,000 live births; whereas in Colombia the prevalence rises up to 0.68 per 100,000 live b
irths45. As a
mucopolysaccharidosis syndrome, the typical alterations of MS involve the supporting tissue and the osteoarticular system73. Individuals with MS display abnormalities such as skeletal dysplasia, short stature and trunk,
kyphoscoliosis, pectus carinatum, genu valgum, and joint hyperlaxity74. Oral diseases often include periodontal disease, malocclusions, caries, and premature tooth loss46. Individuals with MS show coarse facies, with an
excessively rapid growth of the head48. Craniofacial features include a prominent forehead, hypertelorism, prognathism, wide mouth and nose, depressed nasal bridge, plump cheeks, and lips with an oversized tongue48.
In the Colombian population, the facial dysmorphologies observed were consistent with traits reported in the
literature, which included hypertelorism, prognathism, wide nose, and wide m
outh46,48.
In the Colombian sample, Morquio syndrome was associated with the most severe facial dysmorphologies.
Considering that keratan and chondroitin sulfate alterations associated with MS cause irreparable damage to
leukocytes and fibroblasts, and accumulate over life inducing extreme deformations of the osteoarticular system,
facial dysmorphologies associated with MS are expected to increase with age, becoming more severe in adult
individuals46. Further research is required to test this hypothesis and to assess whether pharmacological treatments can slow down the progression of the disease and reduce the facial dysmorphologies associated with MS.
This is especially relevant in Colombia, which is a country with one of the highest prevalence of MS in the world45.
Moreover, dysmorphologies associated with MS vary among individuals. Typically, MS patients present
severe phenotypes, although less severe forms have been described as mild or attenuated p
henotypes73. There
is no consistent evidence regarding the genotype–phenotype correlation in MS, and whether different GALNS
mutations are associated with the degree of severity in facial dysmorphology. In our Colombian sample, we
detected four genetic variants (p.Gly301Cys, p.Arg386Cys, p.Arg94Cys, p.Gly333Asp, and p.Ser80Leu). Two of
these genetic variants, p.Gly301Cys and p.Arg386Cys, that are the most frequently reported mutations in cases
of Morquio syndrome; specifically in Colombia, but also in other American (Brazil, Chile, Argentina, Canada),
and European countries (Spain, Portugal, Italy, Poland)45,75–77. The high prevalence of the p.Gly301Cys mutation
in the Colombian population could result from founder and migration e ffects45. The p.Arg386Cys variant has
been further detected in China and T
urkey75–77; whereas the p.Arg94Cys allele has been previously reported in
Middle East, Brazil, and Italy76,77. Other genetic variants, such as p.Ile113Phe, which are more frequently reported
in British and Irish populations45,75–77, were not detected in our Colombian sample. Further tests with larger
samples associated to each genotype are needed to test whether the population-specific genetic variants can be
associated to different facial phenotypes in Morquio syndrome.
RASopaties: Noonan and NF1 syndromes. Regarding Noonan syndrome, the worldwide prevalence of NS is 1
per 1,000 to 1 per 2,500 live b
irths49. NS is the most common type of RASopathy, and is a rare genetically heterogeneous autosomal dominant disorder caused by mutations in either the PTPN11, SOS 1, KRAS, BRAF or RAF1
genes. Individuals with NS display facial features such as hypertelorism, epicanthic folds, strabismus, downward
slanting palpebral fissures, ptosis, high arched palate, deeply grooved philtrum with high peaks of upper lip vermillion border, midfacial hypoplasia and micrognathia, broad flat nose, low-set posteriorly rotated ears, curly/
sparse/coarse hair, and short webbed neck60. In the Colombian population, we detected hypertelorism, downward slanting palpebral fissures, and midfacial hypoplasia in cases of NS, as reported in populations of European
descent60. In addition, our results quantified relative changes in the position of the mouth in Colombian individuals diagnosed with NS not reported before78.
In Neurofibromatosis type 1, the worldwide incidence is 1 per 2,500 to 1 per 3,000 i ndividuals49. NF1 is an
autosomal dominantly inherited neurocutaneous disorder caused by a mutation in the neurofibromin gene.
The clinical manifestations of NF1 are variable, and the timing of the onset has a major influence49. Regarding
craniofacial traits, individuals with NF1 present macrocephaly, facial asymmetry caused by dysplasia of the
sphenoid wings61, as well as bone deformities caused by plexiform neurofibromas, enlarged mandibular canal,
Scientific Reports |
Vol:.(1234567890)
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
10
www.nature.com/scientificreports/
retrognathic mandible and maxilla, and short cranial b
ase79. The facial pattern associated with NF1 in individuals
from Colombia was also compatible with typical traits of NF1, such as midface hypoplasia49. However, our results
did not detect facial asymmetry or hypertelorism as prominent facial differences between diagnosed individuals
and controls in the Colombian p
opulation49.
Overall, our results support previous evidence demonstrating that rare disorders present distinctive facial
traits that are population specific, with clinical features that are significantly different in Africans, Asians, and
Latin Americans34–36. However, comparative facial quantitative analyses including subjects from different world
regions are not usually available for most genetic and rare disorders, and reference data for diagnosis is mainly
based on phenotypes defined on populations of European descent. In fact, almost no images of individuals of
Latin American origin are included in reference medical t exts16. Our results underscore the need to extend the
analyses to populations from all over the world to achieve a complete and more accurate phenotypic representation of genetic and RD to optimize the diagnostic potential of facial biomarkers in the clinical practice.
Variable accuracy diagnosis in a Colombian population with diverse ancestry.
Deep learning
algorithms such as Face2Gene have shown potential as a reliable and precise tool for genetic diagnosis by image
recognition9,26,80,81. In the Colombian sample analyzed here, Face2Gene diagnosed Down syndrome with 100%
accuracy, with the same accuracy as in the European sample. This result suggests that in a relatively common
genetic disorder such as DS, in which the machine learning algorithm is likely trained in a large sample of individuals with a distinctive and well-represented facial phenotype, Face2Gene shows high diagnostic accuracy,
independently from the genetic ancestry.
However, we found that this result cannot be extrapolated to other rare disorders. For instance, we detected a
lower accuracy in the diagnosis of Noonan syndrome in the Colombian sample as compared with the European
sample. Although Face2Gene correctly identified the disorder in most Colombian subjects, especially when considering the top5-accuracy within Noonan syndrome-like disorders (88.9%), the percentage of top1-accuracy was
reduced from 100% to 66.7% in the Colombian sample. We hypothesize that when machine learning algorithms
are trained in a relatively small sample of individuals with homogeneous European ancestry, the accuracy of
diagnosing rare disorders might be more sensitive to population ancestry. Individuals from diverse populations
may show lower gestalt similarity scores when assessed with predictive models that are trained on a population
with different genetic and facial variation, and this may lead to reduced diagnostic accuracy.
Unfortunately, no data was publicly available on European samples to compare the diagnostic accuracy of
Face2Gene in Morquio and Neurofibromatosis type 1 syndromes. Our results showed that the top1-accuracy for
exact diagnosis of Mucopolysaccharidosis type IVA was 0% in the Colombian sample, despite Morquio syndrome
was associated with the most severe facial dysmorphologies. Only a low percentage of cases (36.4%) were identified as a mucopolysaccharidosis-like syndrome in the first prediction. In the case of NF1, the top1-accuracy was
also very low (8.3%), although the facial dysmorphologies in this disorder were less abundant and severe, and
this result could just reflect the difficulty to diagnose NF1 from facial traits.
Finally, in the Colombian sample we detected a wide range of variation in gestalt similarity scores for most
disorders, even for Down syndrome. In European subjects, the gestalt similarity for DS was high or very high in
95.5% of cases, and only 5% of subjects showed a medium gestalt score, even when the images included in Ferry
et al. (2020)56 were ordinary photos with uncontrolled lighting, pose, and image quality. In Colombia, 79% of
individuals diagnosed with DS were associated with very high gestalt similarity values, but in 21% of subjects
the gestalt similarity was lower, and ranged from medium–high to very low values. Specifically, individuals with
the lowest scores exhibited traits that suggested an admixed ancestry, a hypothesis that needs further assessment.
The potential of facial biomarkers to diagnose genetic and rare disorders. Qualitative visual
assessment of facial dysmorphologies is frequently employed for diagnosis, clinical management and treatment
monitoring of R
D16. Experts in dysmorphologies can identify the facial “gestalt” distinctive of many dysmorphic
syndromes16. However, this facial assessment relies on the expertise of the clinician, and is very challenging
because there is no clear one-to-one correspondence between disorders and facial dysmorphologies. Different
genetic mutations can cause the same syndrome or similar phenotypes, whereas the same mutation can induce
different phenotypes12,82. In addition, within the same rare disease there may be several subtypes, and symptoms
may vary even within individuals of the same genetic disorder and the same family3. This complex biology
generates confusion at the time of diagnosis and warrants the development of efficient, objective and reliable
diagnostic methods.
Computer-assisted phenotyping can overcome these pitfalls and provide widely accessible technologies for
quick syndrome screening6. In this automated approach, methods can be based on 2D or 3D images9,10,26. The
advantage of 2D methods is that data collection is easy and can be readily translated into the clinical practice, as
physicians can take facial images even with simple digital cameras or smartphones. The collection of 3D models
is more sophisticated and requires specialized equipment but provides more accurate phenotype descriptions
by incorporating the depth dimension.
To further improve the methods of craniofacial assessment to diagnose individuals with genetic syndromes
and RD that exhibit facial dysmorphologies, it is crucial to assess the large morphological variation displayed
by human populations in facial phenotypes. Factors such as age, sex and ancestry should be accounted for in
diagnostic methods. Clinical manifestations in some genetic disorders usually begin at an early age, with two
thirds of patients expressing symptoms before the second year of b
irth3; although in other disorders facial dysmorphologies develop later, during postnatal development. Male and female faces present sexual dimorphism
at adulthood83, and diseases can differently affect the facial phenotype depending on sex d
ifferences84.
Scientific Reports |
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
11
Vol.:(0123456789)
www.nature.com/scientificreports/
The role of population ancestry in the facial phenotype associated with genetic and rare disorders also needs
to be further investigated in future analyses, assessing the reliability and validity of automatic diagnostic tools
in admixed populations with diverse contributions of Amerindian, African and European ancestry components.
This is critical in rare disorders with heterogenous clinical presentation and phenotype, where clinical diagnosis
is a challenging p
rocess5,6 that may take several years, leading to the so-called diagnostic o
dyssey7.
Accurate and early diagnosis of genetic and rare disorders are crucial for adequate health care and clinical management. Without a diagnosis, individuals and their families must proceed without basic information
regarding their health and future developmental o
utcomes6. Even though gene-based technologies have greatly
25
improved diagnostic p
rocedures , the mutations causing many rare diseases are still not known and access to
genetic testing is limited3. Genetic consultations may become a long process, and broad molecular testing such
as exome and genome sequencing represent a high expense that is not affordable for all families and health care
systems, especially in low-medium income c ountries7. In this context, faster, non-invasive and low-cost diagnostic methods based on facial phenotypes emerge as complementary tools for providing earlier first reliable
diagnoses9,10,25,26.
Therefore, in future research the recruitment of participants must be expanded to include as many individuals
with RD as possible, together with large comparative samples of age-matched controls, from both sexes, and from
diverse world regions that faithfully represent the complex craniofacial variation and evolutionary histories of
human populations. For instance, the population in Southwestern Colombia is characterized by high levels of
admixture from people with Native American, African, and European ancestry44,85. Including the morphological
variation of faces from such different ancestry backgrounds is key to pinpoint the facial dysmorphologies associated with diseases in worldwide diverse populations86. Our simulation analyses further highlight the importance
of maximizing the recruitment of diagnosed and control individuals, as results considerably change depending
on the cohort and sample sizes.
Conclusions
Facial phenotypes associated with genetic and rare disorders can be influenced by population ancestry34–36. Our
ancestry comparisons highlight that diverse genetic background variation can modulate the phenotypic response
to disease, affecting the accuracy of current tools of clinical diagnosis. In the future, deep learning algorithms
including a high variety of populations with different ancestry backgrounds will optimize the precision and
accuracy of diagnosis in an unbiased approach. Such predictive models will support clinicians in decisionmaking across the world.
Data availability
Raw phenotype data from the Colombian population cannot be made available due to restrictions imposed by
the ethics approval. Images from publicly available sources can be accessed from the original p
ublications56,57.
Anonymized landmark data and Matlab code for computing Facial Dysmorphology Score (FDS) is available at
https://github.com/xaviersevillano/EDMA_FDS_analysis_2D.
Received: 10 December 2022; Accepted: 12 April 2023
References
1. NguengangWakap, S. et al. Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database. Eur. J. Hum.
Genet. 28, 165–173. https://doi.org/10.1038/s41431-019-0508-0 (2020).
2. Viteri, J. et al. Enfermedades huérfanas. Arch. Ven. Farm. Terap. 39, 627–636. https://doi.org/10.5281/ZENODO.4263347 (2020).
3. Suárez-Obando, F. La atención clínica de las enfermedades raras: Un reto para la educación médica. Med. BA 40, 228–241 (2018).
4. Cortés, F. Las enfermedades raras. Rev. Méd. Clín. Cond. 26, 425–431. https://doi.org/10.1016/j.rmclc.2015.06.020 (2015).
5. Schieppati, A., Henter, J.-I., Daina, E. & Aperia, A. Why rare diseases are an important medical and social issue. Lancet 371,
2039–2041. https://doi.org/10.1016/S0140-6736(08)60872-7 (2008).
6. Bannister, J. J. et al. Fully automatic landmarking of syndromic 3D facial surface scans using 2D images. Sensors 20, 3171. https://
doi.org/10.3390/s20113171 (2020).
7. González-Lamuño, D. & García-Fuentes, M. Enfermedades de base genética. An. Sist. San. Nav. 31, 105–126 (2008).
8. Gülbakan, B. et al. Discovery of biomarkers in rare diseases: innovative approaches by predictive and personalized medicine.
EPMA J. 7, 1–6. https://doi.org/10.1186/s13167-016-0074-2 (2016).
9. Gurovich, Y. et al. Identifying facial phenotypes of genetic disorders using deep learning. Nat. Med. 25, 60–64. https://doi.org/10.
1038/s41591-018-0279-0 (2019).
10. Hallgrímsson, B. et al. Automated syndrome diagnosis by three-dimensional facial imaging. Gen. Med. 22, 1682–1693. https://
doi.org/10.1038/s41436-020-0845-y (2020).
11. Farrera, A. et al. Ontogeny of the facial phenotypic variability in Mexican patients with 22q11.2 deletion syndrome. Hea. Fac. Med.
15, 29. https://doi.org/10.1186/s13005-019-0213-9 (2019).
12. Martínez-Abadías, N. et al. FGF/FGFR signaling coordinates skull development by modulating magnitude of morphological
integration: Evidence from Apert syndrome mouse models. PLoS ONE 6, e26425. https://doi.org/10.1371/journal.pone.0026425
(2011).
13. Richtsmeier, J. T. & Flaherty, K. Hand in glove: Brain and skull in development and dysmorphogenesis. Act. Neu. 125, 469–489.
https://doi.org/10.1007/s00401-013-1104-y (2013).
14. Hallgrímsson, B. et al. Morphometrics, 3D imaging, and craniofacial development. Curr. Top. Dev. Bio. 115, 561–597. https://doi.
org/10.1016/bs.ctdb.2015.09.003 (2015).
15. Kouskoura, T. et al. The genetic basis of craniofacial and dental abnormalities. Riv. Men. Svi. Odon. Sto. 121, 636–646 (2011).
16. Jones, K.L., Jones, M.C., & Campo, M. Smith’s recognizable patterns of human malformation (ed. Elsevier Health Sciences) (Amsterdam, 2021).
17. Aase, J.M. The physical examination in dysmorphology in Diagnostic dysmorphology (ed. Plenum Medical Book Company) 33–42
(New York and London, 1990).
Scientific Reports |
Vol:.(1234567890)
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
12
www.nature.com/scientificreports/
18. Johannes, M., Clara, V., Hubert, C. & Raoul, H. Phenotypic abnormalities: Terminology and classification. Am. J. Med. Gen. 123A,
211–230. https://doi.org/10.1002/ajmg.a.20249 (2003).
19. Reardon, W. & Donnai, D. Dysmorphology demystified. Arch. Dis. Child. Fet. Neo. 92, F225–F229. https://doi.org/10.1136/adc.
2006.110619 (2007).
20. Hammond, P. et al. 3D analysis of facial morphology. Am. J. Med. Gen. 126A, 339–348. https://doi.org/10.1002/ajmg.a.20665
(2004).
21 Hammond, P. The use of 3D face shape modelling in dysmorphology. Arch. Dis. Child. 92, 1120–1126. https://doi.org/10.1136/
adc.2006.103507 (2007).
22 Hammond, P. & Suttie, M. Large-scale objective phenotyping of 3D facial morphology. Hum. Mut. 33, 817–825. https://doi.org/
10.1002/humu.22054 (2012).
23 Hurst, A. C. E. Facial recognition software in clinical dysmorphology. Curr. Op. Ped. 30, 701–706. https://doi.org/10.1097/MOP.
0000000000000677 (2018).
24 Köhler, S. et al. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nuc. Ac. Res. 47, D1018–
D1027. https://doi.org/10.1093/nar/gky1105 (2019).
25. Agbolade, O., Nazri, A., Yaakob, R., Ghani, A. A. & Cheah, Y. K. Down syndrome face recognition: A review. Symmetry. 12, 1182.
https://doi.org/10.3390/sym12071182 (2020).
26. Hsieh, T. C. et al. GestaltMatcher facilitates rare disease matching using facial phenotype descriptors. Nat. Gen. 54, 349–357.
https://doi.org/10.1038/s41588-021-01010-x (2022).
27 Xiong, Z. et al. Novel genetic loci affecting facial shape variation in humans. Elife 8, e49898. https://doi.org/10.7554/eLife.49898
(2019).
28. Qiao, L. et al. Genome-wide variants of Eurasian facial shape differentiation and a prospective model of DNA based face prediction. J. Gen. Gen. 45, 419–432. https://doi.org/10.1016/j.jgg.2018.07.009 (2018).
29. Martínez-Abadías, N. et al. Phenotypic evolution of human craniofacial morphology after admixture: a geometric morphometrics
approach. Am. J. Phys. Anth. 129, 387–398. https://doi.org/10.1002/ajpa.20291 (2006).
30 Quinto-Sánchez, M. et al. Facial asymmetry and genetic ancestry in Latin American admixed populations. Am. J. Phys. Anth. 157,
58–70. https://doi.org/10.1002/ajpa.22688 (2015).
31. Ruiz-Linares, A. et al. Admixture in Latin America: geographic structure, phenotypic diversity and self-perception of ancestry
based on 7,342 individuals. PLoS Gen, 10, e1004572. https://doi.org/10.1371/journal.pgen.1004572 (2014).
32. Sheehan, M. J. & Nachman, M. W. Morphological and population genomic evidence that human faces have evolved to signal
individual identity. Nat. Commun. 5, 4800. https://doi.org/10.1038/ncomms5800 (2014).
33. Kruszka, P. et al. 22q11.2 deletion syndrome in diverse populations. Am. J. Med. Gen. A 173, 879–888. https://doi.org/10.1002/
ajmg.a.38199 (2017).
34. Kruszka, P. et al. Noonan syndrome in diverse populations. Am. J. Med. Gen Part A. 173, 2323–2334. https://doi.org/10.1002/
ajmg.a.38362 (2017).
35. Kruszka, P. et al. Down syndrome in diverse populations. Am. J. Med. Gen. Part A. 173, 42–53. https://doi.org/10.1002/ajmg.a.
38043 (2017).
36. Dowsett, L. et al. Cornelia de Lange syndrome in diverse populations. Am. J. Med. Gen. A 179, 150–158. https://doi.org/10.1002/
ajmg.a.61033 (2019).
37. Mendoza-Revilla, J. et al. Disentangling signatures of selection before and after European colonization in Latin Americans. Mol.
Biol. Ev. 39, msac076. https://doi.org/10.1093/molbev/msac076 (2022).
38. Ardelean, C. F. et al. Evidence of human occupation in Mexico around the Last Glacial Maximum. Nature 584, 87–92. https://doi.
org/10.1038/s41586-020-2509-0 (2020).
39. Becerra-Valdivia, L. & Higham, T. The timing and effect of the earliest human arrivals in North America. Nature 584, 93–97.
https://doi.org/10.1038/s41586-020-2491-6 (2020).
40. Castro E Silva, M. A., Ferraz, T., Bortolini, M. C., Comas, D. & Hünemeier, T. Deep genetic affinity between coastal Pacific and
Amazonian natives evidenced by Australasian ancestry. Proc. Nat. Ac. Sci. USA 118, 1. https://doi.org/10.1073/pnas.2025739118
(2021).
41. González-José, R. et al. Craniometric evidence for Palaeoamerican survival in Baja California. Nature 425, 62–65. https://doi.org/
10.1038/nature01816 (2003).
42 Salzano, F. M. & Bortolini, M. C. The Evolution and Genetics of Latin American Populations 512 (Cambridge University Press,
Cambridge, 2002).
43. Salzano, F. M. & Sans, M. Interethnic admixture and the evolution of Latin American populations. Gen. Mol. Biol. 37, 151–170.
https://doi.org/10.1590/s1415-47572014000200003 (2014).
44 Urrea-Giraldo, F. & Álvarez, A. F. C. Cali an enlarged region city: an approximation from the ethnic-racial dimension and population flows. Rev. Soc. Ec. UV. 33, 145–174. https://doi.org/10.25100/sye.v0i33.5628 (2017).
45. Pachajoa, H. et al. Molecular characterization of mucopolysaccharidosis type IVA patients in the Andean region of Colombia. Am.
J. Med. Gen. Part C. 187, 388–395. https://doi.org/10.1002/ajmg.c.31936 (2021).
46. Herrera, L. M. C., Martínez, A. V., López, N. M., Téllez, J. M. & Contreras, X. D. M. Síndrome de Morquio, enfermedad de interés
para la odontopediatría. Presentación de un caso. Rev. Ped. Elec. 14, 2–11 (2017).
47. Sawamoto, K. et al. Mucopolysaccharidosis IVA: Diagnosis, treatment, and management. Int. J. Mol. Sci. 21, 1517. https://doi.org/
10.3390/ijms21041517 (2020).
48. Suárez-Guerrero, J. L., Suárez, A. K. B., Santos, M. C. V. & Contreras-García, G. A. Caracterización clínica, estudios genéticos, y
manejo de la Mucopolisacaridosis tipo IV A. Med. UIS. 26, 43–50 (2013).
49. Hernández-Martín, A. & Torrelo, A. Rasopathies: Developmental disorders that predispose to cancer and skin Manifestations.
Act. Dermo-Sifiliográficas. 102, 402–416. https://doi.org/10.1016/j.adengl.2011.02.002 (2011).
50. King, D. E. Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009).
51. Stull, K. E., Tise, M. L., Ali, Z. & Fowler, D. R. Accuracy and reliability of measurements obtained from computed tomography 3D
volume rendered images. Foren. Sci. Int. 238, 133–140. https://doi.org/10.1016/j.forsciint.2014.03.005 (2014).
52. Lele, S. R. & Richtsmeier, J. T. Euclidean Distance Matrix Analysis: A coordinate-free approach for comparing biological shapes
using landmark data. Am. J. Phys. Anth. 86, 415–427 (1991).
53. Rohlf, F. J. & Slice, D. Extensions of the Procrustes method for the optimal superimposition of landmarks. Syst. Biol. 39, 40–59
(1990).
54. Lele, S. R. & Cole, T. A new test for shape differences when variance-covariance matrices are unequal. J. Hum. Evo. 31, 193–212
(1996).
55. Starbuck, J. M. et al. Green tea extracts containing epigallocatechin-3-gallate modulate facial development in Down syndrome.
Sci. Rep. 11, 4715. https://doi.org/10.1038/s41598-021-83757-1 (2021).
56. Ferry, Q. et al. Diagnostically relevant facial gestalt information from ordinary photos. Elife 3, e02020. https://doi.org/10.7554/
eLife.02020 (2014).
57. Allanson, J. E. et al. The face of Noonan syndrome: Does phenotype predict genotype. Am. J. Med. Gen. 152A, 1960–1966. https://
doi.org/10.1002/ajmg.a.33518 (2010).
Scientific Reports |
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
13
Vol.:(0123456789)
www.nature.com/scientificreports/
58. Terrazas, K., Dixon, J., Trainor, P. A. & Dixon, M. J. Rare syndromes of the head and face: mandibulofacial and acrofacial dysostoses.
Wiley Interd. Rev. Dev. Biol. 6, 263. https://doi.org/10.1002/wdev.263 (2017).
59. Starbuck, J. M., Cole, T. M., Reeves, R. H. & Richtsmeier, J. T. Trisomy 21 and facial developmental instability. Am. J. Phys. Anth.
151, 49–57. https://doi.org/10.1002/AJPA.22255 (2013).
60. Athota, J. P. et al. Molecular and clinical studies in 107 Noonan syndrome affected individuals with PTPN11 mutations. BMC.
Med. Gen. 21, 50. https://doi.org/10.1186/s12881-020-0986-5 (2020).
61. Khosrotehrani, K., Bastuji-Garin, S., Zeller, J., Revuz, J. & Wolkenstein, P. Clinical risk factors for mortality in patients with Neurofibromatosis 1: A cohort study of 378 patients. Arch. Derm. 139, 187–191. https://doi.org/10.1001/archderm.139.2.187 (2003).
62. Glasson, E. J. et al. The changing survival profile of people with Down’s syndrome: Implications for genetic counselling. Clin. Gen.
62, 390–393. https://doi.org/10.1034/j.1399-0004.2002.620506.x (2002).
63. Roper, R. & Reeves, R. Understanding the basis for Down syndrome phenotypes. PLoS Gen. 2, e50. https://doi.org/10.1371/journ
al.pgen.0020050 (2006).
64. Patterson, D. Molecular genetic analysis of Down syndrome. Hum. Gen. 126, 195–214. https://d
oi.o
rg/1 0.1 007/s 00439-0 09-0 696-8
(2009).
65 Aivazidis, S. et al. The burden of trisomy 21 disrupts the proteostasis network in Down syndrome. PLoS ONE 12, e0176307. https://
doi.org/10.1371/journal.pone.0176307 (2017).
66. Laignier, M. R., Lopes-Júnior, L. C., Santana, R. E., Leite, F. M. C. & Brancato, C. L. Down syndrome in Brazil: Occurrence and
associated factors. Int. J. Env. Res. Pub. He. 18, 11954. https://doi.org/10.3390/ijerph182211954 (2021).
67. Hernández Ramírez, I. & Manrique Hernández, R. D. Prevalencia de síndrome de Down en CEHANI-ESE, San Juan de Pasto
Colombia 1998–2003. Nova 4, 50–56. https://doi.org/10.22490/24629448.347 (2006).
68. Valencia Arana, C. A. et al. Prevalencia al nacimiento de síndrome de Down en la ciudad de Manizales (Caldas-Colombia) durante
el periodo 2004–2005. Biosalud. 69. https://link.gale.com/apps/doc/A258132055/IFME?u=anon~ab6dcaef&sid=googleScholar&
xid=7f6e25b7 (2008).
69. Korayem, M. & Bakhadher, W. Craniofacial manifestations of Down syndrome: A review of literature. Ac. J. Sci. Res. 3, 176–181.
https://doi.org/10.15413/ajsr.2019.0502 (2019).
70. Hennequin, M., Faulks, D., Veyrune, J.-L. & Bourdiol, P. Significance of oral health in persons with Down syndrome: A literature
review. Dev. Med. Child. Neu. 41, 275–283. https://doi.org/10.1111/j.1469-8749.1999.tb00599 (1999).
71. Oliveira, A. C. B., Paiva, S. M., Campos, M. R. & Czeresnia, D. Factors associated with malocclusions in children and adolescents
with Down syndrome. Am. J. Orth Dent. Orth. 133, 489-e1 (2008).
72. Vicente, A. et al. Craniofacial morphology in down syndrome: A systematic review and meta-analysis. Sci Rep 10, 19895. https://
doi.org/10.1038/s41598-020-76984-5 (2020).
73. Suárez-Guerrero, J. L., Gómez Higuera, P. J. I., Arias Flórez, J. S. & Contreras-García, G. A. Mucopolisacaridosis: Características
clínicas, diagnóstico y de manejo. Rev. Chil. Ped. 87, 295–304. https://doi.org/10.1016/j.rchipe.2015.10.004 (2016).
74. Ortiz-Quiroga, D., Ariza-Araújo, Y. & Pachajoa, H. Calidad de vida familiar en pacientes con síndrome de Morquio tipo IV-A.
Una mirada desde el contexto colombiano (Suramérica). Rehabilitación. 52, 230–237. https://doi.org/10.1016/j.rh.2018.07.002/
(2018).
75. Tomatsu, S. et al. Mutation and polymorphism spectrum of the GALNS gene in mucopolysaccharidosis IVA (Morquio A). Hum.
Mut. 26, 500–512. https://doi.org/10.1002/humu.20257 (2005).
76. Morrone, A. et al. Molecular testing of 163 patients with Morquio A (Mucopolysaccharidosis IVA) identifies 39 novel GALNS
mutations. Mol. Gen. Metab. 112, 160–170. https://doi.org/10.1016/j.ymgme.2014.03.004 (2014).
77. Zanetti, A. et al. Molecular basis of mucopolysaccharidosis IVA (Morquio A syndrome): A review and classification of GALNS
gene variants and reporting of 68 novel variants. Hum. Mut. 42, 1384–1398. https://doi.org/10.1002/humu.24270 (2021).
78. Lores, J., Prada, C. E., Ramírez-Montaño, D., Nastasi-Catanese, J. A. & Pachajoa, H. Clinical and molecular analysis of 26 individuals with Noonan syndrome in a reference institution in Colombia. Am. J. Med. Gen. Part C. 184, 1042–1051. https://doi.org/10.
1002/ajmg.c.31869 (2020).
79. Visnapuu, V., Peltonen, S., Alivuotila, L., Happonen, R.-P. & Peltonen, J. Craniofacial and oral alterations in patients with Neurofibromatosis 1. Orph. J. Rar. Dis. 13, 131. https://doi.org/10.1186/s13023-018-0881-8 (2018).
80. Park, S., Kim, J., Song, T.-Y. & Jang, D.-H. Case Report: The success of face analysis technology in extremely rare genetic diseases
in Korea: Tatton–Brown–Rahman syndrome and Say-Barber –Biesecker–Young–Simpson variant of ohdo syndrome. Front. Gen.
13, 903199. https://doi.org/10.3389/fgene.2022.903199 (2022).
81. Pascolini, G., Calvani, M. & Grammatico, P. First Italian experience using the automated craniofacial gestalt analysis on a cohort
of pediatric patients with multiple anomaly syndromes. It. J. Ped. 48, 91. https://doi.org/10.1186/s13052-022-01283-w (2022).
82. Aldridge, K. et al. Brain phenotypes in two FGFR2 mouse models for Apert syndrome. Dev. Dyn. 239, 987–997. https://doi.org/
10.1002/dvdy.22218 (2010).
83. Enlow, D.H., & Hans, M.G. Essentials of facial growth (ed. Saunders) (Saunders, 1996).
84. Martínez-Abadías, N. et al. Facial Biomarkers Detect Gender-Specific Traits for Bipolar Disorder. FASEB. J. 35. https://doi.org/
10.1096/fasebj.2021.35.S1.03695 (2021).
85. Adhikari, K., Chacón-Duque, J. C., Mendoza-Revilla, J., Fuentes-Guajardo, M. & Ruiz-Linares, A. The Genetic Diversity of the
Americas. Ann. Rev. Gen. Hum. Gen. 18, 277–296. https://doi.org/10.1146/annurev-genom-083115-022331 (2017).
86 Conley, A. B. et al. A comparative analysis of genetic ancestry and admixture in the Colombian Populations of Chocó and Medellín.
G3 (Bethesda, Md) 7, 3435–3447. https://doi.org/10.1534/g3.117.1118 (2017).
Acknowledgements
We are grateful for the voluntary collaboration of all participants, including children and their families. We are
thankful to Colegio Ecológico Scout and Universidad Icesi for granting us permission to organize the photographic sessions in Colombia; and Dr. Nelläker for help with accessing the European database. We thank Max
Rubert for technical photographic assistance. We also thank the reviewers and editor for their insightful comments, which have greatly improved the quality of our manuscript. We acknowledge support from Proyecto
COL0012168-1097 Interfacultades-ICESI, Grup de Recerca Consolidat (2021 SGR 00706), and Biological
Anthropological Master UB-UAB.
Author contributions
L.M.E., E.C., H.P. and N.M.A. designed the study and wrote the manuscript; L.M.E., E.C., E.G., P.S., D.R., D.O,
J.C.C., H.P. and N.M.A. organized and performed data collection; L.M.E., E.C., A.G, X.S. and N.M.A. performed
data analysis and prepared the figures. All authors reviewed the manuscript.
Competing interests
The authors declare no competing interests.
Scientific Reports |
Vol:.(1234567890)
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
14
www.nature.com/scientificreports/
Additional information
Supplementary Information The online version contains supplementary material available at https://doi.org/
10.1038/s41598-023-33374-x.
Correspondence and requests for materials should be addressed to N.M.-A.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
© The Author(s) 2023
Scientific Reports |
(2023) 13:6869 |
https://doi.org/10.1038/s41598-023-33374-x
15
Vol.:(0123456789)
Received: 8 December 2020
DOI: 10.1002/mgg3.1636
|
Accepted: 10 February 2021
ORIGINAL ARTICLE
Objective differential diagnosis of Noonan and Williams–Beuren
syndromes in diverse populations using quantitative facial
phenotyping
Antonio R. Porras1,2
| Marshal Summar3 | Marius George Linguraru1,4
1
Sheikh Zayed Institute for Pediatric
Surgical Innovation, Children’s National
Hospital, Washington, D.C., USA
2
Department of Biostatistics and
Informatics, Colorado School of Public
Health, University of Colorado Anschutz
Medical Campus, Aurora, CO, USA
3
Rare Disease Institute –Genetics
and Metabolism, Children’s National
Hospital, Washington, D.C., USA
4
School of Medicine and Health
sciences, George Washington University,
Washington, D.C., USA
Correspondence
Antonio R. Porras, Department of
Biostatistics and Informatics, Colorado
School of Public Health, University of
Colorado Anschutz Medical Campus.
Fitzsimons Building, 4th Floor, 13001 E.
17th Place., Aurora, CO 80045, USA.
Email: antonio.porras@cuanschutz.edu
Abstract
Introduction: Patients with Noonan and Williams–Beuren syndrome present similar
facial phenotypes modulated by their ethnic background. Although distinctive facial
features have been reported, studies show a variable incidence of those characteristics in populations with diverse ancestry. Hence, a differential diagnosis based on
reported facial features can be challenging. Although accurate diagnoses are possible
with genetic testing, they are not available in developing and remote regions.
Methods: We used a facial analysis technology to identify the most discriminative
facial metrics between 286 patients with Noonan and 161 with Williams-Beuren syndrome with diverse ethnic background. We quantified the most discriminative metrics, and their ranges both globally and in different ethnic groups. We also created
population-based appearance images that are useful not only as clinical references but
also for training purposes. Finally, we trained both global and ethnic-specific machine
learning models with previous metrics to distinguish between patients with Noonan
and Williams–Beuren syndromes.
Results: We obtained a classification accuracy of 85.68% in the global population
evaluated using cross-validation, which improved to 90.38% when we adapted the
facial metrics to the ethnicity of the patients (p = 0.024).
Conclusion: Our facial analysis provided for the first time quantitative reference facial metrics for the differential diagnosis Noonan and Williams–Beuren syndromes
in diverse populations.
KEYWORDS
facial analysis, facial phenotyping, machine learning, Noonan, Williams–Beuren
1
|
IN T RO D U C T IO N
Noonan syndrome is a congenital genetic disorder that affects between 1 per 1000 and 1 per 2500 live births (Noonan,
1994; Nora, 1974), and it is caused by different mutations
in several genes (OMIM #163950, #605275, #609942,
#610733, #611553, #613224, #613706, #615355, #616559,
#616564, #618499, #618624 or #619087). Subjects with
Noonan syndrome typically present characteristic facial features and short stature (Allanson et al., 2010; van der Burgt
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original
work is properly cited.
© 2021 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals LLC.
Mol Genet Genomic Med. 2021;9:e1636.
https://doi.org/10.1002/mgg3.1636
wileyonlinelibrary.com/journal/mgg3
| 1 of 10
|
et al., 1999), and about half have congenital cardiac abnormalities (Noonan, 1994). Although it is generally diagnosed
based on the observation of key features, molecular testing
can provide a confirmation of diagnosis in about 70% of the
cases (Allanson & Roberts, 1993; Bhambhani et al., 2014).
An early diagnosis is not only important for a prompt treatment but also to provide genetic counseling to the family.
However, early diagnosis of Noonan syndrome is challenging and late diagnoses are frequent, with reports showing an
average age of diagnosis of 9 years (Sharland et al., 1992).
The differential diagnosis of Noonan syndrome includes
Williams–
Beuren syndrome (OMIM #194050) (Allanson,
1987; Morris, 1993), among other disorders. Williams–
Beuren syndrome has a prevalence of about 1 in 7500 live
births (Strømme et al., 2002), and patients with this condition present similar characteristics to patients with Noonan
syndrome, including facial dysmorphology and short stature
(Allanson, 1987; Cassidy & Allanson, 2010; Morris, 1993).
Williams–Beuren syndrome is also associated with congenital heart disease (Morris, 1993, 2010). As both the physical
manifestations and their severity are variable, individuals
with Williams–Beuren syndrome are often undetected during
early childhood, with an average diagnostic age of 3.66 years
(Huang et al., 2002). Diagnostic confirmation of Williams–
Beuren syndrome is often attained using fluorescence in
situ hybridization, but it can also be established using other
techniques such as array comparative genomic hybridization
(Pober, 2010).
Diagnostic tests are typically requested after the identification of signs and symptoms associated with either Noonan
or Williams–Beuren syndrome, and they are often not available in developing countries. In many cases, the examination
is made based only on phenotypical observations and symptoms, which may lead to errors and delays in the correct diagnosis. Although several studies have reported independently
similar facial phenotypes among patients with Noonan and
Williams–Beuren syndrome, there are also studies reporting
distinctive facial features specific to each syndrome (Allanson,
1987; Castelo-Branco et al., 2007; Digilio & Marino, 2001;
Morris & Mervis, 2000; Noonan, 1994; Romano et al., 2010;
Winter et al., 2018; Wu et al., 1999). However, even though
these distinctive observations are often found in patients presenting either Noonan or Williams–Beuren syndromes, they
are not always present and they are modulated by the ethnic
background of the patients(Kruszka, Porras, Addissie, et al.,
2017; Kruszka et al., 2018). An objective and accurate way
to differentiate between these two genetic syndromes can significantly improve the clinical management of these patients
and their outcomes.
In this work, we use a digital facial analysis technology to
objectively quantify and illustrate facial phenotypical differences between patients with Noonan and Williams–Beuren
syndrome. We use our technology to determine a set of
PORRAS et al.
objective metrics that can be used as a reference to help differentiating between these two syndromes. As the phenotype
of genetic syndromes is modulated by the ethnic background
of the patients (Kruszka, Addissie, et al., 2017; Kruszka,
Porras, Addissie, et al., 2017; Kruszka et al., 2018; Kruszka,
Porras, Sobering, et al., 2017), we also present the metrics
that are relevant for patient populations from four different
ethnic groups: African descent, Asian, Caucasian, and Latin
American.
1.1
| State of the art
The phenotypical observations of patients with Williams–
Beuren and Noonan syndromes have been studied independently in the literature (Allanson, 1987, 2016; Kruszka,
Porras, Addissie, et al., 2017; Kruszka et al., 2018; Morris,
1993, 2010; Noonan, 1994; Roberts et al., 2013). Some
studies have reported similar facial observations among
patients with either of those syndromes: hypertelorism
(Allanson, 1987; Levin & Enzenauer, 2017; Noonan, 1994;
Wu et al., 1999), telecanthus (Castelo-Branco et al., 2007;
Chen, 2012; Morris & Mervis, 2000; Romano et al., 2010),
ptosis (Allanson, 2016; Digilio & Marino, 2001; Winter
et al., 2018), epicanthal folds (Allanson, 2016; Kruszka
et al., 2018; Morris, 1993; Roberts et al., 2013), and short
nose (Allanson, 2016; Kruszka et al., 2018; Morris, 1993;
Roberts et al., 2013). However, other studies have reported
distinctive facial features between patients with Williams–
Beuren and Noonan syndromes. Patients with Noonan syndrome are often described as presenting low-set ears and
widely spaced eyes (Bertola et al., 2006; Essawi et al., 2013;
Kruszka, Porras, Addissie, et al., 2017; Rokhaya et al., 2014;
Şimşek-Kiper et al., 2013), whereas patients with Williams–
Beuren syndrome are described as presenting a short nose
and a wide mouth (Kruszka et al., 2018; Patil et al., 2012;
Pérez Jurado et al., 1996). Other discriminative facial features reported include down-slanted palpebral fissures in patients with Noonan syndrome (Bertola et al., 2006; Essawi
et al., 2013; Hung et al., 2007; Kruszka, Porras, Addissie,
et al., 2017; Şimşek-Kiper et al., 2013) and a long philtrum
in patients with Williams–Beuren syndrome (Kruszka et al.,
2018; Patil et al., 2012; Pérez Jurado et al., 1996). However,
as given in Table 1, variable reports on the incidence of
these observations suggest that those characteristics are not
discriminative for an accurate differential diagnosis based
on physical observations between Noonan and Williams–
Beuren syndromes. Only 17% of the patients with Noonan
syndrome from Senegal study (Rokhaya et al., 2014) and 58%
of the patients from Turkey study (Şimşek-Kiper et al., 2013)
were reported as presenting low-
set ears. When patients
with Noonan syndrome were stratified based on the ethnic
background (Kruszka, Porras, Addissie, et al., 2017), 82% of
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 of 10
| 3 of 10
T A B L E 1 Reported incidence of discriminative facial features between patients with Noonan and Williams–Beuren syndromes in different
studies and populations
Noonan syndrome
Study
Population
Low ears
Down-slanted eyes
Widely spaced
eyes
Epicanthal folds
Rokhaya et al. (2014)
Senegal
17%
Not reported
100%
Not reported
Şimşek-Kiper et al.
(2013)
Turkey
58%
73%
85%
Not reported
Essawi et al. (2013)
Egypt
57%
100%
100%
Not reported
Hung et al. (2007)
Taiwan
Not reported
59%
Not reported
56%
Bertola et al. (2006)
Brazil
Not reported
66%
44%
Not reported
Yoshida et al. (2004)
Japan
Not reported
Not reported
100%
Not reported
Kruszka, Porras,
Addissie, et al.
(2017)
African
82%
87%
80%
70%
Asian
94%
86%
96%
64%
Latin American
88%
73%
94%
55%
Williams–Beuren syndrome
Study
Population
Wide mouth
Short nose
Long philtrum
Epicanthal folds
Patil et al. (2012)
India
100%
100%
85%
52%
Pérez Jurado et al.
(1996)
Mixed
Not reported
90%
83%
71%
Kruszka et al.
(2018)
African
88%
88%
88%
13%
Asian
78%
75%
79%
63%
Latin American
91%
74%
93%
73%
African descent, 94% of Asian, and 88% of Latin American
patients presented low-set ears. Similarly, the incidence reports of widely spaced eyes in patients with Noonan syndrome ranged from the 44% reported (Bertola et al., 2006) in
a Brazilian population to the 100% reported (Rokhaya et al.,
2014) for a patient population from Senegal, and (Hung et al.,
2007) for a population from Taiwan.
On the other hand, only 78% of the Asian population with
Williams–Beuren syndrome (Kruszka et al., 2018) presented
a wide mouth, as compared to the 100% reported (Patil et al.,
2012) for an Indian population. When looking at the nose
size, 100% of patients from India presented a short nose
(Patil et al., 2012), compared with 74% of Latin American
(Kruszka et al., 2018).
To the best of our knowledge, quantitative methods to distinguish between patients with Noonan and Williams–Beuren
syndrome have been explored only in the study by Preus
(Preus, 2008). In that study, a clustering analysis showed
that patients with Noonan and Williams–Beuren syndrome
are clinically distinguishable. However, that study focused
on many clinical observations that are not easily observable.
For instance, cardiac abnormalities cannot be observed without the specialized equipment, which may not be available in
in rural areas and developing countries. Similarly, although
family history information is essential for an early diagnosis,
it is sometimes unknown to the clinical team. In addition, that
previous study analyzed a small population of patients, it did
not provide objective metrics that can be translated into direct
clinical use, and it did not consider the ethnic variability of
the patients.
In the current study, we provide reference facial metrics
adapted to the ethnic background of the patients that can
be used directly at any clinic. In addition, we illustrate facial appearance features that can be quantified by computer
methods, but only qualitatively assessed by the human eye,
and which are relevant to differentiate between Noonan and
Williams–Beuren syndrome. To the best of our knowledge,
this is the first time that facial analysis technology is used
to quantify and illustrate graphically on population-based
computer-generated images the specific facial features that
allow for the distinction of these two genetic syndromes in
diverse populations, in addition to providing reference geometric measurements.
2
2.1
|
M ETHODS
| Data
We evaluated the face photographs of 286 (49 infants, 47
toddlers, 71 children, 28 adolescents, and 91 adults; 150
male and 136 female) individuals with Williams–
Beuren
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
PORRAS et al.
|
syndrome from 19 countries, and 161 (45 infants, 29 toddlers, 47 children, 18 adolescents, and 22 adults; 93 male
and 68 female) patients with Noonan syndrome from 14
countries. All participants were diagnosed with molecular
testing and/or clinical evaluation by local expert geneticists.
Verbal or written formal consent from the parent/guardian
was obtained by local institutional review boards and the
protocol #7134 at the Children's National Hospital. A subset
of these dataset is publicly available through the “Atlas of
Human Malformation Syndromes in Diverse Populations” of
the National Human Genome Research Institute –National
Institutes of Health (Muenke et al., 2016). Clinical findings
and additional details on these data can be found in previous studies (Kruszka, Porras, Addissie, et al., 2017; Kruszka
et al., 2018). We categorized the patients into four groups:
African descent (28 patients with Williams–Beuren and 35
with Noonan syndrome), Asian (26 patients with Williams–
Beuren and 40 with Noonan syndrome), Caucasian (121 patients with Williams–Beuren and 40 with Noonan syndrome),
or Latin American (111 patients with Williams–Beuren and
46 with Noonan syndrome). In this study, we only included
those patients whose face photographs were frontal, with
eyes open, and with even illumination conditions. We discarded all pictures with illumination artifacts or shadows that
could affect the appearance of the face. We also discarded
pictures in which any part of the face was not totally visible
(e.g., glasses, hair over the eyes).
2.2
| Facial analysis
The facial analysis methods used in this study are based on
the technology previously described (Cerrolaza et al., 2016;
Ojala et al., 1996). We have used that technology to identify
Down (Kruszka, Porras, Sobering, et al., 2017), 22q11.2 deletion (Kruszka, Addissie, et al., 2017), Noonan (Kruszka,
Porras, Addissie, et al., 2017), and Williams–Beuren syndromes (Kruszka et al., 2018) from healthy individuals in
diverse populations.
2.2.1
| Quantification of facial features
Our face analysis technology quantifies a set of geometric
measurements (i.e., distances and angles) from 44 anatomical
facial landmarks (e.g., lateral canthi, oral commissures…).
The location of each of the landmarks and the geometric
measurements is represented in Figure 1. We estimated the
average of the measurements on the right and left sides of the
face to obtain symmetric metrics that are easier to interpret
and to use as clinical references, and their absolute differences to quantify asymmetry. All horizontal measurements
were normalized with respect to the ear-to-ear distance, and
PORRAS et al.
F I G U R E 1 Representation of the facial landmarks and geometric
metrics. Inner facial landmarks are represented as red circles.
Horizontal distances between these landmarks are represented as blue
lines. Vertical distances are represented as magenta lines. Angles
are represented with green dashed lines, with the center of the angle
represented as a green circle around the landmark, and the extremes
represented with a green dot inside the landmark
all vertical measurements were normalized to the distance
between the mid-point between the oral commissures and the
nose root. Asymmetry measurements were normalized with
respect to the average value from the measurements at the
left and right sides. In addition, our technology quantifies the
appearance around each of a subset of 33 inner facial landmarks using texture descriptors based on local binary patterns (LBP) as represented in Figure 2 (Cerrolaza et al., 2016;
Ye et al., 2005), which are sensitive to lines, shadows, and
local intensity contrast.
2.2.2
| Feature selection and classification
Once all geometric and appearance metrics were calculated, we selected the most discriminative ones between
Noonan and Williams–Beuren syndrome using recursive
feature elimination (Guyon et al., 2002) based on a support vector machine (SVM) classifier (Cortes & Vapnik,
1995). To compensate for the different number of patients
with Noonan and Williams–Beuren syndromes, we used a
weighting scheme (Du & Chen, 2005) that balanced the
contribution of each individual to the SVM classifier,
therefore the total weight of the patients with Noonan
and Williams–Beuren syndrome was the same. We evaluated our classifier using leave-
one-
out cross-
validation
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 of 10
| 5 of 10
F I G U R E 2 Representation of the image patches used to calculate the local binary patterns (LBP) around the medial canthi of the right eye. (a)
the area around the landmark that is involved in the calculation of the LBPs at the three resolutions, in yellow for the highest resolution (R1), green
for a medium resolution (R2), and blue for the lowest resolution (R3). (b), (c), and (d) illustrate the image patches involved in the calculation of the
LBP at resolution levels R1, R2, and R3, respectively. At each level, the LBPs are calculated by comparing the image patch around the landmark
(in red) with the patches in their neighborhood (in yellow for R1, green for R2, and blue for R3)
TABLE 2
Interpretation of the quantitative results in the global population
Significant differences
Relevant differences
Noonan
Williams–Beuren
Noonan
Williams–Beuren
• More pronounced hypertelorism
and telecanthus
• More pronounced down-
slanted palpebral fissures
• Higher orbital rim
• Smaller palpebral
fissures
Nose
• Longer nasal alas
• Shorter nose
• More asymmetric
nasal bridge
Mouth
• Thicker lower lip
• Wider mouth
Eyes
(Devijver & Kittler, 1982) for increasing numbers of features, and we selected the optimal as the minimum number
of features at which the area of the receiving operator characteristic curve converged (Bradley, 1997). In addition to
the optimal list of features obtained, we also estimated the
individual discriminative power of each feature using the
non-parametric Mann–Whitney U test (Mann & Whitney,
1947).
We performed the above process to obtain the optimal
list of features that are discriminative in the global population, regardless of the ethnic background of the patients.
Then, we repeated it for each different population, thus
obtaining a list of optimal discriminant features adapted
to the ethnicity of the patients. Finally, we compared the
performance of the global and the ethnic-specific models
in discriminating between Williams–Beuren and Noonan
syndromes.
3
|
R E S U LTS
We obtained an average accuracy of 85.68% in the discrimination of patients with Noonan syndrome and Williams–
Beuren syndrome in the global population using the list of 14
optimal facial features identified by our face analysis technology. Specifically, we obtained accuracies of 87.58% and
84.62% in the correct identification of Noonan and Williams–
Beuren syndrome, respectively. The list of optimal geometric
and appearance features, their distribution, and individual p-
value in the global population can be consulted in our supplementary material. The clinical interpretation of those features
is given in Table 2, organized according to the region of the
face at which they were observed: eyes, nose, and mouth.
We obtained average accuracies of 93.65%, 87.88%,
91.30%, and 89.17% in the African descent, Asian, Caucasian,
and Latin American populations, respectively, when using
population-specific models. As with the global population,
the details of the geometric and appearance facial features
can be consulted in our supplementary material. Table 3
gives our interpretation of the optimal features identified for
each population.
Table 4 gives the accuracy in differentiating between
Noonan and Williams–Beuren syndromes of the models created both for the global population and for each population
included in this study. Similar to our previous works identifying genetic syndromes from a healthy population(Cerrolaza
et al., 2016; Kruszka, Addissie, et al., 2017; Kruszka, Porras,
Addissie, et al., 2017; Kruszka et al., 2018; Kruszka, Porras,
Sobering, et al., 2017; Zhao et al., 2014), we obtained improved results when we adapted our technology to specific
ethnic groups. In average, we obtained an improvement of
5.49% when using specific models for each ethnicity, with
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
PORRAS et al.
|
PORRAS et al.
T A B L E 3 Interpretation of the quantitative results in the African descent, Asian, Caucasian, and Latin American populations. Characteristics
not observed in the global population are indicated in green
Significant differences
Noonan
Relevant differences
Williams–Beuren
Noonan
Williams–Beuren
African descent population
Eyes
• More pronounced
hypertelorism
• Smaller palpebral fissures with
more significant ptosis
• Smaller palpebral fissures
• More asymmetric palpebral fissures
Nose
• Thicker/more rounded nasal lobe
• More asymmetric nasal alas
Mouth
• Thicker lower lip
• Wider mouth
Asian population
Eyes
• More pronounced down-slanted
palpebral fissures
• Smaller palpebral fissures
• More asymmetric palpebral fissures
Nose
• Longer nasal alas
Mouth
• Thicker lower lip
• Wider mouth
• More asymmetric philtrum and cupid's
bow
Caucasian population
Eyes
• More pronounced
hypertelorism and
telecanthus
• More pronounced down-slanted
palpebral fissures
Nose
• More asymmetric nasal alas
and lobe
Mouth
• Thicker lower lip
• More asymmetric upper lip
thickness
• Wider mouth
• Higher
orbital rim
• More pronounced ptosis
• Shorter nose
Latin American
population
Eyes
• More pronounced
hypertelorism
• Higher orbital
rim
• Smaller palpebral fissures
Nose
• Shorter nose
Mouth
• Thicker lower lip
• Wider mouth
• More asymmetric lips
• Flatter philtrum and cupid's bow
T A B L E 4 Comparison of the accuracy obtained with the global model (trained with all ethnic groups) and with the specific model trained with
a specific ethnic group on each population
Ethnicity
Global model
Ethnicity-specific model
Improvement
p-value*
African descent
87.30%
93.65%
7.27%
0.363
Asian
84.85%
87.88%
3.57%
0.800
Caucasian
83.23%
91.30%
9.70%
0.044
Latin American
86.62%
89.17%
1.91%
0.727
Global population
85.68%
90.38%
5.49%
0.024
*p-value calculated using a Fisher's exact test.
a p-
value of 0.024 estimated using a Fisher's exact test.
However, our results also show that the improvement is only
statistically significant (p < 0.05) on the Caucasian population, with a p-value of 0.044.
4
|
DISCUSSION
Despite many phenotypical similarities reported in the literature between patients with Noonan and Williams–Beuren
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 of 10
syndrome (e.g., short stature, ptosis, down-
slanted palpebral fissures, cardiac abnormalities) (Allanson, 1987;
Morris, 1993, 2010; Noonan, 1994; Roberts et al., 2013),
our facial analysis demonstrated that these two genetic
conditions can be distinguished in the global population
with accuracy higher than 85% based only on facial observations. Patients with Noonan syndrome present significantly more pronounced hypertelorism and telecanthus,
whereas patients with Williams–Beuren syndrome present
significantly more down-slanted palpebral fissures, shorter
nose with longer alas, and a wider mouth with a thicker
lower lip. In addition, patients with Noonan syndrome are
likely to have higher orbital rim and a more asymmetric
nasal bridge, and patients with Williams–Beuren syndrome
often present smaller and less rounded palpebral fissures,
although differences between the two populations in these
observations were not found to be statistically significant
when evaluated individually.
Our results also indicate that the physical manifestations
are modulated by the ethnic background of the patients.
Similar to previous works classifying individuals with genetic syndromes from healthy subjects (Kruszka, Addissie,
et al., 2017; Kruszka, Porras, Addissie, et al., 2017; Kruszka
et al., 2018; Kruszka, Porras, Sobering, et al., 2017), we
obtained a higher classification accuracy when we adapted
the list of relevant discriminative facial features to specific
ethnic groups. Our results show that, although the features
described above are discriminative between Noonan and
Williams–Beuren syndromes in the global population, there
are other features that can be more discriminant on specific
populations, either individually or combined with previous
features.
In the African-
descent population, unlike the global
population, the palpebral slanting angle is not essential
to discriminate Williams–Beuren and Noonan syndrome.
Patients of this ethnic group with Williams–Beuren syndrome often present a more rounded nasal lobe and asymmetric nasal alas, and more asymmetric palpebral fissures.
Importantly, although these features combined were relevant to identify patients with Williams–Beuren syndrome
from Noonan syndrome, they were not found to be significantly different between the two populations when evaluated individually.
In the Asian population, a wider mouth with a thicker
lower lip and more down-slanted palpebral fissures were
significant to distinguish patients with Williams–
Beuren
syndrome from patients with Noonan syndrome. Moreover,
patients with Williams–Beuren syndrome often showed more
asymmetry in the palpebral fissures and in the cupid's bow
and philtrum, in addition to smaller palpebral fissures and
longer nasal alas. Differences in these features were not statistically significant when compared individually with patients with Noonan syndrome.
| 7 of 10
We identified similar discriminative features in the
Caucasian population that those found in the general population except for the nasal observations. Moreover, in this population, patients with Williams–Beuren syndrome presented
significantly more asymmetric nasal alas and lobe than patients with Noonan syndrome, and a significantly more asymmetric upper lip. They often presented shorter nose as well,
although differences with respect to patients with Noonan
syndrome were not found to be statistically significant.
The Latin American population with Noonan syndrome
showed a significantly higher orbital rim and more pronounced hypertelorism. Patients with Williams–Beuren syndrome presented a significantly wider mouth with a thicker
lower lip, and a shorter nose. They often presented smaller
palpebral fissures and a flatter philtrum and cupid's bow, but
these features were not found to be significantly different between the two populations when evaluated individually.
Although ethnic-specific classification models provided
a higher accuracy compared with the model created from the
global population, this improvement was statistically significant only for patients from the Caucasian population. One
possible explanation for this is a lower phenotypical variability of the Caucasian population used in this work compared with the other ethnic groups. To categorize patients,
we followed the racial and ethnic categories used by the
National Institutes of Health. However, the Asian population
analyzed in this work includes patients from China, India,
and Malaysia, thus introducing a high ethnic variability in
the Asian group. This higher variability makes it difficult to
find ethnic-specific features, which translate into a classification model with an accuracy that is higher in average but not
significantly different to the model built from the global population. As more data become available, it will be possible to
focus on the study of more specific populations.
Although many of the discriminant facial observations
between Noonan and Williams–Beuren syndromes found
are consistent among ethnicities (i.e., more significant hypertelorism in patients with Noonan syndrome and wider
mouth in patients with Williams–Beuren syndrome), there
are a few observations that are specific to each ethnic group
and that can be subtle to the human eye. However, they can
be quantified using a systematic analysis as presented in
this work. Our facial analysis technology uses an objective and quantitative approach to identify and stratify facial
phenotypes, which is essential to detect those subtle facial
features that are indicators of genetic conditions. In this
work, we used this technology not only to distinguish patients with Noonan and Williams–Beuren syndromes, but
also to provide reference metrics that can be used in any
clinic. Moreover, these metrics were objectively defined
for different ethnic groups, which resulted in improved accuracy for the potential diagnosis of the syndromes from
phenotypical observations. Our results show the potential
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
PORRAS et al.
|
PORRAS et al.
of our facial analysis technology to support the assessment
of patients with genetic syndromes in areas of the world
with diverse populations and where access to specialists is
sometimes limited.
Finally, we also used our technology to create population-
based computer-generated images that illustrate the specific
appearance of relevant facial features for the differential diagnosis of Noonan and Williams–Beuren syndromes. These
images can be used as a reference for the identification of
these syndromes in populations with different ethnic background, both for training and diagnostic purposes. However,
other observations from clinical evaluation as well as family
history or behavioral observations, if they are available, provide additional information that needs to be considered for a
clinical diagnosis.
ACKNOWLEDGMENTS
Support for this work was partially provided by a philanthropic gift from the Government of Abu Dhabi to Children’s
National Hospital.
CONFLICT OF INTEREST
The authors do not have any conflicts of interest that are relevant to this manuscript.
AUTHOR CONTRIBUTIONS
All authors conceptualized this work together. A.R.P and
M.G.L. designed the methods. A.R.P. implemented the methods, performed the experiments, and wrote the initial draft
of the manuscript. M.G.L. reviewed the results and revised
the manuscript. M.S. provided the clinical perspective and
revised the results and manuscript.
DATA AVAILABILIT Y STATEMENT
A subset of the facial photographs used in this study are available through the “Atlas of Human Malformation Syndromes
in Diverse Populations” of the National Human Genome
Research Institute –National Institutes of Health (Muenke
et al., 2016). The discriminative facial metrics between
Noonan and Williams-Beuren syndromes and their ranges in
diverse population are available as supplementary material
of this article.
ORCID
Antonio R. Porras
R E F E R E NC E S
https://orcid.org/0000-0001-5989-2953
Allanson, J. E. (1987). Noonan syndrome. Journal of Medical Genetics,
24(1), 9–13. http://www.ncbi.nlm.nih.gov/pubmed/17639592
Allanson, J. E. (2016). Objective studies of the face of Noonan, Cardio-
facio-cutaneous, and Costello syndromes: A comparison of three
disorders of the Ras/MAPK signaling pathway. American Journal
of Medical Genetics Part A, 170(10), 2570–
2577. https://doi.
org/10.1002/ajmg.a.37736
Allanson, J. E., Bohring, A., Dörr, H.-
G., Dufke, A., Gillessen-
Kaesbach, G., Horn, D., König, R., Kratz, C. P., Kutsche, K., Pauli,
S., Raskin, S., Rauch, A., Turner, A., Wieczorek, D., & Zenker, M.
(2010). The face of Noonan syndrome: Does phenotype predict
genotype. American Journal of Medical Genetics Part A, 152A(8),
1960–1966. https://doi.org/10.1002/ajmg.a.33518
Allanson, J. E., & Roberts, A. E. (1993). Noonan syndrome. In
GeneReviews®. https://www.ncbi.nlm.nih.gov/books/NBK1124/
Bertola, D. R., Pereira, A. C., Albano, L. M. J., De Oliveira, P. S. L.,
Kim, C. A., & Krieger, J. E. (2006). PTPN11 gene analysis in 74
Brazilian patients with Noonan syndrome or noonan-like phenotype. Genetic Testing, 10(3), 186–
191. https://doi.org/10.1089/
gte.2006.10.186
Bhambhani, V., Muenke, M., Human, N., & Institutes, N. (2014).
Noonan syndrome. American Family Physician, 89(1), 37–43.
Bradley, A. P. (1997). The use of the area under the ROC curve in the
evaluation of machine learning algorithms. Pattern Recognition,
30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
Cassidy, S. B., & Allanson, J. E. (2010). Management of genetic syndromes. John Wiley & Sons.
Castelo-Branco, M., Mendes, M., Sebastião, A. R., Reis, A., Soares, M.,
Saraiva, J., Bernardes, R., Flores, R., Pérez-Jurado, L., & Silva,
E. (2007). Visual phenotype in Williams-Beuren syndrome challenges magnocellular theories explaining human neurodevelopmental visual cortical disorders. Journal of Clinical Investigation,
117(12), 3720–3729. https://doi.org/10.1172/JCI32556
Cerrolaza, J. J., Porras, A. R., Mansoor, A., Zhao, Q., Summar, M., &
Linguraru, M. G. (2016). Identification of dysmorphic syndromes
using landmark-specific local texture descriptors. In 2016 IEEE 13th
International Symposium on Biomedical Imaging (ISBI) (pp. 1080–
1083). IEEE. https://doi.org/10.1109/ISBI.2016.7493453
H. Chen (Ed.). (2012). Noonan syndrome. In Atlas of genetic diagnosis and counseling (pp. 1577–1586). Springer US. https://doi.
org/10.1007/978-1-4614-1037-9_180
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine
Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
Devijver, P. A., & Kittler, J. (1982). Pattern recognition: a statistical approach. In Pattern recognition: A statistical approach. http://www.
scopus.com/inward/record.url?eid=2-s 2.0-0019926397&partn
erID=40
Digilio, M. C., & Marino, B. (2001). Clinical manifestations of Noonan
syndrome. Images in Paediatric Cardiology, 3(2), 19–30. http://
www.ncbi.nlm.nih.gov/pubmed/22368597
Du, S.-X., & Chen, S.-T. (2005). Weighted support vector machine for
classification. In 2005 IEEE International Conference on Systems,
Man and Cybernetics, 4 (Vol. 4, pp. 3866-
3871). https://doi.
org/10.1109/ICSMC.2005.1571749
Essawi, M. L., Ismail, M. F., Afifi, H. H., Kobesiy, M. M., El Kotoury,
A., & Barakat, M. M. (2013). Mutational analysis of the PTPN11
gene in Egyptian patients with Noonan syndrome. Journal of the
Formosan Medical Association, 112(11), 707–
712. https://doi.
org/10.1016/j.jfma.2012.06.002
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection
for cancer classification using support vector machines. Machine
Learning, 46(1/3), 389–
422. https://doi.org/10.1023/A:10124
87302797
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 of 10
Huang, L., Sadler, L., O’Riordan, M. A., & Robin, N. H. (2002). Delay
in diagnosis of Williams syndrome. Clinical Pediatrics, 41(4),
257–261. https://doi.org/10.1177/000992280204100410
Hung, C.-S., Lin, J.-L., Lee, Y.-J., Lin, S.-P., Chao, M.-C., & Lo, F.-S.
(2007). Mutational analysis of PTPN11 gene in Taiwanese children with Noonan syndrome. Journal of the Formosan Medical
Association, 106(2), 169–
172. https://doi.org/10.1016/S0929
-6646(09)60235-7
Kruszka, P., Addissie, Y. A., McGinn, D. E., Porras, A. R., Biggs, E.,
Share, M., Crowley, T. B., Chung, B. H. Y., Mok, G. T. K., Mak,
C. C. Y., Muthukumarasamy, P., Thong, M.-K., Sirisena, N. D.,
Dissanayake, V. H. W., Paththinige, C. S., Prabodha, L. B. L.,
Mishra, R., Shotelersuk, V., Ekure, E. N., … Muenke, M. (2017).
22q11.2 deletion syndrome in diverse populations. American
Journal of Medical Genetics Part A, 173(4), 879–888. https://doi.
org/10.1002/ajmg.a.38199
Kruszka, P., Porras, A. R., Addissie, Y. A., Moresco, A., Medrano, S.,
Mok, G. T. K., Leung, G. K. C., Tekendo-Ngongang, C., Uwineza,
A., Thong, M.-K., Muthukumarasamy, P., Honey, E., Ekure, E.
N., Sokunbi, O. J., Kalu, N., Jones, K. L., Kaplan, J. D., Abdul-
Rahman, O. A., Vincent, L. M., … Muenke, M. (2017). Noonan
syndrome in diverse populations. American Journal of Medical
Genetics Part A, 173(9), 2323–
2334. https://doi.org/10.1002/
ajmg.a.38362
Kruszka, P., Porras, A. R., de Souza, D. H., Moresco, A., Huckstadt,
V., Gill, A. D., Boyle, A. P., Hu, T., Addissie, Y. A., Mok,
G. T. K., Tekendo-Ngongang, C., Fieggen, K., Prijoles, E. J.,
Tanpaiboon, P., Honey, E., Luk, H.-M., Lo, I. F. M., Thong, M.-K.,
Muthukumarasamy, P., … Muenke, M. (2018). Williams-Beuren
syndrome in diverse populations. American Journal of Medical
Genetics Part A, 176(5), 1128–
1136. https://doi.org/10.1002/
ajmg.a.38672
Kruszka, P., Porras, A. R., Sobering, A. K., Ikolo, F. A., La Qua, S.,
Shotelersuk, V., Chung, B. H. Y., Mok, G. T. K., Uwineza, A.,
Mutesa, L., Moresco, A., Obregon, M. G., Sokunbi, O. J., Kalu, N.,
Joseph, D. A., Ikebudu, D., Ugwu, C. E., Okoromah, C. A. N.,
Addissie, Y. A., … Muenke, M. (2017). Down syndrome in diverse populations. American Journal of Medical Genetics Part A,
173(1), 42–53. https://doi.org/10.1002/ajmg.a.38043
Levin, A. V., & Enzenauer, R. W. (2017). The eye in pediatric systemic
disease. Springer International Publishing. https://books.google.
com/books?id=AvIoDwAAQBAJ
Mann, H., & Whitney, D. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of
Mathematical Statistics, 18(1), 50–
60. https://doi.org/10.1214/
aoms/1177730491
Morris, C. A. (1993). Williams syndrome. In GeneReviews® (pp. 1–22).
http://www.ncbi.nlm.nih.gov/pubmed/20301427
Morris, C. A. (2010). Introduction: Williams syndrome. American
Journal of Medical Genetics Part C: Seminars in Medical Genetics,
154C(2), 203–208. https://doi.org/10.1002/ajmg.c.30266
Morris, C. A., & Mervis, C. B. (2000). Williams Syndrome, 60, 389–
395. https://doi.org/10.1016/B978-0-08-097086-8.55055-7
Muenke, M., Adeyemo, A., & Kruszka, P. (2016). An electronic
atlas of human malformation syndromes in diverse populations.
Genetics in Medicine, 18(11), 1085–1087. https://doi.org/10.1038/
gim.2016.3
Noonan, J. A. (1994). Noonan syndrome: An update and review for the
primary pediatrician. Clinical Pediatrics, 33(9), 548–555. https://
doi.org/10.1177/000992289403300907
| 9 of 10
Nora, J. J. (1974). The Ullrich-Noonan syndrome (Turner Phenotype).
Archives of Pediatrics & Adolescent Medicine, 127(1), 48. https://
doi.org/10.1001/archpedi.1974.02110200050007
Ojala, T., Pietikäinen, M., & Harwood, D. (1996). A comparative
study of texture measures with classification based on featured
distributions. Pattern Recognition, 29(1), 51–
59. https://doi.
org/10.1016/0031-3203(95)00067-4
Patil, S. J., Madhusudhan, B. G., Shah, S., & Suresh, P. V. (2012). Facial
phenotype at different ages and cardiovascular malformations in
children with Williams-Beuren syndrome: A study from India.
American Journal of Medical Genetics Part A, 158A(7), 1729–
1734. https://doi.org/10.1002/ajmg.a.35443
Pérez Jurado, L. A., Peoples, R., Kaplan, P., Hamel, B. C. J., & Francke,
U. (1996). Molecular definition of the chromosome 7 deletion
in Williams syndrome and parent-
of-
origin effects on growth.
American Journal of Human Genetics, 59(4), 781–792. http://
www.pubme d cent r al.nih.gov/artic l eren d er.fcgi?artid = 19148
04&tool=pmcentrez&rendertype=abstract
Pober, B. R. (2010). Williams-Beuren syndrome. New England Journal
of Medicine, 362(3), 239–
252. https://doi.org/10.1056/NEJMr
a0903074
Preus, M. (2008). Differential diagnosis of the Williams and the
Noonan syndromes. Clinical Genetics, 25(5), 429–434. https://doi.
org/10.1111/j.1399-0004.1984.tb02012.x
Roberts, A. E., Allanson, J. E., Tartaglia, M., & Gelb, B. D. (2013).
Noonan syndrome. The Lancet, 381(9863), 333–342. https://doi.
org/10.1016/S0140-6736(12)61023-X
Rokhaya, N., Coumba, N., Mohamed, L., Babacar, M., Mama, S. D.,
Jean, P. D. D., Omar, F., Ibrahima, B. D., & Haby, S. S. (2014).
Mutation N308T of protein tyrosine phosphatase SHP-2 in two
Senegalese patients with Noonan syndrome. Journal of Medical
Genetics and Genomics, 6(1), 6–
10. https://doi.org/10.5897/
JMGG2013.0072
Romano, A. A., Allanson, J. E., Dahlgren, J., Gelb, B. D., Hall, B.,
Pierpont, M. E., Roberts, A. E., Robinson, W., Takemoto, C. M.,
& Noonan, J. A. (2010). Noonan syndrome: Clinical features, diagnosis, and management guidelines. Pediatrics, 126(4), 746–759.
https://doi.org/10.1542/peds.2009-3207
Sharland, M., Burch, M., McKenna, W. M., & Paton, M. A. (1992).
A clinical study of Noonan syndrome. Archives of Disease
in Childhood, 67(2), 178–
183. https://doi.org/10.1136/
adc.67.2.178
Şimşek-Kiper, P. ö., Alanay, Y., Gülhan, B., Lissewski, C., Türkyılmaz,
D., Alehan, D., Çetin, M., Utine, G. E., Zenker, M., & Boduroğlu,
K. (2013). Clinical and molecular analysis of RASopathies in a
group of Turkish patients. Clinical Genetics, 83(2), 181–186.
https://doi.org/10.1111/j.1399-0004.2012.01875.x
Strømme, P., Bjømstad, P. G., & Ramstad, K. (2002). Prevalence estimation of Williams syndrome. Journal of Child Neurology, 17(4),
269–271. https://doi.org/10.1177/088307380201700406
van der Burgt, I., Thoonen, G., Roosenboom, N., Assman-Hulsmans,
C., Gabreels, F., Otten, B., & Brunner, H. G. (1999). Patterns of
cognitive functioning in school-aged children with Noonan syndrome associated with variability in phenotypic expression. The
Journal of Pediatrics, 135(6), 707–713. http://www.ncbi.nlm.nih.
gov/pubmed/10586173
Winter, M., Pankau, R., Amm, M., Gosch, A., & Wessel, A. (2018).
The spectrum of ocular features in the Williams-
Beuren syndrome. Clinical Genetics, 49(1), 28–31. https://doi.org/10.1111/
j.1399-0004.1996.tb04320.x
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
PORRAS et al.
|
Wu, Y.-
Q., Nickerson, E., Shaffer, L. G., Keppler-
Noreuil, K., &
Muilenburg, A. (1999). A case of Williams syndrome with a large,
visible cytogenetic deletion. Journal of Medical Genetics, 36(12),
931–932. https://doi.org/10.1136/jmg.36.12.931
Ye, J., Janardan, R., & Li, Q. (2005). Two-dimensional linear discriminant analysis. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.),
Advances in neural information processing systems 17 (pp. 1569–
1576). MIT Press. http://papers.nips.cc/paper/2547-two-dimen
sional-linear-discriminant-analysis.pdf
Yoshida, R., Hasegawa, T., Hasegawa, Y., Nagai, T., Kinoshita, E.,
Tanaka, Y., Kanegane, H., Ohyama, K., Onishi, T., Hanew, K.,
Okuyama, T., Horikawa, R., Tanaka, T., & Ogata, T. (2004).
Protein-tyrosine phosphatase, nonreceptor type 11 mutation analysis and clinical assessment in 45 patients with Noonan syndrome.
The Journal of Clinical Endocrinology & Metabolism, 89(7),
3359–3364. https://doi.org/10.1210/jc.2003-032091
Zhao, Q., Okada, K., Rosenbaum, K., Kehoe, L., Zand, D. J., Sze,
R., Summar, M., & Linguraru, M. G. (2014). Digital facial
PORRAS et al.
dysmorphology for genetic screening: Hierarchical constrained
local model using ICA. Medical Image Analysis, 18(5), 699–710.
https://doi.org/10.1016/j.media.2014.04.002
SUPPORTING INFORMATION
Additional Supporting Information may be found online in
the Supporting Information section.
How to cite this article: Porras AR, Summar M,
Linguraru MG. Objective differential diagnosis of
Noonan and Williams–Beuren syndromes in diverse
populations using quantitative facial phenotyping. Mol
Genet Genomic Med. 2021;9:e1636. https://doi.
org/10.1002/mgg3.1636
23249269, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/mgg3.1636 by CAPES, Wiley Online Library on [01/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 of 10
